Message ID | 20221122161240.137570-1-pskocik@gmail.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp2304065wrr; Tue, 22 Nov 2022 08:17:38 -0800 (PST) X-Google-Smtp-Source: AA0mqf7lCwpiAIWQE7vDxjtIcEz0DM63eTwZyDxdWN2YGiimWIjtp583I4PCATBKrLez1GcUrjcD X-Received: by 2002:a63:d143:0:b0:476:c8c5:ba85 with SMTP id c3-20020a63d143000000b00476c8c5ba85mr22011963pgj.182.1669133858545; Tue, 22 Nov 2022 08:17:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669133858; cv=none; d=google.com; s=arc-20160816; b=mG1jDWPaZjEFQq8qiR6cU5zrSJgoBtKfxOSTBhrpqvo6OWquhAts3VOzwmItDuRVQf JYPPUGHKL9bgcSWjIn0X261rxO3Yt7hqH2iKR9eBB+0or4DRoF5T3mt0vfZbz/5BDlj8 84GDv4l9PsDqSFoYZTWaE5N2Slig0PxyiFJwriPaj2pr3MhPAHeKGGOweiWtxy+U9O4P f4JyEL4H+al04Ep4hZWeGeROffrqOpykj1TU/tiAtZ/howxz8nP0kUlYujCk6diNszmK B3JIy9gnjQ/3Tm2AFAL3gztawvqbLO3E2eeTvRxNYB8Z3+Vmrq+VZIpJwmAT9FBsrn+w 3qcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=mC90GwNhm6j7BlWHBWFQA6hS+M8xqG7H71ScqSgu+SE=; b=pdL6Se5tiLxGp5CTGCiZnorQE+3jW2TYmBg1Pktw1mKdMfy69zHLO/lOZA+rrLMsGf 4lWxujdaMcPpBPXKHRxqcAhhJvaetbVIFRbYoBM+MGAP/+2PTXJCYLh4SifagilWsIGX KxpX1oJrPt5BDZo95P0Ks6QZCqSgb+kFp+lytWh8Gk0qMQF98dZnvxD5aYmCtVIb9GaV zE20EVSrSd039r5hjD2aL54sjqV694lzHhTfW2MM0KUsCbMgmzEKXbhfGTH7U6ryPCYJ 08eFm6Tt9mgR/VEt/URNMmREh8g8AXEc5UivD2GIm9yXV7DjU4kH1m3B/QrJ8BSi6Dob GldQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=duED85v8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ja10-20020a170902efca00b00186ab03eebfsi3940544plb.418.2022.11.22.08.17.22; Tue, 22 Nov 2022 08:17:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=duED85v8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234376AbiKVQQS (ORCPT <rfc822;cjcooper78@gmail.com> + 99 others); Tue, 22 Nov 2022 11:16:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234288AbiKVQP7 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 22 Nov 2022 11:15:59 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7345D73431 for <linux-kernel@vger.kernel.org>; Tue, 22 Nov 2022 08:13:51 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id b12so11661860wrn.2 for <linux-kernel@vger.kernel.org>; Tue, 22 Nov 2022 08:13:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=mC90GwNhm6j7BlWHBWFQA6hS+M8xqG7H71ScqSgu+SE=; b=duED85v80csEaRq1YyX8RGCan5YsZObsDE+LxOFw2hRKLd4y74hJ6VQpH2h8i42z3a Clii+1EG793oZfMdmFqoOXa6/ceHR4g9KwtXM/2o4kfpKIHEwQaVTAIE3wvAVwoB/32u wpVuAXugU8a9n8vANtU1W1XTMegemYjVCPGXndxhV8l1D/UznxvyXgXQ+8jCYydRJjfy 6nXiMTWxIdUjqfx/wRrvMOfOyJbyEsQyeEm+M9vEFOoKFpge1FaGl+okkjKZoQYH3F7y R6KCyvD+KFF3Khy2rvirDlDJOuYh1eJzSja6BYJgs7j42ubvK8GD5SxEcVe+FADW1PHF IUAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=mC90GwNhm6j7BlWHBWFQA6hS+M8xqG7H71ScqSgu+SE=; b=Mw6FQgVD7ezAUM7+PXJ1mKe6M0K3IfgUG7LFXUckJXFY153TGkN1i/7heU7/jtqrxY G4U6Ok2t8w560G7CLgrbqZd5OWsi3eYV449fr6GTycvtDo67cIwUUyHlhJsg+SZXfnsd LtMq+4kTeG4z+n6G5wJ+GLbDr8KbRN4SAOL2btSYGd6zYby+HWS4aakImbW988TSZZdW Hta8wjkl9HJJy1lvWX8nRx11iQ52Gy5KNn8bfNCD8HAzghh/Qb9jvkCWhzbKpBoa1sse X2mLfia3WsQbUPBpv20Pxcw18PPiU4T1JLRho4RABIHSeMU0yOU3PgmiG6w/l0XSduNb tl/w== X-Gm-Message-State: ANoB5plEb5fuhB8bpa0aNiaPBhx4dWZ5lkXtxnwc7EgObXk9ayJyS4Eo Hpzkhu7QVSsNekJ4R0T7lYY= X-Received: by 2002:adf:fb4c:0:b0:236:5270:735e with SMTP id c12-20020adffb4c000000b002365270735emr3949996wrs.659.1669133629782; Tue, 22 Nov 2022 08:13:49 -0800 (PST) Received: from localhost.localdomain (2001-1ae9-2f0-fa00-5962-fdfb-2a9a-17bd.ip6.tmcz.cz. [2001:1ae9:2f0:fa00:5962:fdfb:2a9a:17bd]) by smtp.googlemail.com with ESMTPSA id j13-20020adff54d000000b0023655e51c33sm14610846wrp.4.2022.11.22.08.13.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Nov 2022 08:13:48 -0800 (PST) From: Petr Skocik <pskocik@gmail.com> To: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com>, Kees Cook <keescook@chromium.org>, Thomas Gleixner <tglx@linutronix.de>, Peter Zijlstra <peterz@infradead.org>, Marco Elver <elver@google.com>, linux-kernel@vger.kernel.org, Petr Skocik <pskocik@gmail.com> Subject: [PATCH 0/1] *** Fix kill(-1,s) returning 0 on 0 kills *** Date: Tue, 22 Nov 2022 17:12:40 +0100 Message-Id: <20221122161240.137570-1-pskocik@gmail.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750213704756704316?= X-GMAIL-MSGID: =?utf-8?q?1750213704756704316?= |
Series |
*** Fix kill(-1,s) returning 0 on 0 kills ***
|
|
Message
Petr Skocik
Nov. 22, 2022, 4:12 p.m. UTC
Hi. I've never sent a kernel patch before but this one seemed trivial, so I thought I'd give it a shot. My issue: kill(-1,s) on Linux doesn't return -ESCHR when it has nothing to kill. The code sample below demonstrates the problem, which gets fixed by the patch: #define _GNU_SOURCE #include <assert.h> #include <errno.h> #include <signal.h> #include <stdio.h> #include <sys/wait.h> #include <unistd.h> #define VICTIM_UID 4200 //check these are safe to use on your system! #define UNUSED_UID 4300 int main(){ uid_t r,e,s; if(geteuid()) return 1; //requires root privileges //pipe to let the parent know when the child has changed ids int fds[2]; if(0>pipe(fds)) return 1; pid_t pid; if(0>(pid=fork())) return 1; else if(0==pid){ setreuid(VICTIM_UID,VICTIM_UID); getresuid(&r,&e,&s); printf("child: %u %u %u\n", r,e,s); close(fds[0]); close(fds[1]); //let the parent continue for(;;) pause(); } close(fds[1]); read(fds[0],&(char){0},1); //wait for uid change in the child #if 1 setreuid(VICTIM_UID,(uid_t)-1); seteuid(VICTIM_UID); #else setresuid(UNUSED_UID,VICTIM_UID,0); #endif getresuid(&r,&e,&s); printf("parent: %u %u %u\n", r,e,s); //4200 4200 0 int err = kill(-1,-111); (void)err; //test -EINVAL assert(err < 0 && errno == EINVAL); int rc = kill(-1,SIGTERM); //test 0 if(rc>=0) wait(0); int rc2 = kill(-1,SIGTERM); //test -ESCHR printf("1st kill ok==%d; 2nd kill ESRCH==%d\n", rc==0, rc2<0&& errno==ESRCH); } Thank you for considering the patch. Best regards, Petr S. Petr Skocik (1): Fix kill(-1,s) returning 0 on 0 kills kernel/signal.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
Comments
On Tue, Nov 22, 2022 at 05:12:40PM +0100, Petr Skocik wrote: > Hi. I've never sent a kernel patch before but this one seemed trivial, > so I thought I'd give it a shot. > > My issue: kill(-1,s) on Linux doesn't return -ESCHR when it has nothing > to kill. It looks like LTP already tests for this, and gets -ESRCH? https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/containers/pidns/pidns10.c Does it still pass with your change?
On 11/22/22 18:15, Kees Cook wrote: > On Tue, Nov 22, 2022 at 05:12:40PM +0100, Petr Skocik wrote: >> Hi. I've never sent a kernel patch before but this one seemed trivial, >> so I thought I'd give it a shot. >> >> My issue: kill(-1,s) on Linux doesn't return -ESCHR when it has nothing >> to kill. > It looks like LTP already tests for this, and gets -ESRCH? > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/containers/pidns/pidns10.c > > Does it still pass with your change? > I went ahead and ran it and it does pass with the change. But it should be obvious from the code alone too. It's only a few (and fewer after the patch) simple lines of code. The original: int retval = 0, count = 0; struct task_struct * p; for_each_process(p) { if (task_pid_vnr(p) > 1 && !same_thread_group(p, current)) { int err = group_send_sig_info(sig, info, p, PIDTYPE_MAX); ++count; if (err != -EPERM) retval = err; } } ret = count ? retval : -ESRCH; counts kills made after the `task_pid_vnr(p) > 1 && !same_thread_group(p, current)` check. Some, and possibly all, of those kills fail with -EPERM, but the the final line only sets -ESRCH if the count is zero (i.e., the initial check fails). It should be counting only kill attempts that have _not_ returned -EPERM (if all returned -EPERM, then no suitable target was found and a -ESRCH is would be in order -- but it won't be set with the original code!). So the change can be as minor as diff --git a/kernel/signal.c b/kernel/signal.c index d140672185a4..e42076e2332b 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1608,9 +1608,10 @@ static int kill_something_info(int sig, struct kernel_siginfo *info, pid_t pid) !same_thread_group(p, current)) { int err = group_send_sig_info(sig, info, p, PIDTYPE_MAX); - ++count; - if (err != -EPERM) + if (err != -EPERM){ + ++count; retval = err; + } } } ret = count ? retval : -ESRCH; But since the count variable isn't used other than for the zeroness check, I simplified it further into - int retval = 0, count = 0; struct task_struct * p; + ret = -ESRCH; for_each_process(p) { if (task_pid_vnr(p) > 1 && !same_thread_group(p, current)) { int err = group_send_sig_info(sig, info, p, PIDTYPE_MAX); - ++count; if (err != -EPERM) - retval = err; + ret = err; /*either all 0 or all -EINVAL*/ } } - ret = count ? retval : -ESRCH; adding a comment explaining the apparent implicit assumption of the original code that the non-EPERM returns from group_send_sig_info in this context must be either all -EINVAL (bad signal number) or all 0, i.e., there can't be a signal allocation failure (that would be susceptible to being overshadowed by a 0 returned from a later kill) because none of those kills in this context (kill not sigqueue) should require any memory allocation. It's a tiny patch. Cheers, Petr Skocik P.S.: Apologies if the code formatting is off. Sent this one with Thunderbird. Need to work on my CLI mailsending skills.
Hi. Is there anything else I can do to help get this (or some other equivalent change that results in kill(-1,s) returning -ESRCH when it has nothing to kill (like it does on the BSDs), as opposed to the current return value of 0 in that case) incorporated into mainline Linux? It would rather help some of the user software I'm developing, and the slightly new semantics are IMO definitely reasonable (BSDs have them). Basically, the current code: int retval = 0, count = 0; struct task_struct * p; for_each_process(p) { if (task_pid_vnr(p) > 1 && !same_thread_group(p, current)) { int err = group_send_sig_info(sig, info, p, PIDTYPE_MAX); ++count; if (err != -EPERM) retval = err; } } ret = count ? retval : -ESRCH; counts kill attempts at non-1, other-process pids and sets hardcoded -ESRCH only if no such attempts are made, which will almost never happen for a nonroot EUID, because there will typically be non-pid-1 processes unkillable by the nonroot EUID, but the code will still count those kill attempts, and thus not return the hardcoded -ESRCH even if ALL of those kill attemtpts return -EPERM, in which case -ESRCH would be in order too, because there were no processes that the current EUID had permission to kill (BDSs indeed return ESRCH in such a case). (The kernel shouldn't need to concern itself with possible racy creation of new EUID-killable processes during the kill(-1,s) walk. Either the system can be known not to have running superuser code that could racily create such EUID-killable processes and then such a kill-returned -ESRCH would be useful, or it cannot be known not to have such running superuser code, in which case the -ESRCH is transient and should be droped by the user). The current code also implicitly assumes either all non-EPERM kill attempts return -EINVAL (invalid signal) or they all return 0 (success). This assumption should be valid because either the signal number is invalid and stays invalid, or it is valid and the only possible error is -EPERM (this isn't sigqueue so the kill shouldn't ever fail with -ENOMEM). If the assumption were not valid, then the current code could overshadow a previous failed attempt with a later succesful one, returning success even if there were some non-EPERM failures. My change proposes: struct task_struct * p; ret = -ESRCH; for_each_process(p) { if (task_pid_vnr(p) > 1 && !same_thread_group(p, current)) { int err = group_send_sig_info(sig, info, p, PIDTYPE_MAX); if (err != -EPERM) ret = err; /*either all 0 or all -EINVAL*/ } } i.e., start with -ESRCH (nothing to kill) and any non-EPERM kill attempts change it to the last return value --either all 0 or all -EINVAL as per the implicit assumption of the original code. It passes the tests put forth by Kees Cook. More defensively, the implicit assumption of the original code could be made explicit: struct task_struct * p; int has_last_err = 0; ret = -ESRCH; for_each_process(p) { if (task_pid_vnr(p) > 1 && !same_thread_group(p, current)) { int err = group_send_sig_info(sig, info, p, PIDTYPE_MAX); if (err != -EPERM){ if (has_last_err) BUG_ON(ret != err); /*either all 0 or all -EINVAL*/ has_last_err = 1; ret = err; } } } or dropped; struct task_struct * p; int has_last_err = 0; ret = -ESRCH; for_each_process(p) { if (task_pid_vnr(p) > 1 && !same_thread_group(p, current)) { int err = group_send_sig_info(sig, info, p, PIDTYPE_MAX); if (err != -EPERM){ if (has_last_err){ if (err >= 0) continue; /*don't mask previous failure with later success*/ } has_last_err = 1; ret = err; } } } Thanks again for consideration. Criticism welcome. Regards, Petr Skocik On 11/22/22 18:15, Kees Cook wrote: > On Tue, Nov 22, 2022 at 05:12:40PM +0100, Petr Skocik wrote: >> Hi. I've never sent a kernel patch before but this one seemed trivial, >> so I thought I'd give it a shot. >> >> My issue: kill(-1,s) on Linux doesn't return -ESCHR when it has nothing >> to kill. > It looks like LTP already tests for this, and gets -ESRCH? > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/containers/pidns/pidns10.c > > Does it still pass with your change? >
Petr Skocik <pskocik@gmail.com> writes: > Hi. > > Is there anything else I can do to help get this (or some other equivalent > change that results in kill(-1,s) returning -ESRCH when it has nothing to kill > (like it does on the BSDs), > as opposed to the current return value of 0 in that case) incorporated into > mainline Linux? I think you are talking about a rare enough case that we can safely change the error handling behavior without real risk of trouble. I think there is room for cleanup here. I don't think we can change the set of processes signaled. The linux man page should be updated to note that we skip sending a signal to ourselves in the event of -1. Reading the code the error handling logic is dubious. POSIX provides some guidance it says: If pid is -1, sig shall be sent to all processes (excluding an unspecified set of system processes) for which the process has permission to send that signal. [EINVAL] The value of the sig argument is an invalid or unsupported signal number. [EPERM] The process does not have permission to send the signal to any receiving process. [ESRCH] No process or process group can be found corresponding to that specified by pid. > if (err != -EPERM) > ret = err; /*either all 0 or all -EINVAL*/ The comment in your proposed patch is wrong: -EAGAIN an be returned in the case of real time signals. -ENOMEM can be returned due to linux audit. -EINVAL can be returned, but arguably it should be caught before we even go into the loop. Given that the comment is wrong I don't like what you have done with the error handling logic. It just adds confusion. My question: What would a good and carefully implemented version of kill(2) return? Today for -pgrp we return 0 if any signal delivery succeeds and the error from the last process in the signal group otherwise. For -1 we return -EINVAL if the signal is invalid. For -1 we return -ESRCH only if we are the init process and there are no other processes in the system, aka never except when we are the init process in a pid namespace. For -1 otherwise we return the return value of the last process signaled. I would argue that what needs to happen for -1 is: - Return 0 if the signal was sent to any process successfully. - Return -EINVAL for invalid signals. - When everything errors return some error value and not 0. What error value should we return when everything errors? Especially what error value should we return when everything says -EPERM? Should we follow BSD and return ESRCH? Should we follow Posix and return EPERM? Should we follow SYSV unix? Looking at FreeBSD aka: https://cgit.freebsd.org/src/tree/sys/kern/kern_sig.c?id=9e283cf9a2fe2b3aa6e4a47a012fd43b4e49ebec kill(2) aka killpg1 only returns 0 or ESRCH when sending a signal to multiple processes (after passing the EINVAL) check. The man pages for AIX and Solaris suggest that -EPERM is returned when things don't work. So what should Linux do? Historically Linux signaling is very SysV unix with a some compatibility flags for BSD where it matters. So I am not convinced that return ESRCH in this case is the right answer. Basing the logic off of __kill_pgrp_info I would do: diff --git a/kernel/signal.c b/kernel/signal.c index b5370fe5c198..369499ebb8e2 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1602,7 +1602,8 @@ static int kill_something_info(int sig, struct kernel_siginfo *info, pid_t pid) ret = __kill_pgrp_info(sig, info, pid ? find_vpid(-pid) : task_pgrp(current)); } else { - int retval = 0, count = 0; + bool found = false, success = false; + int retval = 0; struct task_struct * p; for_each_process(p) { @@ -1610,12 +1611,12 @@ static int kill_something_info(int sig, struct kernel_siginfo *info, pid_t pid) !same_thread_group(p, current)) { int err = group_send_sig_info(sig, info, p, PIDTYPE_MAX); - ++count; - if (err != -EPERM) - retval = err; + found = true; + success |= !err; + retval = err; } } - ret = count ? retval : -ESRCH; + ret = success ? 0 : (found ? retval : -ESRCH); } read_unlock(&tasklist_lock); I think that fits in better with Linux, and doesn't have any surprising behavior. > It would rather help some of the user software I'm developing, and the slightly > new semantics are IMO definitely reasonable (BSDs have them). Would my proposal above work for the software you are developing? The behavior your patch was implementing was: ret = success ? 0 : ((retval == -EINVAL)? -EINVAL : -ESRCH); Which gives less information. So I am not thrilled by it. Eric
On 8/10/23 18:16, Eric W. Biederman wrote: > Petr Skocik <pskocik@gmail.com> writes: > >> Hi. >> >> Is there anything else I can do to help get this (or some other equivalent >> change that results in kill(-1,s) returning -ESRCH when it has nothing to kill >> (like it does on the BSDs), >> as opposed to the current return value of 0 in that case) incorporated into >> mainline Linux? > I think you are talking about a rare enough case that we can safely > change the error handling behavior without real risk of trouble. > > I think there is room for cleanup here. > > I don't think we can change the set of processes signaled. The linux > man page should be updated to note that we skip sending a signal > to ourselves in the event of -1. > > Reading the code the error handling logic is dubious. > > POSIX provides some guidance it says: > > If pid is -1, sig shall be sent to all processes (excluding an > unspecified set of system processes) for which the process has > permission to send that signal. > > [EINVAL] > The value of the sig argument is an invalid or unsupported signal number. > [EPERM] > The process does not have permission to send the signal to any receiving process. > [ESRCH] > No process or process group can be found corresponding to that specified by pid. > >> if (err != -EPERM) >> ret = err; /*either all 0 or all -EINVAL*/ > The comment in your proposed patch is wrong: > -EAGAIN an be returned in the case of real time signals. > -ENOMEM can be returned due to linux audit. > -EINVAL can be returned, but arguably it should be caught > before we even go into the loop. > > Given that the comment is wrong I don't like what you have done with the > error handling logic. It just adds confusion. > > My question: What would a good and carefully implemented version > of kill(2) return? > > Today for -pgrp we return 0 if any signal delivery succeeds and > the error from the last process in the signal group otherwise. > > For -1 we return -EINVAL if the signal is invalid. > For -1 we return -ESRCH only if we are the init process and > there are no other processes in the system, aka never except > when we are the init process in a pid namespace. > For -1 otherwise we return the return value of the last > process signaled. > > I would argue that what needs to happen for -1 is: > - Return 0 if the signal was sent to any process successfully. > - Return -EINVAL for invalid signals. > - When everything errors return some error value and not 0. > > What error value should we return when everything errors? > Especially what error value should we return when everything > says -EPERM? > > Should we follow BSD and return ESRCH? > Should we follow Posix and return EPERM? > Should we follow SYSV unix? > > Looking at FreeBSD aka: > https://cgit.freebsd.org/src/tree/sys/kern/kern_sig.c?id=9e283cf9a2fe2b3aa6e4a47a012fd43b4e49ebec > kill(2) aka killpg1 only returns 0 or ESRCH when sending a signal > to multiple processes (after passing the EINVAL) check. > > The man pages for AIX and Solaris suggest that -EPERM is returned when > things don't work. > > So what should Linux do? > > Historically Linux signaling is very SysV unix with a some compatibility > flags for BSD where it matters. So I am not convinced that return > ESRCH in this case is the right answer. > > Basing the logic off of __kill_pgrp_info I would do: > > diff --git a/kernel/signal.c b/kernel/signal.c > index b5370fe5c198..369499ebb8e2 100644 > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -1602,7 +1602,8 @@ static int kill_something_info(int sig, struct kernel_siginfo *info, pid_t pid) > ret = __kill_pgrp_info(sig, info, > pid ? find_vpid(-pid) : task_pgrp(current)); > } else { > - int retval = 0, count = 0; > + bool found = false, success = false; > + int retval = 0; > struct task_struct * p; > > for_each_process(p) { > @@ -1610,12 +1611,12 @@ static int kill_something_info(int sig, struct kernel_siginfo *info, pid_t pid) > !same_thread_group(p, current)) { > int err = group_send_sig_info(sig, info, p, > PIDTYPE_MAX); > - ++count; > - if (err != -EPERM) > - retval = err; > + found = true; > + success |= !err; > + retval = err; > } > } > - ret = count ? retval : -ESRCH; > + ret = success ? 0 : (found ? retval : -ESRCH); > } > read_unlock(&tasklist_lock); > > I think that fits in better with Linux, and doesn't have any surprising > behavior. > >> It would rather help some of the user software I'm developing, and the slightly >> new semantics are IMO definitely reasonable (BSDs have them). > Would my proposal above work for the software you are developing? > > The behavior your patch was implementing was: > ret = success ? 0 : ((retval == -EINVAL)? -EINVAL : -ESRCH); > > Which gives less information. So I am not thrilled by it. > > Eric > > > Thanks for the detailed analysis, Eric W. Biederman. All my software really cares about is that I get some indication that a kill(-1,s) run from a non-root pid no longer had anything left to kill, which on Linux is currently being masked by a return value of 0 whereas BDSs nicely provide an ESRCH. -EPERM would work too (and would still be more useful to me than the current behavior), but I will still object to it because I'm convinced you're misreading POSIX here and ESRCH, not EPERM, is the error that should be returned here. You see, while the POSIX guidance for EPERM-returns from kill [EPERM] The process does not have permission to send the signal to any receiving process. does indeed seem to suggest that EPERM might be right here, the issue is that the receiving processes that returned -EPERM in the loop were formally NOT the receiving processes of kill(-1,s) at all If pid is -1, sig shall be sent to all processes (excluding an unspecified set of system processes) for which the process has permission to send that signal. The -EPERM-returning internal kills (group_send_sig_info), according to the POSIX rules for kill(-1,s), *never* should have happened. It's harmless that they did happen as part of the implementation, given that those attempts aren't externally observable anyway, but this implementation detail should not leak out. Since all targets that for kill(-1,s) internally returned -EPERM formally shouldn't have been tried in the first place, then if all tried processes returned -EPERM, there was no process to try to kill and an -ESRCH is in order. No need to diverge from the BSDs here. That is why the original code had a branch to disregard internal EPERM returns and why this branch should be preserved in any patches so that kill(-1,s) should continue to NEVER return -EPERM. Returning it would contradict the spec (kill(-1,s) kills all it has permission to kill so it's nonsensical for it to report that it lacks that permission). As I said in previous messages, especially my latest one, I don't object to dropping the apparent implicit assumption of the current code that there can be no masking of a previous non-EPERM error by a later success, or to making it explicit with some assert/BUG_ON statement. Please see the code examples for both of these other alternatives in my previous message. However, any other implementation (including the one you suggested) is welcome and thank you very much for your analysis and willingness to pick this. Best regards, Petr Skocik
Petr Skocik <pskocik@gmail.com> writes: > Thanks for the detailed analysis, Eric W. Biederman. > > All my software really cares about is that I get some indication that a > kill(-1,s) run from a non-root pid no longer had anything left to kill, > which on Linux is currently being masked by a return value of 0 whereas BDSs > nicely provide an ESRCH. -EPERM would work too (and would still be more useful > to me than the current behavior), but I will still object to it because I'm > convinced you're misreading POSIX here and ESRCH, not EPERM, is the error that > should be returned here. Thank you for saying any error return is good enough for your application. It is definitely a bug that Linux reports success when no signal has been delivered. I dug into this a little bit more and found that Illumos and it's ancestor OpenSolaris can return EPERM, even when sending to all processes, by reading the Illumos source code. Reading through the rational of kill it says that it is sometimes desirable to hide the existence of one process from another so that the existence of a process will not be an information leak. To accommodate that POSIX allows ESRCH instead of EPERM as an error code. If you want you can read it for yourself here: https://pubs.opengroup.org/onlinepubs/9699919799/functions/kill.html To sum up. The function kill(2) should always report success when it has delivered a signal and not otherwise. The Linux version of kill(2) is buggy because it reports success when it has not delivered a signal. Different implementations of kill(2) do different things in this situation and POSIX appears to allow the variation, so there is no strong argument for any specific behavior (other than returning an error) from a compatibility standpoint. From my perspective making the implementation of sending a signal to all processes (-1) handle errors the same as sending a signal to a process group (-pgrp) seems like the most sensible way to fix this bug in Linux. I can see an argument for hiding the existence of processes and returning ESRCH but if/when we go down that road I would just ask that we be consistent and update all of the signal sending functions at the same time. I will see about writing a commit message and posting a final patch in just a little bit. Eric
Thanks. I appreciate your patch and your researching of this. I still think returning -EPERM for kill(-1,s) (unlike for kill(-pgrp,s), where it *can* make sense) is nonsensical because of how POSIX specifies kill(-1,sig) specifically ("sig shall be sent to all processes (excluding an unspecified set of system processes) for which the process has permission to send that signal"). But as I said, any error will do for me, so I am still grateful for your patch. (The way I see it, the POSIX-mentioned possible hiding of processes via ESRCH is a completely different matter. In kill(-1,sig) specifically, targets that would return -EPERM are excluded/hidden by virtue of the definition of kill(-1,sig), which makes it different from other types of kills for which there's no generic need to hide EPERMs (only optional specific need, hence the paragraph in the POSIX spec on processes with a different security label)). Best regards, Petr Skocik