From patchwork Fri Nov 11 07:31:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrei Vagin X-Patchwork-Id: 18515 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp595832wru; Thu, 10 Nov 2022 23:33:02 -0800 (PST) X-Google-Smtp-Source: AA0mqf40XdS10HZ4FWB72XD4cGkGHwVqIC07Cg2X4LzMJjoEaqxfBCZZ51ElPgQrgCy+Pzc3r6Jk X-Received: by 2002:a63:e758:0:b0:470:89:8e92 with SMTP id j24-20020a63e758000000b0047000898e92mr569610pgk.487.1668151982674; Thu, 10 Nov 2022 23:33:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668151982; cv=none; d=google.com; s=arc-20160816; b=o54sPixYb9es07oDORBoSXEN0DP/B/nX2KAZtyaknKTiAd51JvQKNmYoHhT8DqNnrm gxpceqbaY+io1debaIVjdWvLWPfgOKznrYGqG+B9yPCUYdL4TvMQpYsuxJxNens3Kl/9 S02RNdrt9/0Z0Ey/x+waTEGW8P9H2iOazkIRcDRA87Ffsx1VUV2WnW9mVjIc6LTNWJck PnxWM/mbtwdMmsf84tUh02vcywNuHwh2rdUAJxJU0NQv9UujxNzXP/GIGR/0/dv2fjUe OK5c/AYNL8I1eogsn7mvdOASuv1InWeiBshV/HMTb6X72smp0vR6QaRSzxavceBVMlJY RDyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=d0z4ZqOjCKmXyUs7iMz5v/b1E02qlP3LIlnJ4q8w9/A=; b=tNZQ7NA3Bv2Sf3QkwJ5IQj8pQvph+jNb4qGFqSrM8nsazl0+EiwhJ4tUXsY3uffxXF T7IgZ0Y3Q4Gk8UpCWW/j43pXN3eiYAzDPopCqJ9nGNLMMEkKROViM58LJ9y60J9nDWPf 4hwDGuyPmFl470g8PuLk7U1efdLMiwvbgorQjgjPZ165oBDq1Rc4FZKvOpnZwk82/jTV 4Ul9OfHJAb3CBx6zoBVbX/XEVvq0GMZ9ZPwGSGb8qnnIMk2Il9XX/TRaJVFshreFupEw 6fwYgv/qwznSUe6klDTposODF5Hwu362AH9nh4lEExcIR7GntmyfaX5N/LoeASfx/HvT NFWA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=M2IwOqcn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k17-20020a056a00135100b0053e3cf68da1si1843784pfu.74.2022.11.10.23.32.47; Thu, 10 Nov 2022 23:33:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=M2IwOqcn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232855AbiKKHcJ (ORCPT + 99 others); Fri, 11 Nov 2022 02:32:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232778AbiKKHcH (ORCPT ); Fri, 11 Nov 2022 02:32:07 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61F355B599 for ; Thu, 10 Nov 2022 23:32:06 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id m2-20020a17090a730200b0021020cce6adso4945807pjk.3 for ; Thu, 10 Nov 2022 23:32:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=d0z4ZqOjCKmXyUs7iMz5v/b1E02qlP3LIlnJ4q8w9/A=; b=M2IwOqcnfMwLGJzeNATGonrvMokklJg6m4PkurdIvL6eESyXf/bcAb1uGW/mTJusqJ 3ym30/nIwEo8rDGbooAue+HnPKoZcRPTQL0MJQC44VC3GwzMrF0NF0UR+H2JzrnZCZy+ 1FXb04DEx8lTBuF2ZOQBQL9qDFze3rmI3uQ0tyd52Uwjuu3aTPZhMkt/PUx9TR6jZZTA M3L1Xot34AO8kq5MIPWtN5YKvbXEcrJOniRATzYR5z0s44Y4/edXXWGq9loQzJfsJL9/ 1z2znHxCGZrY+xOeIV4X3AkMiAGi7bXDw3T5KBPVjsFE7Fov9RQCMTu7eEqjJk60dIdO sBYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=d0z4ZqOjCKmXyUs7iMz5v/b1E02qlP3LIlnJ4q8w9/A=; b=O5M9dhcGerAyCEz9RaNB8DVBUABscPg/3n/ntOruGvw301kAgW2OXjWWUgU4O2b4wM mZAwJrsUCW/RDMP4+OM8l3QZ2aab4qNSIlPl9bqclUPzVqMMMTYS6FEkCoh6iF8OyAfZ Pt2ZvKVWPy4QvpTSP/fC8G+kXI+G61mgqmWZlN0cXen64J0hUB+T0kYVwGybl1GTdVgH DUmIYwsglWSrSG4oEv1zmv0doI1PjfxvNz6LKTrzm2d+MEFrxdSiqOscqmknj4ch9CHL YHZu+4FSW546uga5dshbBQmf6PxuRsA64i+ifeu7W4wQDfw+fW5F7KSyY/+IgiO30oBP 3+bw== X-Gm-Message-State: ANoB5pmb27C57Eb8h2TRgb45AUeR2ZtIN39X086rZ148kw6cBQQBx8y5 vlV7U2yzbranNgOitQTO0bXXtA3dgvs= X-Received: from avagin.kir.corp.google.com ([2620:0:1008:11:8cf3:f53:2863:82a3]) (user=avagin job=sendgmr) by 2002:aa7:93b3:0:b0:56a:af55:629c with SMTP id x19-20020aa793b3000000b0056aaf55629cmr1351902pff.82.1668151925950; Thu, 10 Nov 2022 23:32:05 -0800 (PST) Date: Thu, 10 Nov 2022 23:31:50 -0800 In-Reply-To: <20221111073154.784261-1-avagin@google.com> Mime-Version: 1.0 References: <20221111073154.784261-1-avagin@google.com> X-Mailer: git-send-email 2.38.1.493.g58b659f92b-goog Message-ID: <20221111073154.784261-2-avagin@google.com> Subject: [PATCH 1/5] seccomp: don't use semaphore and wait_queue together From: Andrei Vagin To: Kees Cook , Peter Zijlstra , Christian Brauner Cc: linux-kernel@vger.kernel.org, Andrei Vagin , Andy Lutomirski , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Oskolkov , Tycho Andersen , Will Drewry , Vincent Guittot X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749184133404312401?= X-GMAIL-MSGID: =?utf-8?q?1749184133404312401?= From: Andrei Vagin The main reason is to use new wake_up helpers that will be added in the following patches. But here are a few other reasons: * if we use two different ways, we always need to call them both. This patch fixes seccomp_notify_recv where we forgot to call wake_up_poll in the error path. * If we use one primitive, we can control how many waiters are woken up for each request. Our goal is to wake up just one that will handle a request. Right now, wake_up_poll can wake up one waiter and up(&match->notif->request) can wake up one more. Signed-off-by: Andrei Vagin --- kernel/seccomp.c | 41 ++++++++++++++++++++++++++++++++++++----- 1 file changed, 36 insertions(+), 5 deletions(-) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index e9852d1b4a5e..876022e9c88c 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -145,7 +145,7 @@ struct seccomp_kaddfd { * @notifications: A list of struct seccomp_knotif elements. */ struct notification { - struct semaphore request; + atomic_t requests; u64 next_id; struct list_head notifications; }; @@ -1116,7 +1116,7 @@ static int seccomp_do_user_notification(int this_syscall, list_add_tail(&n.list, &match->notif->notifications); INIT_LIST_HEAD(&n.addfd); - up(&match->notif->request); + atomic_add(1, &match->notif->requests); wake_up_poll(&match->wqh, EPOLLIN | EPOLLRDNORM); /* @@ -1450,6 +1450,37 @@ find_notification(struct seccomp_filter *filter, u64 id) return NULL; } +static int recv_wake_function(wait_queue_entry_t *wait, unsigned int mode, int sync, + void *key) +{ + /* Avoid a wakeup if event not interesting for us. */ + if (key && !(key_to_poll(key) & (EPOLLIN | EPOLLERR))) + return 0; + return autoremove_wake_function(wait, mode, sync, key); +} + +static int recv_wait_event(struct seccomp_filter *filter) +{ + DEFINE_WAIT_FUNC(wait, recv_wake_function); + int ret; + + if (atomic_add_unless(&filter->notif->requests, -1, 0) != 0) + return 0; + + for (;;) { + ret = prepare_to_wait_event(&filter->wqh, &wait, TASK_INTERRUPTIBLE); + + if (atomic_add_unless(&filter->notif->requests, -1, 0) != 0) + break; + + if (ret) + return ret; + + schedule(); + } + finish_wait(&filter->wqh, &wait); + return 0; +} static long seccomp_notify_recv(struct seccomp_filter *filter, void __user *buf) @@ -1467,7 +1498,7 @@ static long seccomp_notify_recv(struct seccomp_filter *filter, memset(&unotif, 0, sizeof(unotif)); - ret = down_interruptible(&filter->notif->request); + ret = recv_wait_event(filter); if (ret < 0) return ret; @@ -1515,7 +1546,8 @@ static long seccomp_notify_recv(struct seccomp_filter *filter, if (should_sleep_killable(filter, knotif)) complete(&knotif->ready); knotif->state = SECCOMP_NOTIFY_INIT; - up(&filter->notif->request); + atomic_add(1, &filter->notif->requests); + wake_up_poll(&filter->wqh, EPOLLIN | EPOLLRDNORM); } mutex_unlock(&filter->notify_lock); } @@ -1777,7 +1809,6 @@ static struct file *init_listener(struct seccomp_filter *filter) if (!filter->notif) goto out; - sema_init(&filter->notif->request, 0); filter->notif->next_id = get_random_u64(); INIT_LIST_HEAD(&filter->notif->notifications); From patchwork Fri Nov 11 07:31:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrei Vagin X-Patchwork-Id: 18516 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp595946wru; Thu, 10 Nov 2022 23:33:17 -0800 (PST) X-Google-Smtp-Source: AA0mqf5ToTMDFm7SYCSVo1QHGG3qHYX08Tb1msQdU3S7sKPloU1Ry9lmKPa1NQ3dERR+qoTA6hKP X-Received: by 2002:a17:902:7296:b0:186:e222:9f05 with SMTP id d22-20020a170902729600b00186e2229f05mr1397919pll.61.1668151997063; Thu, 10 Nov 2022 23:33:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668151997; cv=none; d=google.com; s=arc-20160816; b=IMl4fgJZatt/Hzn1SHIgDo0x+S8IZN1Pm5Ue5HgGhPkMY9+4/9+I1SX2JBNIf6PksM lNuuoNwQ5M6IKLeTYJy9T/9ovMC/Vmc74oeyDcc6mvuCpWE2+GYJAFdnG1CT1k+kl8tC OU028ql5sI1fgYUUCLe6Gyq+hLMfZd4rgm+NIFtFdEDFFu2+Pi/dHgvy6MBxJXxw9rx0 uEgCjmafcMCdYOiIH0ASBzurIyMG/6I+vl74ftx3WJa2pmtX2kkeco3teNNACEHex61M sC/eIxK1p47mdXdLEyayoQrEWLTocJUt1MCg9HCY5ffSn34A8T0uFW8sbu7daMlAEBbu 8Ebg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=qJqakF/cb4xngAKJ2V4R5RqLyy98cuBZSEz0fOEyQyE=; b=uPzbdVIDs7GdONxKkrutzlHZIyMze1LLAB2ZFlZEGJm4PRQHoXqLyC4YepW56Hg9oN 8Wj3iJLlXjz57UZXh7uRyWXVX7M06SfpZGjhwq83qjky1tX61LxDdAL0an6fhm29utQ9 wtIJnpQIlgmI7M91KFnXVHUqvuJ/Vo+TUn4NugK833e8NlIzSerAfy80qLhv7ZqaKT28 ziRztm7w1q+gvL2c45SA60m/pG0BW+YNNtE0PD7DiTQEbxNbKSfsLYocyMDBSHGOU5Fh D0EYA3j+24I9eF1AL2VK2vdAv+LGV0UG0ILT1yLAPn9uzrZk9o1sEK3FXS+GPdjqzSNc RexQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=G2Otzz01; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l27-20020a63ba5b000000b0046eecbac47esi1792508pgu.28.2022.11.10.23.33.03; Thu, 10 Nov 2022 23:33:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=G2Otzz01; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233015AbiKKHcN (ORCPT + 99 others); Fri, 11 Nov 2022 02:32:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232854AbiKKHcJ (ORCPT ); Fri, 11 Nov 2022 02:32:09 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 516405B599 for ; Thu, 10 Nov 2022 23:32:08 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id t3-20020a170902e84300b00186ab03043dso3074765plg.20 for ; Thu, 10 Nov 2022 23:32:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=qJqakF/cb4xngAKJ2V4R5RqLyy98cuBZSEz0fOEyQyE=; b=G2Otzz01US+DOi6goCyzMK7tP68iJqbqsKD5nc/uQuFqxTtqVKJqSgQnm9mEnXTj7A XoWso2YI94PIpLITBT9FSPb7Q74RMOrusytAAdR+aO8LX1GmZ6mYwDZiqsRhQ7Gi22Gy B4TSTt6s+/btq3k9nCUUCDSK/tsrMIqRMaK/xGs4a7Tc57IIACF/EEif4lFW0zWJwyvv jISfhJ6dcJswhGTKkku3MgbV2KwiAoLenii5IzwONSooP51bmg3OQoTDSrO2Glx58DpZ kVtLTDlCbeb7/XZyHXeyM0sYovlLwbCtpZm+dhiwVJILaPSnxtJSVqx1bKhGibKw1cl3 KONw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qJqakF/cb4xngAKJ2V4R5RqLyy98cuBZSEz0fOEyQyE=; b=uzEMSmk4fsUmerW3K3s/cWpPJC/2al3MdfAEc9NM7NOMAZGpoqYxMp0PhrMS/rJ1Z3 9FE9o+prgP61OlYjVEF3nB9fiWxrvJPTU4iS2dO7K+POwPZ3fsQ0xJwWF+83amR7+Grh T6xQfnbs69xrrEzDk6cPkQq78ytiXptHUZyezHMqZ1WNvg1GWPb9f1WQpd3pgbRI5crL RNoQcpg4ddh0dER6AYPCe4azQE5wqV1Wys/koA0EVTg53wu+ZTlgi0j10CwnH1WY8Sl1 TjoC+VG744BuvVH0+QLBZo1XnP/J0NfPQMfb7WbAyPsoG8cQJ1CknLQDs+t8WLTfIKRD 7FVQ== X-Gm-Message-State: ANoB5plnBlA8MJNRExlr971iZ+KSIAduNBGNv1NRIFG0vS2paOfOtA1a LKN7jNXRWCeaoC1sLA96MfKdVkrVYXE= X-Received: from avagin.kir.corp.google.com ([2620:0:1008:11:8cf3:f53:2863:82a3]) (user=avagin job=sendgmr) by 2002:a17:902:9349:b0:17e:802b:fd6e with SMTP id g9-20020a170902934900b0017e802bfd6emr1101537plp.116.1668151927844; Thu, 10 Nov 2022 23:32:07 -0800 (PST) Date: Thu, 10 Nov 2022 23:31:51 -0800 In-Reply-To: <20221111073154.784261-1-avagin@google.com> Mime-Version: 1.0 References: <20221111073154.784261-1-avagin@google.com> X-Mailer: git-send-email 2.38.1.493.g58b659f92b-goog Message-ID: <20221111073154.784261-3-avagin@google.com> Subject: [PATCH 2/5] sched: add WF_CURRENT_CPU and externise ttwu From: Andrei Vagin To: Kees Cook , Peter Zijlstra , Christian Brauner Cc: linux-kernel@vger.kernel.org, Andrei Vagin , Andy Lutomirski , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Oskolkov , Tycho Andersen , Will Drewry , Vincent Guittot X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749184148688802229?= X-GMAIL-MSGID: =?utf-8?q?1749184148688802229?= From: Peter Oskolkov Add WF_CURRENT_CPU wake flag that advices the scheduler to move the wakee to the current CPU. This is useful for fast on-CPU context switching use cases such as UMCG. In addition, make ttwu external rather than static so that the flag could be passed to it from outside of sched/core.c. Signed-off-by: Peter Oskolkov Signed-off-by: Andrei Vagin --- kernel/sched/core.c | 3 +-- kernel/sched/fair.c | 4 ++++ kernel/sched/sched.h | 13 ++++++++----- 3 files changed, 13 insertions(+), 7 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index cb2aa2b54c7a..4b591e7773fd 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4039,8 +4039,7 @@ bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success) * Return: %true if @p->state changes (an actual wakeup was done), * %false otherwise. */ -static int -try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) +int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) { unsigned long flags; int cpu, success = 0; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e4a0b8bd941c..4ebe7222664c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7204,6 +7204,10 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags) if (wake_flags & WF_TTWU) { record_wakee(p); + if ((wake_flags & WF_CURRENT_CPU) && + cpumask_test_cpu(cpu, p->cpus_ptr)) + return cpu; + if (sched_energy_enabled()) { new_cpu = find_energy_efficient_cpu(p, prev_cpu); if (new_cpu >= 0) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index a4a20046e586..4c275f41773c 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2077,12 +2077,13 @@ static inline int task_on_rq_migrating(struct task_struct *p) } /* Wake flags. The first three directly map to some SD flag value */ -#define WF_EXEC 0x02 /* Wakeup after exec; maps to SD_BALANCE_EXEC */ -#define WF_FORK 0x04 /* Wakeup after fork; maps to SD_BALANCE_FORK */ -#define WF_TTWU 0x08 /* Wakeup; maps to SD_BALANCE_WAKE */ +#define WF_EXEC 0x02 /* Wakeup after exec; maps to SD_BALANCE_EXEC */ +#define WF_FORK 0x04 /* Wakeup after fork; maps to SD_BALANCE_FORK */ +#define WF_TTWU 0x08 /* Wakeup; maps to SD_BALANCE_WAKE */ -#define WF_SYNC 0x10 /* Waker goes to sleep after wakeup */ -#define WF_MIGRATED 0x20 /* Internal use, task got migrated */ +#define WF_SYNC 0x10 /* Waker goes to sleep after wakeup */ +#define WF_MIGRATED 0x20 /* Internal use, task got migrated */ +#define WF_CURRENT_CPU 0x40 /* Prefer to move the wakee to the current CPU. */ #ifdef CONFIG_SMP static_assert(WF_EXEC == SD_BALANCE_EXEC); @@ -3167,6 +3168,8 @@ static inline bool is_per_cpu_kthread(struct task_struct *p) extern void swake_up_all_locked(struct swait_queue_head *q); extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait); +extern int try_to_wake_up(struct task_struct *tsk, unsigned int state, int wake_flags); + #ifdef CONFIG_PREEMPT_DYNAMIC extern int preempt_dynamic_mode; extern int sched_dynamic_mode(const char *str); From patchwork Fri Nov 11 07:31:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrei Vagin X-Patchwork-Id: 18518 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp596297wru; Thu, 10 Nov 2022 23:34:18 -0800 (PST) X-Google-Smtp-Source: AA0mqf5R72KFDl9rjuMIDM1adcNrscasJZev3YcTiIqPN18Zn/fJ0SVIE5XC9iolztPJKNUkgNsq X-Received: by 2002:a17:902:7449:b0:186:de87:7ffd with SMTP id e9-20020a170902744900b00186de877ffdmr1369863plt.94.1668152058166; Thu, 10 Nov 2022 23:34:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668152058; cv=none; d=google.com; s=arc-20160816; b=0xXvuByZdUmgyWJX3tcxeGKiqi8MMl0QoyI1dsOcjHPpruQ03skYVILahMjovNXljJ n5wpK6VUs8CjU6/qh2/e3FNmStbkOzxBf5IJMZ8NbRcoS+H823dRdvqg/JZ7LQ2KZZlV Oc3n8XKTm+E1VnaaPqgy3+zLeBzTJCe4yuWfoQf5RZRR15vUhS2wknBaelY1oefOed/n bY7pzjEQn4nTLIs3zPMLut4lxpzl97+8MWLrNxRdOvSYvOlmi2dXvNiilzG2t6KAI3vY ZQUq07+u71hDNyYS7pQ/Emu3L3QQfQ31bgqHaqjCdUE9sbxY/nM7aeMScNetlaRELo1X kz5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=YTRv55T0nJN4SwjHA1GUh+2SQie0Oys8BK6pYjQjoOc=; b=ZZXJ7ttBMSURlWps+Wj/zsKnbhZE/eHP93CExtqe1DxQjOAUXqoWXlikOcDsYyAqgH Pm+UvXuL8HcNapCRlMcrjdhrO3dzSBt2YJ+dWhm6oxjh1O3F0yWL+MO7acRoR1AUSuLc WepV78bzmyAVQyUIohlGHweFwticDgFIZxue8r6cJQZfYHQRPGhulgLYU3IFgEIC4ul+ 0tn3BoEDWi4yar+I87aJNtgI+qzncY6uZj8brbx1vQ4QxvLhGqVWuRTCbvZxHO46iciX PDpAO8cGb/fCNrUNSUmOevL4AfvrgOjSd+6MyLCAAcUj8aVhcnv55WYcApBJ937HTKn7 EHmA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=OfXu9ejH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x10-20020a170902ec8a00b0018863dbf3b1si2052497plg.284.2022.11.10.23.34.03; Thu, 10 Nov 2022 23:34:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=OfXu9ejH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233114AbiKKHcV (ORCPT + 99 others); Fri, 11 Nov 2022 02:32:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232907AbiKKHcL (ORCPT ); Fri, 11 Nov 2022 02:32:11 -0500 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35E5D69DC7 for ; Thu, 10 Nov 2022 23:32:10 -0800 (PST) Received: by mail-pg1-x54a.google.com with SMTP id 138-20020a630290000000b004708e8a8dcfso2283123pgc.11 for ; Thu, 10 Nov 2022 23:32:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YTRv55T0nJN4SwjHA1GUh+2SQie0Oys8BK6pYjQjoOc=; b=OfXu9ejHMD7esXQ7C780sPXMvVo/q4vIc/de3NtU+Qt4LDV+CSbHFojkuIB6PCFHmu 7SeR+HH1auCzbOWhqb7uXuz6W7zViQXYpDvnGuDIbQP12CYy6xmZx4Dx/XjKbCOnHgUS O/ZnV1GGNeaIdhmp9BYnf9TOjqQjHDl1wn6Bzh+/QXMXPaBrT0iWeQ9/D1w8nYpzbqVo piNCi0mjX18EOfWnIukqf/jNewjJx1si/OUJTSErlGvyidgUlWZA4UolXHYrLKuMtfQ/ tOgDXw/Dmn2djbTjhyy3S+AvZuUwp9g/OLqQ+w/oNAkwa1nVrAodeG/f+zhvW/NcCcCB NTLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YTRv55T0nJN4SwjHA1GUh+2SQie0Oys8BK6pYjQjoOc=; b=tfAjDm9ZmiOD4L7o942cclxyPwGvKBNA1tIxhQ1GbDLT2Z3QUmQD8rPAy/yJyjB/nB 38v4ogxgQTb+EeHxjeSn+/DUzFcn78xVqaNQMUujt+A0EGsx7cZsO3eR7ROZLR8J4PkH amJtgYOSjZAPA7GRNyOi9b3WkuOXY1sc7GcWJtH7bgINK+LGYi7WLG3J7xQfPiP6d6zf a79LCg7NYRTR8N4TY4cXACCPxh6VHEw99DJFDJ/tjKjNF4/kF8S2aRj2gpCZn/8rZGWD P1x3WTiRmb4KPx3ZjOj0s21GEixastGomnns2nGFtv1SZVBWhQpLekpgI4RHrKMY/frP vPkA== X-Gm-Message-State: ANoB5pl1F/nJuslhCVpi7CO7ZIRAi2HwtKbo8/1EPhHKte2QzsmBvesX DPFGdd07D4Q4OS7YHCo98Hzq+BF/KKg= X-Received: from avagin.kir.corp.google.com ([2620:0:1008:11:8cf3:f53:2863:82a3]) (user=avagin job=sendgmr) by 2002:a62:6446:0:b0:565:c122:b63 with SMTP id y67-20020a626446000000b00565c1220b63mr1378271pfb.49.1668151929765; Thu, 10 Nov 2022 23:32:09 -0800 (PST) Date: Thu, 10 Nov 2022 23:31:52 -0800 In-Reply-To: <20221111073154.784261-1-avagin@google.com> Mime-Version: 1.0 References: <20221111073154.784261-1-avagin@google.com> X-Mailer: git-send-email 2.38.1.493.g58b659f92b-goog Message-ID: <20221111073154.784261-4-avagin@google.com> Subject: [PATCH 3/5] sched: add a few helpers to wake up tasks on the current cpu From: Andrei Vagin To: Kees Cook , Peter Zijlstra , Christian Brauner Cc: linux-kernel@vger.kernel.org, Andrei Vagin , Andy Lutomirski , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Oskolkov , Tycho Andersen , Will Drewry , Vincent Guittot X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749184212752216885?= X-GMAIL-MSGID: =?utf-8?q?1749184212752216885?= From: Andrei Vagin Add complete_on_current_cpu, wake_up_poll_on_current_cpu helpers to wake up tasks on the current CPU. These two helpers are useful when the task needs to make a synchronous context switch to another task. In this context, synchronous means it wakes up the target task and falls asleep right after that. One example of such workloads is seccomp user notifies. This mechanism allows the supervisor process handles system calls on behalf of a target process. While the supervisor is handling an intercepted system call, the target process will be blocked in the kernel, waiting for a response to come back. On-CPU context switches are much faster than regular ones. Signed-off-by: Andrei Vagin --- include/linux/completion.h | 1 + include/linux/swait.h | 1 + include/linux/wait.h | 3 +++ kernel/sched/completion.c | 12 ++++++++++++ kernel/sched/core.c | 2 +- kernel/sched/swait.c | 11 +++++++++++ kernel/sched/wait.c | 5 +++++ 7 files changed, 34 insertions(+), 1 deletion(-) diff --git a/include/linux/completion.h b/include/linux/completion.h index 62b32b19e0a8..fb2915676574 100644 --- a/include/linux/completion.h +++ b/include/linux/completion.h @@ -116,6 +116,7 @@ extern bool try_wait_for_completion(struct completion *x); extern bool completion_done(struct completion *x); extern void complete(struct completion *); +extern void complete_on_current_cpu(struct completion *x); extern void complete_all(struct completion *); #endif diff --git a/include/linux/swait.h b/include/linux/swait.h index 6a8c22b8c2a5..1f27b254adf5 100644 --- a/include/linux/swait.h +++ b/include/linux/swait.h @@ -147,6 +147,7 @@ static inline bool swq_has_sleeper(struct swait_queue_head *wq) extern void swake_up_one(struct swait_queue_head *q); extern void swake_up_all(struct swait_queue_head *q); extern void swake_up_locked(struct swait_queue_head *q); +extern void swake_up_locked_on_current_cpu(struct swait_queue_head *q); extern void prepare_to_swait_exclusive(struct swait_queue_head *q, struct swait_queue *wait, int state); extern long prepare_to_swait_event(struct swait_queue_head *q, struct swait_queue *wait, int state); diff --git a/include/linux/wait.h b/include/linux/wait.h index 7f5a51aae0a7..c7d3e78a500d 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -210,6 +210,7 @@ __remove_wait_queue(struct wait_queue_head *wq_head, struct wait_queue_entry *wq } void __wake_up(struct wait_queue_head *wq_head, unsigned int mode, int nr, void *key); +void __wake_up_on_current_cpu(struct wait_queue_head *wq_head, unsigned int mode, void *key); void __wake_up_locked_key(struct wait_queue_head *wq_head, unsigned int mode, void *key); void __wake_up_locked_key_bookmark(struct wait_queue_head *wq_head, unsigned int mode, void *key, wait_queue_entry_t *bookmark); @@ -237,6 +238,8 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head); #define key_to_poll(m) ((__force __poll_t)(uintptr_t)(void *)(m)) #define wake_up_poll(x, m) \ __wake_up(x, TASK_NORMAL, 1, poll_to_key(m)) +#define wake_up_poll_on_current_cpu(x, m) \ + __wake_up_on_current_cpu(x, TASK_NORMAL, poll_to_key(m)) #define wake_up_locked_poll(x, m) \ __wake_up_locked_key((x), TASK_NORMAL, poll_to_key(m)) #define wake_up_interruptible_poll(x, m) \ diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c index d57a5c1c1cd9..a1931a79c05a 100644 --- a/kernel/sched/completion.c +++ b/kernel/sched/completion.c @@ -38,6 +38,18 @@ void complete(struct completion *x) } EXPORT_SYMBOL(complete); +void complete_on_current_cpu(struct completion *x) +{ + unsigned long flags; + + raw_spin_lock_irqsave(&x->wait.lock, flags); + + if (x->done != UINT_MAX) + x->done++; + swake_up_locked_on_current_cpu(&x->wait); + raw_spin_unlock_irqrestore(&x->wait.lock, flags); +} + /** * complete_all: - signals all threads waiting on this completion * @x: holds the state of this particular completion diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4b591e7773fd..8125e02efd2c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6822,7 +6822,7 @@ asmlinkage __visible void __sched preempt_schedule_irq(void) int default_wake_function(wait_queue_entry_t *curr, unsigned mode, int wake_flags, void *key) { - WARN_ON_ONCE(IS_ENABLED(CONFIG_SCHED_DEBUG) && wake_flags & ~WF_SYNC); + WARN_ON_ONCE(IS_ENABLED(CONFIG_SCHED_DEBUG) && wake_flags & ~(WF_SYNC|WF_CURRENT_CPU)); return try_to_wake_up(curr->private, mode, wake_flags); } EXPORT_SYMBOL(default_wake_function); diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c index 76b9b796e695..9ebe23868942 100644 --- a/kernel/sched/swait.c +++ b/kernel/sched/swait.c @@ -31,6 +31,17 @@ void swake_up_locked(struct swait_queue_head *q) } EXPORT_SYMBOL(swake_up_locked); +void swake_up_locked_on_current_cpu(struct swait_queue_head *q) +{ + struct swait_queue *curr; + + if (list_empty(&q->task_list)) + return; + + curr = list_first_entry(&q->task_list, typeof(*curr), task_list); + try_to_wake_up(curr->task, TASK_NORMAL, WF_CURRENT_CPU); + list_del_init(&curr->task_list); +} /* * Wake up all waiters. This is an interface which is solely exposed for * completions and not for general usage. diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c index 9860bb9a847c..9a78bca79419 100644 --- a/kernel/sched/wait.c +++ b/kernel/sched/wait.c @@ -157,6 +157,11 @@ void __wake_up(struct wait_queue_head *wq_head, unsigned int mode, } EXPORT_SYMBOL(__wake_up); +void __wake_up_on_current_cpu(struct wait_queue_head *wq_head, unsigned int mode, void *key) +{ + __wake_up_common_lock(wq_head, mode, 1, WF_CURRENT_CPU, key); +} + /* * Same as __wake_up but called with the spinlock in wait_queue_head_t held. */ From patchwork Fri Nov 11 07:31:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrei Vagin X-Patchwork-Id: 18520 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp596605wru; Thu, 10 Nov 2022 23:35:15 -0800 (PST) X-Google-Smtp-Source: AA0mqf5i+MN7JrX9+wduFSF4MuhAl8Yk4F5izo0q3FQVTsbuyrF5W6Vddw7MrFl+nXpGpx5qTrfe X-Received: by 2002:a17:902:7292:b0:186:9fb9:1f88 with SMTP id d18-20020a170902729200b001869fb91f88mr1525885pll.24.1668152115397; Thu, 10 Nov 2022 23:35:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668152115; cv=none; d=google.com; s=arc-20160816; b=wl1w76SIR7KqmnJTbrGrquFqW2gYwize3ZA0OnGl4Oeqp9sicDuKgoDMfpdKIybLoc pwFK0XYUPDrhYbhIrX27snXgQqtTRZqCqkQE1klO9d/ufL52GpEHgYs2yvsr9qbH9hBX zjrn1LhClDSnBQhA0GAW7Q6Ud0m6ZcPEuc9HrHIFp7agO6BSnJFcKVkaWQgEJgemZJrV TQwXWyZlDzBn7nxvdLRckkUqA270VbKTvNUM4pH7IT1uGs+vbTZjRrFXq3Tf1wj5ugc2 YnXMvSBUJmCwCrGAp0wtiZco35Cc0E2LnhSxaZzmbYh1j/zh3ghE06CGNMeqFzMleDPY F/Kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :message-id:references:mime-version:in-reply-to:date:dkim-signature; bh=e0lTUdFcUyNaLsQ7N1Av0ruCtNjIy6nxrQKKSF4dfPs=; b=AgFlM4ZmJ7s/0Z+2zsoJmyVPs0CXMh20LUlWwpuozHCxBqBco/zP+XF36PXjCfK4go lCRD5FdXG2ZMUAO3WZ2oE9S2zgqtKpldKilyhghnhoTw+HlEtsNxWe1TGsIZSwHAFZ3q i0rPjU5UTmvjUI12C+OzLkyGU2Cwr/0tcEoPEsxZH35O7ZeLrM8fZqKlXFFJgiwldKbC /SG+z9YmiDUL1KPd486D71OIHf5C+NkqMYLOZ6aoSnW2uZPeJZdtyia5G1OGjTZwjtOm POHRS4XTqgejybD8l097xD5pSoJbigEcOw2/7wygiYr6KLixG1S8PhI6/bE+bMFbftxx LnRQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=HCaixx6+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s5-20020a655845000000b0046e9da89cebsi1552462pgr.532.2022.11.10.23.35.01; Thu, 10 Nov 2022 23:35:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=HCaixx6+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233028AbiKKHcZ (ORCPT + 99 others); Fri, 11 Nov 2022 02:32:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232408AbiKKHcN (ORCPT ); Fri, 11 Nov 2022 02:32:13 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54ABF71F26 for ; Thu, 10 Nov 2022 23:32:12 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id bt19-20020a17090af01300b00213c7cd1083so2359900pjb.8 for ; Thu, 10 Nov 2022 23:32:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=e0lTUdFcUyNaLsQ7N1Av0ruCtNjIy6nxrQKKSF4dfPs=; b=HCaixx6+k/wPTxc4FBSW+AVjQMhf7k+ESRgLTgXtY7Gb81lzD3BTz13e4Vz/p/c69P /FvbiY+Z+xjUGj4YfLu+Z85nz8YXzSNDO6+n+GnqZhTDxYt01mNNn7ZpVoBTdyWcJy4s D1UNyNL86lVJHSxbRG+KpC7U/CEvsQcM5hXf+XOPz5O93OFPiAOQWelwqVsz6xEMYTwD 0AXK2yK+R2BcL9Rbo6ztZwu10MdbFP8+0Jhh6of5A4jwBYc/G//0Ffmq/Z3tVn6gyCmb 6Y6JaHaCx9e0DH0WlwVtbh5BtB30Y3yymdWVQk5lzGSKM64HeMYaJ3GYqcQu5T8U+n84 CgCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=e0lTUdFcUyNaLsQ7N1Av0ruCtNjIy6nxrQKKSF4dfPs=; b=3jmXDzZVLhihEGl7EAgB30fVPnSjzAgKZedRdn3RwjRwDwWZViWelVqwPcGZG9+Bun D3q9csGnsBETwq+dXnlrxczuhsXp9/CK4uWIeqfR+zqHWOqeOP2zc48RGITVb8FK5N9+ 6pmrUTOQNhOeGzEjcCE6yavDd1bkJ4L8wcSGiJ6NDl5VMs3hSaJEK9g7zXUhqYw0yoMC Bx7ZC81p8BH0nJKot5MlNcQMCRWAIoYC592rT68ORW1rwAmbwlBpBPYG8SkpAVtOUa2r 6TWVNWfaRe2V8u+8OZkFetGFfr37YaMpIVk/Wp8ZS7A9ZVG85ETxp7zAGzqfl4aptuhH VJpw== X-Gm-Message-State: ANoB5pn9X+H3+HXMVbf79edAmuzWXTM6hPGdi08Leb3FOtnrp+lY3YVR og/GRfe2DIY/3d5864ia+R+2n9hlcSU= X-Received: from avagin.kir.corp.google.com ([2620:0:1008:11:8cf3:f53:2863:82a3]) (user=avagin job=sendgmr) by 2002:a17:90a:1b23:b0:20a:c032:da66 with SMTP id q32-20020a17090a1b2300b0020ac032da66mr629756pjq.19.1668151931735; Thu, 10 Nov 2022 23:32:11 -0800 (PST) Date: Thu, 10 Nov 2022 23:31:53 -0800 In-Reply-To: <20221111073154.784261-1-avagin@google.com> Mime-Version: 1.0 References: <20221111073154.784261-1-avagin@google.com> X-Mailer: git-send-email 2.38.1.493.g58b659f92b-goog Message-ID: <20221111073154.784261-5-avagin@google.com> Subject: [PATCH 4/5] seccomp: add the synchronous mode for seccomp_unotify From: Andrei Vagin To: Kees Cook , Peter Zijlstra , Christian Brauner Cc: linux-kernel@vger.kernel.org, Andrei Vagin , Andy Lutomirski , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Oskolkov , Tycho Andersen , Will Drewry , Vincent Guittot X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749184116788026087?= X-GMAIL-MSGID: =?utf-8?q?1749184272481871391?= From: Andrei Vagin seccomp_unotify allows more privileged processes do actions on behalf of less privileged processes. In many cases, the workflow is fully synchronous. It means a target process triggers a system call and passes controls to a supervisor process that handles the system call and returns controls to the target process. In this context, "synchronous" means that only one process is running and another one is waiting. There is the WF_CURRENT_CPU flag that is used to advise the scheduler to move the wakee to the current CPU. For such synchronous workflows, it makes context switches a few times faster. Right now, each interaction takes 12µs. With this patch, it takes about 3µs. This change introduce the SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP flag that it used to enable the sync mode. Signed-off-by: Andrei Vagin --- include/uapi/linux/seccomp.h | 4 ++++ kernel/seccomp.c | 31 +++++++++++++++++++++++++++++-- 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h index 0fdc6ef02b94..dbfc9b37fcae 100644 --- a/include/uapi/linux/seccomp.h +++ b/include/uapi/linux/seccomp.h @@ -115,6 +115,8 @@ struct seccomp_notif_resp { __u32 flags; }; +#define SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP (1UL << 0) + /* valid flags for seccomp_notif_addfd */ #define SECCOMP_ADDFD_FLAG_SETFD (1UL << 0) /* Specify remote fd */ #define SECCOMP_ADDFD_FLAG_SEND (1UL << 1) /* Addfd and return it, atomically */ @@ -150,4 +152,6 @@ struct seccomp_notif_addfd { #define SECCOMP_IOCTL_NOTIF_ADDFD SECCOMP_IOW(3, \ struct seccomp_notif_addfd) +#define SECCOMP_IOCTL_NOTIF_SET_FLAGS SECCOMP_IOW(4, __u64) + #endif /* _UAPI_LINUX_SECCOMP_H */ diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 876022e9c88c..0a62d44f4898 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -143,9 +143,12 @@ struct seccomp_kaddfd { * filter->notify_lock. * @next_id: The id of the next request. * @notifications: A list of struct seccomp_knotif elements. + * @flags: A set of SECCOMP_USER_NOTIF_FD_* flags. */ + struct notification { atomic_t requests; + u32 flags; u64 next_id; struct list_head notifications; }; @@ -1117,7 +1120,10 @@ static int seccomp_do_user_notification(int this_syscall, INIT_LIST_HEAD(&n.addfd); atomic_add(1, &match->notif->requests); - wake_up_poll(&match->wqh, EPOLLIN | EPOLLRDNORM); + if (match->notif->flags & SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP) + wake_up_poll_on_current_cpu(&match->wqh, EPOLLIN | EPOLLRDNORM); + else + wake_up_poll(&match->wqh, EPOLLIN | EPOLLRDNORM); /* * This is where we wait for a reply from userspace. @@ -1593,7 +1599,10 @@ static long seccomp_notify_send(struct seccomp_filter *filter, knotif->error = resp.error; knotif->val = resp.val; knotif->flags = resp.flags; - complete(&knotif->ready); + if (filter->notif->flags & SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP) + complete_on_current_cpu(&knotif->ready); + else + complete(&knotif->ready); out: mutex_unlock(&filter->notify_lock); return ret; @@ -1623,6 +1632,22 @@ static long seccomp_notify_id_valid(struct seccomp_filter *filter, return ret; } +static long seccomp_notify_set_flags(struct seccomp_filter *filter, + unsigned long flags) +{ + long ret; + + if (flags & ~SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP) + return -EINVAL; + + ret = mutex_lock_interruptible(&filter->notify_lock); + if (ret < 0) + return ret; + filter->notif->flags = flags; + mutex_unlock(&filter->notify_lock); + return 0; +} + static long seccomp_notify_addfd(struct seccomp_filter *filter, struct seccomp_notif_addfd __user *uaddfd, unsigned int size) @@ -1752,6 +1777,8 @@ static long seccomp_notify_ioctl(struct file *file, unsigned int cmd, case SECCOMP_IOCTL_NOTIF_ID_VALID_WRONG_DIR: case SECCOMP_IOCTL_NOTIF_ID_VALID: return seccomp_notify_id_valid(filter, buf); + case SECCOMP_IOCTL_NOTIF_SET_FLAGS: + return seccomp_notify_set_flags(filter, arg); } /* Extensible Argument ioctls */ From patchwork Fri Nov 11 07:31:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrei Vagin X-Patchwork-Id: 18521 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp596642wru; Thu, 10 Nov 2022 23:35:20 -0800 (PST) X-Google-Smtp-Source: AA0mqf6jGLewtDEwWifnFNQj42J1nveyFHgHjj/skQQfdbCDw1MK+wFV9suJw5eDs5XflVP+NP+H X-Received: by 2002:a63:3103:0:b0:470:2ecd:e9fa with SMTP id x3-20020a633103000000b004702ecde9famr627592pgx.183.1668152119991; Thu, 10 Nov 2022 23:35:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668152119; cv=none; d=google.com; s=arc-20160816; b=G2gIavrmeFV0Vdqh01DuKfjGojOYJp6qtHsDdD5Icml1vs/0D1xp21nVW5TI2rWQ9L ONLPLUiYBcfFl7Vs9GeAeDVJ9foq5Ssp5apwg6ekqwcr02Qb5Ajul6hh16nG5zw756w7 z5mV8eDx/tJa8itlD4B9VqdLTSs3Z3uvZuy5sXtK+wIeRCQUfgPv4QXmx/vtQyi9stXs lJ6Zohmc26JRniRXRen003pJD8xB/t8BvWeqgWVpFRb/2+m/X02dHL44ZdiZI+ALWFIc 6pVYXbVe47lr7A4QglAvzYztkU9QQJeWTRW8fdOOX6glp5AQGXPMTBeMYJu9brOZhkX+ hQ3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=FAikKIJpzJSZ8sMNRJucytFkAgb3856FUSnbWMixKa4=; b=gQuaXftMEG/G5ArMoe676F3M2PzfQygjTOqTuawKz8BfJqhLOA97qEQsFTr2o4DRuq Z7yhVauXidjVH6Ytuh8rjwBe2oI+Tl8nOnnkZr6ZAC5lSconTc5Wz3QKIpqs4z2Zkl63 KEmcnOYWW0xpcis0WSWi3kwCMcKNZl95kUxkemNyOzFDVsT/ELAzSkSisIf5i29pP0XU 10Ay/Ew1ksDhv9T5FiqwHBy0ApHR9sX9wFk6xahE1zO3DVXDezObvrgPpcG30nW39Kgz ckMzMWdb1Zyc4AlXPbdQjduXkio/Coox4ZSGICrly9oyq2t77foqmxqXO/bg9Xb8sK9V Lj4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=X9bFZQmF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o23-20020a63fb17000000b0046f9447ce53si1939493pgh.317.2022.11.10.23.35.05; Thu, 10 Nov 2022 23:35:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=X9bFZQmF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233179AbiKKHc2 (ORCPT + 99 others); Fri, 11 Nov 2022 02:32:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55606 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233067AbiKKHcU (ORCPT ); Fri, 11 Nov 2022 02:32:20 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69B05748EC for ; Thu, 10 Nov 2022 23:32:14 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id k9-20020a17090a39c900b0021671e97a25so2444647pjf.1 for ; Thu, 10 Nov 2022 23:32:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=FAikKIJpzJSZ8sMNRJucytFkAgb3856FUSnbWMixKa4=; b=X9bFZQmFEuyvlCDNautd2g7zE3QogsaL6ZRQn93JPv/cnHnA6KIs9Yxe7ezHRW73pC 9t5iV3cM1zS8a6Oh2W3NmHoerCu7A//qi+0muJGW3dlQ8tm00Z3vua5jt9MZz8ogbxiO PE46s5PQLq7NobrPMsr+DvJndUHWmrAftNk7hIJe16h6ePFabb9BCekR7a6GDyTZu5mS QN3xxaOAEKwyOEN3OAeFnd43T6yPSgSigFtOOdXguEBq+jMAVxLL9Chdqlq/hvDhJweM Ikac8ETwe/l+HajtWjMGBdIBGd3UFp0fwXDFy4IlWHL3u134CttULEdja4jQLIxsF71q 2ffQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FAikKIJpzJSZ8sMNRJucytFkAgb3856FUSnbWMixKa4=; b=cfIcwgN0mKzj4QDFmxT9UYgV5kArtCip1JvWb3RSWeU22b91WgjD9vMGeYwUum3YSj c4UHb0JZW1LCVDAJ2b6I09kBdm1Zuc8BhJmSXqoHqMlsRyfkSwa1Em21bYxirP1o5zNz SUb7EQ0qDF8wkKtqaqk0/rPsgQKSmfhd+dDYZroYlan4SypGGq8kJ1Zn1TDOQZIVdKuI EILofWOztztdTnhYGoAtFlKwU9h2jU+KOq1id2p/emM8thmT2hy2Z1n/+naKbaernv3D aJi2nOGCwNzVR8ISqMsq+TzsGQnNOswk2hX2kR7oIE9oYbG/yYq8WZxCRStInL7x9yQZ saNw== X-Gm-Message-State: ANoB5pmjWbFfUdTdkfPrkcTr8UVbhffTHwZEUFmSE3nZU5K6XaAS/fF2 P7dQcU0AUEeNGTetu5AwUq+WQ0In+eg= X-Received: from avagin.kir.corp.google.com ([2620:0:1008:11:8cf3:f53:2863:82a3]) (user=avagin job=sendgmr) by 2002:a17:90b:b08:b0:212:d796:d30f with SMTP id bf8-20020a17090b0b0800b00212d796d30fmr672058pjb.9.1668151933972; Thu, 10 Nov 2022 23:32:13 -0800 (PST) Date: Thu, 10 Nov 2022 23:31:54 -0800 In-Reply-To: <20221111073154.784261-1-avagin@google.com> Mime-Version: 1.0 References: <20221111073154.784261-1-avagin@google.com> X-Mailer: git-send-email 2.38.1.493.g58b659f92b-goog Message-ID: <20221111073154.784261-6-avagin@google.com> Subject: [PATCH 5/5] selftest/seccomp: add a new test for the sync mode of seccomp_user_notify From: Andrei Vagin To: Kees Cook , Peter Zijlstra , Christian Brauner Cc: linux-kernel@vger.kernel.org, Andrei Vagin , Andy Lutomirski , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Oskolkov , Tycho Andersen , Will Drewry , Vincent Guittot X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749184277679591896?= X-GMAIL-MSGID: =?utf-8?q?1749184277679591896?= From: Andrei Vagin Test output: RUN global.user_notification_sync ... seccomp_bpf.c:4279:user_notification_sync:basic: 8655 nsec/syscall seccomp_bpf.c:4279:user_notification_sync:sync: 2919 nsec/syscall OK global.user_notification_sync Signed-off-by: Andrei Vagin --- tools/testing/selftests/seccomp/seccomp_bpf.c | 88 +++++++++++++++++++ 1 file changed, 88 insertions(+) diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 4ae6c8991307..605c120ba2c2 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -4241,6 +4241,94 @@ TEST(user_notification_addfd_rlimit) close(memfd); } +/* USER_NOTIF_BENCH_TIMEOUT is 100 miliseconds. */ +#define USER_NOTIF_BENCH_TIMEOUT 100000000ULL +#define NSECS_PER_SEC 1000000000ULL + +#ifndef SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP +#define SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP (1UL << 0) +#define SECCOMP_IOCTL_NOTIF_SET_FLAGS SECCOMP_IOW(4, __u64) +#endif + +static uint64_t user_notification_sync_loop(struct __test_metadata *_metadata, + char *test_name, int listener) +{ + struct timespec ts; + uint64_t start, end, nr; + struct seccomp_notif req = {}; + struct seccomp_notif_resp resp = {}; + + clock_gettime(CLOCK_MONOTONIC, &ts); + start = ts.tv_nsec + ts.tv_sec * NSECS_PER_SEC; + for (end = start, nr = 0; end - start < USER_NOTIF_BENCH_TIMEOUT; nr++) { + memset(&req, 0, sizeof(req)); + req.pid = 0; + ASSERT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_RECV, &req), 0); + + ASSERT_EQ(req.data.nr, __NR_getppid); + + resp.id = req.id; + resp.error = 0; + resp.val = USER_NOTIF_MAGIC; + resp.flags = 0; + ASSERT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_SEND, &resp), 0); + + clock_gettime(CLOCK_MONOTONIC, &ts); + end = ts.tv_nsec + ts.tv_sec * NSECS_PER_SEC; + } + TH_LOG("%s:\t%lld nsec/syscall", test_name, USER_NOTIF_BENCH_TIMEOUT / nr); + return nr; +} + +TEST(user_notification_sync) +{ + pid_t pid; + long ret; + int status, listener; + unsigned long calls, sync_calls; + + ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); + ASSERT_EQ(0, ret) { + TH_LOG("Kernel does not support PR_SET_NO_NEW_PRIVS!"); + } + + listener = user_notif_syscall(__NR_getppid, + SECCOMP_FILTER_FLAG_NEW_LISTENER); + ASSERT_GE(listener, 0); + + pid = fork(); + ASSERT_GE(pid, 0); + + if (pid == 0) { + while (1) { + ret = syscall(__NR_getppid); + if (ret == USER_NOTIF_MAGIC) + continue; + break; + } + _exit(1); + } + + calls = user_notification_sync_loop(_metadata, "basic", listener); + + /* Try to set invalid flags. */ + EXPECT_SYSCALL_RETURN(-EINVAL, + ioctl(listener, SECCOMP_IOCTL_NOTIF_SET_FLAGS, 0xffffffff, 0)); + + ASSERT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_SET_FLAGS, + SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP, 0), 0); + + sync_calls = user_notification_sync_loop(_metadata, "sync", listener); + + EXPECT_GT(sync_calls, calls); + + kill(pid, SIGKILL); + ASSERT_EQ(waitpid(pid, &status, 0), pid); + ASSERT_EQ(true, WIFSIGNALED(status)); + ASSERT_EQ(SIGKILL, WTERMSIG(status)); +} + + /* Make sure PTRACE_O_SUSPEND_SECCOMP requires CAP_SYS_ADMIN. */ FIXTURE(O_SUSPEND_SECCOMP) { pid_t pid;