From patchwork Tue Jan 24 23:41:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrei Vagin X-Patchwork-Id: 47956 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp2429580wrn; Tue, 24 Jan 2023 15:45:12 -0800 (PST) X-Google-Smtp-Source: AMrXdXu+Kjmo7AVerxHj1u/tlh7sepa144oDR+CRQleboysn2TvLoys96nyCYSqUggaZUs4J1GWQ X-Received: by 2002:a05:6a21:339a:b0:a7:345a:100f with SMTP id yy26-20020a056a21339a00b000a7345a100fmr40664417pzb.10.1674603911791; Tue, 24 Jan 2023 15:45:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674603911; cv=none; d=google.com; s=arc-20160816; b=CYASubl+KVepxdfBey0RrOXRTNVHnkv5sng8u4zOABKN4kgcXp/G9+wwmqYTZZYTJR zoON3F4IA8NUTpiCGYrYSnB6rJ3md+8ou7iK3j1/KPfnCXfzNtyF4pJISwnriTcdaJM6 j68ZGjgWu+kr73rBVHDqe2ecBeGR/cFVm3r9g7T8/WPfNY7Lz+r7wHeUIHm8Y5bfWcAr 8I8TzeBpCgeubV629vGO7n6o9lq/NMKwMM/KainVmwmfKj4bIZyzyH563C83gAObvIJy I3BmaIXHho1IuAB2ws+6epmnXlYT8Z1cP0+fmG9uHtaLFf7O0hL/FoFteaio1IevCQtN 3fVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :message-id:references:mime-version:in-reply-to:date:dkim-signature; bh=g6Z/L7bX9MFc/+bSE9Rh2MDjwE5MQawcR3+IL7tl120=; b=R/bhlBu4wdYAa3MnWGjkKjquP+/8JAl+rIIZuMctV2n50FxTZGav0J9whaXlINyD9D sVkYPRzUOnsm9i+VvR7szOZrUqUlF639Pf8rBL3ASZ2fxUv8OqDV8urJrjM+gOnqmWss jle56IRt0SVQwmaZMAoJMw2gF6qXMCd83L2yVuwaAiCitanmwL634YxqqHI7a/KGJxP4 cTiaqPvYy0PfaJ50YjYjzosggOXYX9Yfpdf0kNGUwK2xGMDnFF6gsWPPCSM7xjxG7IqJ 9FtsfOleyWY90PDQBt03h002duEKeiLtQSzqMYaTBGScS5wsTrZlW2phbcDc2jJ4SAvj KfhQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=jqgEvr3L; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t7-20020a637807000000b004d378558f31si3582123pgc.136.2023.01.24.15.44.57; Tue, 24 Jan 2023 15:45:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=jqgEvr3L; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232753AbjAXXmX (ORCPT + 99 others); Tue, 24 Jan 2023 18:42:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233999AbjAXXmM (ORCPT ); Tue, 24 Jan 2023 18:42:12 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29B314A21F for ; Tue, 24 Jan 2023 15:42:10 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id 69-20020a630148000000b00478118684c4so7602142pgb.20 for ; Tue, 24 Jan 2023 15:42:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=g6Z/L7bX9MFc/+bSE9Rh2MDjwE5MQawcR3+IL7tl120=; b=jqgEvr3LLSFk3IpBhw0lkz7ygaaSYsrv9PmPLfWIjqQS5/6kHgpMdG2b3XIz6KOvrB fcHWrb+YGbCLFZO9QtB+/KXICKIjUl/DyBYphIVlYvFbNQcThhXjzaZDBP5ysEj1+fvg HYAZD+//ffgLjsg9uPIzBMRJuDW2FbC631nadBYc+jB2AcCjEJtbG6EI5OyS/R6Bu+NC z7NlSrJSg9mOw1FfgiF/OSwIZ5Ed41/lpzau0U5fK5jrceP0zUfUorq7YswjPFasqjVZ GmpUp/cFG0QCFMsFpp0JmEBDUhcnKdJZDO9a+aqbdpcXxBut6G3R+WpwAYlk6jwrnJpb 79YA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=g6Z/L7bX9MFc/+bSE9Rh2MDjwE5MQawcR3+IL7tl120=; b=QY18gfw1XgAH79y17WAmKTDloDy9WnZHyHohL58IPBkCeSKm1wCwhlSnHqtBnNTcMU yMVdUUqrzncsrYTaiS+LHSttNOgBTUQG+zijr4t8nul9t1pvHPEfRhmMqpK7T5p4WZGt vn42XsH+WD1NQkWycJzqAsf7/v6K1tw3HdxhJuCkNog+EL6pYc4fHhCwIhylvqKCaw5v YobbUNbhS2iO6Q1ElShXlHVBjQMnkqC1GqIAdh9aTKAYXg7qE+ldRVZNKr4W3iJeD1+A 5L8RaptX9aVLdKSbjCR/tMzEP0R3Gp1LVlteqb4dGJaDm0Ak4+Io6dDlcfCqa9rrwLT9 95/A== X-Gm-Message-State: AFqh2kozd8iCeyv2NYqPOT40ZQ63Ypg2VAq5xpV1BzgKU8gBfr2CsxH/ BSGDT8yQ3P1rPT4RKnmhgZNHVul0NEc= X-Received: from avagin.kir.corp.google.com ([2620:0:1008:11:cf1b:2f7f:3ca1:6488]) (user=avagin job=sendgmr) by 2002:a05:6a00:400c:b0:58b:c1a4:3c29 with SMTP id by12-20020a056a00400c00b0058bc1a43c29mr3151066pfb.32.1674603729454; Tue, 24 Jan 2023 15:42:09 -0800 (PST) Date: Tue, 24 Jan 2023 15:41:54 -0800 In-Reply-To: <20230124234156.211569-1-avagin@google.com> Mime-Version: 1.0 References: <20230124234156.211569-1-avagin@google.com> X-Mailer: git-send-email 2.39.1.405.gd4c25cc71f-goog Message-ID: <20230124234156.211569-5-avagin@google.com> Subject: [PATCH 4/6] seccomp: add the synchronous mode for seccomp_unotify From: Andrei Vagin To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Kees Cook , Christian Brauner , Chen Yu , Andrei Vagin , Andy Lutomirski , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Oskolkov , Tycho Andersen , Will Drewry , Vincent Guittot X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755949458165254428?= X-GMAIL-MSGID: =?utf-8?q?1755949471627886420?= From: Andrei Vagin seccomp_unotify allows more privileged processes do actions on behalf of less privileged processes. In many cases, the workflow is fully synchronous. It means a target process triggers a system call and passes controls to a supervisor process that handles the system call and returns controls to the target process. In this context, "synchronous" means that only one process is running and another one is waiting. There is the WF_CURRENT_CPU flag that is used to advise the scheduler to move the wakee to the current CPU. For such synchronous workflows, it makes context switches a few times faster. Right now, each interaction takes 12µs. With this patch, it takes about 3µs. This change introduce the SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP flag that it used to enable the sync mode. Signed-off-by: Andrei Vagin --- include/uapi/linux/seccomp.h | 4 ++++ kernel/seccomp.c | 31 +++++++++++++++++++++++++++++-- 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h index 0fdc6ef02b94..dbfc9b37fcae 100644 --- a/include/uapi/linux/seccomp.h +++ b/include/uapi/linux/seccomp.h @@ -115,6 +115,8 @@ struct seccomp_notif_resp { __u32 flags; }; +#define SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP (1UL << 0) + /* valid flags for seccomp_notif_addfd */ #define SECCOMP_ADDFD_FLAG_SETFD (1UL << 0) /* Specify remote fd */ #define SECCOMP_ADDFD_FLAG_SEND (1UL << 1) /* Addfd and return it, atomically */ @@ -150,4 +152,6 @@ struct seccomp_notif_addfd { #define SECCOMP_IOCTL_NOTIF_ADDFD SECCOMP_IOW(3, \ struct seccomp_notif_addfd) +#define SECCOMP_IOCTL_NOTIF_SET_FLAGS SECCOMP_IOW(4, __u64) + #endif /* _UAPI_LINUX_SECCOMP_H */ diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 876022e9c88c..0a62d44f4898 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -143,9 +143,12 @@ struct seccomp_kaddfd { * filter->notify_lock. * @next_id: The id of the next request. * @notifications: A list of struct seccomp_knotif elements. + * @flags: A set of SECCOMP_USER_NOTIF_FD_* flags. */ + struct notification { atomic_t requests; + u32 flags; u64 next_id; struct list_head notifications; }; @@ -1117,7 +1120,10 @@ static int seccomp_do_user_notification(int this_syscall, INIT_LIST_HEAD(&n.addfd); atomic_add(1, &match->notif->requests); - wake_up_poll(&match->wqh, EPOLLIN | EPOLLRDNORM); + if (match->notif->flags & SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP) + wake_up_poll_on_current_cpu(&match->wqh, EPOLLIN | EPOLLRDNORM); + else + wake_up_poll(&match->wqh, EPOLLIN | EPOLLRDNORM); /* * This is where we wait for a reply from userspace. @@ -1593,7 +1599,10 @@ static long seccomp_notify_send(struct seccomp_filter *filter, knotif->error = resp.error; knotif->val = resp.val; knotif->flags = resp.flags; - complete(&knotif->ready); + if (filter->notif->flags & SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP) + complete_on_current_cpu(&knotif->ready); + else + complete(&knotif->ready); out: mutex_unlock(&filter->notify_lock); return ret; @@ -1623,6 +1632,22 @@ static long seccomp_notify_id_valid(struct seccomp_filter *filter, return ret; } +static long seccomp_notify_set_flags(struct seccomp_filter *filter, + unsigned long flags) +{ + long ret; + + if (flags & ~SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP) + return -EINVAL; + + ret = mutex_lock_interruptible(&filter->notify_lock); + if (ret < 0) + return ret; + filter->notif->flags = flags; + mutex_unlock(&filter->notify_lock); + return 0; +} + static long seccomp_notify_addfd(struct seccomp_filter *filter, struct seccomp_notif_addfd __user *uaddfd, unsigned int size) @@ -1752,6 +1777,8 @@ static long seccomp_notify_ioctl(struct file *file, unsigned int cmd, case SECCOMP_IOCTL_NOTIF_ID_VALID_WRONG_DIR: case SECCOMP_IOCTL_NOTIF_ID_VALID: return seccomp_notify_id_valid(filter, buf); + case SECCOMP_IOCTL_NOTIF_SET_FLAGS: + return seccomp_notify_set_flags(filter, arg); } /* Extensible Argument ioctls */