From patchwork Tue Nov 22 20:39:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Desnoyers X-Patchwork-Id: 24562 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp2428963wrr; Tue, 22 Nov 2022 12:41:44 -0800 (PST) X-Google-Smtp-Source: AA0mqf5QvNZxUDOm5zP7feArdWrTWXSFUUs2rFgg/EcqMP7Zg2rWqiuTes7iGNab+PetKNKUEvzw X-Received: by 2002:a05:6a00:2c87:b0:562:e790:dfe0 with SMTP id ef7-20020a056a002c8700b00562e790dfe0mr26870301pfb.16.1669149704458; Tue, 22 Nov 2022 12:41:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669149704; cv=none; d=google.com; s=arc-20160816; b=K5pLvHllCsDrMP4ootTvEcOp7BnsNgPh9oQeqgmiYLt596dNvoMxwTg7z7YG85jz0A tipX9zEMIlSv6ZHF7OwXCQESaddkH8iGmtOgF6S1m32dQUHQUfPvcfxIWcVG9PBSrQjy cn6Lrp4KYESUml6D1HzP2jv3uivr9IkTCb633EyebZq7cfNr4EVBTABJSK6u8C0iOFy6 eZK3r3imICmY/wsyNRmFwUhU6nvanYaacPmPZMKXoo6KFaA2YozIbF9loM7MR9tbT6GG hM4bvo9jue0yjDYA9lvLSegs6/efsOTluBEfFfd5C682WTEt4slof3z85BEZ/nUvVVBp 9s3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=HGSfOK1iY+CRiVQw0TI5bTaI8331CeAg5HwVZ/3U08Y=; b=Lbh2bVD5lxJxW4Z0KnwzJ2r6ymJplEHhss9wZxlCLJSHBbSsqZgl72Mi1H2gZUbjA4 vZGHoqQ6XiCRnAM/TMqkxPws8nkbEKs0FVrQnUCvbm+bPHmCEJoDBBKiLPStBxxME6AN wznZrVD3ms4TrFo56Xpg2WWJOX1PwuB/cteOTL+DphuLusKG+gkLHvmIDwxsouz1TPjQ gyVrCOO9+kpDnGtQI6KHpCMbJkKneQMBWtmAE2Zm2pErAGGHH2tn0+BgEoErf2nSiRPp t86x9HgOnpbOGdv5ppzrbJhnHzsWgnCjRhEGECjaAJMM7ry0I08XnzelsTUgNFxODWWR q42A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=ai0YEWu4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qb12-20020a17090b280c00b001fe4eebefe5si17442913pjb.135.2022.11.22.12.41.30; Tue, 22 Nov 2022 12:41:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=ai0YEWu4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234405AbiKVUkB (ORCPT + 99 others); Tue, 22 Nov 2022 15:40:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234795AbiKVUjs (ORCPT ); Tue, 22 Nov 2022 15:39:48 -0500 Received: from smtpout.efficios.com (unknown [IPv6:2607:5300:203:b2ee::31e5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E78226B3AE; Tue, 22 Nov 2022 12:39:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1669149586; bh=lLWUz3oaCIo40YA9zK1pHfVsPRrbvzXT6T+kSxRkqXo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ai0YEWu4l2AzhF/XVBu9QBS+tEkmlgnYMczBYlB3n1c8sWtuszPMFn1P6j4ySo5tO AxQ5huET+oRQ/2J5A32I73W37PQ1KLEeaL3DvEiCna8EraNUvwvBLHxk1f3vDpZins DrJ4KfwE0298aQpX7b78l6LPUUSkwMuaV7BRGb+VT6AodR5VFSL+dcEqoA7kY8+Ses UPrQjikxFWmVJEp5SIg7eZunOJqjzTl44J3Fd1pvSHedg5bz138YZnwkYLSc6x3ezk Uw2f9WZZPWzcdFunvndbEBeQP9+86V5JjlcxNO8lgdh9U7eJLQdKAAVQX3Y5ZYn57u Y+8fvxdG/oI8w== Received: from localhost.localdomain (192-222-180-24.qc.cable.ebox.net [192.222.180.24]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4NGx2T5bfFzXDT; Tue, 22 Nov 2022 15:39:45 -0500 (EST) From: Mathieu Desnoyers To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , "Paul E . McKenney" , Boqun Feng , "H . Peter Anvin" , Paul Turner , linux-api@vger.kernel.org, Christian Brauner , Florian Weimer , David.Laight@ACULAB.COM, carlos@redhat.com, Peter Oskolkov , Alexander Mikhalitsyn , Chris Kennelly , Mathieu Desnoyers Subject: [PATCH 04/30] rseq: Extend struct rseq with numa node id Date: Tue, 22 Nov 2022 15:39:06 -0500 Message-Id: <20221122203932.231377-5-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221122203932.231377-1-mathieu.desnoyers@efficios.com> References: <20221122203932.231377-1-mathieu.desnoyers@efficios.com> MIME-Version: 1.0 X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RDNS_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750230320652242701?= X-GMAIL-MSGID: =?utf-8?q?1750230320652242701?= Adding the NUMA node id to struct rseq is a straightforward thing to do, and a good way to figure out if anything in the user-space ecosystem prevents extending struct rseq. This NUMA node id field allows memory allocators such as tcmalloc to take advantage of fast access to the current NUMA node id to perform NUMA-aware memory allocation. It can also be useful for implementing fast-paths for NUMA-aware user-space mutexes. It also allows implementing getcpu(2) purely in user-space. Signed-off-by: Mathieu Desnoyers --- Changes since v4: - Use __entry->cpu_id as argument for cpu_to_node() in the rseq_update tracepoint. --- include/trace/events/rseq.h | 4 +++- include/uapi/linux/rseq.h | 8 ++++++++ kernel/rseq.c | 19 +++++++++++++------ 3 files changed, 24 insertions(+), 7 deletions(-) diff --git a/include/trace/events/rseq.h b/include/trace/events/rseq.h index a04a64bc1a00..dde7a359b4ef 100644 --- a/include/trace/events/rseq.h +++ b/include/trace/events/rseq.h @@ -16,13 +16,15 @@ TRACE_EVENT(rseq_update, TP_STRUCT__entry( __field(s32, cpu_id) + __field(s32, node_id) ), TP_fast_assign( __entry->cpu_id = raw_smp_processor_id(); + __entry->node_id = cpu_to_node(__entry->cpu_id); ), - TP_printk("cpu_id=%d", __entry->cpu_id) + TP_printk("cpu_id=%d node_id=%d", __entry->cpu_id, __entry->node_id) ); TRACE_EVENT(rseq_ip_fixup, diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h index 05d3c4cdeb40..1cb90a435c5c 100644 --- a/include/uapi/linux/rseq.h +++ b/include/uapi/linux/rseq.h @@ -131,6 +131,14 @@ struct rseq { */ __u32 flags; + /* + * Restartable sequences node_id field. Updated by the kernel. Read by + * user-space with single-copy atomicity semantics. This field should + * only be read by the thread which registered this data structure. + * Aligned on 32-bit. Contains the current NUMA node ID. + */ + __u32 node_id; + /* * Flexible array member at end of structure, after last feature field. */ diff --git a/kernel/rseq.c b/kernel/rseq.c index c1058b3f10ac..e21ad8929958 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -85,15 +85,17 @@ * F1. */ -static int rseq_update_cpu_id(struct task_struct *t) +static int rseq_update_cpu_node_id(struct task_struct *t) { - u32 cpu_id = raw_smp_processor_id(); struct rseq __user *rseq = t->rseq; + u32 cpu_id = raw_smp_processor_id(); + u32 node_id = cpu_to_node(cpu_id); if (!user_write_access_begin(rseq, t->rseq_len)) goto efault; unsafe_put_user(cpu_id, &rseq->cpu_id_start, efault_end); unsafe_put_user(cpu_id, &rseq->cpu_id, efault_end); + unsafe_put_user(node_id, &rseq->node_id, efault_end); /* * Additional feature fields added after ORIG_RSEQ_SIZE * need to be conditionally updated only if @@ -109,9 +111,9 @@ static int rseq_update_cpu_id(struct task_struct *t) return -EFAULT; } -static int rseq_reset_rseq_cpu_id(struct task_struct *t) +static int rseq_reset_rseq_cpu_node_id(struct task_struct *t) { - u32 cpu_id_start = 0, cpu_id = RSEQ_CPU_ID_UNINITIALIZED; + u32 cpu_id_start = 0, cpu_id = RSEQ_CPU_ID_UNINITIALIZED, node_id = 0; /* * Reset cpu_id_start to its initial state (0). @@ -125,6 +127,11 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *t) */ if (put_user(cpu_id, &t->rseq->cpu_id)) return -EFAULT; + /* + * Reset node_id to its initial state (0). + */ + if (put_user(node_id, &t->rseq->node_id)) + return -EFAULT; /* * Additional feature fields added after ORIG_RSEQ_SIZE * need to be conditionally reset only if @@ -299,7 +306,7 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs) if (unlikely(ret < 0)) goto error; } - if (unlikely(rseq_update_cpu_id(t))) + if (unlikely(rseq_update_cpu_node_id(t))) goto error; return; @@ -346,7 +353,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len, return -EINVAL; if (current->rseq_sig != sig) return -EPERM; - ret = rseq_reset_rseq_cpu_id(current); + ret = rseq_reset_rseq_cpu_node_id(current); if (ret) return ret; current->rseq = NULL;