From patchwork Mon May 29 19:14:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Desnoyers X-Patchwork-Id: 100388 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1722732vqr; Mon, 29 May 2023 12:20:28 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7Pi69L8h29Ve7N33DO7IO+/1eu6dxnRS6f0zdpK85dGmGM+cj6RUkjTc9ey+Vp6Xr2a1kC X-Received: by 2002:a17:90a:9a8d:b0:253:42cc:8c46 with SMTP id e13-20020a17090a9a8d00b0025342cc8c46mr160531pjp.10.1685388028350; Mon, 29 May 2023 12:20:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685388028; cv=none; d=google.com; s=arc-20160816; b=YCxnyS95U09vel5sboLS4s6gWq8uswOenaHPjrhXL7Hb9/P6Muc36Nuh27XsrgTrBp /ncQsVve7UbfBst7EFPwHwtNSYlck9i36nOApLac2QNL2Sg4waYAVlBm9/JyL1PJcz5z G7UoxLUjrGtsKG7FqwbxxcxZWNEmgJM3UXERdv6DFMLiR3Z0i+qTPTZi/XqG0HvqA2Yt 0Adq1fe6RYg/E/F/FVUjjpuzMvjmUzBZGpKd83udLRDk9eH7Zys3T5JLywDuaGFpsGv5 pnLL1geZtpU0rURNV/iTOGAeBlVbztcLT9roIPSosy4ZlpyGo9dAsYE05FlnmMxANV5B y43Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=qMuVPhvjaCmBPiUMzG/LcEJ+LK5qfqAg9NaRsXvjmN4=; b=i0sMnTS94u2U0N7M1fJbi2BHa7N76j5E5nXgy913jPEFn/t5SKNJG9ZFa4hYfhwAii QVm6Hh/lzsoFYqePe2GiQTQik/17KRQOt8+gcsJ/dydQoJvAgYE5pZrkhwS9bZA7ptsK emsss/PKMRez6I6owK66mYxEB9Hnh5V5AMiU1c09H8j13QbeQvpUXboxZ8piSUAdEUNt POODgOugKbFSeeEsw7im4rVnu8tmDz0fgnSdsBDmoFW2sbK17wWLwbzkI+xZp5P3BxX3 L64uFdYsULeCBj154szJtjOmt0g4+pfn+6LgVg7YtQWYJCJCkkkfTDplJ/YTpO+oTf6h bztA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=RT3UWrn+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a36-20020a631a24000000b0052c688e6608si56836pga.505.2023.05.29.12.20.15; Mon, 29 May 2023 12:20:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=RT3UWrn+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229712AbjE2TOb (ORCPT + 99 others); Mon, 29 May 2023 15:14:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50840 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229513AbjE2TO1 (ORCPT ); Mon, 29 May 2023 15:14:27 -0400 Received: from smtpout.efficios.com (unknown [IPv6:2607:5300:203:b2ee::31e5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56917CD; Mon, 29 May 2023 12:14:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1685387662; bh=5w4NZKyVPLUH045JI3QXxaLJDagv6gC2n7xAu8bq0T8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RT3UWrn+xCHr3vwR5d6xOmhi+qsJFoWgO9oCztv7nV/2TiQiUZftnDSF/TkGXcp/4 Es6GYMDISPv38LHGOlcxuAc6Ufi4SjqT34iOSyq19HmowHXFmaPXfoZvjh0w86wHxJ MBK8jQoSt0qdPsH6mdJFT8FHk73uZjozgDWGvqL/JjvDJUEMRBBtDN2HdENZ6YQWxW faTesn9z7bCV/eB8eOpDK93TsQx4yZU9V986ev6EdjIm8SqBYTHrxCJeEXo6/5y8Ay vbs8ovkZJLyZhGvizgtfiAQhMexR8etOpNhnD3qm55NGzWNtF8vc808BDQgdVFFqW3 8AEIGjTYOiKjA== Received: from localhost.localdomain (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4QVQFB0xl1z16DH; Mon, 29 May 2023 15:14:22 -0400 (EDT) From: Mathieu Desnoyers To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , "Paul E . McKenney" , Boqun Feng , "H . Peter Anvin" , Paul Turner , linux-api@vger.kernel.org, Christian Brauner , Florian Weimer , David.Laight@ACULAB.COM, carlos@redhat.com, Peter Oskolkov , Alexander Mikhalitsyn , Chris Kennelly , Ingo Molnar , Darren Hart , Davidlohr Bueso , =?utf-8?q?Andr=C3=A9_Almeida?= , libc-alpha@sourceware.org, Steven Rostedt , Jonathan Corbet , Noah Goldstein , Daniel Colascione , longman@redhat.com, Mathieu Desnoyers , Florian Weimer Subject: [RFC PATCH v2 1/4] rseq: Add sched_state field to struct rseq Date: Mon, 29 May 2023 15:14:13 -0400 Message-Id: <20230529191416.53955-2-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230529191416.53955-1-mathieu.desnoyers@efficios.com> References: <20230529191416.53955-1-mathieu.desnoyers@efficios.com> MIME-Version: 1.0 X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RDNS_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767257437022863380?= X-GMAIL-MSGID: =?utf-8?q?1767257437022863380?= Expose the "on-cpu" state for each thread through struct rseq to allow adaptative mutexes to decide more accurately between busy-waiting and calling sys_futex() to release the CPU, based on the on-cpu state of the mutex owner. It is only provided as an optimization hint, because there is no guarantee that the page containing this field is in the page cache, and therefore the scheduler may very well fail to clear the on-cpu state on preemption. This is expected to be rare though, and is resolved as soon as the task returns to user-space. The goal is to improve use-cases where the duration of the critical sections for a given lock follows a multi-modal distribution, preventing statistical guesses from doing a good job at choosing between busy-wait and futex wait behavior. Signed-off-by: Mathieu Desnoyers Cc: Peter Zijlstra (Intel) Cc: Jonathan Corbet Cc: Steven Rostedt (Google) Cc: Carlos O'Donell Cc: Florian Weimer Cc: libc-alpha@sourceware.org --- include/linux/sched.h | 16 +++++++++++++++ include/uapi/linux/rseq.h | 41 +++++++++++++++++++++++++++++++++++++ kernel/rseq.c | 43 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 100 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index eed5d65b8d1f..7741ff10136a 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1311,6 +1311,7 @@ struct task_struct { * with respect to preemption. */ unsigned long rseq_event_mask; + struct rseq_sched_state __user *rseq_sched_state; #endif #ifdef CONFIG_SCHED_MM_CID @@ -2351,11 +2352,20 @@ static inline void rseq_signal_deliver(struct ksignal *ksig, rseq_handle_notify_resume(ksig, regs); } +void __rseq_set_sched_state(struct task_struct *t, unsigned int state); + +static inline void rseq_set_sched_state(struct task_struct *t, unsigned int state) +{ + if (t->rseq_sched_state) + __rseq_set_sched_state(t, state); +} + /* rseq_preempt() requires preemption to be disabled. */ static inline void rseq_preempt(struct task_struct *t) { __set_bit(RSEQ_EVENT_PREEMPT_BIT, &t->rseq_event_mask); rseq_set_notify_resume(t); + rseq_set_sched_state(t, 0); } /* rseq_migrate() requires preemption to be disabled. */ @@ -2376,11 +2386,13 @@ static inline void rseq_fork(struct task_struct *t, unsigned long clone_flags) t->rseq_len = 0; t->rseq_sig = 0; t->rseq_event_mask = 0; + t->rseq_sched_state = NULL; } else { t->rseq = current->rseq; t->rseq_len = current->rseq_len; t->rseq_sig = current->rseq_sig; t->rseq_event_mask = current->rseq_event_mask; + t->rseq_sched_state = current->rseq_sched_state; } } @@ -2390,6 +2402,7 @@ static inline void rseq_execve(struct task_struct *t) t->rseq_len = 0; t->rseq_sig = 0; t->rseq_event_mask = 0; + t->rseq_sched_state = NULL; } #else @@ -2405,6 +2418,9 @@ static inline void rseq_signal_deliver(struct ksignal *ksig, struct pt_regs *regs) { } +static inline void rseq_set_sched_state(struct task_struct *t, unsigned int state) +{ +} static inline void rseq_preempt(struct task_struct *t) { } diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h index c233aae5eac9..b28588225fa7 100644 --- a/include/uapi/linux/rseq.h +++ b/include/uapi/linux/rseq.h @@ -37,6 +37,13 @@ enum rseq_cs_flags { (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT), }; +enum rseq_sched_state_flags { + /* + * Task is currently running on a CPU if bit is set. + */ + RSEQ_SCHED_STATE_FLAG_ON_CPU = (1U << 0), +}; + /* * struct rseq_cs is aligned on 4 * 8 bytes to ensure it is always * contained within a single cache-line. It is usually declared as @@ -53,6 +60,31 @@ struct rseq_cs { __u64 abort_ip; } __attribute__((aligned(4 * sizeof(__u64)))); +/* + * rseq_sched_state should be aligned on the cache line size. + */ +struct rseq_sched_state { + /* + * Version of this structure. Populated by the kernel, read by + * user-space. + */ + __u32 version; + /* + * The state is updated by the kernel. Read by user-space with + * single-copy atomicity semantics. This field can be read by any + * userspace thread. Aligned on 32-bit. Contains a bitmask of enum + * rseq_sched_state_flags. This field is provided as a hint by the + * scheduler, and requires that the page holding this state is + * faulted-in for the state update to be performed by the scheduler. + */ + __u32 state; + /* + * Thread ID associated with the thread registering this structure. + * Initialized by user-space before registration. + */ + __u32 tid; +}; + /* * struct rseq is aligned on 4 * 8 bytes to ensure it is always * contained within a single cache-line. @@ -148,6 +180,15 @@ struct rseq { */ __u32 mm_cid; + __u32 padding1; + + /* + * Restartable sequences sched_state_ptr field. Initialized by + * userspace to the address at which the struct rseq_sched_state is + * located. Read by the kernel on rseq registration. + */ + __u64 sched_state_ptr; + /* * Flexible array member at end of structure, after last feature field. */ diff --git a/kernel/rseq.c b/kernel/rseq.c index 9de6e35fe679..e36d6deeae77 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -87,10 +87,12 @@ static int rseq_update_cpu_node_id(struct task_struct *t) { + struct rseq_sched_state __user *rseq_sched_state = t->rseq_sched_state; struct rseq __user *rseq = t->rseq; u32 cpu_id = raw_smp_processor_id(); u32 node_id = cpu_to_node(cpu_id); u32 mm_cid = task_mm_cid(t); + u32 sched_state = RSEQ_SCHED_STATE_FLAG_ON_CPU; WARN_ON_ONCE((int) mm_cid < 0); if (!user_write_access_begin(rseq, t->rseq_len)) @@ -99,6 +101,7 @@ static int rseq_update_cpu_node_id(struct task_struct *t) unsafe_put_user(cpu_id, &rseq->cpu_id, efault_end); unsafe_put_user(node_id, &rseq->node_id, efault_end); unsafe_put_user(mm_cid, &rseq->mm_cid, efault_end); + unsafe_put_user(sched_state, &rseq_sched_state->state, efault_end); /* * Additional feature fields added after ORIG_RSEQ_SIZE * need to be conditionally updated only if @@ -339,6 +342,18 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs) force_sigsegv(sig); } +/* + * Attempt to update rseq scheduler state. + */ +void __rseq_set_sched_state(struct task_struct *t, unsigned int state) +{ + if (unlikely(t->flags & PF_EXITING)) + return; + pagefault_disable(); + (void) put_user(state, &t->rseq_sched_state->state); + pagefault_enable(); +} + #ifdef CONFIG_DEBUG_RSEQ /* @@ -359,6 +374,29 @@ void rseq_syscall(struct pt_regs *regs) #endif +static int rseq_get_sched_state_ptr(struct rseq __user *rseq, u32 rseq_len, + struct rseq_sched_state __user **_sched_state_ptr) +{ + struct rseq_sched_state __user *sched_state_ptr; + u64 sched_state_ptr_value; + u32 version = 0; + int ret; + + if (rseq_len < offsetofend(struct rseq, sched_state_ptr)) + return 0; + ret = get_user(sched_state_ptr_value, &rseq->sched_state_ptr); + if (ret) + return ret; + sched_state_ptr = (struct rseq_sched_state __user *)(unsigned long)sched_state_ptr_value; + if (!sched_state_ptr) + return 0; + ret = put_user(version, &sched_state_ptr->version); + if (ret) + return ret; + *_sched_state_ptr = sched_state_ptr; + return 0; +} + /* * sys_rseq - setup restartable sequences for caller thread. */ @@ -366,6 +404,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len, int, flags, u32, sig) { int ret; + struct rseq_sched_state __user *sched_state_ptr = NULL; if (flags & RSEQ_FLAG_UNREGISTER) { if (flags & ~RSEQ_FLAG_UNREGISTER) @@ -383,6 +422,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len, current->rseq = NULL; current->rseq_sig = 0; current->rseq_len = 0; + current->rseq_sched_state = NULL; return 0; } @@ -420,9 +460,12 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len, return -EINVAL; if (!access_ok(rseq, rseq_len)) return -EFAULT; + if (rseq_get_sched_state_ptr(rseq, rseq_len, &sched_state_ptr)) + return -EFAULT; current->rseq = rseq; current->rseq_len = rseq_len; current->rseq_sig = sig; + current->rseq_sched_state = sched_state_ptr; /* * If rseq was previously inactive, and has just been * registered, ensure the cpu_id_start and cpu_id fields From patchwork Mon May 29 19:14:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Desnoyers X-Patchwork-Id: 100389 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1722947vqr; Mon, 29 May 2023 12:20:58 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4rSgFEmi+JNdknMiK5d4MJEEdvjzZHR3Ls37DLHeowvU6xDlb34VWuUK0W2f+PTKzPq0RA X-Received: by 2002:a05:6a20:549a:b0:10f:fea:1997 with SMTP id i26-20020a056a20549a00b0010f0fea1997mr37623pzk.5.1685388058096; Mon, 29 May 2023 12:20:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685388058; cv=none; d=google.com; s=arc-20160816; b=WwiFVXrcjQDbHc5+bKjv+erFdWl/0g500tBhJxr5BO6KNImBv6vieSBGIGInri5mBg CfeZf37iu9Yvkh9Thj+EHSSvsujOQuMoOEOeBdZp1A3qk08IffSLPOAfSy1czKFH/hfo /cqSAXfW/k/R/v344G/VgWeV+Lc/HZf4qePYslAq99zHULRRL1MGhTR4R9BMYvBlvqav GhzmDJZa87uqzshmAD/akCY3M/5T21DuyMSXEhjlzPlYhKBR/lPkipgC08M5QFy5OhMZ hMyUt4JTaOO7zPWlj6kZUEVa1fSP8IUyI7VTtUAoUFadoVDZUkQoQM3bl6GDNT2nTy6S 4aNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=hJbpA9guhgays83tJxGgreV4AGGG41TCjU0IDqP0kz0=; b=v/CoJixhPIQNJNMYgbYUfxiNhFFQuqVukPXJ04mMHvCCH8Bhhj6XDEK2uXff0nShME ByiGo3ZQ9ZtJ7KZwL7GNalaXGZ3ZgbdbTajIk6LOaNfzR1iXwn2aVW4amoGcjNCzKTIu mhXhdnZ7jQjhUy4FpRtK6UJrASQ5AqRBrn9BRbSmKoy1SEd5fW36zEvhd5mvbjMoIcMF kraY/DrCZ3PIZ4vhSzQZd/YawP67/l4dekgLdvfkCqSraJqUMrYBpIShzrwhYMSGl81y HU2u7s1dA4VhwRyXEQRPgVHIxMPSE3Rm4+OK1JkD63VTcr9+q6LSOn43JY5ggL047mA8 DvxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=iA6sYhg7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z13-20020aa7958d000000b0064d43f63156si309214pfj.354.2023.05.29.12.20.43; Mon, 29 May 2023 12:20:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=iA6sYhg7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229717AbjE2TOd (ORCPT + 99 others); Mon, 29 May 2023 15:14:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229457AbjE2TO1 (ORCPT ); Mon, 29 May 2023 15:14:27 -0400 Received: from smtpout.efficios.com (unknown [IPv6:2607:5300:203:b2ee::31e5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD471D2; Mon, 29 May 2023 12:14:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1685387663; bh=E7Q4JmbMKWpqkjo72GLBkCxs4FPIIvyJtEspj5idB/U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iA6sYhg7uvaIe/pCwyzk1Z6fglq2EMj8blDu8+QNsIltbH5xaQX/PGLCvJztR9ZbW 3HOWzbbek+f7mpT0kHnTz1jpLJSxd6TY3uMCaFNhJ1pLOPahcBAuk+nOo/+JrcdRhW Ic9FqDxpWxoZrdnpTmeJq+Sgkf1w8kwF+4hgKg/jyizLs34BqBCvDt9/3hapA/GmhU L6FspGdPOmUDPEauUIV5OK/A+5H5r3vT3xGfLLeop603Y2o0gPIWR+rtKR7TJC65r8 FIu6g/GOEpmf5c15Nz56T3ebqj1llXL6QdOsCG32aMGEHZK0yyR4rT5bK9sVr9fxvE YxEC05Z+gihjQ== Received: from localhost.localdomain (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4QVQFB4ZFyz165P; Mon, 29 May 2023 15:14:22 -0400 (EDT) From: Mathieu Desnoyers To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , "Paul E . McKenney" , Boqun Feng , "H . Peter Anvin" , Paul Turner , linux-api@vger.kernel.org, Christian Brauner , Florian Weimer , David.Laight@ACULAB.COM, carlos@redhat.com, Peter Oskolkov , Alexander Mikhalitsyn , Chris Kennelly , Ingo Molnar , Darren Hart , Davidlohr Bueso , =?utf-8?q?Andr=C3=A9_Almeida?= , libc-alpha@sourceware.org, Steven Rostedt , Jonathan Corbet , Noah Goldstein , Daniel Colascione , longman@redhat.com, Mathieu Desnoyers Subject: [RFC PATCH v2 2/4] selftests/rseq: Add sched_state rseq field and getter Date: Mon, 29 May 2023 15:14:14 -0400 Message-Id: <20230529191416.53955-3-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230529191416.53955-1-mathieu.desnoyers@efficios.com> References: <20230529191416.53955-1-mathieu.desnoyers@efficios.com> MIME-Version: 1.0 X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RDNS_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767257468028001197?= X-GMAIL-MSGID: =?utf-8?q?1767257468028001197?= Extend struct rseq in the rseq selftests to include the sched_state field. Implement a getter function for this field. Signed-off-by: Mathieu Desnoyers --- tools/testing/selftests/rseq/rseq-abi.h | 42 +++++++++++++++++++++++++ tools/testing/selftests/rseq/rseq.c | 13 ++++++++ tools/testing/selftests/rseq/rseq.h | 5 +++ 3 files changed, 60 insertions(+) diff --git a/tools/testing/selftests/rseq/rseq-abi.h b/tools/testing/selftests/rseq/rseq-abi.h index fb4ec8a75dd4..1092d6750386 100644 --- a/tools/testing/selftests/rseq/rseq-abi.h +++ b/tools/testing/selftests/rseq/rseq-abi.h @@ -37,6 +37,13 @@ enum rseq_abi_cs_flags { (1U << RSEQ_ABI_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT), }; +enum rseq_abi_sched_state_flags { + /* + * Task is currently running on a CPU if bit is set. + */ + RSEQ_ABI_SCHED_STATE_FLAG_ON_CPU = (1U << 0), +}; + /* * struct rseq_abi_cs is aligned on 4 * 8 bytes to ensure it is always * contained within a single cache-line. It is usually declared as @@ -53,6 +60,32 @@ struct rseq_abi_cs { __u64 abort_ip; } __attribute__((aligned(4 * sizeof(__u64)))); +/* + * rseq_abi_sched_state should be aligned on the cache line size. + */ +struct rseq_abi_sched_state { + /* + * Version of this structure. Populated by the kernel, read by + * user-space. + */ + __u32 version; + /* + * The state is updated by the kernel. Read by user-space with + * single-copy atomicity semantics. This field can be read by any + * userspace thread. Aligned on 32-bit, and ideally on cache line size. + * Contains a bitmask of enum rseq_abi_sched_state_flags. This field is + * provided as a hint by the scheduler, and requires that the page + * holding this state is faulted-in for the state update to be + * performed by the scheduler. + */ + __u32 state; + /* + * Thread ID associated with the thread registering this structure. + * Initialized by user-space before registration. + */ + __u32 tid; +}; + /* * struct rseq_abi is aligned on 4 * 8 bytes to ensure it is always * contained within a single cache-line. @@ -164,6 +197,15 @@ struct rseq_abi { */ __u32 mm_cid; + __u32 padding1; + + /* + * Restartable sequences sched_state_ptr field. Initialized by + * userspace to the address at which the struct rseq_abi_sched_state is + * located. Read by the kernel on rseq registration. + */ + __u64 sched_state_ptr; + /* * Flexible array member at end of structure, after last feature field. */ diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c index 4e4aa006004c..76925b116054 100644 --- a/tools/testing/selftests/rseq/rseq.c +++ b/tools/testing/selftests/rseq/rseq.c @@ -62,17 +62,28 @@ static int rseq_reg_success; /* At least one rseq registration has succeded. */ /* Allocate a large area for the TLS. */ #define RSEQ_THREAD_AREA_ALLOC_SIZE 1024 +/* Approximation of cacheline size. */ +#define CACHELINE_SIZE 128 + /* Original struct rseq feature size is 20 bytes. */ #define ORIG_RSEQ_FEATURE_SIZE 20 /* Original struct rseq allocation size is 32 bytes. */ #define ORIG_RSEQ_ALLOC_SIZE 32 +static +__thread struct rseq_abi_sched_state __rseq_abi_sched_state __attribute__((tls_model("initial-exec"), aligned(CACHELINE_SIZE))); + static __thread struct rseq_abi __rseq_abi __attribute__((tls_model("initial-exec"), aligned(RSEQ_THREAD_AREA_ALLOC_SIZE))) = { .cpu_id = RSEQ_ABI_CPU_ID_UNINITIALIZED, }; +static pid_t rseq_gettid(void) +{ + return syscall(__NR_gettid); +} + static int sys_rseq(struct rseq_abi *rseq_abi, uint32_t rseq_len, int flags, uint32_t sig) { @@ -109,6 +120,8 @@ int rseq_register_current_thread(void) /* Treat libc's ownership as a successful registration. */ return 0; } + __rseq_abi_sched_state.tid = rseq_gettid(); + __rseq_abi.sched_state_ptr = (uint64_t)(unsigned long)&__rseq_abi_sched_state; rc = sys_rseq(&__rseq_abi, rseq_size, 0, RSEQ_SIG); if (rc) { if (RSEQ_READ_ONCE(rseq_reg_success)) { diff --git a/tools/testing/selftests/rseq/rseq.h b/tools/testing/selftests/rseq/rseq.h index d7364ea4d201..4c14ef3f581f 100644 --- a/tools/testing/selftests/rseq/rseq.h +++ b/tools/testing/selftests/rseq/rseq.h @@ -236,6 +236,11 @@ static inline void rseq_prepare_unload(void) rseq_clear_rseq_cs(); } +static inline struct rseq_abi_sched_state *rseq_get_sched_state(struct rseq_abi *rseq) +{ + return (struct rseq_abi_sched_state *)(unsigned long)rseq->sched_state_ptr; +} + static inline __attribute__((always_inline)) int rseq_cmpeqv_storev(enum rseq_mo rseq_mo, enum rseq_percpu_mode percpu_mode, intptr_t *v, intptr_t expect, From patchwork Mon May 29 19:14:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Desnoyers X-Patchwork-Id: 100390 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1724227vqr; Mon, 29 May 2023 12:24:09 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7Ms/KYQhq178mbznJ+A5oj69+5hlfcN/eVBAm+6lJVkJvsFL+GRvZ7OIp3WDRxKsXEAJOp X-Received: by 2002:a05:6a00:17aa:b0:64f:4197:8d93 with SMTP id s42-20020a056a0017aa00b0064f41978d93mr739334pfg.24.1685388248782; Mon, 29 May 2023 12:24:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685388248; cv=none; d=google.com; s=arc-20160816; b=EDw/GDTwwlE0xRkPlyCSEebH8A02iwvNxWkncrYaTKFXMgRjiyjkietPOP1ZxwxOp+ T0DUw/KzVqYrO+/xqvseYBHTqsjAhunwi5CaubfnSp2PYhYb8Lp5qU63mSBxntjlJFnu eaZYq/ZGAfdE6zQOX2bTMRFaQhomxE3TS0A3udPeGA8HuVFLPy28iipBdZuVdehMVpRP KCY6aCMsUMVBb7yDMIaFpJg8X7fyiPT2SEDt/w6BFP0XR3/0dj7ReLnMajVlxedXFvYD szWrYbFHAtm6mlqTMlQaeTfxPnUXW5bvPmISo0M1h/F/A60WWgMP61ibSL1z+90Aio9b XFqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=PZm43cQF9R7ze8qG/8I3vbJVGxeVF9J+7XQdNapSIOY=; b=lgs55tstCjmIMqsaVxxwOakLSkSjL1o15YjkHsALYfnPG3CoRRXh26NwQsETORfJ+s 9JBd0o0aQBEmQgrXnT2KddSbo85zt2+MtP9R4acNuShCCF8eZE4UaBCjL6Hs3puvbVsP uj7UEDrSTz2kFjGLvDE1P5VlphJRTVXOyedzPMmibXrkLIaucT7FukjH5TL/1MkS9hgE ZdG9JCnDIZPrmY7p4T99n6MRxMZXdjEKYWdviqDMlKDbKz4Kx7LshzTYQZXkodJq0QLZ MGwqgBd5SDv9/Lxujgqb3koPAIQUoAVMV1fULdlUT5LXGjp0+AZkoEf2AsCYH9IypvWY GyrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=E+nmIA9j; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z13-20020aa7958d000000b0064d43f63156si309214pfj.354.2023.05.29.12.23.57; Mon, 29 May 2023 12:24:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=E+nmIA9j; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229735AbjE2TOj (ORCPT + 99 others); Mon, 29 May 2023 15:14:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50848 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229676AbjE2TO1 (ORCPT ); Mon, 29 May 2023 15:14:27 -0400 Received: from smtpout.efficios.com (unknown [IPv6:2607:5300:203:b2ee::31e5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B90DD9; Mon, 29 May 2023 12:14:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1685387663; bh=1NSDvuKZlkj1LDG55Opnc9arEf2CDBA5sWooOa0Auxs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=E+nmIA9jjhGxSVcQWQ1eh0PMnpHNNfD6o9CBD+CE/oKRJutzuFGKqFXTzkuQ3vlHR +srFPLRlhguoA+SYIAbp2Y80lBYBP8/sVMMuFZjOO8TomJeAkBd/oHgtqHKET1L/jE Oq+sdX9bwUXsbfwjQ1TUad7wjJoJvyq4l5NiCSUFlorVyqzmqn8KCQdYd0PDkQDKke rMm2TI/P4A++PEXnsUgm17wwUCo1q7HtISOODP7xtmJOwVXzNjY5xP9XCLnkP13faU dvZd1sCtXlMDSGDDskruo6k2GnQ2obToX5tEaMZD6S6g95kTQbdLD9TnsB3M2w4yjk 7ljd78EzhMOCg== Received: from localhost.localdomain (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4QVQFC0n9Jz165Q; Mon, 29 May 2023 15:14:23 -0400 (EDT) From: Mathieu Desnoyers To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , "Paul E . McKenney" , Boqun Feng , "H . Peter Anvin" , Paul Turner , linux-api@vger.kernel.org, Christian Brauner , Florian Weimer , David.Laight@ACULAB.COM, carlos@redhat.com, Peter Oskolkov , Alexander Mikhalitsyn , Chris Kennelly , Ingo Molnar , Darren Hart , Davidlohr Bueso , =?utf-8?q?Andr=C3=A9_Almeida?= , libc-alpha@sourceware.org, Steven Rostedt , Jonathan Corbet , Noah Goldstein , Daniel Colascione , longman@redhat.com, Mathieu Desnoyers Subject: [RFC PATCH v2 3/4] selftests/rseq: Implement sched state test program Date: Mon, 29 May 2023 15:14:15 -0400 Message-Id: <20230529191416.53955-4-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230529191416.53955-1-mathieu.desnoyers@efficios.com> References: <20230529191416.53955-1-mathieu.desnoyers@efficios.com> MIME-Version: 1.0 X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RDNS_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767257668585677046?= X-GMAIL-MSGID: =?utf-8?q?1767257668585677046?= This is a small test program with can be altered to show whether the target thread is on-cpu or not, dependending on whether it loops on poll() or does a busy-loop. Signed-off-by: Mathieu Desnoyers --- tools/testing/selftests/rseq/.gitignore | 1 + tools/testing/selftests/rseq/Makefile | 2 +- .../testing/selftests/rseq/sched_state_test.c | 72 +++++++++++++++++++ 3 files changed, 74 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/rseq/sched_state_test.c diff --git a/tools/testing/selftests/rseq/.gitignore b/tools/testing/selftests/rseq/.gitignore index 16496de5f6ce..a8db9f7a7cec 100644 --- a/tools/testing/selftests/rseq/.gitignore +++ b/tools/testing/selftests/rseq/.gitignore @@ -9,3 +9,4 @@ param_test_compare_twice param_test_mm_cid param_test_mm_cid_benchmark param_test_mm_cid_compare_twice +sched_state_test diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile index b357ba24af06..7c8f4f2be74c 100644 --- a/tools/testing/selftests/rseq/Makefile +++ b/tools/testing/selftests/rseq/Makefile @@ -14,7 +14,7 @@ OVERRIDE_TARGETS = 1 TEST_GEN_PROGS = basic_test basic_percpu_ops_test basic_percpu_ops_mm_cid_test param_test \ param_test_benchmark param_test_compare_twice param_test_mm_cid \ - param_test_mm_cid_benchmark param_test_mm_cid_compare_twice + param_test_mm_cid_benchmark param_test_mm_cid_compare_twice sched_state_test TEST_GEN_PROGS_EXTENDED = librseq.so diff --git a/tools/testing/selftests/rseq/sched_state_test.c b/tools/testing/selftests/rseq/sched_state_test.c new file mode 100644 index 000000000000..5196b0dd897a --- /dev/null +++ b/tools/testing/selftests/rseq/sched_state_test.c @@ -0,0 +1,72 @@ +// SPDX-License-Identifier: LGPL-2.1 + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include + +#include "rseq.h" + +static struct rseq_abi_sched_state *target_thread_state; + +//TODO: +//Use rseq c.s. and rseq fence to protect access to remote thread's rseq_abi. + +static +void show_sched_state(struct rseq_abi_sched_state *rseq_thread_state) +{ + uint32_t state; + + state = rseq_thread_state->state; + printf("Target thread: %u, ON_CPU=%d\n", + rseq_thread_state->tid, + !!(state & RSEQ_ABI_SCHED_STATE_FLAG_ON_CPU)); +} + +static +void *test_thread(void *arg) +{ + int i; + + for (i = 0; i < 1000; i++) { + show_sched_state(target_thread_state); + (void) poll(NULL, 0, 100); + } + return NULL; +} + +int main(int argc, char **argv) +{ + pthread_t test_thread_id; + int i; + + if (rseq_register_current_thread()) { + fprintf(stderr, "Error: rseq_register_current_thread(...) failed(%d): %s\n", + errno, strerror(errno)); + goto init_thread_error; + } + target_thread_state = rseq_get_sched_state(rseq_get_abi()); + + pthread_create(&test_thread_id, NULL, test_thread, NULL); + + for (i = 0; i < 1000000000; i++) + rseq_barrier(); + //for (i = 0; i < 10000; i++) + // (void) poll(NULL, 0, 75); + + pthread_join(test_thread_id, NULL); + + if (rseq_unregister_current_thread()) { + fprintf(stderr, "Error: rseq_unregister_current_thread(...) failed(%d): %s\n", + errno, strerror(errno)); + goto init_thread_error; + } + return 0; + +init_thread_error: + return -1; +} From patchwork Mon May 29 19:14:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Desnoyers X-Patchwork-Id: 100387 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1722353vqr; Mon, 29 May 2023 12:19:42 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7DdgKJ6nqddv6jj+/Fw4RshkIb0eTR5b1eMzY/g+an7X1eJrfAeJw8YZ4v22oT6ikEC+/k X-Received: by 2002:a17:902:ce84:b0:1ac:aac1:e344 with SMTP id f4-20020a170902ce8400b001acaac1e344mr199259plg.36.1685387982178; Mon, 29 May 2023 12:19:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685387982; cv=none; d=google.com; s=arc-20160816; b=tC0b3AvrkFRMMoM3b5pGc/6CYDVyWeUjGFnNbM30eHK1ptCG9scS39NhdljBeaVOIq KkE59FJ7II1sLWl1TqIo+Eb35fs6hXGAqkduTE5AJSldgxToUL6UT2eafWEkZuDnU7gK 9KidWI60HzkJqloLDh2/le0lH2sAJ95T57ONVCI3in3eHdhXqIee+hNhyIi1SgPXEb6W JVn1PoeoIqE36t81apyyVTSmKrpgkwIHWFGWXs4NpHQqNZlHYA//tjXhnB+R6UQujhpO pKLNDHUUeq33BWYrrhEzs6ono8/nbbDtGj15qeBvV3JKKXmxTfXxl3zhT+kHRL3uYR60 DhqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=AG8T0dRwqEURuo50XJXwN0FInMeOvUVPZxn8hNbzcKU=; b=fZ8si7qjXoFkBBqIrwvBPWsqy5FcpVrfiEDBJIDnU9Y/TT7Ftxk6U/dgLMuRF/Yyb7 O5rsMobChtELGTDO8qHnEGZl8N7UxdFl7hV7Rnm627scjS+OQy+REH0g4aU6k0CFmQ7i vAXW/097ZcTiDMTbu4F3NRHTmIbPnbLajhBrU/8IFegVo3ppw5MtzD05i1vs++eOiPo6 jzOrTpkecosVdjr5Hxn4A+XLEus2rTUzUl7nwyIlBQP5yIg0pzOWfnl5EsuXr/Xhgumh zjzB26L1e+OEIV3tzg7zZRwGMbSbvMHO4V62ObMUc21mIKlbaor7ot1+you/xDcfmJXy iRYg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=JOryM4YH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c10-20020a170903234a00b001ae453d07a3si6116608plh.539.2023.05.29.12.19.07; Mon, 29 May 2023 12:19:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=JOryM4YH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229708AbjE2TO3 (ORCPT + 99 others); Mon, 29 May 2023 15:14:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50850 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229455AbjE2TO1 (ORCPT ); Mon, 29 May 2023 15:14:27 -0400 Received: from smtpout.efficios.com (unknown [IPv6:2607:5300:203:b2ee::31e5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1DE9DC; Mon, 29 May 2023 12:14:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1685387664; bh=4iL71aqYe6mylgQi/BMhO/DWKge19ME73V8LmRUiL0k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JOryM4YHJrCJcZzUWSpChQleZcvut1kTGIIi7UFdFAE/7CiS+zzdzW052nujghcBL JJ2gaH2SB/CNXKa5sT3JBTRvmJrfA2VtAGeuUUCZEyUluxGWwzqu21PkZ5av/oK+zC QP9MryjatNN5GN28eOOM/mqsGR9Lz+5hERq2xXUD1G8BQbADOrtGcpWjMcQ4mw2SFJ dGSKFqu6RE3gThOyAXZ95I+LaTGDevfV15U5oYHnMi0JVXmVEmlqJqlYx4R9MxtQkW gIRMrhO8i6B5JFoya26rQiNKi6YV/2Qevg81Z5lPJKVbgjZidOktCFm3CxvPNUVe9S QpbUFNKg8Ts5w== Received: from localhost.localdomain (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4QVQFC4LtVz165R; Mon, 29 May 2023 15:14:23 -0400 (EDT) From: Mathieu Desnoyers To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , "Paul E . McKenney" , Boqun Feng , "H . Peter Anvin" , Paul Turner , linux-api@vger.kernel.org, Christian Brauner , Florian Weimer , David.Laight@ACULAB.COM, carlos@redhat.com, Peter Oskolkov , Alexander Mikhalitsyn , Chris Kennelly , Ingo Molnar , Darren Hart , Davidlohr Bueso , =?utf-8?q?Andr=C3=A9_Almeida?= , libc-alpha@sourceware.org, Steven Rostedt , Jonathan Corbet , Noah Goldstein , Daniel Colascione , longman@redhat.com, Mathieu Desnoyers Subject: [RFC PATCH v2 4/4] selftests/rseq: Implement rseq_mutex test program Date: Mon, 29 May 2023 15:14:16 -0400 Message-Id: <20230529191416.53955-5-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230529191416.53955-1-mathieu.desnoyers@efficios.com> References: <20230529191416.53955-1-mathieu.desnoyers@efficios.com> MIME-Version: 1.0 X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RDNS_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767257388724666961?= X-GMAIL-MSGID: =?utf-8?q?1767257388724666961?= Example use of the rseq sched state. Signed-off-by: Mathieu Desnoyers --- tools/testing/selftests/rseq/.gitignore | 1 + tools/testing/selftests/rseq/Makefile | 3 +- tools/testing/selftests/rseq/rseq_mutex.c | 120 ++++++++++++++++++++++ 3 files changed, 123 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/rseq/rseq_mutex.c diff --git a/tools/testing/selftests/rseq/.gitignore b/tools/testing/selftests/rseq/.gitignore index a8db9f7a7cec..38d5b2fe5905 100644 --- a/tools/testing/selftests/rseq/.gitignore +++ b/tools/testing/selftests/rseq/.gitignore @@ -10,3 +10,4 @@ param_test_mm_cid param_test_mm_cid_benchmark param_test_mm_cid_compare_twice sched_state_test +rseq_mutex diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile index 7c8f4f2be74c..a9d7ceb5b79b 100644 --- a/tools/testing/selftests/rseq/Makefile +++ b/tools/testing/selftests/rseq/Makefile @@ -14,7 +14,8 @@ OVERRIDE_TARGETS = 1 TEST_GEN_PROGS = basic_test basic_percpu_ops_test basic_percpu_ops_mm_cid_test param_test \ param_test_benchmark param_test_compare_twice param_test_mm_cid \ - param_test_mm_cid_benchmark param_test_mm_cid_compare_twice sched_state_test + param_test_mm_cid_benchmark param_test_mm_cid_compare_twice sched_state_test \ + rseq_mutex TEST_GEN_PROGS_EXTENDED = librseq.so diff --git a/tools/testing/selftests/rseq/rseq_mutex.c b/tools/testing/selftests/rseq/rseq_mutex.c new file mode 100644 index 000000000000..01afd6a0e8bd --- /dev/null +++ b/tools/testing/selftests/rseq/rseq_mutex.c @@ -0,0 +1,120 @@ +// SPDX-License-Identifier: LGPL-2.1 + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include + +#include "rseq.h" + +#define RSEQ_MUTEX_MAX_BUSY_LOOP 100 + +struct rseq_mutex { + /* + * When non-NULL, owner points to per-thread rseq_abi_sched_state of + * owner thread. + */ + struct rseq_abi_sched_state *owner; +}; + +static struct rseq_mutex lock = { .owner = NULL }; + +static int testvar; + +static void rseq_lock_slowpath(struct rseq_mutex *lock) +{ + int i = 0; + + for (;;) { + struct rseq_abi_sched_state *expected = NULL, *self = rseq_get_sched_state(rseq_get_abi()); + + if (__atomic_compare_exchange_n(&lock->owner, &expected, self, false, + __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) + break; + //TODO: use rseq critical section to protect dereference of owner thread's + //rseq_abi_sched_state, combined with rseq fence at thread reclaim. + if ((RSEQ_READ_ONCE(expected->state) & RSEQ_ABI_SCHED_STATE_FLAG_ON_CPU) && + i < RSEQ_MUTEX_MAX_BUSY_LOOP) { + rseq_barrier(); /* busy-wait, e.g. cpu_relax(). */ + i++; + } else { + //TODO: Enqueue waiter in a wait-queue, and integrate + //with sys_futex rather than waiting for 10ms. + (void) poll(NULL, 0, 10); /* wait 10ms */ + } + } +} + +static void rseq_lock(struct rseq_mutex *lock) +{ + struct rseq_abi_sched_state *expected = NULL, *self = rseq_get_sched_state(rseq_get_abi()); + + if (__atomic_compare_exchange_n(&lock->owner, &expected, self, false, + __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) + return; + rseq_lock_slowpath(lock); +} + +static void rseq_unlock(struct rseq_mutex *lock) +{ + __atomic_store_n(&lock->owner, NULL, __ATOMIC_RELEASE); + //TODO: integrate with sys_futex and wakeup oldest waiter. +} + +static +void *test_thread(void *arg) +{ + int i; + + if (rseq_register_current_thread()) { + fprintf(stderr, "Error: rseq_register_current_thread(...) failed(%d): %s\n", + errno, strerror(errno)); + abort(); + } + + for (i = 0; i < 1000; i++) { + int var; + + rseq_lock(&lock); + var = RSEQ_READ_ONCE(testvar); + if (var) { + fprintf(stderr, "Unexpected value %d\n", var); + abort(); + } + RSEQ_WRITE_ONCE(testvar, 1); + if (!(i % 10)) + (void) poll(NULL, 0, 10); + else + rseq_barrier(); + RSEQ_WRITE_ONCE(testvar, 0); + rseq_unlock(&lock); + } + + if (rseq_unregister_current_thread()) { + fprintf(stderr, "Error: rseq_unregister_current_thread(...) failed(%d): %s\n", + errno, strerror(errno)); + abort(); + } + return NULL; +} + +int main(int argc, char **argv) +{ + int nr_threads = 5; + pthread_t test_thread_id[nr_threads]; + int i; + + for (i = 0; i < nr_threads; i++) { + pthread_create(&test_thread_id[i], NULL, test_thread, NULL); + } + + for (i = 0; i < nr_threads; i++) { + pthread_join(test_thread_id[i], NULL); + } + + return 0; +}