From patchwork Thu Sep 21 10:45:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 142906 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5133812vqi; Thu, 21 Sep 2023 13:58:38 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEHG2YJ0ddJK7v9zAgmTgNP/rw6OmITjcC9cdnOYFQLpksT1BCE/C1JRToERNwSSi2W+8wZ X-Received: by 2002:a05:6808:311:b0:3a7:2690:94d5 with SMTP id i17-20020a056808031100b003a7269094d5mr6336189oie.8.1695329918525; Thu, 21 Sep 2023 13:58:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695329918; cv=none; d=google.com; s=arc-20160816; b=rlm0c+Xun/TG8U4JdIX4p12i6lAEWD5+1u7fA99FJ9ZjG/aQSns0UmaDZZ4fLYs4iP jgbvEPI4Uy1uIy7DShrO0mVKrQCuyxkWBB2kBQb1/quGI1DoUk5eAVAXzRydU0r6E3yj XDxKDMCgEtyhDf8k+G18HXL44mltui0D7fCQGDWgHVbk2m9HpZV7jv8sXy1naLgEk/EQ Gcc+YcJLmxnSsiJh+Kig2Xh/8yN1SZVJFzR9dwOR4UxJowNZfnmKmg4031q7/Z2+qnPx DirYUPH7SiG3wWWxw2tDk1XJzP1xh8oOOyRSzUDLBbLGt/idBbjgYX9bw9Z1BsAEzkDj Qn7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:references :subject:cc:to:from:date:user-agent:message-id:dkim-signature; bh=yq6smcRh9Lt9PMXVlulhUee9kIX8B6TjEGtSPSAL3jI=; fh=gKrHpzx5aMrklt5E8Fmce1sSUBKxYyMxbBSs7w4zkwQ=; b=Bxcp8vleA2EXx92wkveMEZe585QFcDq4CozpGe2As7zlvpJ0l0VJGg3ezGTIgszds9 l1NMHcCCwg8Nma4OwpGqB0/iSm41fltw5BlAwar4d+VyUhKdUgzOL7N0iTrW5LbrWdn9 QPBxPSPqBGMDYGzrHgUsTWxvZGpPZykxTUdm0L3kLaPvtTWLx7iWCao/KFPnbla3PkDA vJPweDjOZtjYwRCSmdRMFnCDzV3kQZsL3myJwAKiaWRD5bJjPEg/V/MPWtA5fWEMeJOi N0ZMjhKK7blpxxYr3EMfDPyn8q+WRwX6DZb0N4RQKoX+iXBXeRXA/wfbE6TeOrGyHuRm IX5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=MFTg6cMl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id by19-20020a056a02059300b00578b499200esi2508155pgb.84.2023.09.21.13.58.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 13:58:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=MFTg6cMl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id DEA5F829152A; Thu, 21 Sep 2023 13:22:43 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231584AbjIUUVs (ORCPT + 29 others); Thu, 21 Sep 2023 16:21:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231601AbjIUUVU (ORCPT ); Thu, 21 Sep 2023 16:21:20 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C8CD69038; Thu, 21 Sep 2023 10:31:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=yq6smcRh9Lt9PMXVlulhUee9kIX8B6TjEGtSPSAL3jI=; b=MFTg6cMl019oUQu6rDJgts+aeg 6Z+T8mLRp+uO7oQYQZw8Sm7/bPEVJfCOG+aED2yludqPec99o8HNyZPS/UwxNUu5ybxaL1HEhSQAn dbGtduFHWOeo64tStINJ7gZdIll94CL1nBUWrJCIh7duiDqXpGEwJDGf9UmYQHjclLEG2htYks1j2 k3TT1ZVJmPhNPbii9Vmr15VGN96F+S0BqQB8o5Be51SKNksVDbNp+cdAzrwSIkN9YnEHdQHtiZs0o FsdMmqM6DHVtTYVFoqyyPUjpMHPB4elIcHz6Lly3U5MtaUc3UieSMAgJP1uokBuXVd99H5hXDY8TC QnNwNEBg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qjHQJ-00FJvh-0q; Thu, 21 Sep 2023 11:00:45 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id E37B2300513; Thu, 21 Sep 2023 13:00:42 +0200 (CEST) Message-Id: <20230921105247.828934099@noisy.programming.kicks-ass.net> User-Agent: quilt/0.65 Date: Thu, 21 Sep 2023 12:45:09 +0200 From: peterz@infradead.org To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de Subject: [PATCH v3 04/15] futex: Validate futex value against futex size References: <20230921104505.717750284@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Disposition: inline; filename=peterz-futex2-enforce-bits.patch X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Thu, 21 Sep 2023 13:22:43 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777682264664315274 X-GMAIL-MSGID: 1777682264664315274 Ensure the futex value fits in the given futex size. Since this adds a constraint to an existing syscall, it might possibly change behaviour. Currently the value would be truncated to a u32 and any high bits would get silently lost. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Thomas Gleixner --- kernel/futex/futex.h | 10 ++++++++++ kernel/futex/syscalls.c | 3 +++ 2 files changed, 13 insertions(+) Index: linux-2.6/kernel/futex/futex.h =================================================================== --- linux-2.6.orig/kernel/futex/futex.h +++ linux-2.6/kernel/futex/futex.h @@ -85,6 +85,16 @@ static inline bool futex_flags_valid(uns return true; } +static inline bool futex_validate_input(unsigned int flags, u64 val) +{ + int bits = 8 * futex_size(flags); + + if (bits < 64 && (val >> bits)) + return false; + + return true; +} + #ifdef CONFIG_FAIL_FUTEX extern bool should_fail_futex(bool fshared); #else Index: linux-2.6/kernel/futex/syscalls.c =================================================================== --- linux-2.6.orig/kernel/futex/syscalls.c +++ linux-2.6/kernel/futex/syscalls.c @@ -209,6 +209,9 @@ static int futex_parse_waitv(struct fute if (!futex_flags_valid(flags)) return -EINVAL; + if (!futex_validate_input(flags, aux.val)) + return -EINVAL; + futexv[i].w.flags = flags; futexv[i].w.val = aux.val; futexv[i].w.uaddr = aux.uaddr; From patchwork Thu Sep 21 10:45:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 143080 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5265822vqi; Thu, 21 Sep 2023 19:01:04 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE6skL6RAobz38LkjKbT0Hymic/UXe3vP1SwFjBmmKsOOpH3aCOnsuDpbN9UL671b5sffpd X-Received: by 2002:a05:6358:724b:b0:133:b42:69ca with SMTP id i11-20020a056358724b00b001330b4269camr7796113rwa.18.1695348063689; Thu, 21 Sep 2023 19:01:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695348063; cv=none; d=google.com; s=arc-20160816; b=rHZReiE7wBMnF+1B7Tzt0ZD00VAPg06b+8ALuuiTuU1NI9tVTgmF3C0nY5Uampu/mV vf1uTg4tkefJKydmanGDJK1mhG5z0Ad52gJB/8c0NuIqvm0u3KYZAVE7K1LbgxOOkcAW M6aCSWWS7hfObI0bE2iUYB4ORaV3HaVrT1MsNaC0BjRmHEMv4PILS3cNDEkh3jwqrLnl zf2xULaBkaXvNkjW8eI3dcEdIC4doeD+y+FHridFvgHnU9vuJ3cjPElrBnMNTo8YYSIn riA1rGSHTEvvDlS/Y6X0Xd2bkUUGsPjcgcdbXlzvEt7FnaoGPLCfEk5wh844iMV3uDfL laaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:references :subject:cc:to:from:date:user-agent:message-id:dkim-signature; bh=2DTgjumfpm1XkZ715bEL5OW+Lzt/Am5TA3vJOTBLXZU=; fh=KVSSIH4XiUdyS6GzQPVkCH2RTBvSntROxyBgLavxfwI=; b=DKJAfRjxwKjz2vRBvPFfacy9DL8zfGgiEx1I20gCm9CSo4+xPhupv/ee1GX1QVcZY4 Wyt+ah1/zKuF0b4Dgb5yQtdxJLuC/3YZOO9ZOYn/qfKuwAWOhUomLYz65dIUULTPwZZW ZRYZez9FeSSnKKhwAk40+9N9O88wBEBOvDuxheb5k54RjADDllRlLWYg2I1NfoWqRp0l 0KjTK/WnjMewLjLOdRhRo8PK57WGjbsLMTlOzTgoXDXKPGOOplAIOQfVvawvHH46obdC U/fXptDzfhIWxZwdmw0lS0BDpj9oWXDmLhwuG8A8oy7cFc0NbB1+vzF32OunY8LWA9He IWnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=XtAFaLhm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id a63-20020a639042000000b00577f65baa3dsi2666747pge.849.2023.09.21.19.01.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 19:01:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=XtAFaLhm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id D15DC8319232; Thu, 21 Sep 2023 14:03:53 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232544AbjIUVDo (ORCPT + 29 others); Thu, 21 Sep 2023 17:03:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54602 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229990AbjIUVDR (ORCPT ); Thu, 21 Sep 2023 17:03:17 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC80535AD; Thu, 21 Sep 2023 11:10:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=2DTgjumfpm1XkZ715bEL5OW+Lzt/Am5TA3vJOTBLXZU=; b=XtAFaLhmRfTSIAepdy7Z15s9Wy b1BqY+tCKhG7iyZKLYYRNBZJwy94StzYUTqTRb4lHlI1R2dX7EN+UPkoJULC5vToF96NtZbh7DJNr pTY0V1ZnfaiE7UbbbZJByPuN2HXlor7S6wiQ/LCk7IplNknLy93ZOB92k8zpLDFoFIiUeIsaRFSe0 +XpYMy8qnNT2sPEYPDdBo65fv2TLQ/McFO5r8jj1XGaBf5mkd6OJSbAm2arUAPFUaGHhGx0kK0UNg //TgiNaI26sQrDaKAHt9H9bCJt3yGOC9sNcWBcNm0DUfZlETWKVXDYWK3X05N72GmLWsjWXc9s9d4 Cfr8LB7Q==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1qjHQO-00BTok-R7; Thu, 21 Sep 2023 11:00:48 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id E749B300585; Thu, 21 Sep 2023 13:00:42 +0200 (CEST) Message-Id: <20230921105247.936205525@noisy.programming.kicks-ass.net> User-Agent: quilt/0.65 Date: Thu, 21 Sep 2023 12:45:10 +0200 From: peterz@infradead.org To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, Geert Uytterhoeven Subject: [PATCH v3 05/15] futex: Add sys_futex_wake() References: <20230921104505.717750284@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Disposition: inline; filename=peterz-futex2-wake.patch X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 21 Sep 2023 14:03:53 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777701291274104809 X-GMAIL-MSGID: 1777701291274104809 To complement sys_futex_waitv() add sys_futex_wake(). This syscall implements what was previously known as FUTEX_WAKE_BITSET except it uses 'unsigned long' for the bitmask and takes FUTEX2 flags. The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Thomas Gleixner Acked-by: Geert Uytterhoeven --- arch/alpha/kernel/syscalls/syscall.tbl | 1 arch/arm/tools/syscall.tbl | 1 arch/arm64/include/asm/unistd.h | 2 - arch/arm64/include/asm/unistd32.h | 2 + arch/ia64/kernel/syscalls/syscall.tbl | 1 arch/m68k/kernel/syscalls/syscall.tbl | 1 arch/microblaze/kernel/syscalls/syscall.tbl | 1 arch/mips/kernel/syscalls/syscall_n32.tbl | 1 arch/mips/kernel/syscalls/syscall_n64.tbl | 1 arch/mips/kernel/syscalls/syscall_o32.tbl | 1 arch/parisc/kernel/syscalls/syscall.tbl | 1 arch/powerpc/kernel/syscalls/syscall.tbl | 1 arch/s390/kernel/syscalls/syscall.tbl | 1 arch/sh/kernel/syscalls/syscall.tbl | 1 arch/sparc/kernel/syscalls/syscall.tbl | 1 arch/x86/entry/syscalls/syscall_32.tbl | 1 arch/x86/entry/syscalls/syscall_64.tbl | 1 arch/xtensa/kernel/syscalls/syscall.tbl | 1 include/linux/syscalls.h | 3 ++ include/uapi/asm-generic/unistd.h | 5 ++-- kernel/futex/syscalls.c | 33 ++++++++++++++++++++++++++++ kernel/sys_ni.c | 1 22 files changed, 59 insertions(+), 3 deletions(-) Index: linux-2.6/arch/alpha/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/alpha/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/alpha/kernel/syscalls/syscall.tbl @@ -492,3 +492,4 @@ 560 common set_mempolicy_home_node sys_ni_syscall 561 common cachestat sys_cachestat 562 common fchmodat2 sys_fchmodat2 +563 common futex_wake sys_futex_wake Index: linux-2.6/arch/arm/tools/syscall.tbl =================================================================== --- linux-2.6.orig/arch/arm/tools/syscall.tbl +++ linux-2.6/arch/arm/tools/syscall.tbl @@ -466,3 +466,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +454 common futex_wake sys_futex_wake Index: linux-2.6/arch/arm64/include/asm/unistd.h =================================================================== --- linux-2.6.orig/arch/arm64/include/asm/unistd.h +++ linux-2.6/arch/arm64/include/asm/unistd.h @@ -39,7 +39,7 @@ #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 453 +#define __NR_compat_syscalls 455 #endif #define __ARCH_WANT_SYS_CLONE Index: linux-2.6/arch/arm64/include/asm/unistd32.h =================================================================== --- linux-2.6.orig/arch/arm64/include/asm/unistd32.h +++ linux-2.6/arch/arm64/include/asm/unistd32.h @@ -911,6 +911,8 @@ __SYSCALL(__NR_set_mempolicy_home_node, __SYSCALL(__NR_cachestat, sys_cachestat) #define __NR_fchmodat2 452 __SYSCALL(__NR_fchmodat2, sys_fchmodat2) +#define __NR_futex_wake 454 +__SYSCALL(__NR_futex_wake, sys_futex_wake) /* * Please add new compat syscalls above this comment and update Index: linux-2.6/arch/ia64/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/ia64/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/ia64/kernel/syscalls/syscall.tbl @@ -373,3 +373,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +454 common futex_wake sys_futex_wake Index: linux-2.6/arch/m68k/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/m68k/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/m68k/kernel/syscalls/syscall.tbl @@ -452,3 +452,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +454 common futex_wake sys_futex_wake Index: linux-2.6/arch/microblaze/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/microblaze/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/microblaze/kernel/syscalls/syscall.tbl @@ -458,3 +458,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +454 common futex_wake sys_futex_wake Index: linux-2.6/arch/mips/kernel/syscalls/syscall_n32.tbl =================================================================== --- linux-2.6.orig/arch/mips/kernel/syscalls/syscall_n32.tbl +++ linux-2.6/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -391,3 +391,4 @@ 450 n32 set_mempolicy_home_node sys_set_mempolicy_home_node 451 n32 cachestat sys_cachestat 452 n32 fchmodat2 sys_fchmodat2 +454 n32 futex_wake sys_futex_wake Index: linux-2.6/arch/mips/kernel/syscalls/syscall_n64.tbl =================================================================== --- linux-2.6.orig/arch/mips/kernel/syscalls/syscall_n64.tbl +++ linux-2.6/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -367,3 +367,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 n64 cachestat sys_cachestat 452 n64 fchmodat2 sys_fchmodat2 +454 n64 futex_wake sys_futex_wake Index: linux-2.6/arch/mips/kernel/syscalls/syscall_o32.tbl =================================================================== --- linux-2.6.orig/arch/mips/kernel/syscalls/syscall_o32.tbl +++ linux-2.6/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -440,3 +440,4 @@ 450 o32 set_mempolicy_home_node sys_set_mempolicy_home_node 451 o32 cachestat sys_cachestat 452 o32 fchmodat2 sys_fchmodat2 +454 o32 futex_wake sys_futex_wake Index: linux-2.6/arch/parisc/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/parisc/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/parisc/kernel/syscalls/syscall.tbl @@ -451,3 +451,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +454 common futex_wake sys_futex_wake Index: linux-2.6/arch/powerpc/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/powerpc/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/powerpc/kernel/syscalls/syscall.tbl @@ -539,3 +539,4 @@ 450 nospu set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +454 common futex_wake sys_futex_wake Index: linux-2.6/arch/s390/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/s390/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/s390/kernel/syscalls/syscall.tbl @@ -455,3 +455,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 sys_fchmodat2 +454 common futex_wake sys_futex_wake sys_futex_wake Index: linux-2.6/arch/sh/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/sh/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/sh/kernel/syscalls/syscall.tbl @@ -455,3 +455,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +454 common futex_wake sys_futex_wake Index: linux-2.6/arch/sparc/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/sparc/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/sparc/kernel/syscalls/syscall.tbl @@ -498,3 +498,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +454 common futex_wake sys_futex_wake Index: linux-2.6/arch/x86/entry/syscalls/syscall_32.tbl =================================================================== --- linux-2.6.orig/arch/x86/entry/syscalls/syscall_32.tbl +++ linux-2.6/arch/x86/entry/syscalls/syscall_32.tbl @@ -457,3 +457,4 @@ 450 i386 set_mempolicy_home_node sys_set_mempolicy_home_node 451 i386 cachestat sys_cachestat 452 i386 fchmodat2 sys_fchmodat2 +454 i386 futex_wake sys_futex_wake Index: linux-2.6/arch/x86/entry/syscalls/syscall_64.tbl =================================================================== --- linux-2.6.orig/arch/x86/entry/syscalls/syscall_64.tbl +++ linux-2.6/arch/x86/entry/syscalls/syscall_64.tbl @@ -375,6 +375,7 @@ 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 453 64 map_shadow_stack sys_map_shadow_stack +454 common futex_wake sys_futex_wake # # Due to a historical design error, certain syscalls are numbered differently Index: linux-2.6/arch/xtensa/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/xtensa/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/xtensa/kernel/syscalls/syscall.tbl @@ -423,3 +423,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +454 common futex_wake sys_futex_wake Index: linux-2.6/include/linux/syscalls.h =================================================================== --- linux-2.6.orig/include/linux/syscalls.h +++ linux-2.6/include/linux/syscalls.h @@ -549,6 +549,9 @@ asmlinkage long sys_set_robust_list(stru asmlinkage long sys_futex_waitv(struct futex_waitv *waiters, unsigned int nr_futexes, unsigned int flags, struct __kernel_timespec __user *timeout, clockid_t clockid); + +asmlinkage long sys_futex_wake(void __user *uaddr, unsigned long mask, int nr, unsigned int flags); + asmlinkage long sys_nanosleep(struct __kernel_timespec __user *rqtp, struct __kernel_timespec __user *rmtp); asmlinkage long sys_nanosleep_time32(struct old_timespec32 __user *rqtp, Index: linux-2.6/include/uapi/asm-generic/unistd.h =================================================================== --- linux-2.6.orig/include/uapi/asm-generic/unistd.h +++ linux-2.6/include/uapi/asm-generic/unistd.h @@ -822,9 +822,11 @@ __SYSCALL(__NR_cachestat, sys_cachestat) #define __NR_fchmodat2 452 __SYSCALL(__NR_fchmodat2, sys_fchmodat2) +#define __NR_futex_wake 454 +__SYSCALL(__NR_futex_wake, sys_futex_wake) #undef __NR_syscalls -#define __NR_syscalls 453 +#define __NR_syscalls 455 /* * 32 bit systems traditionally used different Index: linux-2.6/kernel/futex/syscalls.c =================================================================== --- linux-2.6.orig/kernel/futex/syscalls.c +++ linux-2.6/kernel/futex/syscalls.c @@ -306,6 +306,36 @@ destroy_timer: return ret; } +/* + * sys_futex_wake - Wake a number of futexes + * @uaddr: Address of the futex(es) to wake + * @mask: bitmask + * @nr: Number of the futexes to wake + * @flags: FUTEX2 flags + * + * Identical to the traditional FUTEX_WAKE_BITSET op, except it is part of the + * futex2 family of calls. + */ + +SYSCALL_DEFINE4(futex_wake, + void __user *, uaddr, + unsigned long, mask, + int, nr, + unsigned int, flags) +{ + if (flags & ~FUTEX2_VALID_MASK) + return -EINVAL; + + flags = futex2_to_flags(flags); + if (!futex_flags_valid(flags)) + return -EINVAL; + + if (!futex_validate_input(flags, mask)) + return -EINVAL; + + return futex_wake(uaddr, flags, nr, mask); +} + #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE2(set_robust_list, struct compat_robust_list_head __user *, head, Index: linux-2.6/kernel/sys_ni.c =================================================================== --- linux-2.6.orig/kernel/sys_ni.c +++ linux-2.6/kernel/sys_ni.c @@ -87,6 +87,7 @@ COND_SYSCALL_COMPAT(set_robust_list); COND_SYSCALL(get_robust_list); COND_SYSCALL_COMPAT(get_robust_list); COND_SYSCALL(futex_waitv); +COND_SYSCALL(futex_wake); COND_SYSCALL(kexec_load); COND_SYSCALL_COMPAT(kexec_load); COND_SYSCALL(init_module); From patchwork Thu Sep 21 10:45:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 143024 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5225332vqi; Thu, 21 Sep 2023 17:15:24 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE1+z1edoIJIUOJ6QrAf2NXAvBLqAZNBAY3QN3TJiKc0AFgA3j1QLqPEwPO7/uwx2uI0LGH X-Received: by 2002:a05:6a20:7483:b0:14d:d636:ed3a with SMTP id p3-20020a056a20748300b0014dd636ed3amr8050258pzd.23.1695341724111; Thu, 21 Sep 2023 17:15:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695341724; cv=none; d=google.com; s=arc-20160816; b=Smtqr8ZGvyTwmYqBsleP1iVeLwjfAspThjpJpz6Ext75Etr2iUKrLuUoClXWCfgrMX ZYn0iK+jDd/gbVuLlYAJaKRWvWxH9+1UurSsB5sUU4OJSq/veNK80BLm8py3pB7xlbip Y5XfYIwIPP/lrRBoyrWt03kiQaS0JEaRrXe6eCGAWS1Oo0gfswKUeATG+Qx7GGGm6xqA 774v2sWThnyKZ4dhIRG62AwOCvFj5pZkwgHIHwSUfkPTCB9CFKyD/TZmY3HPAWS1/GX6 UB/f/RH7MFX0IE2MuTFPv8zzLsc6DVsJXzONCEFdNi5junZb4/dqGCK/wVZRaOfTB3ft RC8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:references :subject:cc:to:from:date:user-agent:message-id:dkim-signature; bh=xsbwJJWQRl6v4i26c9uvT221icEnXzjXoYJDfVyH/TQ=; fh=KVSSIH4XiUdyS6GzQPVkCH2RTBvSntROxyBgLavxfwI=; b=C0y2JIFleHUZU87rI14sxi5IzcQ9L3o7r/GSaF4QZXyHFTsM7t5rFEVLlDWb6eFKx3 oeaKsnOFimeFY2AB4lRYRAsGLxsOoYdhD0Qea5+yJ8Pe8i3HqUT7pO3dDG2tDOZiLyGA 6C1ddiwgq6YcpNCgHh+hEZaBBUF7Rd/LsYDKY2+bOE78MUwpnh0g8Y2YQEGkwRXhaxhj h8EiLtWx104rrYE68BWe5ONlHOatXGPIbt3gdU5gZU8vSM+o70n+cH+HEai4p1DT993x gONooVSjPrJIOMPrM7MWKI+iYubymz0DLdqw71I2UTB39ryvJMztye0v8cnK8ki8JgtF JuKg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b="Fo6P3/Mv"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id s19-20020a170903201300b001c353153012si2428228pla.415.2023.09.21.17.15.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 17:15:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b="Fo6P3/Mv"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id C99E680B26BE; Thu, 21 Sep 2023 11:36:57 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230090AbjIUSgP (ORCPT + 29 others); Thu, 21 Sep 2023 14:36:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229903AbjIUSfw (ORCPT ); Thu, 21 Sep 2023 14:35:52 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 078EECD4; Thu, 21 Sep 2023 11:10:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=xsbwJJWQRl6v4i26c9uvT221icEnXzjXoYJDfVyH/TQ=; b=Fo6P3/MvFvvR8pAfvb8nlrxdCk OnjgXIeagZ73bqEbOHv6sBpNm7aVbhIlfhvEUXSLSnY4lqUuTO+9u/I5QWw63Es2TmQoz5JwiWcz1 h5ceHTW6+tvyHmyl4oVHIM0R8pNJcJBfb/qeMYFMRCTx7WpeGf63PPlwkbUsmMJoWVCW7l+ri2x8v IK4+6rqRjm1NZGD/5ohORG+LiwpYxohmX1WOQkLzfuVBgizIDskFfDVqId6iY9Ue30gXrbnYxrbDJ vaUy+33vOL82TVWcuPuhVg8P8d1Aq2vITDIv3t2f7gpt0Z+r5aQJA5k2P68okHqmFMYru4zQXSqoj 76ytotLw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1qjHQO-00BTol-Tz; Thu, 21 Sep 2023 11:00:49 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 015323005AA; Thu, 21 Sep 2023 13:00:42 +0200 (CEST) Message-Id: <20230921105248.164324363@noisy.programming.kicks-ass.net> User-Agent: quilt/0.65 Date: Thu, 21 Sep 2023 12:45:12 +0200 From: peterz@infradead.org To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, Geert Uytterhoeven Subject: [PATCH v3 07/15] futex: Add sys_futex_wait() References: <20230921104505.717750284@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Disposition: inline; filename=peterz-futex2-wait.patch X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Thu, 21 Sep 2023 11:36:57 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777694643576929644 X-GMAIL-MSGID: 1777694643576929644 To complement sys_futex_waitv()/wake(), add sys_futex_wait(). This syscall implements what was previously known as FUTEX_WAIT_BITSET except it uses 'unsigned long' for the value and bitmask arguments, takes timespec and clockid_t arguments for the absolute timeout and uses FUTEX2 flags. The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Thomas Gleixner Acked-by: Geert Uytterhoeven --- arch/alpha/kernel/syscalls/syscall.tbl | 1 arch/arm/tools/syscall.tbl | 1 arch/arm64/include/asm/unistd.h | 2 arch/arm64/include/asm/unistd32.h | 2 arch/ia64/kernel/syscalls/syscall.tbl | 1 arch/m68k/kernel/syscalls/syscall.tbl | 1 arch/microblaze/kernel/syscalls/syscall.tbl | 1 arch/mips/kernel/syscalls/syscall_n32.tbl | 1 arch/mips/kernel/syscalls/syscall_n64.tbl | 1 arch/mips/kernel/syscalls/syscall_o32.tbl | 1 arch/parisc/kernel/syscalls/syscall.tbl | 1 arch/powerpc/kernel/syscalls/syscall.tbl | 1 arch/s390/kernel/syscalls/syscall.tbl | 1 arch/sh/kernel/syscalls/syscall.tbl | 1 arch/sparc/kernel/syscalls/syscall.tbl | 1 arch/x86/entry/syscalls/syscall_32.tbl | 1 arch/x86/entry/syscalls/syscall_64.tbl | 1 arch/xtensa/kernel/syscalls/syscall.tbl | 1 include/linux/syscalls.h | 4 include/uapi/asm-generic/unistd.h | 4 kernel/futex/futex.h | 3 kernel/futex/syscalls.c | 120 +++++++++++++++++++++------- kernel/futex/waitwake.c | 67 +++++++++------ kernel/sys_ni.c | 1 24 files changed, 159 insertions(+), 60 deletions(-) Index: linux-2.6/arch/alpha/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/alpha/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/alpha/kernel/syscalls/syscall.tbl @@ -493,3 +493,4 @@ 561 common cachestat sys_cachestat 562 common fchmodat2 sys_fchmodat2 563 common futex_wake sys_futex_wake +564 common futex_wait sys_futex_wait Index: linux-2.6/arch/arm/tools/syscall.tbl =================================================================== --- linux-2.6.orig/arch/arm/tools/syscall.tbl +++ linux-2.6/arch/arm/tools/syscall.tbl @@ -467,3 +467,4 @@ 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake +455 common futex_wait sys_futex_wait Index: linux-2.6/arch/arm64/include/asm/unistd.h =================================================================== --- linux-2.6.orig/arch/arm64/include/asm/unistd.h +++ linux-2.6/arch/arm64/include/asm/unistd.h @@ -39,7 +39,7 @@ #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 455 +#define __NR_compat_syscalls 456 #endif #define __ARCH_WANT_SYS_CLONE Index: linux-2.6/arch/arm64/include/asm/unistd32.h =================================================================== --- linux-2.6.orig/arch/arm64/include/asm/unistd32.h +++ linux-2.6/arch/arm64/include/asm/unistd32.h @@ -913,6 +913,8 @@ __SYSCALL(__NR_cachestat, sys_cachestat) __SYSCALL(__NR_fchmodat2, sys_fchmodat2) #define __NR_futex_wake 454 __SYSCALL(__NR_futex_wake, sys_futex_wake) +#define __NR_futex_wait 455 +__SYSCALL(__NR_futex_wait, sys_futex_wait) /* * Please add new compat syscalls above this comment and update Index: linux-2.6/arch/ia64/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/ia64/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/ia64/kernel/syscalls/syscall.tbl @@ -374,3 +374,4 @@ 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake +455 common futex_wait sys_futex_wait Index: linux-2.6/arch/m68k/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/m68k/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/m68k/kernel/syscalls/syscall.tbl @@ -453,3 +453,4 @@ 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake +455 common futex_wait sys_futex_wait Index: linux-2.6/arch/microblaze/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/microblaze/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/microblaze/kernel/syscalls/syscall.tbl @@ -459,3 +459,4 @@ 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake +455 common futex_wait sys_futex_wait Index: linux-2.6/arch/mips/kernel/syscalls/syscall_n32.tbl =================================================================== --- linux-2.6.orig/arch/mips/kernel/syscalls/syscall_n32.tbl +++ linux-2.6/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -392,3 +392,4 @@ 451 n32 cachestat sys_cachestat 452 n32 fchmodat2 sys_fchmodat2 454 n32 futex_wake sys_futex_wake +455 n32 futex_wait sys_futex_wait Index: linux-2.6/arch/mips/kernel/syscalls/syscall_n64.tbl =================================================================== --- linux-2.6.orig/arch/mips/kernel/syscalls/syscall_n64.tbl +++ linux-2.6/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -368,3 +368,4 @@ 451 n64 cachestat sys_cachestat 452 n64 fchmodat2 sys_fchmodat2 454 n64 futex_wake sys_futex_wake +455 n64 futex_wait sys_futex_wait Index: linux-2.6/arch/mips/kernel/syscalls/syscall_o32.tbl =================================================================== --- linux-2.6.orig/arch/mips/kernel/syscalls/syscall_o32.tbl +++ linux-2.6/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -441,3 +441,4 @@ 451 o32 cachestat sys_cachestat 452 o32 fchmodat2 sys_fchmodat2 454 o32 futex_wake sys_futex_wake +455 o32 futex_wait sys_futex_wait Index: linux-2.6/arch/parisc/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/parisc/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/parisc/kernel/syscalls/syscall.tbl @@ -452,3 +452,4 @@ 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake +455 common futex_wait sys_futex_wait Index: linux-2.6/arch/powerpc/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/powerpc/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/powerpc/kernel/syscalls/syscall.tbl @@ -540,3 +540,4 @@ 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake +455 common futex_wait sys_futex_wait Index: linux-2.6/arch/s390/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/s390/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/s390/kernel/syscalls/syscall.tbl @@ -456,3 +456,4 @@ 451 common cachestat sys_cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake sys_futex_wake +455 common futex_wait sys_futex_wait sys_futex_wait Index: linux-2.6/arch/sh/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/sh/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/sh/kernel/syscalls/syscall.tbl @@ -456,3 +456,4 @@ 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake +455 common futex_wait sys_futex_wait Index: linux-2.6/arch/sparc/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/sparc/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/sparc/kernel/syscalls/syscall.tbl @@ -499,3 +499,4 @@ 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake +455 common futex_wait sys_futex_wait Index: linux-2.6/arch/x86/entry/syscalls/syscall_32.tbl =================================================================== --- linux-2.6.orig/arch/x86/entry/syscalls/syscall_32.tbl +++ linux-2.6/arch/x86/entry/syscalls/syscall_32.tbl @@ -458,3 +458,4 @@ 451 i386 cachestat sys_cachestat 452 i386 fchmodat2 sys_fchmodat2 454 i386 futex_wake sys_futex_wake +455 i386 futex_wait sys_futex_wait Index: linux-2.6/arch/x86/entry/syscalls/syscall_64.tbl =================================================================== --- linux-2.6.orig/arch/x86/entry/syscalls/syscall_64.tbl +++ linux-2.6/arch/x86/entry/syscalls/syscall_64.tbl @@ -376,6 +376,7 @@ 452 common fchmodat2 sys_fchmodat2 453 64 map_shadow_stack sys_map_shadow_stack 454 common futex_wake sys_futex_wake +455 common futex_wait sys_futex_wait # # Due to a historical design error, certain syscalls are numbered differently Index: linux-2.6/arch/xtensa/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/xtensa/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/xtensa/kernel/syscalls/syscall.tbl @@ -424,3 +424,4 @@ 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake +455 common futex_wait sys_futex_wait Index: linux-2.6/include/linux/syscalls.h =================================================================== --- linux-2.6.orig/include/linux/syscalls.h +++ linux-2.6/include/linux/syscalls.h @@ -552,6 +552,10 @@ asmlinkage long sys_futex_waitv(struct f asmlinkage long sys_futex_wake(void __user *uaddr, unsigned long mask, int nr, unsigned int flags); +asmlinkage long sys_futex_wait(void __user *uaddr, unsigned long val, unsigned long mask, + unsigned int flags, struct __kernel_timespec __user *timespec, + clockid_t clockid); + asmlinkage long sys_nanosleep(struct __kernel_timespec __user *rqtp, struct __kernel_timespec __user *rmtp); asmlinkage long sys_nanosleep_time32(struct old_timespec32 __user *rqtp, Index: linux-2.6/include/uapi/asm-generic/unistd.h =================================================================== --- linux-2.6.orig/include/uapi/asm-generic/unistd.h +++ linux-2.6/include/uapi/asm-generic/unistd.h @@ -824,9 +824,11 @@ __SYSCALL(__NR_cachestat, sys_cachestat) __SYSCALL(__NR_fchmodat2, sys_fchmodat2) #define __NR_futex_wake 454 __SYSCALL(__NR_futex_wake, sys_futex_wake) +#define __NR_futex_wait 455 +__SYSCALL(__NR_futex_wait, sys_futex_wait) #undef __NR_syscalls -#define __NR_syscalls 455 +#define __NR_syscalls 456 /* * 32 bit systems traditionally used different Index: linux-2.6/kernel/futex/futex.h =================================================================== --- linux-2.6.orig/kernel/futex/futex.h +++ linux-2.6/kernel/futex/futex.h @@ -332,6 +332,9 @@ extern int futex_requeue(u32 __user *uad u32 __user *uaddr2, int nr_wake, int nr_requeue, u32 *cmpval, int requeue_pi); +extern int __futex_wait(u32 __user *uaddr, unsigned int flags, u32 val, + struct hrtimer_sleeper *to, u32 bitset); + extern int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val, ktime_t *abs_time, u32 bitset); Index: linux-2.6/kernel/futex/syscalls.c =================================================================== --- linux-2.6.orig/kernel/futex/syscalls.c +++ linux-2.6/kernel/futex/syscalls.c @@ -221,6 +221,46 @@ static int futex_parse_waitv(struct fute return 0; } +static int futex2_setup_timeout(struct __kernel_timespec __user *timeout, + clockid_t clockid, struct hrtimer_sleeper *to) +{ + int flag_clkid = 0, flag_init = 0; + struct timespec64 ts; + ktime_t time; + int ret; + + if (!timeout) + return 0; + + if (clockid == CLOCK_REALTIME) { + flag_clkid = FLAGS_CLOCKRT; + flag_init = FUTEX_CLOCK_REALTIME; + } + + if (clockid != CLOCK_REALTIME && clockid != CLOCK_MONOTONIC) + return -EINVAL; + + if (get_timespec64(&ts, timeout)) + return -EFAULT; + + /* + * Since there's no opcode for futex_waitv, use + * FUTEX_WAIT_BITSET that uses absolute timeout as well + */ + ret = futex_init_timeout(FUTEX_WAIT_BITSET, flag_init, &ts, &time); + if (ret) + return ret; + + futex_setup_timer(&time, to, flag_clkid, 0); + return 0; +} + +static inline void futex2_destroy_timeout(struct hrtimer_sleeper *to) +{ + hrtimer_cancel(&to->timer); + destroy_hrtimer_on_stack(&to->timer); +} + /** * sys_futex_waitv - Wait on a list of futexes * @waiters: List of futexes to wait on @@ -250,8 +290,6 @@ SYSCALL_DEFINE5(futex_waitv, struct fute { struct hrtimer_sleeper to; struct futex_vector *futexv; - struct timespec64 ts; - ktime_t time; int ret; /* This syscall supports no flags for now */ @@ -261,30 +299,8 @@ SYSCALL_DEFINE5(futex_waitv, struct fute if (!nr_futexes || nr_futexes > FUTEX_WAITV_MAX || !waiters) return -EINVAL; - if (timeout) { - int flag_clkid = 0, flag_init = 0; - - if (clockid == CLOCK_REALTIME) { - flag_clkid = FLAGS_CLOCKRT; - flag_init = FUTEX_CLOCK_REALTIME; - } - - if (clockid != CLOCK_REALTIME && clockid != CLOCK_MONOTONIC) - return -EINVAL; - - if (get_timespec64(&ts, timeout)) - return -EFAULT; - - /* - * Since there's no opcode for futex_waitv, use - * FUTEX_WAIT_BITSET that uses absolute timeout as well - */ - ret = futex_init_timeout(FUTEX_WAIT_BITSET, flag_init, &ts, &time); - if (ret) - return ret; - - futex_setup_timer(&time, &to, flag_clkid, 0); - } + if (timeout && (ret = futex2_setup_timeout(timeout, clockid, &to))) + return ret; futexv = kcalloc(nr_futexes, sizeof(*futexv), GFP_KERNEL); if (!futexv) { @@ -299,10 +315,8 @@ SYSCALL_DEFINE5(futex_waitv, struct fute kfree(futexv); destroy_timer: - if (timeout) { - hrtimer_cancel(&to.timer); - destroy_hrtimer_on_stack(&to.timer); - } + if (timeout) + futex2_destroy_timeout(&to); return ret; } @@ -336,6 +350,52 @@ SYSCALL_DEFINE4(futex_wake, return futex_wake(uaddr, FLAGS_STRICT | flags, nr, mask); } +/* + * sys_futex_wait - Wait on a futex + * @uaddr: Address of the futex to wait on + * @val: Value of @uaddr + * @mask: bitmask + * @flags: FUTEX2 flags + * @timeout: Optional absolute timeout + * @clockid: Clock to be used for the timeout, realtime or monotonic + * + * Identical to the traditional FUTEX_WAIT_BITSET op, except it is part of the + * futex2 familiy of calls. + */ + +SYSCALL_DEFINE6(futex_wait, + void __user *, uaddr, + unsigned long, val, + unsigned long, mask, + unsigned int, flags, + struct __kernel_timespec __user *, timeout, + clockid_t, clockid) +{ + struct hrtimer_sleeper to; + int ret; + + if (flags & ~FUTEX2_VALID_MASK) + return -EINVAL; + + flags = futex2_to_flags(flags); + if (!futex_flags_valid(flags)) + return -EINVAL; + + if (!futex_validate_input(flags, val) || + !futex_validate_input(flags, mask)) + return -EINVAL; + + if (timeout && (ret = futex2_setup_timeout(timeout, clockid, &to))) + return ret; + + ret = __futex_wait(uaddr, flags, val, timeout ? &to : NULL, mask); + + if (timeout) + futex2_destroy_timeout(&to); + + return ret; +} + #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE2(set_robust_list, struct compat_robust_list_head __user *, head, Index: linux-2.6/kernel/futex/waitwake.c =================================================================== --- linux-2.6.orig/kernel/futex/waitwake.c +++ linux-2.6/kernel/futex/waitwake.c @@ -632,20 +632,18 @@ retry_private: return ret; } -int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val, ktime_t *abs_time, u32 bitset) +int __futex_wait(u32 __user *uaddr, unsigned int flags, u32 val, + struct hrtimer_sleeper *to, u32 bitset) { - struct hrtimer_sleeper timeout, *to; - struct restart_block *restart; - struct futex_hash_bucket *hb; struct futex_q q = futex_q_init; + struct futex_hash_bucket *hb; int ret; if (!bitset) return -EINVAL; + q.bitset = bitset; - to = futex_setup_timer(abs_time, &timeout, flags, - current->timer_slack_ns); retry: /* * Prepare to wait on uaddr. On success, it holds hb->lock and q @@ -653,18 +651,17 @@ retry: */ ret = futex_wait_setup(uaddr, val, flags, &q, &hb); if (ret) - goto out; + return ret; /* futex_queue and wait for wakeup, timeout, or a signal. */ futex_wait_queue(hb, &q, to); /* If we were woken (and unqueued), we succeeded, whatever. */ - ret = 0; if (!futex_unqueue(&q)) - goto out; - ret = -ETIMEDOUT; + return 0; + if (to && !to->task) - goto out; + return -ETIMEDOUT; /* * We expect signal_pending(current), but we might be the @@ -673,24 +670,38 @@ retry: if (!signal_pending(current)) goto retry; - ret = -ERESTARTSYS; - if (!abs_time) - goto out; - - restart = ¤t->restart_block; - restart->futex.uaddr = uaddr; - restart->futex.val = val; - restart->futex.time = *abs_time; - restart->futex.bitset = bitset; - restart->futex.flags = flags | FLAGS_HAS_TIMEOUT; - - ret = set_restart_fn(restart, futex_wait_restart); - -out: - if (to) { - hrtimer_cancel(&to->timer); - destroy_hrtimer_on_stack(&to->timer); + return -ERESTARTSYS; +} + +int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val, ktime_t *abs_time, u32 bitset) +{ + struct hrtimer_sleeper timeout, *to; + struct restart_block *restart; + int ret; + + to = futex_setup_timer(abs_time, &timeout, flags, + current->timer_slack_ns); + + ret = __futex_wait(uaddr, flags, val, to, bitset); + + /* No timeout, nothing to clean up. */ + if (!to) + return ret; + + hrtimer_cancel(&to->timer); + destroy_hrtimer_on_stack(&to->timer); + + if (ret == -ERESTARTSYS) { + restart = ¤t->restart_block; + restart->futex.uaddr = uaddr; + restart->futex.val = val; + restart->futex.time = *abs_time; + restart->futex.bitset = bitset; + restart->futex.flags = flags | FLAGS_HAS_TIMEOUT; + + return set_restart_fn(restart, futex_wait_restart); } + return ret; } Index: linux-2.6/kernel/sys_ni.c =================================================================== --- linux-2.6.orig/kernel/sys_ni.c +++ linux-2.6/kernel/sys_ni.c @@ -88,6 +88,7 @@ COND_SYSCALL(get_robust_list); COND_SYSCALL_COMPAT(get_robust_list); COND_SYSCALL(futex_waitv); COND_SYSCALL(futex_wake); +COND_SYSCALL(futex_wait); COND_SYSCALL(kexec_load); COND_SYSCALL_COMPAT(kexec_load); COND_SYSCALL(init_module); From patchwork Thu Sep 21 10:45:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 143148 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5299959vqi; Thu, 21 Sep 2023 20:42:12 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGpBLm+/z9I2/toDAqGMeOt4Ofa8ZgvPS4NBauzKeF/KOr3/IZ9PpsBMV7MQQueEbYwKPp+ X-Received: by 2002:a05:6870:5247:b0:1c8:d72a:d6ba with SMTP id o7-20020a056870524700b001c8d72ad6bamr8195707oai.45.1695354132563; Thu, 21 Sep 2023 20:42:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695354132; cv=none; d=google.com; s=arc-20160816; b=HRzvFIQ+8eJH4pbk9hnRTrH9g3PNOQhDFY8WOXpbVZm8RnctXz2DikP6o0H38zMsDP eppMstnmNlFDy6MM8wYFs8zA1UinPgBsb43xG903ppe2vPWgUiyRZuqm/RoJe1aER1uj l90uu/toTNViQ/DxsY1ZwgRF4nD2P4KpdTBi74DhKHiNY/ZsgpajI/wvVBwDanYlev0x SkHCSOJ6okqv/D9Fw9Ne79LECch2rjmHM+swlm5HOTso8/kT2Cma7kKLqy6oxPz1CpX6 bmgys/oIB+1jC4BLENR0sK6W1LG4bgOSOJKAc26ZLbke/SSJ2PqRQDWsJ/a6KSwr6+r1 F+Dg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:references :subject:cc:to:from:date:user-agent:message-id:dkim-signature; bh=fjIRU4T5oPILBU1zZ/wxS7qhPtCXJYxpFVPOgrLAAA0=; fh=gKrHpzx5aMrklt5E8Fmce1sSUBKxYyMxbBSs7w4zkwQ=; b=pFuJo1BizokN2pUEmXpAa054L7DCI5K6rf7XW6VhcPJU18nbjHFazs3q5v3i9r2ikt gT3nQ3KumahpLyHyFgnlcSECIZTyV03SWiAiho0VyqmGWne4qUYUXbsjheBCdNs65ml1 MlvizlcqJvuOhKZ/nIRsVzj9dk6FR5uXLa4YWc3wKQz13+PvytXgE/7Qm1iUwCam5lyU J/Rc+zObCJjDiWCGT/7FzwVENhDZDHzC4HD95syOPts7dokf3hgHPjdcAYCAwfLo5+cJ L2R531qvL/jiJOzjGf+Us260v/qbky6K7zmN6hkdbctsPgcrH3m41Kzq7acYZcHTnplV tjIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=eBZezOFZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id f20-20020a637554000000b00565dd3fbfdfsi2797556pgn.214.2023.09.21.20.42.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 20:42:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=eBZezOFZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 8DEBA84D73D1; Thu, 21 Sep 2023 13:17:30 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231145AbjIUURa (ORCPT + 29 others); Thu, 21 Sep 2023 16:17:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229934AbjIUURO (ORCPT ); Thu, 21 Sep 2023 16:17:14 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 835F1580A7; Thu, 21 Sep 2023 10:19:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=fjIRU4T5oPILBU1zZ/wxS7qhPtCXJYxpFVPOgrLAAA0=; b=eBZezOFZ4ppQ3c2LZZFjuO/bD8 21whfvNIaYL8MbYoA09p6cZMY0xa2zNuISVyyt31TCZQMlOL2jutHZQw5DV0UbkivdgvXS3N35Ecn vIbEoK/h0Be5VogafdT0PutUojkgclQdYFKDradWPI5BjIUJG7gRPS3JbBmbctKHbDK2ojQ9dxGgC 1iteJJ8zhfr+VbGovdVx3usUWCyGC//ajRFV+vQXW/HxStnQ1v8g3S1XR5LaOZrhqhx5Oa0CtmlSl qCksmgG5lRBLRCaymIvIwWUXwbSOxPlUU0xV5NxOIlLp8T4G/ARIoWsYIypPd/BkX2MbZ33POQ8YU PDpUBtGg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qjHQN-00FJvw-1Z; Thu, 21 Sep 2023 11:00:49 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 0ABA13005F4; Thu, 21 Sep 2023 13:00:43 +0200 (CEST) Message-Id: <20230921105248.396780136@noisy.programming.kicks-ass.net> User-Agent: quilt/0.65 Date: Thu, 21 Sep 2023 12:45:14 +0200 From: peterz@infradead.org To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de Subject: [PATCH v3 09/15] futex: Add flags2 argument to futex_requeue() References: <20230921104505.717750284@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Disposition: inline; filename=peterz-futex2-requeue-flags.patch X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Thu, 21 Sep 2023 13:17:30 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777707655121573214 X-GMAIL-MSGID: 1777707655121573214 In order to support mixed size requeue, add a second flags argument to the internal futex_requeue() function. No functional change intended. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Thomas Gleixner --- kernel/futex/futex.h | 5 +++-- kernel/futex/requeue.c | 12 +++++++----- kernel/futex/syscalls.c | 6 +++--- 3 files changed, 13 insertions(+), 10 deletions(-) Index: linux-2.6/kernel/futex/futex.h =================================================================== --- linux-2.6.orig/kernel/futex/futex.h +++ linux-2.6/kernel/futex/futex.h @@ -328,8 +328,9 @@ extern int futex_wait_requeue_pi(u32 __u val, ktime_t *abs_time, u32 bitset, u32 __user *uaddr2); -extern int futex_requeue(u32 __user *uaddr1, unsigned int flags, - u32 __user *uaddr2, int nr_wake, int nr_requeue, +extern int futex_requeue(u32 __user *uaddr1, unsigned int flags1, + u32 __user *uaddr2, unsigned int flags2, + int nr_wake, int nr_requeue, u32 *cmpval, int requeue_pi); extern int __futex_wait(u32 __user *uaddr, unsigned int flags, u32 val, Index: linux-2.6/kernel/futex/requeue.c =================================================================== --- linux-2.6.orig/kernel/futex/requeue.c +++ linux-2.6/kernel/futex/requeue.c @@ -346,8 +346,9 @@ futex_proxy_trylock_atomic(u32 __user *p /** * futex_requeue() - Requeue waiters from uaddr1 to uaddr2 * @uaddr1: source futex user address - * @flags: futex flags (FLAGS_SHARED, etc.) + * @flags1: futex flags (FLAGS_SHARED, etc.) * @uaddr2: target futex user address + * @flags2: futex flags (FLAGS_SHARED, etc.) * @nr_wake: number of waiters to wake (must be 1 for requeue_pi) * @nr_requeue: number of waiters to requeue (0-INT_MAX) * @cmpval: @uaddr1 expected value (or %NULL) @@ -361,7 +362,8 @@ futex_proxy_trylock_atomic(u32 __user *p * - >=0 - on success, the number of tasks requeued or woken; * - <0 - on error */ -int futex_requeue(u32 __user *uaddr1, unsigned int flags, u32 __user *uaddr2, +int futex_requeue(u32 __user *uaddr1, unsigned int flags1, + u32 __user *uaddr2, unsigned int flags2, int nr_wake, int nr_requeue, u32 *cmpval, int requeue_pi) { union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT; @@ -424,10 +426,10 @@ int futex_requeue(u32 __user *uaddr1, un } retry: - ret = get_futex_key(uaddr1, flags, &key1, FUTEX_READ); + ret = get_futex_key(uaddr1, flags1, &key1, FUTEX_READ); if (unlikely(ret != 0)) return ret; - ret = get_futex_key(uaddr2, flags, &key2, + ret = get_futex_key(uaddr2, flags2, &key2, requeue_pi ? FUTEX_WRITE : FUTEX_READ); if (unlikely(ret != 0)) return ret; @@ -459,7 +461,7 @@ retry_private: if (ret) return ret; - if (!(flags & FLAGS_SHARED)) + if (!(flags1 & FLAGS_SHARED)) goto retry_private; goto retry; Index: linux-2.6/kernel/futex/syscalls.c =================================================================== --- linux-2.6.orig/kernel/futex/syscalls.c +++ linux-2.6/kernel/futex/syscalls.c @@ -106,9 +106,9 @@ long do_futex(u32 __user *uaddr, int op, case FUTEX_WAKE_BITSET: return futex_wake(uaddr, flags, val, val3); case FUTEX_REQUEUE: - return futex_requeue(uaddr, flags, uaddr2, val, val2, NULL, 0); + return futex_requeue(uaddr, flags, uaddr2, flags, val, val2, NULL, 0); case FUTEX_CMP_REQUEUE: - return futex_requeue(uaddr, flags, uaddr2, val, val2, &val3, 0); + return futex_requeue(uaddr, flags, uaddr2, flags, val, val2, &val3, 0); case FUTEX_WAKE_OP: return futex_wake_op(uaddr, flags, uaddr2, val, val2, val3); case FUTEX_LOCK_PI: @@ -125,7 +125,7 @@ long do_futex(u32 __user *uaddr, int op, return futex_wait_requeue_pi(uaddr, flags, val, timeout, val3, uaddr2); case FUTEX_CMP_REQUEUE_PI: - return futex_requeue(uaddr, flags, uaddr2, val, val2, &val3, 1); + return futex_requeue(uaddr, flags, uaddr2, flags, val, val2, &val3, 1); } return -ENOSYS; } From patchwork Thu Sep 21 10:45:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 142910 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5137623vqi; Thu, 21 Sep 2023 14:04:38 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE482qON8tTGD4Ac6GFvUBIq7VSKnVFpJ1AI/fjegxM71ISZATM68Je0b0c/Zwz7VmvopZB X-Received: by 2002:a05:6a20:d90b:b0:15c:c99d:ba73 with SMTP id jd11-20020a056a20d90b00b0015cc99dba73mr5615895pzb.18.1695330277684; Thu, 21 Sep 2023 14:04:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695330277; cv=none; d=google.com; s=arc-20160816; b=vEFlqsVE/Eh7DoOK4ueIHkpWt17SgeuBlQyo5oAbDqaH9jeORtdx/mR4qe+KaU3DD9 WguhbWauP+zSr/FTDUMyFZOMuSlZVHFdKN9jYiUH7JT+Tj2AcLLRYIADEA8RVLQ6sB90 dmHJjvZQ5hfPpiaVwQK8CMRsVcjPQEDGK46BqtFf6PIhWDZuBZ/I7A2eQ+e4hTKn4vC/ pHCY8J3gf2b2BQVrJajRPKaD+yTbFheCzjHmyECNJwnGyqND+024tUWAJ1mEYyZGD5i8 uJjGSBcuvS9+Qb1f1lvqflsJ+NK1Q83gUKFxiD8RE0bqJPy2xAxAX+PHfW+VdUcptQp4 ODPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:references :subject:cc:to:from:date:user-agent:message-id:dkim-signature; bh=f+fo5qtlWKdma6yR33ByQC30fFdCan4IPhCDTk6tKSM=; fh=KVSSIH4XiUdyS6GzQPVkCH2RTBvSntROxyBgLavxfwI=; b=IPiRTRAH40EA/8ag1Rjm/cLSYaERPgme2tdvS8TzYG5nim6sZgyDpBdcy+WP2gv//h eZRLg/xSJlFdfS888tiwIyy/JxHH5sKiNkYlecFIoxRv+zIWk4fd/bF5qDscpB0pxqfA pMtjd7fNO8twUnPqypVI3oW3gfsXxUozRsqpmp6wJ0DHMtkIbeI3u6odrjj+sg6qVmxY 3jg62ADjJU3qnu1a4hcSGc4fjBUZIPv81MxFLnGxQYqIOJGz5foAxfEchtR5DHqSlLZn I05AR7LRs5d9lV6xM0CMZiLiJyT6AqPq8cyhjrCoEyr2JB4DOSrf4dbQbXM7XBTpL4zp rfVA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=HEloi5ch; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id g7-20020a17090a4b0700b002639fe8ff6bsi4401449pjh.44.2023.09.21.14.04.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 14:04:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=HEloi5ch; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 80ECA829EA30; Thu, 21 Sep 2023 12:31:32 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230228AbjIUTbH (ORCPT + 28 others); Thu, 21 Sep 2023 15:31:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43726 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229674AbjIUTaP (ORCPT ); Thu, 21 Sep 2023 15:30:15 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D326D580BB; Thu, 21 Sep 2023 10:19:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=f+fo5qtlWKdma6yR33ByQC30fFdCan4IPhCDTk6tKSM=; b=HEloi5chIkoAzJ1qmH9doIMMLS ZkNSD4N2RkkbqgCewVxvAHggRdWZnFRlWj4IhTeRhItbbFVErm62/CTGDqCmyjsnOp4ZuteeOoFdF yIEIu8bnwVXPVpe6KPzotIKluKJ0e+mnwy062T62ceHSKleg0fzg6/b1LE8m/A4OlmlXuCwJfqVcV xHMQu5W5PZqXS6brx0IevaeO27f1kU1m0gjva+SDK5AWHjcO2nWlE/J9+3FVfPKfua0pNdfZVpV3Y v4a1PegmFDKYJ5+tOvwtgzugAgCj5s7ib09/wtVoAKIabLs7TIX8znLpx9OXnGZG8oI1jFkcJns8j 3bUiJ+NA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qjHQN-00FJvu-1M; Thu, 21 Sep 2023 11:00:57 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 10D1E300642; Thu, 21 Sep 2023 13:00:43 +0200 (CEST) Message-Id: <20230921105248.511860556@noisy.programming.kicks-ass.net> User-Agent: quilt/0.65 Date: Thu, 21 Sep 2023 12:45:15 +0200 From: peterz@infradead.org To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, Geert Uytterhoeven Subject: [PATCH v3 10/15] futex: Add sys_futex_requeue() References: <20230921104505.717750284@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Disposition: inline; filename=peterz-futex2-requeue.patch X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Thu, 21 Sep 2023 12:31:32 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777682641305119090 X-GMAIL-MSGID: 1777682641305119090 Finish off the 'simple' futex2 syscall group by adding sys_futex_requeue(). Unlike sys_futex_{wait,wake}() its arguments are too numerous to fit into a regular syscall. As such, use struct futex_waitv to pass the 'source' and 'destination' futexes to the syscall. This syscall implements what was previously known as FUTEX_CMP_REQUEUE and uses {val, uaddr, flags} for source and {uaddr, flags} for destination. This design explicitly allows requeueing between different types of futex by having a different flags word per uaddr. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Thomas Gleixner Acked-by: Geert Uytterhoeven --- arch/alpha/kernel/syscalls/syscall.tbl | 1 arch/arm/tools/syscall.tbl | 1 arch/arm64/include/asm/unistd.h | 2 - arch/arm64/include/asm/unistd32.h | 2 + arch/ia64/kernel/syscalls/syscall.tbl | 1 arch/m68k/kernel/syscalls/syscall.tbl | 1 arch/microblaze/kernel/syscalls/syscall.tbl | 1 arch/mips/kernel/syscalls/syscall_n32.tbl | 1 arch/mips/kernel/syscalls/syscall_n64.tbl | 1 arch/mips/kernel/syscalls/syscall_o32.tbl | 1 arch/parisc/kernel/syscalls/syscall.tbl | 1 arch/powerpc/kernel/syscalls/syscall.tbl | 1 arch/s390/kernel/syscalls/syscall.tbl | 1 arch/sh/kernel/syscalls/syscall.tbl | 1 arch/sparc/kernel/syscalls/syscall.tbl | 1 arch/x86/entry/syscalls/syscall_32.tbl | 1 arch/x86/entry/syscalls/syscall_64.tbl | 1 arch/xtensa/kernel/syscalls/syscall.tbl | 1 include/linux/syscalls.h | 3 ++ include/uapi/asm-generic/unistd.h | 4 ++ kernel/futex/syscalls.c | 38 ++++++++++++++++++++++++++++ kernel/sys_ni.c | 1 22 files changed, 64 insertions(+), 2 deletions(-) Index: linux-2.6/arch/alpha/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/alpha/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/alpha/kernel/syscalls/syscall.tbl @@ -494,3 +494,4 @@ 562 common fchmodat2 sys_fchmodat2 563 common futex_wake sys_futex_wake 564 common futex_wait sys_futex_wait +565 common futex_requeue sys_futex_requeue Index: linux-2.6/arch/arm/tools/syscall.tbl =================================================================== --- linux-2.6.orig/arch/arm/tools/syscall.tbl +++ linux-2.6/arch/arm/tools/syscall.tbl @@ -468,3 +468,4 @@ 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait +456 common futex_requeue sys_futex_requeue Index: linux-2.6/arch/arm64/include/asm/unistd.h =================================================================== --- linux-2.6.orig/arch/arm64/include/asm/unistd.h +++ linux-2.6/arch/arm64/include/asm/unistd.h @@ -39,7 +39,7 @@ #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 456 +#define __NR_compat_syscalls 457 #endif #define __ARCH_WANT_SYS_CLONE Index: linux-2.6/arch/arm64/include/asm/unistd32.h =================================================================== --- linux-2.6.orig/arch/arm64/include/asm/unistd32.h +++ linux-2.6/arch/arm64/include/asm/unistd32.h @@ -915,6 +915,8 @@ __SYSCALL(__NR_fchmodat2, sys_fchmodat2) __SYSCALL(__NR_futex_wake, sys_futex_wake) #define __NR_futex_wait 455 __SYSCALL(__NR_futex_wait, sys_futex_wait) +#define __NR_futex_requeue 456 +__SYSCALL(__NR_futex_requeue, sys_futex_requeue) /* * Please add new compat syscalls above this comment and update Index: linux-2.6/arch/ia64/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/ia64/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/ia64/kernel/syscalls/syscall.tbl @@ -375,3 +375,4 @@ 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait +456 common futex_requeue sys_futex_requeue Index: linux-2.6/arch/m68k/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/m68k/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/m68k/kernel/syscalls/syscall.tbl @@ -454,3 +454,4 @@ 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait +456 common futex_requeue sys_futex_requeue Index: linux-2.6/arch/microblaze/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/microblaze/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/microblaze/kernel/syscalls/syscall.tbl @@ -460,3 +460,4 @@ 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait +456 common futex_requeue sys_futex_requeue Index: linux-2.6/arch/mips/kernel/syscalls/syscall_n32.tbl =================================================================== --- linux-2.6.orig/arch/mips/kernel/syscalls/syscall_n32.tbl +++ linux-2.6/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -393,3 +393,4 @@ 452 n32 fchmodat2 sys_fchmodat2 454 n32 futex_wake sys_futex_wake 455 n32 futex_wait sys_futex_wait +456 n32 futex_requeue sys_futex_requeue Index: linux-2.6/arch/mips/kernel/syscalls/syscall_n64.tbl =================================================================== --- linux-2.6.orig/arch/mips/kernel/syscalls/syscall_n64.tbl +++ linux-2.6/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -369,3 +369,4 @@ 452 n64 fchmodat2 sys_fchmodat2 454 n64 futex_wake sys_futex_wake 455 n64 futex_wait sys_futex_wait +456 n64 futex_requeue sys_futex_requeue Index: linux-2.6/arch/mips/kernel/syscalls/syscall_o32.tbl =================================================================== --- linux-2.6.orig/arch/mips/kernel/syscalls/syscall_o32.tbl +++ linux-2.6/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -442,3 +442,4 @@ 452 o32 fchmodat2 sys_fchmodat2 454 o32 futex_wake sys_futex_wake 455 o32 futex_wait sys_futex_wait +456 o32 futex_requeue sys_futex_requeue Index: linux-2.6/arch/parisc/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/parisc/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/parisc/kernel/syscalls/syscall.tbl @@ -453,3 +453,4 @@ 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait +456 common futex_requeue sys_futex_requeue Index: linux-2.6/arch/powerpc/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/powerpc/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/powerpc/kernel/syscalls/syscall.tbl @@ -541,3 +541,4 @@ 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait +456 common futex_requeue sys_futex_requeue Index: linux-2.6/arch/s390/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/s390/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/s390/kernel/syscalls/syscall.tbl @@ -457,3 +457,4 @@ 452 common fchmodat2 sys_fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait sys_futex_wait +456 common futex_requeue sys_futex_requeue sys_futex_requeue Index: linux-2.6/arch/sh/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/sh/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/sh/kernel/syscalls/syscall.tbl @@ -457,3 +457,4 @@ 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait +456 common futex_requeue sys_futex_requeue Index: linux-2.6/arch/sparc/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/sparc/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/sparc/kernel/syscalls/syscall.tbl @@ -500,3 +500,4 @@ 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait +456 common futex_requeue sys_futex_requeue Index: linux-2.6/arch/x86/entry/syscalls/syscall_32.tbl =================================================================== --- linux-2.6.orig/arch/x86/entry/syscalls/syscall_32.tbl +++ linux-2.6/arch/x86/entry/syscalls/syscall_32.tbl @@ -459,3 +459,4 @@ 452 i386 fchmodat2 sys_fchmodat2 454 i386 futex_wake sys_futex_wake 455 i386 futex_wait sys_futex_wait +456 i386 futex_requeue sys_futex_requeue Index: linux-2.6/arch/x86/entry/syscalls/syscall_64.tbl =================================================================== --- linux-2.6.orig/arch/x86/entry/syscalls/syscall_64.tbl +++ linux-2.6/arch/x86/entry/syscalls/syscall_64.tbl @@ -377,6 +377,7 @@ 453 64 map_shadow_stack sys_map_shadow_stack 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait +456 common futex_requeue sys_futex_requeue # # Due to a historical design error, certain syscalls are numbered differently Index: linux-2.6/arch/xtensa/kernel/syscalls/syscall.tbl =================================================================== --- linux-2.6.orig/arch/xtensa/kernel/syscalls/syscall.tbl +++ linux-2.6/arch/xtensa/kernel/syscalls/syscall.tbl @@ -425,3 +425,4 @@ 452 common fchmodat2 sys_fchmodat2 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait +456 common futex_requeue sys_futex_requeue Index: linux-2.6/include/linux/syscalls.h =================================================================== --- linux-2.6.orig/include/linux/syscalls.h +++ linux-2.6/include/linux/syscalls.h @@ -556,6 +556,9 @@ asmlinkage long sys_futex_wait(void __us unsigned int flags, struct __kernel_timespec __user *timespec, clockid_t clockid); +asmlinkage long sys_futex_requeue(struct futex_waitv __user *waiters, + unsigned int flags, int nr_wake, int nr_requeue); + asmlinkage long sys_nanosleep(struct __kernel_timespec __user *rqtp, struct __kernel_timespec __user *rmtp); asmlinkage long sys_nanosleep_time32(struct old_timespec32 __user *rqtp, Index: linux-2.6/include/uapi/asm-generic/unistd.h =================================================================== --- linux-2.6.orig/include/uapi/asm-generic/unistd.h +++ linux-2.6/include/uapi/asm-generic/unistd.h @@ -826,9 +826,11 @@ __SYSCALL(__NR_fchmodat2, sys_fchmodat2) __SYSCALL(__NR_futex_wake, sys_futex_wake) #define __NR_futex_wait 455 __SYSCALL(__NR_futex_wait, sys_futex_wait) +#define __NR_futex_requeue 456 +__SYSCALL(__NR_futex_requeue, sys_futex_requeue) #undef __NR_syscalls -#define __NR_syscalls 456 +#define __NR_syscalls 457 /* * 32 bit systems traditionally used different Index: linux-2.6/kernel/futex/syscalls.c =================================================================== --- linux-2.6.orig/kernel/futex/syscalls.c +++ linux-2.6/kernel/futex/syscalls.c @@ -396,6 +396,44 @@ SYSCALL_DEFINE6(futex_wait, return ret; } +/* + * sys_futex_requeue - Requeue a waiter from one futex to another + * @waiters: array describing the source and destination futex + * @flags: unused + * @nr_wake: number of futexes to wake + * @nr_requeue: number of futexes to requeue + * + * Identical to the traditional FUTEX_CMP_REQUEUE op, except it is part of the + * futex2 family of calls. + */ + +SYSCALL_DEFINE4(futex_requeue, + struct futex_waitv __user *, waiters, + unsigned int, flags, + int, nr_wake, + int, nr_requeue) +{ + struct futex_vector futexes[2]; + u32 cmpval; + int ret; + + if (flags) + return -EINVAL; + + if (!waiters) + return -EINVAL; + + ret = futex_parse_waitv(futexes, waiters, 2); + if (ret) + return ret; + + cmpval = futexes[0].w.val; + + return futex_requeue(u64_to_user_ptr(futexes[0].w.uaddr), futexes[0].w.flags, + u64_to_user_ptr(futexes[1].w.uaddr), futexes[1].w.flags, + nr_wake, nr_requeue, &cmpval, 0); +} + #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE2(set_robust_list, struct compat_robust_list_head __user *, head, Index: linux-2.6/kernel/sys_ni.c =================================================================== --- linux-2.6.orig/kernel/sys_ni.c +++ linux-2.6/kernel/sys_ni.c @@ -89,6 +89,7 @@ COND_SYSCALL_COMPAT(get_robust_list); COND_SYSCALL(futex_waitv); COND_SYSCALL(futex_wake); COND_SYSCALL(futex_wait); +COND_SYSCALL(futex_requeue); COND_SYSCALL(kexec_load); COND_SYSCALL_COMPAT(kexec_load); COND_SYSCALL(init_module); From patchwork Thu Sep 21 10:45:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 142931 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5154779vqi; Thu, 21 Sep 2023 14:39:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH6DAF1z+UrjWGr2ku3teNGB5f2WTNkKKy3wti/ioYLoG9SuDEbV0RLIu0jigzeG6AC/c3e X-Received: by 2002:a17:902:e743:b0:1c1:f0b4:f68f with SMTP id p3-20020a170902e74300b001c1f0b4f68fmr1440215plf.10.1695332395948; Thu, 21 Sep 2023 14:39:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695332395; cv=none; d=google.com; s=arc-20160816; b=K4zRUQmCOURHFoelUXiBll1/fr6sgg61vUiXCGMToUc107A46E05vEqk0aFyZ+Zdlj p4L8FBnKXKxtQmceTNOBh7GhmD43wbxUe06/WeKj4Y+9NpqvPgrik6phTrqh7sRdjItP +FLmgQOFKeHp6a4NqafEKPoXm2lVIJloof9Rft7yYLqaGdhhQLhp48oUaYGxnZJRcWJ/ 7n4lN1tedUZ6XwaskkZw69xmVjgo82ywU454u1oK2Vnljmsze09ZBJLv+3j636yCIt9x iFfrDkDBBTxoHzb/xk+BQA1TNX2cnBs6FBvaozLVh8gaXhbJhAbGwvuU2H3ShavsJz1a MdEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:references :subject:cc:to:from:date:user-agent:message-id:dkim-signature; bh=+WIdohk1HL2nzrHveE43px5IkTjGhV/7JBkH2s5C174=; fh=HgsJ7WFcUFn0TdD0zoEp/rYZgetpe2J5VKUV82PkH34=; b=PyzOUuZd8OdkNIWLqgKYcPqM+xPOZEjyWfEZnvWhP46rptbnyhN1C/s3CUJhK6n/Fg aHbgOZluYgiZKgWL3kflLFoqz87Y/+mlZGOo/139UyuLvkQ3EyjTGfIXiK/SeLPpCXEE 3MizAfPBfJoa0jyIWyAg0m4ggE5WKSATakr1Ma2ZfESkNLEElUfwQgH8y0ScgNCuuuPV dsxSOr4WJpfupFNFPbiy3TdgyxK8SiVAaCGeOXHYYYET4o5bYgkJWFQeCBQTHPdrbTuL OB72GzrVdI+1oDgf21geMwPzNKaYrASTBEjz4bO5R4tugnMcHYClzIg+CRWXlMJs4HDS 2SVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=ViAEDbVG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id jx9-20020a170903138900b001c3aed2db5dsi2184834plb.409.2023.09.21.14.39.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 14:39:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=ViAEDbVG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id E3D0082A38F0; Thu, 21 Sep 2023 11:12:26 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230238AbjIUSLs (ORCPT + 29 others); Thu, 21 Sep 2023 14:11:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48284 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229734AbjIUSLP (ORCPT ); Thu, 21 Sep 2023 14:11:15 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1D64580B1; Thu, 21 Sep 2023 10:19:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=+WIdohk1HL2nzrHveE43px5IkTjGhV/7JBkH2s5C174=; b=ViAEDbVGOlyxSg3ebF8KxwAziD ZJXxCIy42X12v4Mep2vVxljFUC4bpie4jrjZc5b7y0aR/AFBwBx8QQf7IyrWb6in0/ZjL0xzIPnSx pb6UQGfcsxlZIGeiFtjBChv2JNIQwRXxcHOt6/36MAnZlaHZe1+0fz5A4w/a/h9G+i1YiZoOjLNE4 OwpOyR+wNSWxm0UFxB37sVSQEBBvFtN6ftl6Pn5yJIMtEP7rAmx4bdY6aw54FXq2amDzF8WCOnnaP MIOPlPb7NIN0RqyDlWXr9vkr9PkW1Y4MN8g0z/WF3CPS10TmZ2L6+1wJRYHnIB47xz4TyxFPPPHSd HzrBon7A==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qjHQN-00FJvv-1Z; Thu, 21 Sep 2023 11:00:59 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 15CCD30067A; Thu, 21 Sep 2023 13:00:43 +0200 (CEST) Message-Id: <20230921105248.683656626@noisy.programming.kicks-ass.net> User-Agent: quilt/0.65 Date: Thu, 21 Sep 2023 12:45:16 +0200 From: peterz@infradead.org To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, Christoph Hellwig Subject: [PATCH v3 11/15] mm: Add vmalloc_huge_node() References: <20230921104505.717750284@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Disposition: inline; filename=peterz-vmalloc_huge_node.patch X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 21 Sep 2023 11:12:26 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777684862439051607 X-GMAIL-MSGID: 1777684862439051607 To enable node specific hash-tables. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Christoph Hellwig --- include/linux/vmalloc.h | 1 + mm/vmalloc.c | 7 +++++++ 2 files changed, 8 insertions(+) Index: linux-2.6/include/linux/vmalloc.h =================================================================== --- linux-2.6.orig/include/linux/vmalloc.h +++ linux-2.6/include/linux/vmalloc.h @@ -152,6 +152,7 @@ extern void *__vmalloc_node_range(unsign void *__vmalloc_node(unsigned long size, unsigned long align, gfp_t gfp_mask, int node, const void *caller) __alloc_size(1); void *vmalloc_huge(unsigned long size, gfp_t gfp_mask) __alloc_size(1); +void *vmalloc_huge_node(unsigned long size, gfp_t gfp_mask, int node) __alloc_size(1); extern void *__vmalloc_array(size_t n, size_t size, gfp_t flags) __alloc_size(1, 2); extern void *vmalloc_array(size_t n, size_t size) __alloc_size(1, 2); Index: linux-2.6/mm/vmalloc.c =================================================================== --- linux-2.6.orig/mm/vmalloc.c +++ linux-2.6/mm/vmalloc.c @@ -3420,6 +3420,13 @@ void *vmalloc(unsigned long size) } EXPORT_SYMBOL(vmalloc); +void *vmalloc_huge_node(unsigned long size, gfp_t gfp_mask, int node) +{ + return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, + gfp_mask, PAGE_KERNEL, VM_ALLOW_HUGE_VMAP, + node, __builtin_return_address(0)); +} + /** * vmalloc_huge - allocate virtually contiguous memory, allow huge pages * @size: allocation size From patchwork Thu Sep 21 10:45:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 143164 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5314222vqi; Thu, 21 Sep 2023 21:21:19 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHqadOaPrlgBiSOSQVnbsZmnQXb9mFFB+YRBXaIHYF7XfljaDK7HUYJqUg7tC0Nc4850JIk X-Received: by 2002:a17:902:d70f:b0:1be:384:7b29 with SMTP id w15-20020a170902d70f00b001be03847b29mr7294496ply.34.1695356478688; Thu, 21 Sep 2023 21:21:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695356478; cv=none; d=google.com; s=arc-20160816; b=ciV6U/Z0yJUkYyFXogp9JEHfpQbTOrA890XEsX4Lsnr3MY/9yl7Cv0rpAX+nLXEdkL lVdfpZxOsq/2Zpyrv8PSeEJV3vjamoZ7MbSNwL3rteifTObxljy06BCEsGbdOfzt6ITP AzXdr5ews1Me5DVrbd47v9edIqpLHcymqB8iR9teltB2xaIhjj1Mt/H6/LFmqYfgHMkC nF9XEXaJj0CVhR5cf4BvA28pyYxq75ORTT3wp3HNloe4QKqAwNW7q5nThpXAWfgqCwF8 voKpSmC8b6AYPthaK4R+MjBlHe9hWruXUhiOQZp7/7bmXcYdDSb7IRPsr5NBWMafjaes Z2Bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:references :subject:cc:to:from:date:user-agent:message-id:dkim-signature; bh=Q23LA4fNtxuPKBmxlCap1S+ePSWCcBWqnwP5S12UL2A=; fh=gKrHpzx5aMrklt5E8Fmce1sSUBKxYyMxbBSs7w4zkwQ=; b=x1UFiBcPj5CpbogC4Jv4PcCJnY/Wu5rBFx2x6UvLDxclu8Q5hpqnqxmPhStfRo1swc jkeccnunnLGpz6MHYOyQ09DMbeiH5VLRx9uL3zVn7rNhjqJYqV24WvN2xY/huR5LMGG0 Ne4PilgOZj1Yaq0FWfva4xfViglqKvJoab62i4Dwc2Mf/vO0RDwI1mjgppKF/T4XFXGD AYqOejGQJmjzGCWpoJuE+v4noZsdykty2/JXHZb7sixkVwDrvvpmFj/lhVRQ6zJdQ4wn 7rwnkVlBIPIDfv2kjLil5KyptbEiNj00Iv9AwpuiFPS7Wc5UbeR/zXmSsn8SauWQ1/A8 ZDvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=fC10oRh8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id kg3-20020a170903060300b0019c354055d0si2738475plb.304.2023.09.21.21.21.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 21:21:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=fC10oRh8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id DDC96848B338; Thu, 21 Sep 2023 14:34:44 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232163AbjIUVep (ORCPT + 29 others); Thu, 21 Sep 2023 17:34:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229978AbjIUVeC (ORCPT ); Thu, 21 Sep 2023 17:34:02 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5CB14585E7; Thu, 21 Sep 2023 10:19:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=Q23LA4fNtxuPKBmxlCap1S+ePSWCcBWqnwP5S12UL2A=; b=fC10oRh8rfYFZppt9c16iBpKl6 tkQE8L0dalhfkmy47nxsgnYzuJc2FTXNyLp6Rd0ZVmI/KuXMWQpYLjRD+U+fbnmEeJ9gaFyMm1bnb aT0bT1FWUhEoTBablLBTbDrqaF7h+KpyUsKbRO9bRqOlj/qsOceCkEiLP7zh0EbXUzHT4FQy45hya fmEHq61ylgZ0WVZqE3slZONq67sTfxiJPZsPO+eqcx/GQXgQn37LOKWcwtWwjtqB7ENRwaZMaLKGM IYs4YpR7md3q/nWDWIwlAUgK/TXufUhO82cfMxRUll9dQ0evfrrvgxDvbYnkIPPZ657PVpOhfuk4O Soz1laqg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qjHQN-00FJvx-1Y; Thu, 21 Sep 2023 11:00:49 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 1BDC230067B; Thu, 21 Sep 2023 13:00:43 +0200 (CEST) Message-Id: <20230921105248.852663217@noisy.programming.kicks-ass.net> User-Agent: quilt/0.65 Date: Thu, 21 Sep 2023 12:45:17 +0200 From: peterz@infradead.org To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de Subject: [PATCH v3 12/15] futex: Implement FUTEX2_NUMA References: <20230921104505.717750284@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Disposition: inline; filename=peterz-futex2-numa.patch X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Thu, 21 Sep 2023 14:34:44 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777710114813737256 X-GMAIL-MSGID: 1777710114813737256 Extend the futex2 interface to be numa aware. When FUTEX2_NUMA is specified for a futex, the user value is extended to two words (of the same size). The first is the user value we all know, the second one will be the node to place this futex on. struct futex_numa_32 { u32 val; u32 node; }; When node is set to ~0, WAIT will set it to the current node_id such that WAKE knows where to find it. If userspace corrupts the node value between WAIT and WAKE, the futex will not be found and no wakeup will happen. When FUTEX2_NUMA is not set, the node is simply an extention of the hash, such that traditional futexes are still interleaved over the nodes. This is done to avoid having to have a separate !numa hash-table. Signed-off-by: Peter Zijlstra (Intel) --- include/linux/futex.h | 3 + kernel/futex/core.c | 129 +++++++++++++++++++++++++++++++++++++++--------- kernel/futex/futex.h | 25 +++++++-- kernel/futex/syscalls.c | 2 4 files changed, 128 insertions(+), 31 deletions(-) Index: linux-2.6/include/linux/futex.h =================================================================== --- linux-2.6.orig/include/linux/futex.h +++ linux-2.6/include/linux/futex.h @@ -34,6 +34,7 @@ union futex_key { u64 i_seq; unsigned long pgoff; unsigned int offset; + /* unsigned int node; */ } shared; struct { union { @@ -42,11 +43,13 @@ union futex_key { }; unsigned long address; unsigned int offset; + /* unsigned int node; */ } private; struct { u64 ptr; unsigned long word; unsigned int offset; + unsigned int node; /* NOT hashed! */ } both; }; Index: linux-2.6/kernel/futex/core.c =================================================================== --- linux-2.6.orig/kernel/futex/core.c +++ linux-2.6/kernel/futex/core.c @@ -34,7 +34,8 @@ #include #include #include -#include +#include +#include #include #include @@ -47,12 +48,14 @@ * reside in the same cacheline. */ static struct { - struct futex_hash_bucket *queues; unsigned long hashsize; + unsigned int hashshift; + struct futex_hash_bucket *queues[MAX_NUMNODES]; } __futex_data __read_mostly __aligned(2*sizeof(long)); -#define futex_queues (__futex_data.queues) -#define futex_hashsize (__futex_data.hashsize) +#define futex_hashsize (__futex_data.hashsize) +#define futex_hashshift (__futex_data.hashshift) +#define futex_queues (__futex_data.queues) /* * Fault injections for futexes. @@ -105,6 +108,26 @@ late_initcall(fail_futex_debugfs); #endif /* CONFIG_FAIL_FUTEX */ +static int futex_get_value(u32 *val, u32 __user *from, unsigned int flags) +{ + switch (futex_size(flags)) { + case 1: return __get_user(*val, (u8 __user *)from); + case 2: return __get_user(*val, (u16 __user *)from); + case 4: return __get_user(*val, (u32 __user *)from); + default: BUG(); + } +} + +static int futex_put_value(u32 val, u32 __user *to, unsigned int flags) +{ + switch (futex_size(flags)) { + case 1: return __put_user(val, (u8 __user *)to); + case 2: return __put_user(val, (u16 __user *)to); + case 4: return __put_user(val, (u32 __user *)to); + default: BUG(); + } +} + /** * futex_hash - Return the hash bucket in the global hash * @key: Pointer to the futex key for which the hash is calculated @@ -114,10 +137,29 @@ late_initcall(fail_futex_debugfs); */ struct futex_hash_bucket *futex_hash(union futex_key *key) { - u32 hash = jhash2((u32 *)key, offsetof(typeof(*key), both.offset) / 4, + u32 hash = jhash2((u32 *)key, + offsetof(typeof(*key), both.offset) / sizeof(u32), key->both.offset); + int node = key->both.node; + + if (node == FUTEX_NO_NODE) { + /* + * In case of !FLAGS_NUMA, use some unused hash bits to pick a + * node -- this ensures regular futexes are interleaved across + * the nodes and avoids having to allocate multiple + * hash-tables. + * + * NOTE: this isn't perfectly uniform, but it is fast and + * handles sparse node masks. + */ + node = (hash >> futex_hashshift) % nr_node_ids; + if (!node_possible(node)) { + node = find_next_bit_wrap(node_possible_map.bits, + nr_node_ids, node); + } + } - return &futex_queues[hash & (futex_hashsize - 1)]; + return &futex_queues[node][hash & (futex_hashsize - 1)]; } @@ -217,7 +259,7 @@ static u64 get_inode_sequence_number(str * * lock_page() might sleep, the caller should not hold a spinlock. */ -int get_futex_key(u32 __user *uaddr, unsigned int flags, union futex_key *key, +int get_futex_key(void __user *uaddr, unsigned int flags, union futex_key *key, enum futex_access rw) { unsigned long address = (unsigned long)uaddr; @@ -225,25 +267,49 @@ int get_futex_key(u32 __user *uaddr, uns struct page *page; struct folio *folio; struct address_space *mapping; - int err, ro = 0; + int node, err, size, ro = 0; bool fshared; fshared = flags & FLAGS_SHARED; + size = futex_size(flags); + if (flags & FLAGS_NUMA) + size *= 2; /* * The futex address must be "naturally" aligned. */ key->both.offset = address % PAGE_SIZE; - if (unlikely((address % sizeof(u32)) != 0)) + if (unlikely((address % size) != 0)) return -EINVAL; address -= key->both.offset; - if (unlikely(!access_ok(uaddr, sizeof(u32)))) + if (unlikely(!access_ok(uaddr, size))) return -EFAULT; if (unlikely(should_fail_futex(fshared))) return -EFAULT; + if (flags & FLAGS_NUMA) { + void __user *naddr = uaddr + size / 2; + + if (futex_get_value(&node, naddr, flags)) + return -EFAULT; + + if (node == FUTEX_NO_NODE) { + node = numa_node_id(); + if (futex_put_value(node, naddr, flags)) + return -EFAULT; + + } else if (node >= MAX_NUMNODES || !node_possible(node)) { + return -EINVAL; + } + + key->both.node = node; + + } else { + key->both.node = FUTEX_NO_NODE; + } + /* * PROCESS_PRIVATE futexes are fast. * As the mm cannot disappear under us and the 'key' only needs @@ -1124,26 +1190,42 @@ void futex_exit_release(struct task_stru static int __init futex_init(void) { - unsigned int futex_shift; - unsigned long i; + unsigned int order, n; + unsigned long size, i; #if CONFIG_BASE_SMALL futex_hashsize = 16; #else - futex_hashsize = roundup_pow_of_two(256 * num_possible_cpus()); + futex_hashsize = 256 * num_possible_cpus(); + futex_hashsize /= num_possible_nodes(); + futex_hashsize = roundup_pow_of_two(futex_hashsize); #endif + futex_hashshift = ilog2(futex_hashsize); + size = sizeof(struct futex_hash_bucket) * futex_hashsize; + order = get_order(size); + + for_each_node(n) { + struct futex_hash_bucket *table; + + if (order > MAX_ORDER) + table = vmalloc_huge_node(size, GFP_KERNEL, n); + else + table = alloc_pages_exact_nid(n, size, GFP_KERNEL); + + BUG_ON(!table); + + for (i = 0; i < futex_hashsize; i++) { + atomic_set(&table[i].waiters, 0); + spin_lock_init(&table[i].lock); + plist_head_init(&table[i].chain); + } - futex_queues = alloc_large_system_hash("futex", sizeof(*futex_queues), - futex_hashsize, 0, 0, - &futex_shift, NULL, - futex_hashsize, futex_hashsize); - futex_hashsize = 1UL << futex_shift; - - for (i = 0; i < futex_hashsize; i++) { - atomic_set(&futex_queues[i].waiters, 0); - plist_head_init(&futex_queues[i].chain); - spin_lock_init(&futex_queues[i].lock); + futex_queues[n] = table; } + pr_info("futex hash table, %d nodes, %ld entries (order: %d, %lu bytes)\n", + num_possible_nodes(), + futex_hashsize, order, + sizeof(struct futex_hash_bucket) * futex_hashsize); return 0; } Index: linux-2.6/kernel/futex/futex.h =================================================================== --- linux-2.6.orig/kernel/futex/futex.h +++ linux-2.6/kernel/futex/futex.h @@ -83,6 +83,19 @@ static inline bool futex_flags_valid(uns if ((flags & FLAGS_SIZE_MASK) != FLAGS_SIZE_32) return false; + /* + * Must be able to represent both FUTEX_NO_NODE and every valid nodeid + * in a futex word. + */ + if (flags & FLAGS_NUMA) { + int bits = 8 * futex_size(flags); + u64 max = ~0ULL; + + max >>= 64 - bits; + if (nr_node_ids >= max) + return false; + } + return true; } @@ -184,7 +197,7 @@ enum futex_access { FUTEX_WRITE }; -extern int get_futex_key(u32 __user *uaddr, unsigned int flags, union futex_key *key, +extern int get_futex_key(void __user *uaddr, unsigned int flags, union futex_key *key, enum futex_access rw); extern struct hrtimer_sleeper * Index: linux-2.6/kernel/futex/syscalls.c =================================================================== --- linux-2.6.orig/kernel/futex/syscalls.c +++ linux-2.6/kernel/futex/syscalls.c @@ -179,7 +179,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uad return do_futex(uaddr, op, val, tp, uaddr2, (unsigned long)utime, val3); } -#define FUTEX2_VALID_MASK (FUTEX2_SIZE_MASK | FUTEX2_PRIVATE) +#define FUTEX2_VALID_MASK (FUTEX2_SIZE_MASK | FUTEX2_NUMA | FUTEX2_PRIVATE) /** * futex_parse_waitv - Parse a waitv array from userspace Index: linux-2.6/include/uapi/linux/futex.h =================================================================== --- linux-2.6.orig/include/uapi/linux/futex.h +++ linux-2.6/include/uapi/linux/futex.h @@ -74,6 +74,14 @@ /* do not use */ #define FUTEX_32 FUTEX2_SIZE_U32 /* historical accident :-( */ + +/* + * When FUTEX2_NUMA doubles the futex word, the second word is a node value. + * The special value -1 indicates no-node. This is the same value as + * NUMA_NO_NODE, except that value is not ABI, this is. + */ +#define FUTEX_NO_NODE (-1) + /* * Max numbers of elements in a futex_waitv array */ From patchwork Thu Sep 21 10:45:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 143099 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5275983vqi; Thu, 21 Sep 2023 19:29:18 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGA1STrPwKnq6NGvuixgWRsFE6k8JuLDj8EAAApf5H0lIzb8vEEGEtsz5I/D5ook7+hidOi X-Received: by 2002:a05:6a20:4415:b0:153:39d9:56fe with SMTP id ce21-20020a056a20441500b0015339d956femr8868025pzb.47.1695349757742; Thu, 21 Sep 2023 19:29:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695349757; cv=none; d=google.com; s=arc-20160816; b=TXWfAEjQN8UPx3N5GgFKghOKiHVe9J8qb5fiDnEaRZE2DFfInrFeULL3qNRFMn0mZr hZszn6SVk9Tkdc9oO36wiR7ShsuMEl+3JNfKntiWiKzoIc7Rn7WHIbycCmx3HIqwpmoz IhP0phpzpombqRAAS+Ep9n5D2EcSbuQpL4reugbqqu5glwEkq+PLuzS6RCMwia7PUVSu ReQnG05hdYuM3i8pT1Tvuam6MgjTTNota1joQaEGZpYCsPitxZ26+eOXsmlUG772gYKR E/OyhRMc8xKPVGcXrXhXYx9FudOD1n+ArHbCquqCfxieRh+UME3JIvcw//4gTL33gJh4 tJGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:references :subject:cc:to:from:date:user-agent:message-id:dkim-signature; bh=s6qOxGEZJpUErRo7Rj/ShDjtuA5F7i3OQJb+GA/Bkxo=; fh=gKrHpzx5aMrklt5E8Fmce1sSUBKxYyMxbBSs7w4zkwQ=; b=D2UJbfuNpMO1GZw+Xic4Mkry+yq2fMvhF3m4WwYGNx1u3CxlBPy/6YiHcxYe0UhiDJ 6JiDHLZYUmxmoXu1GQatHHtQIqB71bo2I4i0za8LRekednxyrTG/7q+iZfLcDZxd0Mro yhpOQ4YKKX6b4ge41qyggsXsCYYCsX8zBQAtJ6NhANjnA3BkiUTyC94dp7gQEXDsEL+6 foostuml2bieySHXWQHdkPq0W2DJZ266c1SwiBsOANwkPQdng41wEXsDY72lGKLBvGgq 01su+50mdXqOQgp/x/pSRPPEP6vGtNOXH1BuFj4OgK/JfQwbLia0Mv6QETNdX1Khm4tP 25qA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b="D3oUK/pO"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id y2-20020a17090a86c200b00267668f8be7si2768231pjv.64.2023.09.21.19.29.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 19:29:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b="D3oUK/pO"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 1B22D836723E; Thu, 21 Sep 2023 13:19:00 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231549AbjIUUS1 (ORCPT + 29 others); Thu, 21 Sep 2023 16:18:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231537AbjIUUSB (ORCPT ); Thu, 21 Sep 2023 16:18:01 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A922585D3; Thu, 21 Sep 2023 10:19:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=s6qOxGEZJpUErRo7Rj/ShDjtuA5F7i3OQJb+GA/Bkxo=; b=D3oUK/pO445uaPiP0UtrFW7hXa oKZz9yxvDNPjaKGxg+tuiKw82E47h3XSx1J1DtEx5/rO/Ue7n1RjsT8ClHfs5ISvz5vNJWJsrkIuA 0ui4A/To6UDY07L1ltjrTzUwk7nI3WvMdvonqicjSHRMsq5KyZxf0tb6QrOfqs7ZLA//VIXr/9qFW cze4siUErqj4DLV5JdF5b/Icns3tEaq1kbYbpCZmyI419v1f7F+4SLOHmrjiK5p3eoVvuUByqPDQz eLLWEv5LdNv7XBieo03gAqykd/mkLRixRW1Oph05DbRI73MxCQQfYN3EUx6SKXBr/CbYfTLYJm1Ng Gk8gGfng==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qjHQN-00FJvy-1Y; Thu, 21 Sep 2023 11:00:49 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 20ED33006F6; Thu, 21 Sep 2023 13:00:43 +0200 (CEST) Message-Id: <20230921105249.002168440@noisy.programming.kicks-ass.net> User-Agent: quilt/0.65 Date: Thu, 21 Sep 2023 12:45:18 +0200 From: peterz@infradead.org To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de Subject: [PATCH v3 13/15] futex: Propagate flags into futex_get_value_locked() References: <20230921104505.717750284@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Disposition: inline; filename=peterz-futex2-get_values_locked.patch X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 21 Sep 2023 13:19:00 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777703067264335387 X-GMAIL-MSGID: 1777703067264335387 In order to facilitate variable sized futexes propagate the flags into futex_get_value_locked(). No functional change intended. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Thomas Gleixner --- kernel/futex/core.c | 4 ++-- kernel/futex/futex.h | 2 +- kernel/futex/pi.c | 8 ++++---- kernel/futex/requeue.c | 4 ++-- kernel/futex/waitwake.c | 4 ++-- 5 files changed, 11 insertions(+), 11 deletions(-) Index: linux-2.6/kernel/futex/core.c =================================================================== --- linux-2.6.orig/kernel/futex/core.c +++ linux-2.6/kernel/futex/core.c @@ -516,12 +516,12 @@ int futex_cmpxchg_value_locked(u32 *curv return ret; } -int futex_get_value_locked(u32 *dest, u32 __user *from) +int futex_get_value_locked(u32 *dest, u32 __user *from, unsigned int flags) { int ret; pagefault_disable(); - ret = __get_user(*dest, from); + ret = futex_get_value(dest, from, flags); pagefault_enable(); return ret ? -EFAULT : 0; Index: linux-2.6/kernel/futex/futex.h =================================================================== --- linux-2.6.orig/kernel/futex/futex.h +++ linux-2.6/kernel/futex/futex.h @@ -229,7 +229,7 @@ extern void futex_wake_mark(struct wake_ extern int fault_in_user_writeable(u32 __user *uaddr); extern int futex_cmpxchg_value_locked(u32 *curval, u32 __user *uaddr, u32 uval, u32 newval); -extern int futex_get_value_locked(u32 *dest, u32 __user *from); +extern int futex_get_value_locked(u32 *dest, u32 __user *from, unsigned int flags); extern struct futex_q *futex_top_waiter(struct futex_hash_bucket *hb, union futex_key *key); extern void __futex_unqueue(struct futex_q *q); Index: linux-2.6/kernel/futex/pi.c =================================================================== --- linux-2.6.orig/kernel/futex/pi.c +++ linux-2.6/kernel/futex/pi.c @@ -240,7 +240,7 @@ static int attach_to_pi_state(u32 __user * still is what we expect it to be, otherwise retry the entire * operation. */ - if (futex_get_value_locked(&uval2, uaddr)) + if (futex_get_value_locked(&uval2, uaddr, FLAGS_SIZE_32)) goto out_efault; if (uval != uval2) @@ -359,7 +359,7 @@ static int handle_exit_race(u32 __user * * The same logic applies to the case where the exiting task is * already gone. */ - if (futex_get_value_locked(&uval2, uaddr)) + if (futex_get_value_locked(&uval2, uaddr, FLAGS_SIZE_32)) return -EFAULT; /* If the user space value has changed, try again. */ @@ -527,7 +527,7 @@ int futex_lock_pi_atomic(u32 __user *uad * Read the user space value first so we can validate a few * things before proceeding further. */ - if (futex_get_value_locked(&uval, uaddr)) + if (futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32)) return -EFAULT; if (unlikely(should_fail_futex(true))) @@ -750,7 +750,7 @@ retry: if (!pi_state->owner) newtid |= FUTEX_OWNER_DIED; - err = futex_get_value_locked(&uval, uaddr); + err = futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32); if (err) goto handle_err; Index: linux-2.6/kernel/futex/requeue.c =================================================================== --- linux-2.6.orig/kernel/futex/requeue.c +++ linux-2.6/kernel/futex/requeue.c @@ -273,7 +273,7 @@ futex_proxy_trylock_atomic(u32 __user *p u32 curval; int ret; - if (futex_get_value_locked(&curval, pifutex)) + if (futex_get_value_locked(&curval, pifutex, FLAGS_SIZE_32)) return -EFAULT; if (unlikely(should_fail_futex(true))) @@ -451,7 +451,7 @@ retry_private: if (likely(cmpval != NULL)) { u32 curval; - ret = futex_get_value_locked(&curval, uaddr1); + ret = futex_get_value_locked(&curval, uaddr1, FLAGS_SIZE_32); if (unlikely(ret)) { double_unlock_hb(hb1, hb2); Index: linux-2.6/kernel/futex/waitwake.c =================================================================== --- linux-2.6.orig/kernel/futex/waitwake.c +++ linux-2.6/kernel/futex/waitwake.c @@ -441,7 +441,7 @@ retry: u32 val = vs[i].w.val; hb = futex_q_lock(q); - ret = futex_get_value_locked(&uval, uaddr); + ret = futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32); if (!ret && uval == val) { /* @@ -609,7 +609,7 @@ retry: retry_private: *hb = futex_q_lock(q); - ret = futex_get_value_locked(&uval, uaddr); + ret = futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32); if (ret) { futex_q_unlock(*hb); From patchwork Thu Sep 21 10:45:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 143003 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5209159vqi; Thu, 21 Sep 2023 16:36:44 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHlhv5lavmSbAeW8UAI0u93KeEd/cNsbB72GjR7Gjk77G5a1oM5j3/uwuDZQXH+/akz/5gT X-Received: by 2002:a17:903:1cc:b0:1bc:9c70:b955 with SMTP id e12-20020a17090301cc00b001bc9c70b955mr5347089plh.28.1695339404444; Thu, 21 Sep 2023 16:36:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695339404; cv=none; d=google.com; s=arc-20160816; b=ZDO5rAYgKBZhErveDopmXpEQlkz62niEl3Y0yU5Z2eo2T8HUo/ClsAc9OBOArY2NRV XKF7L7qjDnVVlCBwaVqq+Jm8tCA8zcHFFOLGm7GzR4iPAiAXXMLGu5v7vJYJspYnnvJG UGIJ9szwK02DXXzcaDFJToVt3iux3bm0LGoHDCxPfjHnEcOg9T7trrknx32igzFCSfXE jmHivCzWtyHZzRi93OK6NW3qHeUcHaD5EIPuwKBpacpimC8PfWNB0L9mGAGc9B9izOo2 3mtfJe7fwNW1jYzKKkPyT+kCi5fFYZDVO93oEwiUOzSxSBwdHCoFECnxH+1f6RLmKH7r hIYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:references :subject:cc:to:from:date:user-agent:message-id:dkim-signature; bh=9OUktNkwNXjB5runbwWnUkYaH4gkQXyoyA8wphj6CxQ=; fh=gKrHpzx5aMrklt5E8Fmce1sSUBKxYyMxbBSs7w4zkwQ=; b=lEX71woMfuGOL+VasfJFMa5i74WhsWX2TDROLUbxWqn3G8oO2imeKHLgCKPER+m4u7 9ry0UgvnkeRchXZUATXkKzP4CN42uu8YX0l5FEJtqAOFXsDAADprHd4zu8TffjTHjsoG 22Jlyo08Hgd5zFjDe9ZzH/BX992SfMxJnI1tWoJQDh7uUJpiDGf1Z78CGd7Yk2nsZF8y A5NDey4jDPqrBPJ0QwjiHMJdU8tdDWHPAVyjieq/E0jtYvCGZ3XVnPd04t1pfwLVTfwn /nEvA7HmC9sYnbfB9tjgc+dO+aCM+lXscTmd4nAkQ8NyqwT38334aQ/bdG7tKdy2hTZc N91Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=GGxYiouD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id n12-20020a170902d2cc00b001b69ede5b79si2712131plc.470.2023.09.21.16.36.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 16:36:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=GGxYiouD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 904E28373A06; Thu, 21 Sep 2023 16:25:41 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230156AbjIUXZ1 (ORCPT + 29 others); Thu, 21 Sep 2023 19:25:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230050AbjIUXZW (ORCPT ); Thu, 21 Sep 2023 19:25:22 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E46F4F9; Thu, 21 Sep 2023 11:10:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=9OUktNkwNXjB5runbwWnUkYaH4gkQXyoyA8wphj6CxQ=; b=GGxYiouDxwMhpCaPcVH40k98P/ XAfPr7EX2gzP8TckrBbk0itBDqh1K7L47qzj5ufg0WUG7Olk8Cby/mGQk/MfSUHpFcAoHFbPrC9vs EjyJVceNWZaOGslnAhpSgFyQciPs7Zl7wH+N2XntEXFqDsPfCoE6LiaK/AnohyGjrKsGsSQ089gs3 hlcFVZvN0eJHab3vYggwHl9ShJvSk4C58uCtLGwBHhEyqjaQe2iX2kH0aoMJkbcnplJDonkvdIIhK hXDu6aGgekrKWj5LYF6dmt9W2ciUrqRrAGk+jg2tMV2PEMXwjpNtLXqPV9S9yK+8ASFpo1IaN/Y4q uQIXkbmg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1qjHQO-00BTom-U1; Thu, 21 Sep 2023 11:00:49 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 26D833008D6; Thu, 21 Sep 2023 13:00:43 +0200 (CEST) Message-Id: <20230921105249.108410391@noisy.programming.kicks-ass.net> User-Agent: quilt/0.65 Date: Thu, 21 Sep 2023 12:45:19 +0200 From: peterz@infradead.org To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de Subject: [PATCH v3 14/15] futex: Enable FUTEX2_{8,16} References: <20230921104505.717750284@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Disposition: inline; filename=peterz-futex2-small.patch X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 21 Sep 2023 16:25:41 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777692211391856897 X-GMAIL-MSGID: 1777692211391856897 When futexes are no longer u32 aligned, the lower offset bits are no longer available to put type info in. However, since offset is the offset within a page, there are plenty bits available on the top end. After that, pass flags into futex_get_value_locked() for WAIT and disallow FUTEX2_SIZE_U64 instead of mandating FUTEX2_SIZE_U32. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Thomas Gleixner --- include/linux/futex.h | 11 ++++++----- kernel/futex/core.c | 9 +++++++++ kernel/futex/futex.h | 4 ++-- kernel/futex/waitwake.c | 5 +++-- 4 files changed, 20 insertions(+), 9 deletions(-) Index: linux-2.6/include/linux/futex.h =================================================================== --- linux-2.6.orig/include/linux/futex.h +++ linux-2.6/include/linux/futex.h @@ -16,18 +16,19 @@ struct task_struct; * The key type depends on whether it's a shared or private mapping. * Don't rearrange members without looking at hash_futex(). * - * offset is aligned to a multiple of sizeof(u32) (== 4) by definition. - * We use the two low order bits of offset to tell what is the kind of key : + * offset is the position within a page and is in the range [0, PAGE_SIZE). + * The high bits of the offset indicate what kind of key this is: * 00 : Private process futex (PTHREAD_PROCESS_PRIVATE) * (no reference on an inode or mm) * 01 : Shared futex (PTHREAD_PROCESS_SHARED) * mapped on a file (reference on the underlying inode) * 10 : Shared futex (PTHREAD_PROCESS_SHARED) * (but private mapping on an mm, and reference taken on it) -*/ + */ -#define FUT_OFF_INODE 1 /* We set bit 0 if key has a reference on inode */ -#define FUT_OFF_MMSHARED 2 /* We set bit 1 if key has a reference on mm */ +#define FUT_OFF_INODE (PAGE_SIZE << 0) +#define FUT_OFF_MMSHARED (PAGE_SIZE << 1) +#define FUT_OFF_SIZE (PAGE_SIZE << 2) union futex_key { struct { Index: linux-2.6/kernel/futex/core.c =================================================================== --- linux-2.6.orig/kernel/futex/core.c +++ linux-2.6/kernel/futex/core.c @@ -311,6 +311,15 @@ int get_futex_key(void __user *uaddr, un } /* + * Encode the futex size in the offset. This makes cross-size + * wake-wait fail -- see futex_match(). + * + * NOTE that cross-size wake-wait is fundamentally broken wrt + * FLAGS_NUMA. + */ + key->both.offset |= FUT_OFF_SIZE * (flags & FLAGS_SIZE_MASK); + + /* * PROCESS_PRIVATE futexes are fast. * As the mm cannot disappear under us and the 'key' only needs * virtual address, we dont even have to find the underlying vma. Index: linux-2.6/kernel/futex/futex.h =================================================================== --- linux-2.6.orig/kernel/futex/futex.h +++ linux-2.6/kernel/futex/futex.h @@ -79,8 +79,8 @@ static inline bool futex_flags_valid(uns return false; } - /* Only 32bit futexes are implemented -- for now */ - if ((flags & FLAGS_SIZE_MASK) != FLAGS_SIZE_32) + /* 64bit futexes aren't implemented -- yet */ + if ((flags & FLAGS_SIZE_MASK) == FLAGS_SIZE_64) return false; /* Index: linux-2.6/kernel/futex/waitwake.c =================================================================== --- linux-2.6.orig/kernel/futex/waitwake.c +++ linux-2.6/kernel/futex/waitwake.c @@ -437,11 +437,12 @@ retry: for (i = 0; i < count; i++) { u32 __user *uaddr = (u32 __user *)(unsigned long)vs[i].w.uaddr; + unsigned int flags = vs[i].w.flags; struct futex_q *q = &vs[i].q; u32 val = vs[i].w.val; hb = futex_q_lock(q); - ret = futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32); + ret = futex_get_value_locked(&uval, uaddr, flags); if (!ret && uval == val) { /* @@ -609,7 +610,7 @@ retry: retry_private: *hb = futex_q_lock(q); - ret = futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32); + ret = futex_get_value_locked(&uval, uaddr, flags); if (ret) { futex_q_unlock(*hb); From patchwork Thu Sep 21 10:45:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 143119 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5283985vqi; Thu, 21 Sep 2023 19:55:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGeeKJ1bZeRrNoRFgxTBcv2qZtaeaA0zc10UiCWlBq0Iq4lgTE6KKtYqJt4oT9wc9dbbfsu X-Received: by 2002:a05:6a20:8f28:b0:15d:6ea0:82da with SMTP id b40-20020a056a208f2800b0015d6ea082damr3147117pzk.33.1695351320529; Thu, 21 Sep 2023 19:55:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695351320; cv=none; d=google.com; s=arc-20160816; b=gVvn+IR47A3sArKhhFCpt5BdFZ3FHTB2QG7YrrGnPdArWg8IXD6lDMAzscz5HWGKm2 dN8zepJafCJ6e0mEOULWQaTvzV74/ucs1lsZKhivCkar7GTNp5GOaYK0F17rX/xLuYaH IgAysOChqutSXLOamp9GrQ53dAdxq90yqgr0sZy5sY50CJJ/Xiy0zH+6pXS+2EIjiZEk KJ2xLnEDgOdgqRaWogy9GiUnT09KZU1TvU6JGVCShq8WfIv+irrZ/4GEPXSoKHRMyKXJ U4B8q5oO7pfrxPlb1drZ2wCapZmp8d35DMMOJRcigMN6KJLJ32Yngil96zEimROzzX3k 7PNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:references :subject:cc:to:from:date:user-agent:message-id:dkim-signature; bh=fyxg1ddJvWatw1JXyI0YYA49BqK/pr5vHXcY+zgsi84=; fh=gKrHpzx5aMrklt5E8Fmce1sSUBKxYyMxbBSs7w4zkwQ=; b=FygXgtQ345LJjHgmuWyVJL2mQ9gbB9CM3V6g4PUEgerLrnJp0uhdAKj2tKm5/TB3e0 mAO3up+EB4sFDb2ABzXR5bC7UiVJ6YthvJDyMECke9p7M9WdXsTjo4iCG7XbYA3GRHcb pAde352Zn+RLmTUZ6QUjb9qVv3UpN89KV0zW6isSf9J0Ot2vmrRNtXv9FeTc1OOPsfIB j4O8SWdw2wm3fLIF9+jtx/e+HX2zzTqhBGBngm0ypwjPk2CivSYmvcYlfzbChnatHCgo BEfBFhOJ2vCnoQAf3RM0idgDDXT1knOizz9Lrxp2vThXrk+qKfo2a7o8CNGM0c14N1Bq 1B5A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=SbTkocQo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id bt1-20020a17090af00100b002635d3815adsi5020532pjb.74.2023.09.21.19.55.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 19:55:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=SbTkocQo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id CFA7F80DA30C; Thu, 21 Sep 2023 13:40:55 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229878AbjIUUg2 (ORCPT + 29 others); Thu, 21 Sep 2023 16:36:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57606 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232442AbjIUUfG (ORCPT ); Thu, 21 Sep 2023 16:35:06 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 316A4720F7; Thu, 21 Sep 2023 10:33:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=fyxg1ddJvWatw1JXyI0YYA49BqK/pr5vHXcY+zgsi84=; b=SbTkocQo9lW9WukkQBhKrwBEuC BTd79yk4lKXrRgv0hqZylgiK6CARXtVkPsdmIWm6/46AEGuH2n3LR1XHR5vqf7W5IYTuUxuV7FfOQ Kmj05lyIjwafVCWmd3Lv8eTfRYeKH0MgPGAOkc9mdnVMpTUIcWluAEWmUKomy6Hfq/4shYj3h76wt XyBh9RvtpmlQ5Nxw8Us4u9ChhkBQwoKMfmQWP+mTD0UsUQIqzuN2VcuwUUAzrNw/se/mk/zSG0V01 0VfAFl6yjoo8fExKwlzWb3N5Ne1hFX2EptUsq32O5dKcdNokdFrlklW0Wi7kMKsJnd8Gg05FrK2F6 tH0xMm6Q==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qjHQR-00FJwH-0l; Thu, 21 Sep 2023 11:00:53 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 2C225300AFE; Thu, 21 Sep 2023 13:00:43 +0200 (CEST) Message-Id: <20230921105249.214313438@noisy.programming.kicks-ass.net> User-Agent: quilt/0.65 Date: Thu, 21 Sep 2023 12:45:20 +0200 From: peterz@infradead.org To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de Subject: [PATCH v3 15/15] futex,selftests: Extend the futex selftests References: <20230921104505.717750284@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Disposition: inline; filename=peterz-futex2-tests.patch X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Thu, 21 Sep 2023 13:40:56 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777681711380925552 X-GMAIL-MSGID: 1777704706322628056 Extend the wait/requeue selftests to also cover the futex2 syscalls. Signed-off-by: Peter Zijlstra (Intel) --- tools/testing/selftests/futex/functional/futex_requeue.c | 100 +++++++++- tools/testing/selftests/futex/functional/futex_wait.c | 56 ++++- tools/testing/selftests/futex/functional/futex_wait_timeout.c | 14 + tools/testing/selftests/futex/functional/futex_wait_wouldblock.c | 28 ++ tools/testing/selftests/futex/functional/futex_waitv.c | 15 - tools/testing/selftests/futex/functional/run.sh | 6 tools/testing/selftests/futex/include/futex2test.h | 39 +++ 7 files changed, 229 insertions(+), 29 deletions(-) Index: linux-2.6/tools/testing/selftests/futex/functional/futex_requeue.c =================================================================== --- linux-2.6.orig/tools/testing/selftests/futex/functional/futex_requeue.c +++ linux-2.6/tools/testing/selftests/futex/functional/futex_requeue.c @@ -7,8 +7,10 @@ #include #include +#include #include "logging.h" #include "futextest.h" +#include "futex2test.h" #define TEST_NAME "futex-requeue" #define timeout_ns 30000000 @@ -16,24 +18,58 @@ volatile futex_t *f1; +bool futex2 = 0; +bool mixed = 0; + void usage(char *prog) { printf("Usage: %s\n", prog); printf(" -c Use color\n"); + printf(" -n Use futex2 interface\n"); + printf(" -x Use mixed size futex\n"); printf(" -h Display this help message\n"); printf(" -v L Verbosity level: %d=QUIET %d=CRITICAL %d=INFO\n", VQUIET, VCRITICAL, VINFO); } -void *waiterfn(void *arg) +static void *waiterfn(void *arg) { + unsigned int flags = 0; struct timespec to; - to.tv_sec = 0; - to.tv_nsec = timeout_ns; + if (futex2) { + unsigned long mask; + + if (clock_gettime(CLOCK_MONOTONIC, &to)) { + printf("clock_gettime() failed errno %d", errno); + return NULL; + } + + to.tv_nsec += timeout_ns; + if (to.tv_nsec >= 1000000000) { + to.tv_sec++; + to.tv_nsec -= 1000000000; + } + + if (mixed) { + flags |= FUTEX2_SIZE_U16; + mask = (unsigned short)(~0U); + } else { + flags |= FUTEX2_SIZE_U32; + mask = (unsigned int)(~0U); + } + + if (futex2_wait(f1, *f1, mask, flags, + &to, CLOCK_MONOTONIC)) + printf("waiter failed errno %d\n", errno); + } else { + + to.tv_sec = 0; + to.tv_nsec = timeout_ns; - if (futex_wait(f1, *f1, &to, 0)) - printf("waiter failed errno %d\n", errno); + if (futex_wait(f1, *f1, &to, flags)) + printf("waiter failed errno %d\n", errno); + } return NULL; } @@ -48,7 +84,7 @@ int main(int argc, char *argv[]) f1 = &_f1; - while ((c = getopt(argc, argv, "cht:v:")) != -1) { + while ((c = getopt(argc, argv, "xncht:v:")) != -1) { switch (c) { case 'c': log_color(1); @@ -59,6 +95,12 @@ int main(int argc, char *argv[]) case 'v': log_verbosity(atoi(optarg)); break; + case 'x': + mixed=1; + /* fallthrough */ + case 'n': + futex2=1; + break; default: usage(basename(argv[0])); exit(1); @@ -79,7 +121,22 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); info("Requeuing 1 futex from f1 to f2\n"); - res = futex_cmp_requeue(f1, 0, &f2, 0, 1, 0); + if (futex2) { + struct futex_waitv futexes[2] = { + { + .val = 0, + .uaddr = (unsigned long)f1, + .flags = mixed ? FUTEX2_SIZE_U16 : FUTEX2_SIZE_U32, + }, + { + .uaddr = (unsigned long)&f2, + .flags = FUTEX2_SIZE_U32, + }, + }; + res = futex2_requeue(futexes, 0, 0, 1); + } else { + res = futex_cmp_requeue(f1, 0, &f2, 0, 1, 0); + } if (res != 1) { ksft_test_result_fail("futex_requeue simple returned: %d %s\n", res ? errno : res, @@ -89,7 +146,11 @@ int main(int argc, char *argv[]) info("Waking 1 futex at f2\n"); - res = futex_wake(&f2, 1, 0); + if (futex2) { + res = futex2_wake(&f2, ~0U, 1, FUTEX2_SIZE_U32); + } else { + res = futex_wake(&f2, 1, 0); + } if (res != 1) { ksft_test_result_fail("futex_requeue simple returned: %d %s\n", res ? errno : res, @@ -112,7 +173,22 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); info("Waking 3 futexes at f1 and requeuing 7 futexes from f1 to f2\n"); - res = futex_cmp_requeue(f1, 0, &f2, 3, 7, 0); + if (futex2) { + struct futex_waitv futexes[2] = { + { + .val = 0, + .uaddr = (unsigned long)f1, + .flags = mixed ? FUTEX2_SIZE_U16 : FUTEX2_SIZE_U32, + }, + { + .uaddr = (unsigned long)&f2, + .flags = FUTEX2_SIZE_U32, + }, + }; + res = futex2_requeue(futexes, 0, 3, 7); + } else { + res = futex_cmp_requeue(f1, 0, &f2, 3, 7, 0); + } if (res != 10) { ksft_test_result_fail("futex_requeue many returned: %d %s\n", res ? errno : res, @@ -121,7 +197,11 @@ int main(int argc, char *argv[]) } info("Waking INT_MAX futexes at f2\n"); - res = futex_wake(&f2, INT_MAX, 0); + if (futex2) { + res = futex2_wake(&f2, ~0U, INT_MAX, FUTEX2_SIZE_U32); + } else { + res = futex_wake(&f2, INT_MAX, 0); + } if (res != 7) { ksft_test_result_fail("futex_requeue many returned: %d %s\n", res ? errno : res, Index: linux-2.6/tools/testing/selftests/futex/functional/futex_wait.c =================================================================== --- linux-2.6.orig/tools/testing/selftests/futex/functional/futex_wait.c +++ linux-2.6/tools/testing/selftests/futex/functional/futex_wait.c @@ -9,8 +9,10 @@ #include #include #include +#include #include "logging.h" #include "futextest.h" +#include "futex2test.h" #define TEST_NAME "futex-wait" #define timeout_ns 30000000 @@ -19,10 +21,13 @@ void *futex; +bool futex2 = 0; + void usage(char *prog) { printf("Usage: %s\n", prog); printf(" -c Use color\n"); + printf(" -n Use futex2 interface\n"); printf(" -h Display this help message\n"); printf(" -v L Verbosity level: %d=QUIET %d=CRITICAL %d=INFO\n", VQUIET, VCRITICAL, VINFO); @@ -30,17 +35,35 @@ void usage(char *prog) static void *waiterfn(void *arg) { - struct timespec to; unsigned int flags = 0; + struct timespec to; if (arg) flags = *((unsigned int *) arg); - to.tv_sec = 0; - to.tv_nsec = timeout_ns; + if (futex2) { + if (clock_gettime(CLOCK_MONOTONIC, &to)) { + printf("clock_gettime() failed errno %d", errno); + return NULL; + } - if (futex_wait(futex, 0, &to, flags)) - printf("waiter failed errno %d\n", errno); + to.tv_nsec += timeout_ns; + if (to.tv_nsec >= 1000000000) { + to.tv_sec++; + to.tv_nsec -= 1000000000; + } + + if (futex2_wait(futex, 0, ~0U, flags | FUTEX2_SIZE_U32, + &to, CLOCK_MONOTONIC)) + printf("waiter failed errno %d\n", errno); + } else { + + to.tv_sec = 0; + to.tv_nsec = timeout_ns; + + if (futex_wait(futex, 0, &to, flags)) + printf("waiter failed errno %d\n", errno); + } return NULL; } @@ -55,7 +78,7 @@ int main(int argc, char *argv[]) futex = &f_private; - while ((c = getopt(argc, argv, "cht:v:")) != -1) { + while ((c = getopt(argc, argv, "ncht:v:")) != -1) { switch (c) { case 'c': log_color(1); @@ -66,6 +89,9 @@ int main(int argc, char *argv[]) case 'v': log_verbosity(atoi(optarg)); break; + case 'n': + futex2=1; + break; default: usage(basename(argv[0])); exit(1); @@ -84,7 +110,11 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); info("Calling private futex_wake on futex: %p\n", futex); - res = futex_wake(futex, 1, FUTEX_PRIVATE_FLAG); + if (futex2) { + res = futex2_wake(futex, ~0U, 1, FUTEX2_SIZE_U32 | FUTEX2_PRIVATE); + } else { + res = futex_wake(futex, 1, FUTEX_PRIVATE_FLAG); + } if (res != 1) { ksft_test_result_fail("futex_wake private returned: %d %s\n", errno, strerror(errno)); @@ -112,7 +142,11 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); info("Calling shared (page anon) futex_wake on futex: %p\n", futex); - res = futex_wake(futex, 1, 0); + if (futex2) { + res = futex2_wake(futex, ~0U, 1, FUTEX2_SIZE_U32); + } else { + res = futex_wake(futex, 1, 0); + } if (res != 1) { ksft_test_result_fail("futex_wake shared (page anon) returned: %d %s\n", errno, strerror(errno)); @@ -151,7 +185,11 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); info("Calling shared (file backed) futex_wake on futex: %p\n", futex); - res = futex_wake(shm, 1, 0); + if (futex2) { + res = futex2_wake(shm, ~0U, 1, FUTEX2_SIZE_U32); + } else { + res = futex_wake(shm, 1, 0); + } if (res != 1) { ksft_test_result_fail("futex_wake shared (file backed) returned: %d %s\n", errno, strerror(errno)); Index: linux-2.6/tools/testing/selftests/futex/functional/futex_wait_timeout.c =================================================================== --- linux-2.6.orig/tools/testing/selftests/futex/functional/futex_wait_timeout.c +++ linux-2.6/tools/testing/selftests/futex/functional/futex_wait_timeout.c @@ -128,7 +128,7 @@ int main(int argc, char *argv[]) } ksft_print_header(); - ksft_set_plan(9); + ksft_set_plan(11); ksft_print_msg("%s: Block on a futex and wait for timeout\n", basename(argv[0])); ksft_print_msg("\tArguments: timeout=%ldns\n", timeout_ns); @@ -201,6 +201,18 @@ int main(int argc, char *argv[]) res = futex_waitv(&waitv, 1, 0, &to, CLOCK_REALTIME); test_timeout(res, &ret, "futex_waitv realtime", ETIMEDOUT); + /* futex2_wait with CLOCK_MONOTONIC */ + if (futex_get_abs_timeout(CLOCK_MONOTONIC, &to, timeout_ns)) + return RET_FAIL; + res = futex2_wait(&f1, f1, 1, FUTEX2_SIZE_U32, &to, CLOCK_MONOTONIC); + test_timeout(res, &ret, "futex2_wait monotonic", ETIMEDOUT); + + /* futex2_wait with CLOCK_REALTIME */ + if (futex_get_abs_timeout(CLOCK_REALTIME, &to, timeout_ns)) + return RET_FAIL; + res = futex2_wait(&f1, f1, 1, FUTEX2_SIZE_U32, &to, CLOCK_REALTIME); + test_timeout(res, &ret, "futex2_wait realtime", ETIMEDOUT); + ksft_print_cnts(); return ret; } Index: linux-2.6/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c =================================================================== --- linux-2.6.orig/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c +++ linux-2.6/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c @@ -46,7 +46,7 @@ int main(int argc, char *argv[]) struct futex_waitv waitv = { .uaddr = (uintptr_t)&f1, .val = f1+1, - .flags = FUTEX_32, + .flags = FUTEX2_SIZE_U32 | FUTEX2_PRIVATE, .__reserved = 0 }; @@ -68,7 +68,7 @@ int main(int argc, char *argv[]) } ksft_print_header(); - ksft_set_plan(2); + ksft_set_plan(3); ksft_print_msg("%s: Test the unexpected futex value in FUTEX_WAIT\n", basename(argv[0])); @@ -106,6 +106,30 @@ int main(int argc, char *argv[]) ksft_test_result_pass("futex_waitv\n"); } + if (clock_gettime(CLOCK_MONOTONIC, &to)) { + error("clock_gettime failed\n", errno); + return errno; + } + + to.tv_nsec += timeout_ns; + + if (to.tv_nsec >= 1000000000) { + to.tv_sec++; + to.tv_nsec -= 1000000000; + } + + info("Calling futex2_wait on f1: %u @ %p with val=%u\n", f1, &f1, f1+1); + res = futex2_wait(&f1, f1+1, ~0U, FUTEX2_SIZE_U32 | FUTEX2_PRIVATE, + &to, CLOCK_MONOTONIC); + if (!res || errno != EWOULDBLOCK) { + ksft_test_result_pass("futex2_wait returned: %d %s\n", + res ? errno : res, + res ? strerror(errno) : ""); + ret = RET_FAIL; + } else { + ksft_test_result_pass("futex2_wait\n"); + } + ksft_print_cnts(); return ret; } Index: linux-2.6/tools/testing/selftests/futex/functional/futex_waitv.c =================================================================== --- linux-2.6.orig/tools/testing/selftests/futex/functional/futex_waitv.c +++ linux-2.6/tools/testing/selftests/futex/functional/futex_waitv.c @@ -88,7 +88,7 @@ int main(int argc, char *argv[]) for (i = 0; i < NR_FUTEXES; i++) { waitv[i].uaddr = (uintptr_t)&futexes[i]; - waitv[i].flags = FUTEX_32 | FUTEX_PRIVATE_FLAG; + waitv[i].flags = FUTEX2_SIZE_U32 | FUTEX2_PRIVATE; waitv[i].val = 0; waitv[i].__reserved = 0; } @@ -99,7 +99,8 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); - res = futex_wake(u64_to_ptr(waitv[NR_FUTEXES - 1].uaddr), 1, FUTEX_PRIVATE_FLAG); + res = futex2_wake(u64_to_ptr(waitv[NR_FUTEXES - 1].uaddr), ~0U, 1, + FUTEX2_PRIVATE | FUTEX2_SIZE_U32); if (res != 1) { ksft_test_result_fail("futex_wake private returned: %d %s\n", res ? errno : res, @@ -122,7 +123,7 @@ int main(int argc, char *argv[]) *shared_data = 0; waitv[i].uaddr = (uintptr_t)shared_data; - waitv[i].flags = FUTEX_32; + waitv[i].flags = FUTEX2_SIZE_U32; waitv[i].val = 0; waitv[i].__reserved = 0; } @@ -145,8 +146,8 @@ int main(int argc, char *argv[]) for (i = 0; i < NR_FUTEXES; i++) shmdt(u64_to_ptr(waitv[i].uaddr)); - /* Testing a waiter without FUTEX_32 flag */ - waitv[0].flags = FUTEX_PRIVATE_FLAG; + /* Testing a waiter without FUTEX2_SIZE_U32 flag */ + waitv[0].flags = FUTEX2_PRIVATE; if (clock_gettime(CLOCK_MONOTONIC, &to)) error("gettime64 failed\n", errno); @@ -160,11 +161,11 @@ int main(int argc, char *argv[]) res ? strerror(errno) : ""); ret = RET_FAIL; } else { - ksft_test_result_pass("futex_waitv without FUTEX_32\n"); + ksft_test_result_pass("futex_waitv without FUTEX2_SIZE_U32\n"); } /* Testing a waiter with an unaligned address */ - waitv[0].flags = FUTEX_PRIVATE_FLAG | FUTEX_32; + waitv[0].flags = FUTEX2_PRIVATE | FUTEX2_SIZE_U32; waitv[0].uaddr = 1; if (clock_gettime(CLOCK_MONOTONIC, &to)) Index: linux-2.6/tools/testing/selftests/futex/functional/run.sh =================================================================== --- linux-2.6.orig/tools/testing/selftests/futex/functional/run.sh +++ linux-2.6/tools/testing/selftests/futex/functional/run.sh @@ -76,9 +76,15 @@ echo echo ./futex_wait $COLOR +echo +./futex_wait -n $COLOR echo ./futex_requeue $COLOR +echo +./futex_requeue -n $COLOR +echo +./futex_requeue -x $COLOR echo ./futex_waitv $COLOR Index: linux-2.6/tools/testing/selftests/futex/include/futex2test.h =================================================================== --- linux-2.6.orig/tools/testing/selftests/futex/include/futex2test.h +++ linux-2.6/tools/testing/selftests/futex/include/futex2test.h @@ -8,6 +8,28 @@ #define u64_to_ptr(x) ((void *)(uintptr_t)(x)) +#ifndef __NR_futex_wake +#define __NR_futex_wake 452 +#define __NR_futex_wait 453 +#define __NR_futex_requeue 454 +#endif + +#ifndef FUTEX2_SIZE_U8 +/* + * Flags for futex2 syscalls. + */ +#define FUTEX2_SIZE_U8 0x00 +#define FUTEX2_SIZE_U16 0x01 +#define FUTEX2_SIZE_U32 0x02 +#define FUTEX2_SIZE_U64 0x03 +#define FUTEX2_NUMA 0x04 + /* 0x08 */ + /* 0x10 */ + /* 0x20 */ + /* 0x40 */ +#define FUTEX2_PRIVATE FUTEX_PRIVATE_FLAG +#endif + /** * futex_waitv - Wait at multiple futexes, wake on any * @waiters: Array of waiters @@ -20,3 +42,20 @@ static inline int futex_waitv(volatile s { return syscall(__NR_futex_waitv, waiters, nr_waiters, flags, timo, clockid); } + +static inline int futex2_wake(volatile void *uaddr, unsigned long mask, int nr, unsigned int flags) +{ + return syscall(__NR_futex_wake, uaddr, mask, nr, flags); +} + +static inline int futex2_wait(volatile void *uaddr, unsigned long val, unsigned long mask, + unsigned int flags, struct timespec *timo, clockid_t clockid) +{ + return syscall(__NR_futex_wait, uaddr, val, mask, flags, timo, clockid); +} + +static inline int futex2_requeue(struct futex_waitv *futexes, unsigned int flags, + int nr_wake, int nr_requeue) +{ + return syscall(__NR_futex_requeue, futexes, flags, nr_wake, nr_requeue); +} From patchwork Fri Sep 22 20:01:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 143660 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5863299vqi; Fri, 22 Sep 2023 14:02:38 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHWckUzLRcMFLSff/iUJU1kmZIldJRJZCNIOSm8vX78uFbQn9KwvA7xxBDHyYkP2qtTBEtw X-Received: by 2002:a05:6a20:8e08:b0:13e:90aa:8c8b with SMTP id y8-20020a056a208e0800b0013e90aa8c8bmr1093052pzj.4.1695416557779; Fri, 22 Sep 2023 14:02:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695416557; cv=none; d=google.com; s=arc-20160816; b=DObW4gX64Lyft7u1ojAXMav2Ktq2zdNxrNVBEsWP8JcD5jYQKSlOSOxrBmhx3gA6PF +IMDpCbEIIDtoR+W1b6WuzgM5yGVkzE8KnekEV1+9axccvGGfFxZ/nhv+E8ItOznC0iw jikeSoyT43yTl4jbK82rG1uN5Bt8h9K/tSSb+bXLH9jW8rm39fI0tg1vt+OnwAOp2pOq NCA4cJaO/KR1GzaWSo/S3UjcPRAJggESDHdtfX8Q/wxP0x70a+I5U8ZNIbeWTg7Dff7A wV+W/pMKcwFH4+XLPNzXffXYdvHxh//P+k6YqIWj0I9bw7WnKmiwJ77GGANG8alUKOqM obpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=m5eUhTj4OMdhGrZCCmAgS0fprpsPOZ8qVsmY+pLiWtU=; fh=nTCmWlxk/Baj2Uc7uGPyXZEanAXF/EsyGib0RIHhcIQ=; b=XtYDlLMuoV5FPA/CTAJ2O6g/AWmclEr35aL2ZNWWCtEkaN9IaSX6LyzhflH3QA4OLh yHO7JbzB9EzgU6/o3BPO2tdKXIpUZxYOBD7LWc7ogO4QsUpk8NG6mGa7D7BwIHsqzCZC ct/5FCCYptWWYlJctA+gASlENJ1iWXo/t45Ku/3wMJH/SkmT/cxyi19EC0yqXZloSvXL 2yP16F1ilpYohl8GWAJ4//E5C36tlTU9wTqZ84ofDZtsixUYWVmtY2uqkgO6oZpMoBUA FQF9AfejUlfIdD+Lct9v7M45f3xyxhBjkkQM7RvRnL1u/QlnIsrEQTH2+Lf4OaLKspMz e45A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=DCD6izEC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id t22-20020a056a0021d600b00690bf904bb6si4668925pfj.307.2023.09.22.14.02.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 14:02:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=DCD6izEC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 3BF3C80722C5; Fri, 22 Sep 2023 14:00:13 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229804AbjIVU76 (ORCPT + 28 others); Fri, 22 Sep 2023 16:59:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229745AbjIVU7x (ORCPT ); Fri, 22 Sep 2023 16:59:53 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8B151A1; Fri, 22 Sep 2023 13:59:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=m5eUhTj4OMdhGrZCCmAgS0fprpsPOZ8qVsmY+pLiWtU=; b=DCD6izECXSbSHvAInV6KQTFrIm AM42jx2fn31pHdWtIwgBg7UYPun61VPp7UZmrCy0O+wtnJgcLfiXyzMLfRpoWnTWzpJVGQJ8jo7J5 bFtxuDcm4+py1iQQAF3vZol/m5JSW+wtkIYwBQUnx+DP6uHi6VjF8Qs8bA0HREM7IAbga7BkRcJ8z l7kAiGcSdKDyHIkEYK9JgiGqtRuvl75DEd2RCLmo9ym5K2osMT3RX8p5NMe8f3cwH2RuRKq4O4KF/ uVS4O05DZT1B9tQivLWaNkuXz1fPyb4mCZcQheDnfTNDElGcU3IAvwCoLKAF9ShDt8Wk2e9MO5frF 4ixJusiA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qjnEs-00GXzB-2v; Fri, 22 Sep 2023 20:59:05 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id EC2F03005AA; Fri, 22 Sep 2023 22:59:03 +0200 (CEST) Message-Id: <20230922205449.923636292@infradead.org> User-Agent: quilt/0.65 Date: Fri, 22 Sep 2023 22:01:22 +0200 From: Peter Zijlstra To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, steve.shaw@intel.com, marko.makela@mariadb.com, andrei.artemev@intel.com Subject: [PATCH 16/15] futex,selftests: Extend the futex selftests for NUMA References: <20230921104505.717750284@noisy.programming.kicks-ass.net> <20230921104505.717750284@noisy.programming.kicks-ass.net> <20230922200120.011184118@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=2.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 22 Sep 2023 14:00:13 -0700 (PDT) X-Spam-Level: ** X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777773112410232671 X-GMAIL-MSGID: 1777773112410232671 XXX Signed-off-by: Peter Zijlstra (Intel) --- tools/testing/selftests/futex/functional/Makefile | 3 tools/testing/selftests/futex/functional/futex_numa.c | 262 ++++++++++++++++++ 2 files changed, 264 insertions(+), 1 deletion(-) --- a/tools/testing/selftests/futex/functional/Makefile +++ b/tools/testing/selftests/futex/functional/Makefile @@ -17,7 +17,8 @@ TEST_GEN_PROGS := \ futex_wait_private_mapped_file \ futex_wait \ futex_requeue \ - futex_waitv + futex_waitv \ + futex_numa TEST_PROGS := run.sh --- /dev/null +++ b/tools/testing/selftests/futex/functional/futex_numa.c @@ -0,0 +1,262 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include +#include +#include "logging.h" +#include "futextest.h" +#include "futex2test.h" + +typedef u_int32_t u32; +typedef int32_t s32; +typedef u_int64_t u64; + +static int fflags = (FUTEX2_SIZE_U32 | FUTEX2_PRIVATE); +static int fnode = FUTEX_NO_NODE; + +/* fairly stupid test-and-set lock with a waiter flag */ + +#define N_LOCK 0x0000001 +#define N_WAITERS 0x0001000 + +struct futex_numa_32 { + union { + u64 full; + struct { + u32 val; + u32 node; + }; + }; +}; + +void futex_numa_32_lock(struct futex_numa_32 *lock) +{ + for (;;) { + struct futex_numa_32 new, old = { + .full = __atomic_load_n(&lock->full, __ATOMIC_RELAXED), + }; + + for (;;) { + new = old; + if (old.val == 0) { + /* no waiter, no lock -> first lock, set no-node */ + new.node = fnode; + } + if (old.val & N_LOCK) { + /* contention, set waiter */ + new.val |= N_WAITERS; + } + new.val |= N_LOCK; + + /* nothing changed, ready to block */ + if (old.full == new.full) + break; + + /* + * Use u64 cmpxchg to set the futex value and node in a + * consistent manner. + */ + if (__atomic_compare_exchange_n(&lock->full, + &old.full, new.full, + /* .weak */ false, + __ATOMIC_ACQUIRE, + __ATOMIC_RELAXED)) { + + /* if we just set N_LOCK, we own it */ + if (!(old.val & N_LOCK)) + return; + + /* go block */ + break; + } + } + + futex2_wait(lock, new.val, ~0U, fflags, NULL, 0); + } +} + +void futex_numa_32_unlock(struct futex_numa_32 *lock) +{ + u32 val = __atomic_sub_fetch(&lock->val, N_LOCK, __ATOMIC_RELEASE); + assert((s32)val >= 0); + if (val & N_WAITERS) { + int woken = futex2_wake(lock, ~0U, 1, fflags); + assert(val == N_WAITERS); + if (!woken) { + __atomic_compare_exchange_n(&lock->val, &val, 0U, + false, __ATOMIC_RELAXED, + __ATOMIC_RELAXED); + } + } +} + +static long nanos = 50000; + +struct thread_args { + pthread_t tid; + volatile int * done; + struct futex_numa_32 *lock; + int val; + int *val1, *val2; + int node; +}; + +static void *threadfn(void *_arg) +{ + struct thread_args *args = _arg; + struct timespec ts = { + .tv_nsec = nanos, + }; + int node; + + while (!*args->done) { + + futex_numa_32_lock(args->lock); + args->val++; + + assert(*args->val1 == *args->val2); + (*args->val1)++; + nanosleep(&ts, NULL); + (*args->val2)++; + + node = args->lock->node; + futex_numa_32_unlock(args->lock); + + if (node != args->node) { + args->node = node; + printf("node: %d\n", node); + } + + nanosleep(&ts, NULL); + } + + return NULL; +} + +static void *contendfn(void *_arg) +{ + struct thread_args *args = _arg; + + while (!*args->done) { + /* + * futex2_wait() will take hb-lock, verify *var == val and + * queue/abort. By knowingly setting val 'wrong' this will + * abort and thereby generate hb-lock contention. + */ + futex2_wait(&args->lock->val, ~0U, ~0U, fflags, NULL, 0); + args->val++; + } + + return NULL; +} + +static volatile int done = 0; +static struct futex_numa_32 lock = { .val = 0, }; +static int val1, val2; + +int main(int argc, char *argv[]) +{ + struct thread_args *tas[512], *cas[512]; + int c, t, threads = 2, contenders = 0; + int sleeps = 10; + int total = 0; + + while ((c = getopt(argc, argv, "c:t:s:n:N::")) != -1) { + switch (c) { + case 'c': + contenders = atoi(optarg); + break; + case 't': + threads = atoi(optarg); + break; + case 's': + sleeps = atoi(optarg); + break; + case 'n': + nanos = atoi(optarg); + break; + case 'N': + fflags |= FUTEX2_NUMA; + if (optarg) + fnode = atoi(optarg); + break; + default: + exit(1); + break; + } + } + + for (t = 0; t < contenders; t++) { + struct thread_args *args = calloc(1, sizeof(*args)); + if (!args) { + perror("thread_args"); + exit(-1); + } + + args->done = &done; + args->lock = &lock; + args->val1 = &val1; + args->val2 = &val2; + args->node = -1; + + if (pthread_create(&args->tid, NULL, contendfn, args)) { + perror("pthread_create"); + exit(-1); + } + + cas[t] = args; + } + + for (t = 0; t < threads; t++) { + struct thread_args *args = calloc(1, sizeof(*args)); + if (!args) { + perror("thread_args"); + exit(-1); + } + + args->done = &done; + args->lock = &lock; + args->val1 = &val1; + args->val2 = &val2; + args->node = -1; + + if (pthread_create(&args->tid, NULL, threadfn, args)) { + perror("pthread_create"); + exit(-1); + } + + tas[t] = args; + } + + sleep(sleeps); + + done = true; + + for (t = 0; t < threads; t++) { + struct thread_args *args = tas[t]; + + pthread_join(args->tid, NULL); + total += args->val; +// printf("tval: %d\n", args->val); + } + printf("total: %d\n", total); + + if (contenders) { + total = 0; + for (t = 0; t < contenders; t++) { + struct thread_args *args = cas[t]; + + pthread_join(args->tid, NULL); + total += args->val; +// printf("tval: %d\n", args->val); + } + printf("contenders: %d\n", total); + } + + return 0; +} + From patchwork Fri Sep 22 20:01:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 143691 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5892897vqi; Fri, 22 Sep 2023 15:08:02 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFL9SCLln7B9wL/hpobNZbLlJK1F4mryzEdGF+sKepSHJergh1xipYbsTa7b+RnpMqlZAi5 X-Received: by 2002:a17:90a:f40e:b0:276:6b9d:7503 with SMTP id ch14-20020a17090af40e00b002766b9d7503mr983984pjb.28.1695420482127; Fri, 22 Sep 2023 15:08:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695420482; cv=none; d=google.com; s=arc-20160816; b=jPrO/CnLlEvlGaSH8zcw9r/dJX1qLtUBAtyEiU5vilHV6RcgrC1kzoQ+kPUqSz/af5 hbV1BmAnyCZfX7QoP5GGe0LSztsYWcASz1uDGKW/a08OvQS0rqsSaHnqfuf67Z9mPEHl ygy9roBJfG8Md2O1uQmuFGRs1tH7qKoNXQH9BOhdaMeLaVAg4p92m4tj6fDWlhR2hMBl TGOtcWKll2drCzPzaNg0rG1YKiVXK1gcqIvGvv8tT5KIYmoG+sHRaDLy5I9wsil2V5zb z9K02JD/lGwVcybYaKNdK++KIM+0XbJp7WY29/kJaL4hhk2eY6oi/IkPjYyH88Ngig0Y CAXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=zWIQ6Yi556U65myk2PzRkq/DHdsIiBTxeyqBKDx6r2g=; fh=nTCmWlxk/Baj2Uc7uGPyXZEanAXF/EsyGib0RIHhcIQ=; b=0gL6NUygWKSZhkdjM3ShQd5kjvosMXJ86n3axrtUT5xdy2MdqiWr5u3/dx4WySSAfK VcVVANqIRyZbQzr03KGqgg/h6hCTks0+Gx3XXSwObH7sLK81q3YxDvi4ViApr1kUVpas XAWY/jmEZostCv3KRcca7839M3yDT+4le55tC1z6xNUAzrUsVzEPOCNM49rPUr5H5EAz lQ8RxoCtfI8l0NOf5YallG6TPU2QXNuOwET08IS5hJGcSno85k3oMcUIk2az6uN08MkD e84ajcEw5j8RlklHbkZ9Vp3qzeWLw5ubw7XpWEaH63WJaZdoKTn5Qv1MbMhsmCIHdXOV uLPw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=hktQlK3I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id n8-20020a17090a4e0800b00273e2978b8fsi4743942pjh.32.2023.09.22.15.08.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 15:08:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=hktQlK3I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 1E94683B5AC4; Fri, 22 Sep 2023 14:00:05 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229757AbjIVU7y (ORCPT + 28 others); Fri, 22 Sep 2023 16:59:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229723AbjIVU7w (ORCPT ); Fri, 22 Sep 2023 16:59:52 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C46471A2; Fri, 22 Sep 2023 13:59:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=zWIQ6Yi556U65myk2PzRkq/DHdsIiBTxeyqBKDx6r2g=; b=hktQlK3IZ5z27ejm1FLGbbL1mo mzMcjkhwk3cOL9uWlaupMqvTCXnVZAoT6CJRQ0M29Q5yLP4ldsnxOmGjX4q6rjSEI64QiOYmhDBl1 P34q0qimycD9b/kQhEEKct+8KTKDGbkWMwSTmS278soTtIfi9R+SqnOIrE5sNsyZz3ma+fTYPc4zF 1bnVko5qUEoPXj7iOJw+xXKE+NPyQD41xo32nxvuoljsm3NlHTZLfVHISoB7SITHfEz+0mYyjkOq8 DkRxp0jw7F5ICVHtOBTQtELeVEyWis3tl7y1KZ2CkP3ihqlF98Q0ECbL3/aloKy0AEDNlzdBY/hIg FSIpHA6A==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qjnEs-00GXzC-2v; Fri, 22 Sep 2023 20:59:05 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id F0DDE3008D6; Fri, 22 Sep 2023 22:59:03 +0200 (CEST) Message-Id: <20230922205450.033535181@infradead.org> User-Agent: quilt/0.65 Date: Fri, 22 Sep 2023 22:01:23 +0200 From: Peter Zijlstra To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, steve.shaw@intel.com, marko.makela@mariadb.com, andrei.artemev@intel.com Subject: [PATCH 17/15] [HACK] futex: Force futex hash collision References: <20230921104505.717750284@noisy.programming.kicks-ass.net> <20230921104505.717750284@noisy.programming.kicks-ass.net> <20230922200120.011184118@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=2.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Fri, 22 Sep 2023 14:00:06 -0700 (PDT) X-Spam-Level: ** X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777777227429486812 X-GMAIL-MSGID: 1777777227429486812 If you hate performance -- use this. Signed-off-by: Peter Zijlstra (Intel) --- kernel/futex/core.c | 6 ++++++ kernel/sched/features.h | 2 ++ 2 files changed, 8 insertions(+) --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -128,6 +128,9 @@ static int futex_put_value(u32 val, u32 } } +#include +#include "../sched/sched.h" + /** * futex_hash - Return the hash bucket in the global hash * @key: Pointer to the futex key for which the hash is calculated @@ -159,6 +162,9 @@ struct futex_hash_bucket *futex_hash(uni } } + if (sched_feat(FUTEX_SQUASH)) + hash = 0; + return &futex_queues[node][hash & (futex_hashsize - 1)]; } --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -89,3 +89,5 @@ SCHED_FEAT(UTIL_EST_FASTUP, true) SCHED_FEAT(LATENCY_WARN, false) SCHED_FEAT(HZ_BW, true) + +SCHED_FEAT(FUTEX_SQUASH, false)