From patchwork Mon Dec 19 15:35:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34625 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2463839wrn; Mon, 19 Dec 2022 07:48:26 -0800 (PST) X-Google-Smtp-Source: AMrXdXsGVxAVe95cKGrrAWmy6cP2cVatFIHQc2QzEmKuuUzBmsQpN8s0Xm/WLwP6bP7SSGINGZzc X-Received: by 2002:aa7:850c:0:b0:57f:69a8:1e04 with SMTP id v12-20020aa7850c000000b0057f69a81e04mr13135732pfn.26.1671464905905; Mon, 19 Dec 2022 07:48:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671464905; cv=none; d=google.com; s=arc-20160816; b=J1DUBqjCgLGoB2IO/Djd+FbLHAVrGCJuIB3Yy0MGlxd5KaNPF7SlZGl5K9LCJOsgZH aKoKvDNNwUVb6w/3RRtMt+OJCA2pdjHnjz+vIHq1j8aBNDHimHtTNtEEhlryBCuAe/Rr mmD/lOfYbOdCwrr/3t0IswFov+P0BBW3c/M/iyVlspaOeOgFEmqjO93GiPSDIMAliOl4 2Q6VDHF0diFt8+1BscoY6qLEdG12fQn3GfXuU14jFL5epFFTIni6tdapyq1NrTgYMTru fiEo+fZdPbuiqoJ+VguncmmrmPular8fBBppED2guUhjj/Rnb6g4Y+cBK6YLPDNuA6a8 o5cw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=ezxvAyohFkyD0fBAJIL+qTrFeT/MhmgYOqNL8/CaLLo=; b=zqLqyd2JimNLHEJeGhgs9wjd5yRKhwloav20FYGOcB3DaniTqztJOXCqxwaMnaZQfb nrg9d91Uf0cyPeUjnUJEnPw7zcNt+j5moS5+ZDpRJ6EjCbjpP63hkPTh2skrQM8j8h4r PuTXEuGd4k1YyMCxiFXrF8jzMpeG2cjlOX8QUqwcXZEH7IzDT+GV9e+mu/5emOY0Nh6C oQByeMoYF0LwEQSAZIw9lpCTar2ifVZ5HEbpO03IeyBWPmpu54zR2cS1SbR2OsHnpeJz 8fF7zWnuPE+w+zSLvtTZ77ffaXOgzWEf8oWzpmYlYHrhPy/NptMq5NalaaT08E5gqz+o Wnlg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=EpCbp6SS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 193-20020a6217ca000000b00574694c97c7si10267623pfx.300.2022.12.19.07.48.13; Mon, 19 Dec 2022 07:48:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=EpCbp6SS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232491AbiLSPql (ORCPT + 99 others); Mon, 19 Dec 2022 10:46:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232326AbiLSPpc (ORCPT ); Mon, 19 Dec 2022 10:45:32 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C85C513F40; Mon, 19 Dec 2022 07:44:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=ezxvAyohFkyD0fBAJIL+qTrFeT/MhmgYOqNL8/CaLLo=; b=EpCbp6SSB82FBRpRL3WLFqa1er fuzx4lpjiwMDgI2wDkjoqisa1rFP2utGhxIaOhxTrHCo7GVehas62q8lOrxKyUExkTPgH5qrJlgmD mdLMP53s/UdjA6xFdacYlLwfmWKVGgN3FDnUJc+YE6QaSFx2faUMB9M6ycuS9i+QXaYo1GZ45IuFJ MLKh2Lk5mUsf7UiRdYtCL6rq+w1NEEnIXm7NyWF8oPWx10GLQhHJOxBvnuKwjkt4yuS47GJYU3R22 p58A2Hq1aQJaVBZxqTlizmhlk3cfNJT7v4y+MhwyWx8cFLoCzjcwo0zOIMO0tcf6r9L03dJVuMEwy eDeSbfPQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1p7IIS-000qwb-OL; Mon, 19 Dec 2022 15:43:21 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 9FE0F300642; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 64DEF20B0F898; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154118.889543494@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:26 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 01/12] crypto: Remove u128 usage References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752657985348620065?= X-GMAIL-MSGID: =?utf-8?q?1752657985348620065?= As seems to be the common (majority) usage in crypto, use __uint128_t instead of u128. This frees up u128 for definition in linux/types.h. Signed-off-by: Peter Zijlstra (Intel) --- lib/crypto/curve25519-hacl64.c | 142 ++++++++++++++++++++--------------------- lib/crypto/poly1305-donna64.c | 22 ++---- 2 files changed, 80 insertions(+), 84 deletions(-) --- a/lib/crypto/curve25519-hacl64.c +++ b/lib/crypto/curve25519-hacl64.c @@ -14,8 +14,6 @@ #include #include -typedef __uint128_t u128; - static __always_inline u64 u64_eq_mask(u64 a, u64 b) { u64 x = a ^ b; @@ -50,77 +48,77 @@ static __always_inline void modulo_carry b[0] = b0_; } -static __always_inline void fproduct_copy_from_wide_(u64 *output, u128 *input) +static __always_inline void fproduct_copy_from_wide_(u64 *output, __uint128_t *input) { { - u128 xi = input[0]; + __uint128_t xi = input[0]; output[0] = ((u64)(xi)); } { - u128 xi = input[1]; + __uint128_t xi = input[1]; output[1] = ((u64)(xi)); } { - u128 xi = input[2]; + __uint128_t xi = input[2]; output[2] = ((u64)(xi)); } { - u128 xi = input[3]; + __uint128_t xi = input[3]; output[3] = ((u64)(xi)); } { - u128 xi = input[4]; + __uint128_t xi = input[4]; output[4] = ((u64)(xi)); } } static __always_inline void -fproduct_sum_scalar_multiplication_(u128 *output, u64 *input, u64 s) +fproduct_sum_scalar_multiplication_(__uint128_t *output, u64 *input, u64 s) { - output[0] += (u128)input[0] * s; - output[1] += (u128)input[1] * s; - output[2] += (u128)input[2] * s; - output[3] += (u128)input[3] * s; - output[4] += (u128)input[4] * s; + output[0] += (__uint128_t)input[0] * s; + output[1] += (__uint128_t)input[1] * s; + output[2] += (__uint128_t)input[2] * s; + output[3] += (__uint128_t)input[3] * s; + output[4] += (__uint128_t)input[4] * s; } -static __always_inline void fproduct_carry_wide_(u128 *tmp) +static __always_inline void fproduct_carry_wide_(__uint128_t *tmp) { { u32 ctr = 0; - u128 tctr = tmp[ctr]; - u128 tctrp1 = tmp[ctr + 1]; + __uint128_t tctr = tmp[ctr]; + __uint128_t tctrp1 = tmp[ctr + 1]; u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU; - u128 c = ((tctr) >> (51)); - tmp[ctr] = ((u128)(r0)); + __uint128_t c = ((tctr) >> (51)); + tmp[ctr] = ((__uint128_t)(r0)); tmp[ctr + 1] = ((tctrp1) + (c)); } { u32 ctr = 1; - u128 tctr = tmp[ctr]; - u128 tctrp1 = tmp[ctr + 1]; + __uint128_t tctr = tmp[ctr]; + __uint128_t tctrp1 = tmp[ctr + 1]; u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU; - u128 c = ((tctr) >> (51)); - tmp[ctr] = ((u128)(r0)); + __uint128_t c = ((tctr) >> (51)); + tmp[ctr] = ((__uint128_t)(r0)); tmp[ctr + 1] = ((tctrp1) + (c)); } { u32 ctr = 2; - u128 tctr = tmp[ctr]; - u128 tctrp1 = tmp[ctr + 1]; + __uint128_t tctr = tmp[ctr]; + __uint128_t tctrp1 = tmp[ctr + 1]; u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU; - u128 c = ((tctr) >> (51)); - tmp[ctr] = ((u128)(r0)); + __uint128_t c = ((tctr) >> (51)); + tmp[ctr] = ((__uint128_t)(r0)); tmp[ctr + 1] = ((tctrp1) + (c)); } { u32 ctr = 3; - u128 tctr = tmp[ctr]; - u128 tctrp1 = tmp[ctr + 1]; + __uint128_t tctr = tmp[ctr]; + __uint128_t tctrp1 = tmp[ctr + 1]; u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU; - u128 c = ((tctr) >> (51)); - tmp[ctr] = ((u128)(r0)); + __uint128_t c = ((tctr) >> (51)); + tmp[ctr] = ((__uint128_t)(r0)); tmp[ctr + 1] = ((tctrp1) + (c)); } } @@ -154,7 +152,7 @@ static __always_inline void fmul_shift_r output[0] = 19 * b0; } -static __always_inline void fmul_mul_shift_reduce_(u128 *output, u64 *input, +static __always_inline void fmul_mul_shift_reduce_(__uint128_t *output, u64 *input, u64 *input21) { u32 i; @@ -188,21 +186,21 @@ static __always_inline void fmul_fmul(u6 { u64 tmp[5] = { input[0], input[1], input[2], input[3], input[4] }; { - u128 b4; - u128 b0; - u128 b4_; - u128 b0_; + __uint128_t b4; + __uint128_t b0; + __uint128_t b4_; + __uint128_t b0_; u64 i0; u64 i1; u64 i0_; u64 i1_; - u128 t[5] = { 0 }; + __uint128_t t[5] = { 0 }; fmul_mul_shift_reduce_(t, tmp, input21); fproduct_carry_wide_(t); b4 = t[4]; b0 = t[0]; - b4_ = ((b4) & (((u128)(0x7ffffffffffffLLU)))); - b0_ = ((b0) + (((u128)(19) * (((u64)(((b4) >> (51)))))))); + b4_ = ((b4) & (((__uint128_t)(0x7ffffffffffffLLU)))); + b0_ = ((b0) + (((__uint128_t)(19) * (((u64)(((b4) >> (51)))))))); t[4] = b4_; t[0] = b0_; fproduct_copy_from_wide_(output, t); @@ -215,7 +213,7 @@ static __always_inline void fmul_fmul(u6 } } -static __always_inline void fsquare_fsquare__(u128 *tmp, u64 *output) +static __always_inline void fsquare_fsquare__(__uint128_t *tmp, u64 *output) { u64 r0 = output[0]; u64 r1 = output[1]; @@ -227,16 +225,16 @@ static __always_inline void fsquare_fsqu u64 d2 = r2 * 2 * 19; u64 d419 = r4 * 19; u64 d4 = d419 * 2; - u128 s0 = ((((((u128)(r0) * (r0))) + (((u128)(d4) * (r1))))) + - (((u128)(d2) * (r3)))); - u128 s1 = ((((((u128)(d0) * (r1))) + (((u128)(d4) * (r2))))) + - (((u128)(r3 * 19) * (r3)))); - u128 s2 = ((((((u128)(d0) * (r2))) + (((u128)(r1) * (r1))))) + - (((u128)(d4) * (r3)))); - u128 s3 = ((((((u128)(d0) * (r3))) + (((u128)(d1) * (r2))))) + - (((u128)(r4) * (d419)))); - u128 s4 = ((((((u128)(d0) * (r4))) + (((u128)(d1) * (r3))))) + - (((u128)(r2) * (r2)))); + __uint128_t s0 = ((((((__uint128_t)(r0) * (r0))) + (((__uint128_t)(d4) * (r1))))) + + (((__uint128_t)(d2) * (r3)))); + __uint128_t s1 = ((((((__uint128_t)(d0) * (r1))) + (((__uint128_t)(d4) * (r2))))) + + (((__uint128_t)(r3 * 19) * (r3)))); + __uint128_t s2 = ((((((__uint128_t)(d0) * (r2))) + (((__uint128_t)(r1) * (r1))))) + + (((__uint128_t)(d4) * (r3)))); + __uint128_t s3 = ((((((__uint128_t)(d0) * (r3))) + (((__uint128_t)(d1) * (r2))))) + + (((__uint128_t)(r4) * (d419)))); + __uint128_t s4 = ((((((__uint128_t)(d0) * (r4))) + (((__uint128_t)(d1) * (r3))))) + + (((__uint128_t)(r2) * (r2)))); tmp[0] = s0; tmp[1] = s1; tmp[2] = s2; @@ -244,12 +242,12 @@ static __always_inline void fsquare_fsqu tmp[4] = s4; } -static __always_inline void fsquare_fsquare_(u128 *tmp, u64 *output) +static __always_inline void fsquare_fsquare_(__uint128_t *tmp, u64 *output) { - u128 b4; - u128 b0; - u128 b4_; - u128 b0_; + __uint128_t b4; + __uint128_t b0; + __uint128_t b4_; + __uint128_t b0_; u64 i0; u64 i1; u64 i0_; @@ -258,8 +256,8 @@ static __always_inline void fsquare_fsqu fproduct_carry_wide_(tmp); b4 = tmp[4]; b0 = tmp[0]; - b4_ = ((b4) & (((u128)(0x7ffffffffffffLLU)))); - b0_ = ((b0) + (((u128)(19) * (((u64)(((b4) >> (51)))))))); + b4_ = ((b4) & (((__uint128_t)(0x7ffffffffffffLLU)))); + b0_ = ((b0) + (((__uint128_t)(19) * (((u64)(((b4) >> (51)))))))); tmp[4] = b4_; tmp[0] = b0_; fproduct_copy_from_wide_(output, tmp); @@ -271,7 +269,7 @@ static __always_inline void fsquare_fsqu output[1] = i1_; } -static __always_inline void fsquare_fsquare_times_(u64 *output, u128 *tmp, +static __always_inline void fsquare_fsquare_times_(u64 *output, __uint128_t *tmp, u32 count1) { u32 i; @@ -283,7 +281,7 @@ static __always_inline void fsquare_fsqu static __always_inline void fsquare_fsquare_times(u64 *output, u64 *input, u32 count1) { - u128 t[5]; + __uint128_t t[5]; memcpy(output, input, 5 * sizeof(*input)); fsquare_fsquare_times_(output, t, count1); } @@ -291,7 +289,7 @@ static __always_inline void fsquare_fsqu static __always_inline void fsquare_fsquare_times_inplace(u64 *output, u32 count1) { - u128 t[5]; + __uint128_t t[5]; fsquare_fsquare_times_(output, t, count1); } @@ -396,36 +394,36 @@ static __always_inline void fdifference( static __always_inline void fscalar(u64 *output, u64 *b, u64 s) { - u128 tmp[5]; - u128 b4; - u128 b0; - u128 b4_; - u128 b0_; + __uint128_t tmp[5]; + __uint128_t b4; + __uint128_t b0; + __uint128_t b4_; + __uint128_t b0_; { u64 xi = b[0]; - tmp[0] = ((u128)(xi) * (s)); + tmp[0] = ((__uint128_t)(xi) * (s)); } { u64 xi = b[1]; - tmp[1] = ((u128)(xi) * (s)); + tmp[1] = ((__uint128_t)(xi) * (s)); } { u64 xi = b[2]; - tmp[2] = ((u128)(xi) * (s)); + tmp[2] = ((__uint128_t)(xi) * (s)); } { u64 xi = b[3]; - tmp[3] = ((u128)(xi) * (s)); + tmp[3] = ((__uint128_t)(xi) * (s)); } { u64 xi = b[4]; - tmp[4] = ((u128)(xi) * (s)); + tmp[4] = ((__uint128_t)(xi) * (s)); } fproduct_carry_wide_(tmp); b4 = tmp[4]; b0 = tmp[0]; - b4_ = ((b4) & (((u128)(0x7ffffffffffffLLU)))); - b0_ = ((b0) + (((u128)(19) * (((u64)(((b4) >> (51)))))))); + b4_ = ((b4) & (((__uint128_t)(0x7ffffffffffffLLU)))); + b0_ = ((b0) + (((__uint128_t)(19) * (((u64)(((b4) >> (51)))))))); tmp[4] = b4_; tmp[0] = b0_; fproduct_copy_from_wide_(output, tmp); --- a/lib/crypto/poly1305-donna64.c +++ b/lib/crypto/poly1305-donna64.c @@ -10,8 +10,6 @@ #include #include -typedef __uint128_t u128; - void poly1305_core_setkey(struct poly1305_core_key *key, const u8 raw_key[POLY1305_BLOCK_SIZE]) { @@ -41,7 +39,7 @@ void poly1305_core_blocks(struct poly130 u64 s1, s2; u64 h0, h1, h2; u64 c; - u128 d0, d1, d2, d; + __uint128_t d0, d1, d2, d; if (!nblocks) return; @@ -71,20 +69,20 @@ void poly1305_core_blocks(struct poly130 h2 += (((t1 >> 24)) & 0x3ffffffffffULL) | hibit64; /* h *= r */ - d0 = (u128)h0 * r0; - d = (u128)h1 * s2; + d0 = (__uint128_t)h0 * r0; + d = (__uint128_t)h1 * s2; d0 += d; - d = (u128)h2 * s1; + d = (__uint128_t)h2 * s1; d0 += d; - d1 = (u128)h0 * r1; - d = (u128)h1 * r0; + d1 = (__uint128_t)h0 * r1; + d = (__uint128_t)h1 * r0; d1 += d; - d = (u128)h2 * s2; + d = (__uint128_t)h2 * s2; d1 += d; - d2 = (u128)h0 * r2; - d = (u128)h1 * r1; + d2 = (__uint128_t)h0 * r2; + d = (__uint128_t)h1 * r1; d2 += d; - d = (u128)h2 * r0; + d = (__uint128_t)h2 * r0; d2 += d; /* (partial) h %= p */ From patchwork Mon Dec 19 15:35:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34619 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2463141wrn; Mon, 19 Dec 2022 07:46:57 -0800 (PST) X-Google-Smtp-Source: AMrXdXsOU1ZmWt3ULX5scpZBHd/zZUyPumOx6gk0g3RGWvBjtSKhe8K8WMJAeEplBGVUkfwTabYS X-Received: by 2002:a17:902:c942:b0:190:f537:3c45 with SMTP id i2-20020a170902c94200b00190f5373c45mr29388945pla.30.1671464816784; Mon, 19 Dec 2022 07:46:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671464816; cv=none; d=google.com; s=arc-20160816; b=lWH0jorVTLNg6tiVf2r2nYutRx4idNTMAnMeKWFLMFj6x5f/OFraGwr6aHEd6hqaBV E7EzCREtIi6kimmWDIj74BLeD+0norYZbzqnJzHL+jG19qM2MT5CbvobwhMr4TD9+anM 2r3V6JlLsV/PNqLpsInyxPjt0qxpFfRZBMtWjccx/PMcjrz6xb8z7zoU5Ee5rP0p10k6 Yqvcuccg9lD+U/59JpTJfAGg/FWOekZDBn59tpiSumineCXgtBsRsyyuvzXQ6f0aJuH4 mxSdTwwweyf7YSnv0NUmsXkGiWcL19zaEICFl0IUfjdRwJlX+Xo33SK+0K2Y+TqgUR4Y JYmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=ltmheXPbH9IMNFHZ4P8NvOysWa92wKwMkpxaGeG832Q=; b=vd4z4y3qZDmE89L+eiD1DLjEbPg8DCr5AZLLfVnM5QU+8Bl5KL/DdM6xEyAeovxc3S LFo34BNDF/ptcI6Z7NAAwH3aMRnt/Q7aVAfJSu1HXoT/fp/VuB9b353Zo4z/5ChdtvQi 1jUaHhCea74fgwssiICgL9vZ3YoPXWBZC/JEGKQLGqKjuitaiySrj4GzpgIjx8YfaN+q iY6HOa4VXR6MH68ZuerIae057V9sh9JIS1YD2MOSWJR/66uXXttyx+5fIplTPTw+WEPY qHmTHocfZ4x99b1/dxAfUa6a1+DIu7MnxCY4G8jJpCC8VbuoyFwDO1Ci1sv7YJbbOjmS 9lTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=BNNdcWpl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s4-20020a170902ea0400b00189c05664e8si12154195plg.563.2022.12.19.07.46.43; Mon, 19 Dec 2022 07:46:56 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=BNNdcWpl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232757AbiLSPqI (ORCPT + 99 others); Mon, 19 Dec 2022 10:46:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232603AbiLSPpa (ORCPT ); Mon, 19 Dec 2022 10:45:30 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7460813D3A; Mon, 19 Dec 2022 07:44:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=ltmheXPbH9IMNFHZ4P8NvOysWa92wKwMkpxaGeG832Q=; b=BNNdcWplbia7Tdo9e9JhzPu744 gXUhVeJe6clEEF36/xkbfr3un7XfcOJP5RsPH/+K3YTh4jfl8/W0cviflURH82EDqZNU17198QH36 WtXvDaQ8shWg9Et9rZ0rynw5UAUIMvXqis8/K09oIZJGjXakQ2lD7xEJXx8DLdeYSyAUKUsAqn+gq wbwi6dU6OmKxe5dkQmHBEh1gsn+cqhS4KPU/uyspZ1uxqYCK58Od/ZX2TDOHMHLuybHLL61GBnq+1 2JKJPz5M+mHygNluwA1CvVZiz5OnHN0oZnOivqqc8Z5SwIEE2uc3zf2UtwWj74ZvPOIqz2jboZeLx U+GXIYdA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1p7III-00CeDn-27; Mon, 19 Dec 2022 15:43:11 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id A44BA302E0D; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 6839D20114B73; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154118.955831880@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:27 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 02/12] crypto/ghash-clmulni: Use (struct) be128 References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752657891792472297?= X-GMAIL-MSGID: =?utf-8?q?1752657891792472297?= Even though x86 is firmly little endian, use be128 because le128 is in fact the wrong way around :/ The actual code is already using be128 in ghash_setkey() so this shouldn't be more confusing. This frees up the u128 name for a real u128 type. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/crypto/ghash-clmulni-intel_asm.S | 4 ++-- arch/x86/crypto/ghash-clmulni-intel_glue.c | 6 +++--- 2 files changed, 5 insertions(+), 5 deletions(-) --- a/arch/x86/crypto/ghash-clmulni-intel_asm.S +++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S @@ -88,7 +88,7 @@ SYM_FUNC_START_LOCAL(__clmul_gf128mul_bl RET SYM_FUNC_END(__clmul_gf128mul_ble) -/* void clmul_ghash_mul(char *dst, const u128 *shash) */ +/* void clmul_ghash_mul(char *dst, const le128 *shash) */ SYM_FUNC_START(clmul_ghash_mul) FRAME_BEGIN movups (%rdi), DATA @@ -104,7 +104,7 @@ SYM_FUNC_END(clmul_ghash_mul) /* * void clmul_ghash_update(char *dst, const char *src, unsigned int srclen, - * const u128 *shash); + * const le128 *shash); */ SYM_FUNC_START(clmul_ghash_update) FRAME_BEGIN --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -23,17 +23,17 @@ #define GHASH_BLOCK_SIZE 16 #define GHASH_DIGEST_SIZE 16 -void clmul_ghash_mul(char *dst, const u128 *shash); +void clmul_ghash_mul(char *dst, const be128 *shash); void clmul_ghash_update(char *dst, const char *src, unsigned int srclen, - const u128 *shash); + const be128 *shash); struct ghash_async_ctx { struct cryptd_ahash *cryptd_tfm; }; struct ghash_ctx { - u128 shash; + be128 shash; }; struct ghash_desc_ctx { From patchwork Mon Dec 19 15:35:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34624 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2463816wrn; Mon, 19 Dec 2022 07:48:22 -0800 (PST) X-Google-Smtp-Source: AA0mqf7RZHHBT5cjOWYN1Z8YN+1z1b3sFpcjWFgyGk1ohNs1r+UhsuzdrZEgJoUH/kbyCksAlVOP X-Received: by 2002:a05:6a20:9f87:b0:9d:efbf:787d with SMTP id mm7-20020a056a209f8700b0009defbf787dmr51615037pzb.50.1671464902327; Mon, 19 Dec 2022 07:48:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671464902; cv=none; d=google.com; s=arc-20160816; b=VCr0+LRJtWsR4YK1JFnWzmR8Rr39Do2pCej3E3C0zhjC02AvGN/TCp9z4O5wNLDpt5 PbVsslSKFwaJGgfZzwWJXgoXxknzLL2fON5tEqQPohlgidgQyjMziCIoNeBzm1qti5kj jFKXkdmK5tkknac3Lb/BmTLm9IBdiz0l6oUd47n3xTfIMSokQXnNhSbAoF0hafju22kN 4ICGztRSEYxAm3Qgz5Acl8nEeHFrWxMWg3acnTjPzOuOzKLoFCGkhrsOuLAzjmtLc/ii Pd0I1fo6MJLqW+jIqm3FAkGXKu+BpAZYEQ7EOip0tNSthSyhrAU9yrVGCrwspyDK6W+5 MKKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=HwKpSq8Syzt9wYk5/qyQZy9JqJElqdY1XVTNG2HpXaA=; b=Ue6gHdnSk0HwaCIxvejV+l8+vZgkb8XK4sduC+OZ0lcbsVEiZWZEuMyB586jJVRkps 7VHH28gkx3VIUJdH6o+z4gcSiYv++P4CKSn3jknBBZkCsjFj0A4D1u4EpOmztgihC57N OyBM7aIWTL+HR2sCyzHlQDYpzr5cNoihvdv8PsxV+JNODpdBH/1zLy3Cl6TXCBFs8sVK 1lljfAJtSj+gqIvcWPLfwTzBAf5vj7i7FugLDf4OkOFeGrR+aePQtL9uvXj91MpCIR9D qadqhXRFjh7CkPB+QEhQr+5N+RylyvMUO1G2Vb0WUjUBIFiIyrHAgse0IthIg6NGkNof ciHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=C89SVOTN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d20-20020a637354000000b00476cc73fd87si11327045pgn.512.2022.12.19.07.48.09; Mon, 19 Dec 2022 07:48:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=C89SVOTN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232399AbiLSPqg (ORCPT + 99 others); Mon, 19 Dec 2022 10:46:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60144 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232562AbiLSPpd (ORCPT ); Mon, 19 Dec 2022 10:45:33 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4AA2013F46; Mon, 19 Dec 2022 07:44:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=HwKpSq8Syzt9wYk5/qyQZy9JqJElqdY1XVTNG2HpXaA=; b=C89SVOTN4frRfghMH72mO24iYJ pVkZGAAkMqRxVkuH3fpqA8/0D1aFHW6y76H1YnF8kEDFDm33E8UubQcUKhJKpWZG8jDggtq8uBBzW U+SxkTxM4g0Q4uVrYdyNJqtyBLrQ5J7iHfduxeojqYnxDrzaSPCi7ttbP4Dg/mEyo3g341YNDQ5Bj 8CplCFReZ3n2JbDGVtP5MXOFW56RblTJKRv8Kcn9PDdv6XD/jCDzjQgWk8Jwq7PfTGyAOmvYLBOrd tix3uV9D5+yfntLLV6oTkPEC51xWm3pii6k4b1Pg03G6ocA15Z1RIniFBXvpIunHhG6jU70qH5Akn jgEqG5Xw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1p7III-00CeDm-27; Mon, 19 Dec 2022 15:43:11 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id A20AD302DA8; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 6BFEC20B0F89A; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154119.022180444@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:28 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 03/12] cyrpto/b128ops: Remove struct u128 References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752657981455323432?= X-GMAIL-MSGID: =?utf-8?q?1752657981455323432?= Per git-grep u128_xor() and its related struct u128 are unused except to implement {be,le}128_xor(). Remove them to free up the namespace. Signed-off-by: Peter Zijlstra (Intel) --- include/crypto/b128ops.h | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-) --- a/include/crypto/b128ops.h +++ b/include/crypto/b128ops.h @@ -50,10 +50,6 @@ #include typedef struct { - u64 a, b; -} u128; - -typedef struct { __be64 a, b; } be128; @@ -61,20 +57,16 @@ typedef struct { __le64 b, a; } le128; -static inline void u128_xor(u128 *r, const u128 *p, const u128 *q) +static inline void be128_xor(be128 *r, const be128 *p, const be128 *q) { r->a = p->a ^ q->a; r->b = p->b ^ q->b; } -static inline void be128_xor(be128 *r, const be128 *p, const be128 *q) -{ - u128_xor((u128 *)r, (u128 *)p, (u128 *)q); -} - static inline void le128_xor(le128 *r, const le128 *p, const le128 *q) { - u128_xor((u128 *)r, (u128 *)p, (u128 *)q); + r->a = p->a ^ q->a; + r->b = p->b ^ q->b; } #endif /* _CRYPTO_B128OPS_H */ From patchwork Mon Dec 19 15:35:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34617 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2462922wrn; Mon, 19 Dec 2022 07:46:30 -0800 (PST) X-Google-Smtp-Source: AA0mqf6NFxkQvP/g2qosob9ev7nGQjpaBlSd5RBusIdIsbj9b2UHr7hTG0mYQsSgxacayxVN9kBL X-Received: by 2002:a62:870d:0:b0:576:f02e:d0ef with SMTP id i13-20020a62870d000000b00576f02ed0efmr45103783pfe.4.1671464789738; Mon, 19 Dec 2022 07:46:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671464789; cv=none; d=google.com; s=arc-20160816; b=czmRsny9cEW9wJFc7RJq1FuC/KYiE7svZcrM2oLLebYNhiMTXIcr510VJGb651NCwd hwvls8+nzWAP9gG6cONWsn7ahYYh652OuAgEg/s6K9CEMb7P/CvZ6ccF/0o6+m8AgXEw m7Qg5CWwPvuvy7VBHoyssuB1WMZ9esvewTVyjPJyAo1KWa4UgwQhUfWkWT8rdeValTWC 3j+3lVfXsqRSxZ0iVztXCrbihSvwojR0jpvVTuYHg8z5AP0qhlA71L458TKVYa/0Co8q Vn9yJItZG4btt/dSBeeU3IarCUDAUZCeT1tpNFY1tQ0LjK5q4+w4VkymyhIEMqYm0pTn d0/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=LatshXnMYCOGTRlakBz337bCR5sPE/9cJhDhXQqvVDY=; b=idq/0PEU5Z2xBWz+WJBjG12XRQoy/qbHpjwaP8YW5gUB2RScLNusRPgfp8hla+Nz9q fU+lZ6MoR/Sq7MOc+iGguh1JnphHV6sLspcroNl3mzfAWy3/PpWmCf2mxJjf/CF92aMa T7wvYRkxUvBasUo4NHIekK+CMNQtcvQ+OS7FcIj2nm6JJec5rLkANTdNFQvaFYQCGNhI jWCsus1bSipSG10pUvrtuJ1kfCUdgz6IfMZPCduIM45duvFIPdjoe2BLNCvXh6OnUGcB A+N14XZdz3hiLf1DHzYea+OSeJ80+IVrLMY1mbUf0FF8RQI/gBpUxK0QSnTUKLpKEQBD 88WA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=Aw3O8XqI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i10-20020aa78d8a000000b0056c37608f45si9841072pfr.348.2022.12.19.07.46.14; Mon, 19 Dec 2022 07:46:29 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=Aw3O8XqI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232684AbiLSPqB (ORCPT + 99 others); Mon, 19 Dec 2022 10:46:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232556AbiLSPpa (ORCPT ); Mon, 19 Dec 2022 10:45:30 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04A7813D53; Mon, 19 Dec 2022 07:44:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=LatshXnMYCOGTRlakBz337bCR5sPE/9cJhDhXQqvVDY=; b=Aw3O8XqIY/S7IlsVQ/vztRWMgZ 9MZkonC3mns2/7yAkQNiwPXzwmfBls4vaouAPRXYWbYwoAstiZ4ypUsECL7omaUYUtnI+EZ6P2V3q y/xcU1yy7FDkewopYOQj1Nhj8XcQ428K57j1QJkMCYDX1H0YnXT/ueX2JiIkaqg3dGA8DpE+uDHuZ XbkHNHNiVQLGrU4dprOjpr2trbXxO1lIDg6Cer+yAbg1xvT5Wewrt05BD6+u3Lj5CFMlGQOOT7hNz 97N20DbnzOeiDbQdYLEw7Y4vbsRRe+9WwBIoqo4YmUd5wlZH4B2ZRAk6v1VeIfq+JnxA7jHGols8G ZrKZ2grg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1p7IIS-000qwd-Of; Mon, 19 Dec 2022 15:43:21 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 9DABD300422; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 7217320A25F3F; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154119.087799661@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:29 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 04/12] types: Introduce [us]128 References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752657863525570135?= X-GMAIL-MSGID: =?utf-8?q?1752657863525570135?= Introduce [us]128 (when available). Unlike [us]64, ensure they are always naturally aligned. This also enables 128bit wide atomics (which require natural alignment) such as cmpxchg128(). Signed-off-by: Peter Zijlstra (Intel) --- include/linux/types.h | 5 +++++ include/uapi/linux/types.h | 4 ++++ 2 files changed, 9 insertions(+) --- a/include/linux/types.h +++ b/include/linux/types.h @@ -10,6 +10,11 @@ #define DECLARE_BITMAP(name,bits) \ unsigned long name[BITS_TO_LONGS(bits)] +#ifdef __SIZEOF_INT128__ +typedef __s128 s128; +typedef __u128 u128; +#endif + typedef u32 __kernel_dev_t; typedef __kernel_fd_set fd_set; --- a/include/uapi/linux/types.h +++ b/include/uapi/linux/types.h @@ -13,6 +13,10 @@ #include +#ifdef __SIZEOF_INT128__ +typedef __signed__ __int128 __s128 __attribute__((aligned(16))); +typedef unsigned __int128 __u128 __attribute__((aligned(16))); +#endif /* * Below are truly Linux-specific types that should never collide with From patchwork Mon Dec 19 15:35:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34626 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2464041wrn; Mon, 19 Dec 2022 07:48:49 -0800 (PST) X-Google-Smtp-Source: AA0mqf6SUyAKltyXlkkLZj97lGgVRP0HXQ1ICFv/qVxHYCGp8N+hnipdzcwHaeoF26NA4zjN4iQY X-Received: by 2002:a05:6a20:a001:b0:9d:efbf:48e3 with SMTP id p1-20020a056a20a00100b0009defbf48e3mr59354674pzj.39.1671464928803; Mon, 19 Dec 2022 07:48:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671464928; cv=none; d=google.com; s=arc-20160816; b=0y0I/4o1OznxVqeJWbUSUssEdV7RMBvh/sTE5eAXTLsp0KRR1oHfKQAXkH9IBCJxAY Pv24QqGe6Fzcq01Nqdsqo54UHMGVvguIur5p55tNXzkEvdk2ukhiIhzVkAOnlFivUvGw HRAxC9lhwLz75d2asu30XwiWBNOhP6SvqGor6g3UzTLYrId0842msf04quYnPUXrRY3r OGxZtXA8yW/mNDPhmdfoamU3KnvdRI7BEZ9hFmnSvIcRtXXlWrfMwsZAXZ+ZDlQHW7ai Yc7e6cbmgwfD9yhsvSHuNLB3XzTliqZFwsAlaNNvmfm4c8n5BINv18neRD7MSx5PO8lt jR9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=QJ923ExDj0XcuShmfE67Q3JbXhYYvsBZdme4aHvNZqo=; b=Ll/P6h+Tbq54GffuOsOZplgTlAa1bL5+3gTQPHQ5y5ICOyHupURK8DCd3MBj6wpcJY wJgwzxIz7hHKzokGAHed51anKP87GqGt7AVf2Ks6bkQ/c5eSenka9FVz2Go32XVoA8Ue 6TIWD4MTvbg34mQatQlMK0R7m9QEmhEVpOHl+XOjgZLdIFvvQ5GcaMW5w2tt3xYSB2GS eMLQq8YiPzVl38vZvoe55F4aSsN4De3gjjyftsNi5oN9yDAtIyCjyDRFw48OtExamFfp uF3PqMDzAlxTSXHcqhAAM2kgiTJauG+avdp7HiZVKeuErZZySRPmWmuG12f1TOe26mob wKNA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=L1IW8vYe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 145-20020a630597000000b0047751f6c725si11680376pgf.159.2022.12.19.07.48.36; Mon, 19 Dec 2022 07:48:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=L1IW8vYe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232599AbiLSPqp (ORCPT + 99 others); Mon, 19 Dec 2022 10:46:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59672 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232249AbiLSPpd (ORCPT ); Mon, 19 Dec 2022 10:45:33 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6BABF12AC3; Mon, 19 Dec 2022 07:44:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=QJ923ExDj0XcuShmfE67Q3JbXhYYvsBZdme4aHvNZqo=; b=L1IW8vYeBrUO/INAKwHRRlhJVX i/vG9qTUNs+oa6cLvr2/Xu8n10441UAqstaULlYbvtYtvBB8LKp5je0sNowRdtYNbuOsLa4LKS3CY ehyeCPRWtvV4hGe0ZB3g6kR8E71kEvMsxxMtQrGDHXor+ViJ0P/iVOtCKziKJj7Si1SEXl6vlANmB yYWIS8cHYYC4BX1p6xQ5JmaJ/1KWfoI/zkCuJdzbYjW06InrNLxk+2aKKyhk5GG61r78NF3XKBgq1 pHVhdDp4lqC8zzJrPRgm38RR7St7UYIkuWJmN6/KKJLCp+ZckgxoC6jHXr9M2Ub80pLF+e7+CCCy0 TLS8Y1ow==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1p7IIJ-00CeDr-0Z; Mon, 19 Dec 2022 15:43:11 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 0DCF930325C; Mon, 19 Dec 2022 16:43:10 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 7679B20B0F89C; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154119.154045458@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:30 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 05/12] arch: Introduce arch_{,try_}_cmpxchg128{,_local}() References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752658009340131475?= X-GMAIL-MSGID: =?utf-8?q?1752658009340131475?= For all architectures that currently support cmpxchg_double() implement the cmpxchg128() family of functions that is basically the same but with a saner interface. Signed-off-by: Peter Zijlstra (Intel) --- arch/arm64/include/asm/atomic_ll_sc.h | 38 +++++++++++++++++++++++ arch/arm64/include/asm/atomic_lse.h | 33 +++++++++++++++++++- arch/arm64/include/asm/cmpxchg.h | 26 ++++++++++++++++ arch/s390/include/asm/cmpxchg.h | 33 ++++++++++++++++++++ arch/x86/include/asm/cmpxchg_32.h | 3 + arch/x86/include/asm/cmpxchg_64.h | 55 +++++++++++++++++++++++++++++++++- 6 files changed, 185 insertions(+), 3 deletions(-) --- a/arch/arm64/include/asm/atomic_ll_sc.h +++ b/arch/arm64/include/asm/atomic_ll_sc.h @@ -326,6 +326,44 @@ __CMPXCHG_DBL( , , , ) __CMPXCHG_DBL(_mb, dmb ish, l, "memory") #undef __CMPXCHG_DBL + +union __u128_halves { + u128 full; + struct { + u64 low, high; + }; +}; + +#define __CMPXCHG128(name, mb, rel, cl) \ +static __always_inline u128 \ +__ll_sc__cmpxchg128##name(volatile u128 *ptr, u128 old, u128 new) \ +{ \ + union __u128_halves r, o = { .full = (old) }, \ + n = { .full = (new) }; \ + \ + asm volatile("// __cmpxchg128" #name "\n" \ + " prfm pstl1strm, %2\n" \ + "1: ldxp %0, %1, %2\n" \ + " eor %3, %0, %3\n" \ + " eor %4, %1, %4\n" \ + " orr %3, %4, %3\n" \ + " cbnz %3, 2f\n" \ + " st" #rel "xp %w3, %5, %6, %2\n" \ + " cbnz %w3, 1b\n" \ + " " #mb "\n" \ + "2:" \ + : "=&r" (r.low), "=&r" (r.high), "+Q" (*(unsigned long *)ptr) \ + : "r" (o.low), "r" (o.high), "r" (n.low), "r" (n.high) \ + : cl); \ + \ + return r.full; \ +} + +__CMPXCHG128( , , , ) +__CMPXCHG128(_mb, dmb ish, l, "memory") + +#undef __CMPXCHG128 + #undef K #endif /* __ASM_ATOMIC_LL_SC_H */ --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -151,7 +151,7 @@ __lse_atomic64_fetch_##op##name(s64 i, a " " #asm_op #mb " %[i], %[old], %[v]" \ : [v] "+Q" (v->counter), \ [old] "=r" (old) \ - : [i] "r" (i) \ + : [i] "r" (i) \ : cl); \ \ return old; \ @@ -324,4 +324,35 @@ __CMPXCHG_DBL(_mb, al, "memory") #undef __CMPXCHG_DBL +#define __CMPXCHG128(name, mb, cl...) \ +static __always_inline u128 \ +__lse__cmpxchg128##name(volatile u128 *ptr, u128 old, u128 new) \ +{ \ + union __u128_halves r, o = { .full = (old) }, \ + n = { .full = (new) }; \ + register unsigned long x0 asm ("x0") = o.low; \ + register unsigned long x1 asm ("x1") = o.high; \ + register unsigned long x2 asm ("x2") = n.low; \ + register unsigned long x3 asm ("x3") = n.high; \ + register unsigned long x4 asm ("x4") = (unsigned long)ptr; \ + \ + asm volatile( \ + __LSE_PREAMBLE \ + " casp" #mb "\t%[old1], %[old2], %[new1], %[new2], %[v]\n"\ + : [old1] "+&r" (x0), [old2] "+&r" (x1), \ + [v] "+Q" (*(unsigned long *)ptr) \ + : [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), \ + [oldval1] "r" (r.low), [oldval2] "r" (r.high) \ + : cl); \ + \ + r.low = x0; r.high = x1; \ + \ + return r.full; \ +} + +__CMPXCHG128( , ) +__CMPXCHG128(_mb, al, "memory") + +#undef __CMPXCHG128 + #endif /* __ASM_ATOMIC_LSE_H */ --- a/arch/arm64/include/asm/cmpxchg.h +++ b/arch/arm64/include/asm/cmpxchg.h @@ -147,6 +147,19 @@ __CMPXCHG_DBL(_mb) #undef __CMPXCHG_DBL +#define __CMPXCHG128(name) \ +static inline long __cmpxchg128##name(volatile u128 *ptr, \ + u128 old, u128 new) \ +{ \ + return __lse_ll_sc_body(_cmpxchg128##name, \ + ptr, old, new); \ +} + +__CMPXCHG128( ) +__CMPXCHG128(_mb) + +#undef __CMPXCHG128 + #define __CMPXCHG_GEN(sfx) \ static __always_inline unsigned long __cmpxchg##sfx(volatile void *ptr, \ unsigned long old, \ @@ -229,6 +242,19 @@ __CMPXCHG_GEN(_mb) __ret; \ }) +/* cmpxchg128 */ +#define system_has_cmpxchg128() 1 + +#define arch_cmpxchg128(ptr, o, n) \ +({ \ + __cmpxchg128_mb((ptr), (o), (n)); \ +}) + +#define arch_cmpxchg128_local(ptr, o, n) \ +({ \ + __cmpxchg128((ptr), (o), (n)); \ +}) + #define __CMPWAIT_CASE(w, sfx, sz) \ static inline void __cmpwait_case_##sz(volatile void *ptr, \ unsigned long val) \ --- a/arch/s390/include/asm/cmpxchg.h +++ b/arch/s390/include/asm/cmpxchg.h @@ -201,4 +201,37 @@ static __always_inline int __cmpxchg_dou (unsigned long)(n1), (unsigned long)(n2)); \ }) +#define system_has_cmpxchg128() 1 + +static __always_inline u128 arch_cmpxchg128(volatile u128 *ptr, u128 old, u128 new) +{ + asm volatile( + " cdsg %[old],%[new],%[ptr]\n" + : [old] "+&d" (old) + : [new] "d" (new), + [ptr] "QS" (*(unsigned long *)ptr) + : "memory", "cc"); + return old; +} + +static __always_inline bool arch_try_cmpxchg128(volatile u128 *ptr, u128 *oldp, u128 new) +{ + u128 old = *oldp; + int cc; + + asm volatile( + " cdsg %[old],%[new],%[ptr]\n" + " ipm %[cc]\n" + " srl %[cc],28\n" + : [cc] "=&d" (cc), [old] "+&d" (old) + : [new] "d" (new), + [ptr] "QS" (*(unsigned long *)ptr) + : "memory", "cc"); + + if (unlikely(!cc)) + *oldp = old; + + return likely(cc); +} + #endif /* __ASM_CMPXCHG_H */ --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -103,6 +103,7 @@ static inline bool __try_cmpxchg64(volat #endif -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX8) +#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX8) +#define system_has_cmpxchg64() boot_cpu_has(X86_FEATURE_CX8) #endif /* _ASM_X86_CMPXCHG_32_H */ --- a/arch/x86/include/asm/cmpxchg_64.h +++ b/arch/x86/include/asm/cmpxchg_64.h @@ -20,6 +20,59 @@ arch_try_cmpxchg((ptr), (po), (n)); \ }) -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX16) +union __u128_halves { + u128 full; + struct { + u64 low, high; + }; +}; + +static __always_inline u128 arch_cmpxchg128(volatile u128 *ptr, u128 old, u128 new) +{ + union __u128_halves o = { .full = old, }, n = { .full = new, }; + + asm volatile(LOCK_PREFIX "cmpxchg16b %[ptr]" + : [ptr] "+m" (*ptr), + "+a" (o.low), "+d" (o.high) + : "b" (n.low), "c" (n.high) + : "memory"); + + return o.full; +} + +static __always_inline u128 arch_cmpxchg128_local(volatile u128 *ptr, u128 old, u128 new) +{ + union __u128_halves o = { .full = old, }, n = { .full = new, }; + + asm volatile("cmpxchg16b %[ptr]" + : [ptr] "+m" (*ptr), + "+a" (o.low), "+d" (o.high) + : "b" (n.low), "c" (n.high) + : "memory"); + + return o.full; +} + +static __always_inline bool arch_try_cmpxchg128(volatile u128 *ptr, u128 *old, u128 new) +{ + union __u128_halves o = { .full = *old, }, n = { .full = new, }; + bool ret; + + asm volatile(LOCK_PREFIX "cmpxchg16b %[ptr]" + CC_SET(e) + : CC_OUT(e) (ret), + [ptr] "+m" (*ptr), + "+a" (o.low), "+d" (o.high) + : "b" (n.low), "c" (n.high) + : "memory"); + + if (unlikely(!ret)) + *old = o.full; + + return likely(ret); +} + +#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX16) +#define system_has_cmpxchg128() boot_cpu_has(X86_FEATURE_CX16) #endif /* _ASM_X86_CMPXCHG_64_H */ From patchwork Mon Dec 19 15:35:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34620 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2463276wrn; Mon, 19 Dec 2022 07:47:14 -0800 (PST) X-Google-Smtp-Source: AA0mqf5uEpSyyOR34fwYUVKidlXh2qDtY3Rd2JpAxFMmPszbdzB2fimAsvUKF4s29mGXVMzaMzh0 X-Received: by 2002:a05:6a21:168d:b0:ad:79bb:7869 with SMTP id np13-20020a056a21168d00b000ad79bb7869mr39370478pzb.56.1671464833870; Mon, 19 Dec 2022 07:47:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671464833; cv=none; d=google.com; s=arc-20160816; b=rM71SDlzajx0cUmxJG50uwAxYtpVGY4Z3OZ+LprZeQNyHWaUb36CNAm/coQQ2Sb7Ph Wj2ii4rbFAuu6ssRZD97OOdZ1AYvLxXv6vgmavAXvUr2gbpNrKkZnPqdDmpq8rqj6v+q noACROYn0UlSJeXBWCc7Mm9c0oxLc4X90nGDx895jfvdrfazSbXIAriEdVOMTd44EeiK 9B7R2YfxVdQAFLn0B6RBCX3XSaKzhU3EZLbEcyNyZTWDyChY4PuqWBhyBnqsJ+pPpOlE zuTLLHDEusCJ4ObndcTMIrK6SHcIkdQXGVtjU5q4k5/OSQ4RsgDf9eZ7+4z/r9u5YvAO hz9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=VyG6+KQHCAeEAEZCgwKflnVjYNYIFWnZVnfAizA2/bE=; b=LXrTKFwLBvAZaWxiDDJnBSdAADB+EHQHBNKmyE4BvarPKvRYLc2Dit4par+K0RJbAM +tb8bticcc6CIfn4f4oSJidrFAzZK5ZBf9GGnUv/muaCDeIro3VIqW8I2PVLvNClz4JV S7dIfq9nHWd0KAOiwVp29xKSca1Xi1MjK+2FEUjBzjovSmLFUHr5AD5KNlZtyGdRlEEp H/qgmA3SS2gM11PiugYR7j50kd5nMBtYP8x6c5vKalrbDN9civGOzmIgF/f98woNCxdi gQpJKRmLpTyNilg2JbGxU9IZhxMr+MULFWPJcTIMDQYlbPv4nODs6tYjJt9GPuZZ+uzJ 3kmg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=NqSBQbzr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 205-20020a6302d6000000b00476a08c5d9bsi11702997pgc.602.2022.12.19.07.47.00; Mon, 19 Dec 2022 07:47:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=NqSBQbzr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232480AbiLSPq0 (ORCPT + 99 others); Mon, 19 Dec 2022 10:46:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232418AbiLSPpc (ORCPT ); Mon, 19 Dec 2022 10:45:32 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E163213DD2; Mon, 19 Dec 2022 07:44:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=VyG6+KQHCAeEAEZCgwKflnVjYNYIFWnZVnfAizA2/bE=; b=NqSBQbzryBnwzMj3labGO28WSb quzx6A9NfDQvDGTMNaYKgoRnyRUPk/0ZV0HRzLNgd0QaUSHHdy9ExheW59YszMfkZx2eJzA0Jw7IE KJ8WTQlyNLQ7fegq1TKrzmKFARPbUxxzfEwlpKbXKnfFGgb5HQqA7OcVdOlW8i0cS1t52gCWWWqLq vjQv/pV3D7/lEoWP1dlDrTXGviFlBsD9SdchGgkPkK22C30RrLQ+KtHO5B/vDLjERdNHN/pQxx5bh 55vB5bLPYMlwd0IhuA3ktroYepp0Vs2J0cA+Eq1Qa5T2hrta8CvnaPvG0Rm050p9k9MdD6UqiVmqh /SrKYdrA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1p7IIT-000qwo-KD; Mon, 19 Dec 2022 15:43:21 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 0DD1B30328D; Mon, 19 Dec 2022 16:43:10 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 7A18920B0F89D; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154119.220928704@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:31 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 06/12] instrumentation: Wire up cmpxchg128() References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752657909431328409?= X-GMAIL-MSGID: =?utf-8?q?1752657909431328409?= Wire up the cmpxchg128 familty in the atomic wrappery scripts. These provide the generic cmpxchg128 family of functions from the arch_ prefixed version, adding explicit instrumentation where needed. Signed-off-by: Peter Zijlstra (Intel) --- include/linux/atomic/atomic-arch-fallback.h | 95 +++++++++++++++++++++++++++- include/linux/atomic/atomic-instrumented.h | 77 ++++++++++++++++++++++ scripts/atomic/gen-atomic-fallback.sh | 4 - scripts/atomic/gen-atomic-instrumented.sh | 4 - 4 files changed, 174 insertions(+), 6 deletions(-) --- a/include/linux/atomic/atomic-arch-fallback.h +++ b/include/linux/atomic/atomic-arch-fallback.h @@ -77,6 +77,29 @@ #endif /* arch_cmpxchg64_relaxed */ +#ifndef arch_cmpxchg128_relaxed +#define arch_cmpxchg128_acquire arch_cmpxchg128 +#define arch_cmpxchg128_release arch_cmpxchg128 +#define arch_cmpxchg128_relaxed arch_cmpxchg128 +#else /* arch_cmpxchg128_relaxed */ + +#ifndef arch_cmpxchg128_acquire +#define arch_cmpxchg128_acquire(...) \ + __atomic_op_acquire(arch_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_cmpxchg128_release +#define arch_cmpxchg128_release(...) \ + __atomic_op_release(arch_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_cmpxchg128 +#define arch_cmpxchg128(...) \ + __atomic_op_fence(arch_cmpxchg128, __VA_ARGS__) +#endif + +#endif /* arch_cmpxchg128_relaxed */ + #ifndef arch_try_cmpxchg_relaxed #ifdef arch_try_cmpxchg #define arch_try_cmpxchg_acquire arch_try_cmpxchg @@ -217,6 +240,76 @@ #endif /* arch_try_cmpxchg64_relaxed */ +#ifndef arch_try_cmpxchg128_relaxed +#ifdef arch_try_cmpxchg128 +#define arch_try_cmpxchg128_acquire arch_try_cmpxchg128 +#define arch_try_cmpxchg128_release arch_try_cmpxchg128 +#define arch_try_cmpxchg128_relaxed arch_try_cmpxchg128 +#endif /* arch_try_cmpxchg128 */ + +#ifndef arch_try_cmpxchg128 +#define arch_try_cmpxchg128(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128 */ + +#ifndef arch_try_cmpxchg128_acquire +#define arch_try_cmpxchg128_acquire(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128_acquire((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128_acquire */ + +#ifndef arch_try_cmpxchg128_release +#define arch_try_cmpxchg128_release(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128_release((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128_release */ + +#ifndef arch_try_cmpxchg128_relaxed +#define arch_try_cmpxchg128_relaxed(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128_relaxed((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128_relaxed */ + +#else /* arch_try_cmpxchg128_relaxed */ + +#ifndef arch_try_cmpxchg128_acquire +#define arch_try_cmpxchg128_acquire(...) \ + __atomic_op_acquire(arch_try_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_try_cmpxchg128_release +#define arch_try_cmpxchg128_release(...) \ + __atomic_op_release(arch_try_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_try_cmpxchg128 +#define arch_try_cmpxchg128(...) \ + __atomic_op_fence(arch_try_cmpxchg128, __VA_ARGS__) +#endif + +#endif /* arch_try_cmpxchg128_relaxed */ + #ifndef arch_atomic_read_acquire static __always_inline int arch_atomic_read_acquire(const atomic_t *v) @@ -2456,4 +2549,4 @@ arch_atomic64_dec_if_positive(atomic64_t #endif #endif /* _LINUX_ATOMIC_FALLBACK_H */ -// b5e87bdd5ede61470c29f7a7e4de781af3770f09 +// 46357a526de89c762d30fb238f35a7d5950a670b --- a/include/linux/atomic/atomic-instrumented.h +++ b/include/linux/atomic/atomic-instrumented.h @@ -1968,6 +1968,36 @@ atomic_long_dec_if_positive(atomic_long_ arch_cmpxchg64_relaxed(__ai_ptr, __VA_ARGS__); \ }) +#define cmpxchg128(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + kcsan_mb(); \ + instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128(__ai_ptr, __VA_ARGS__); \ +}) + +#define cmpxchg128_acquire(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_acquire(__ai_ptr, __VA_ARGS__); \ +}) + +#define cmpxchg128_release(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + kcsan_release(); \ + instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_release(__ai_ptr, __VA_ARGS__); \ +}) + +#define cmpxchg128_relaxed(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_relaxed(__ai_ptr, __VA_ARGS__); \ +}) + #define try_cmpxchg(ptr, oldp, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ @@ -2044,6 +2074,44 @@ atomic_long_dec_if_positive(atomic_long_ arch_try_cmpxchg64_relaxed(__ai_ptr, __ai_oldp, __VA_ARGS__); \ }) +#define try_cmpxchg128(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + kcsan_mb(); \ + instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + +#define try_cmpxchg128_acquire(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128_acquire(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + +#define try_cmpxchg128_release(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + kcsan_release(); \ + instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128_release(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + +#define try_cmpxchg128_relaxed(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128_relaxed(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + #define cmpxchg_local(ptr, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ @@ -2058,6 +2126,13 @@ atomic_long_dec_if_positive(atomic_long_ arch_cmpxchg64_local(__ai_ptr, __VA_ARGS__); \ }) +#define cmpxchg128_local(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_local(__ai_ptr, __VA_ARGS__); \ +}) + #define sync_cmpxchg(ptr, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ @@ -2083,4 +2158,4 @@ atomic_long_dec_if_positive(atomic_long_ }) #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */ -// 764f741eb77a7ad565dc8d99ce2837d5542e8aee +// 27320c1ec2bf2878ecb9df3ea4816a7bc0c57a52 --- a/scripts/atomic/gen-atomic-fallback.sh +++ b/scripts/atomic/gen-atomic-fallback.sh @@ -217,11 +217,11 @@ cat << EOF EOF -for xchg in "arch_xchg" "arch_cmpxchg" "arch_cmpxchg64"; do +for xchg in "arch_xchg" "arch_cmpxchg" "arch_cmpxchg64" "arch_cmpxchg128"; do gen_xchg_fallbacks "${xchg}" done -for cmpxchg in "cmpxchg" "cmpxchg64"; do +for cmpxchg in "cmpxchg" "cmpxchg64" "cmpxchg128"; do gen_try_cmpxchg_fallbacks "${cmpxchg}" done --- a/scripts/atomic/gen-atomic-instrumented.sh +++ b/scripts/atomic/gen-atomic-instrumented.sh @@ -166,14 +166,14 @@ grep '^[a-z]' "$1" | while read name met done -for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg" "try_cmpxchg64"; do +for xchg in "xchg" "cmpxchg" "cmpxchg64" "cmpxchg128" "try_cmpxchg" "try_cmpxchg64" "try_cmpxchg128"; do for order in "" "_acquire" "_release" "_relaxed"; do gen_xchg "${xchg}" "${order}" "" printf "\n" done done -for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg"; do +for xchg in "cmpxchg_local" "cmpxchg64_local" "cmpxchg128_local" "sync_cmpxchg"; do gen_xchg "${xchg}" "" "" printf "\n" done From patchwork Mon Dec 19 15:35:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34628 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2467616wrn; Mon, 19 Dec 2022 07:56:51 -0800 (PST) X-Google-Smtp-Source: AA0mqf57dWxUWx4RnSn6NJsuydm3q/ECEZrQDOgMhU8ryMY4jqHOq0lzh3CAWt9/J2yrMlW4RtQO X-Received: by 2002:a17:90a:6c21:b0:223:1e7d:67e8 with SMTP id x30-20020a17090a6c2100b002231e7d67e8mr26041902pjj.16.1671465410767; Mon, 19 Dec 2022 07:56:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671465410; cv=none; d=google.com; s=arc-20160816; b=xsACiS4uAkQp9gl/cqTX7a2DZIAVyKRW6umVEVy7kiI8DQ37rZ4uO/7waLx5fzCshx 1M6NjizDh/F5tSK/N7zodk6Jbv0R4takzLYWgG6D+iYWYScJPLjGmcLacV4U7oz2OUwF uAfvQ/9uWlz5gusF6BmBACx8B751OvDMX6rTU/CfOOKsC6poLYZmpOjyrEEFyvOskrbA wphLdiygnZvkBgJnannTnAoWSMxY2WpS1nCgMYY7RL3n/WRYI24OJp4KkxdOMXlStDax eigXWYHAgAhtPyedeLSEjNgQ1hcWhmMxHCGj1xXCLOjPeax0vIZ0rL4ctkSWiCYelGrN cRvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=EGvEEhj67bxGYwjZSrZRnMP7nFcuILVLo/TJsg7wp70=; b=gnpLMmE/RP5f3jyCXO09+rtQC9ccKJ0V2QsMxWGBNkRQHBeGlouHtkzz2C9i0jHg64 HbS7X5jO6QjnunqCeOH4/aAwQ/ACl4QU6QqBpL2O65HwlZ+v50QZ7pfyqzGAjL2v+BHZ t6oEMxYe0xqnmiGSJc6C05Izaan8Pw87436LLEUtSQpNg6CFdX56ELY7Z+ILEmr8T8gK JupO34feUDrU7N5vgfFGklUr0ysM5zvKPsgO4mEkwksC39VkHj3D8yMa9KNjImtItkLE b6WeyPJZNw3OUK6ebbVImIV+QkeKDLE3MZfPzoqGNwTwOLawYGJricYDtrTm+GIcQq2Z HP8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=XTpDGZcO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ga11-20020a17090b038b00b00219a2721d64si10301053pjb.72.2022.12.19.07.56.38; Mon, 19 Dec 2022 07:56:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=XTpDGZcO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232809AbiLSPrH (ORCPT + 99 others); Mon, 19 Dec 2022 10:47:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232482AbiLSPpd (ORCPT ); Mon, 19 Dec 2022 10:45:33 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74B8413CF8; Mon, 19 Dec 2022 07:44:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=EGvEEhj67bxGYwjZSrZRnMP7nFcuILVLo/TJsg7wp70=; b=XTpDGZcOneyi2jU6pOMWNGxSrG Tdqw3sx7AG1vT1ne1fXNaPNp/c69+TrAS0VOUR9s5qT2xIXABRaivHmARLOPbnJq0jzG+7GWUEiE4 eO1FkjNo5cUP4B8MZTqFLdIr08eD9muQ0B3AXs0KOvBV3UbV3RCW3i/O7OXSbm86entb2bwh0I9fN D5sLZ8446vFcqxVt8ybKuHmZk9Lw7DXjhJh5Qg3IyC7Yy85C812g45O996HpRXURZMlMLfklLD21p LvziG6s731SVNOgX+uqOircga46FxYtSbIOBEecUmbyboFFKGW/xwGhoUzyXfAUnoDpZ3INrhuGol Y+kXACug==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1p7III-00CeDq-39; Mon, 19 Dec 2022 15:43:11 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 0EB5F303382; Mon, 19 Dec 2022 16:43:10 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 8676D20B0F899; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154119.286760562@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:32 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 07/12] percpu: Wire up cmpxchg128 References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752658514855370063?= X-GMAIL-MSGID: =?utf-8?q?1752658514855370063?= In order to replace cmpxchg_double() with the newly minted cmpxchg128() family of functions, wire it up in this_cpu_cmpxchg(). Signed-off-by: Peter Zijlstra (Intel) --- arch/arm64/include/asm/percpu.h | 24 ++++++++++++++++++ arch/s390/include/asm/percpu.h | 20 +++++++++++++++ arch/x86/include/asm/percpu.h | 52 ++++++++++++++++++++++++++++++++++++++++ include/asm-generic/percpu.h | 8 ++++++ include/linux/percpu-defs.h | 20 +++++++++++++-- 5 files changed, 122 insertions(+), 2 deletions(-) --- a/arch/arm64/include/asm/percpu.h +++ b/arch/arm64/include/asm/percpu.h @@ -140,6 +140,10 @@ PERCPU_RET_OP(add, add, ldadd) * re-enabling preemption for preemptible kernels, but doing that in a way * which builds inside a module would mean messing directly with the preempt * count. If you do this, peterz and tglx will hunt you down. + * + * Not to mention it'll break the actual preemption model for missing a + * preemption point when TIF_NEED_RESCHED gets set while preemption is + * disabled. */ #define this_cpu_cmpxchg_double_8(ptr1, ptr2, o1, o2, n1, n2) \ ({ \ @@ -240,6 +244,26 @@ PERCPU_RET_OP(add, add, ldadd) #define this_cpu_cmpxchg_8(pcp, o, n) \ _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) +#define __pcpu_cast_128(_exp, _val) \ + _Generic((_exp), \ + u128: (_val), \ + s128: (_val), \ + default: (unsigned long)(_val)) + +#define this_cpu_cmpxchg_16(pcp, o, n) \ +({ \ + u128 old__ = __pcpu_cast_128((o), (o)); \ + u128 new__ = __pcpu_cast_128((n), (n)); \ + typedef typeof(pcp) pcp_op_T__; \ + pcp_op_T__ *ptr__; \ + u128 ret__; \ + preempt_disable_notrace(); \ + ptr__ = raw_cpu_ptr(&(pcp)); \ + ret__ = cmpxchg128_local((void *)ptr__, old__, new__); \ + preempt_enable_notrace(); \ + (typeof(pcp))__pcpu_cast_128(*ptr__, ret__); \ +}) + #ifdef __KVM_NVHE_HYPERVISOR__ extern unsigned long __hyp_per_cpu_offset(unsigned int cpu); #define __per_cpu_offset --- a/arch/s390/include/asm/percpu.h +++ b/arch/s390/include/asm/percpu.h @@ -148,6 +148,26 @@ #define this_cpu_cmpxchg_4(pcp, oval, nval) arch_this_cpu_cmpxchg(pcp, oval, nval) #define this_cpu_cmpxchg_8(pcp, oval, nval) arch_this_cpu_cmpxchg(pcp, oval, nval) +#define __pcpu_cast_128(_exp, _val) \ + _Generic((_exp), \ + u128: (_val), \ + s128: (_val), \ + default: (unsigned long)(_val)) + +#define this_cpu_cmpxchg_16(pcp, oval, nval) \ +({ \ + u128 old__ = __pcpu_cast_128((nval), (nval)); \ + u128 new__ = __pcpu_cast_128((oval), (oval)); \ + typedef typeof(pcp) pcp_op_T__; \ + pcp_op_T__ *ptr__; \ + u128 ret__; \ + preempt_disable_notrace(); \ + ptr__ = raw_cpu_ptr(&(pcp)); \ + ret__ = cmpxchg128((void *)ptr__, old__, new__); \ + preempt_enable_notrace(); \ + (typeof(pcp))__pcpu_cast_128(*ptr__, ret__); \ +}) + #define arch_this_cpu_xchg(pcp, nval) \ ({ \ typeof(pcp) *ptr__; \ --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -210,6 +210,58 @@ do { \ (typeof(_var))(unsigned long) pco_old__; \ }) +#if defined(CONFIG_X86_32) && defined(CONFIG_X86_CMPXCHG64) +#define __pcpu_cast_64(_exp, _val) \ + _Generic((_exp), \ + u64: (_val), \ + s64: (_val), \ + default: (unsigned long)(_val)) + +#define percpu_cmpxchg64_op(size, qual, _var, _oval, _nval) \ +({ \ + __pcpu_type_##size pco_old__ = __pcpu_cast_64((_oval), (_oval));\ + __pcpu_type_##size pco_new__ = __pcpu_cast_64((_nval), (_nval));\ + asm qual ("cmpxchg8b " __percpu_arg([var]) \ + : [var] "+m" (_var), \ + "+A" (pco_old__) \ + : "b" ((u32)pco_new__), "c" ((u32)(pco_new__ >> 32)) \ + : "memory"); \ + (typeof(_var))__pcpu_cast_64(_var, pco_old__); \ +}) + +#define raw_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg64_op(8, , pcp, oval, nval) +#define this_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg64_op(8, volatile, pcp, oval, nval) +#endif + +#ifdef CONFIG_X86_64 +#define __pcpu_cast_128(_exp, _val) \ + _Generic((_exp), \ + u128: (_val), \ + s128: (_val), \ + default: (unsigned long)(_val)) + +#define percpu_cmpxchg128_op(size, qual, _var, _oval, _nval) \ +({ \ + union __u128_halves pco_old__ = { \ + .full = __pcpu_cast_128((_oval), (_oval)) \ + }; \ + union __u128_halves pco_new__ = { \ + .full = __pcpu_cast_128((_nval), (_nval)) \ + }; \ + asm qual ("cmpxchg16b " __percpu_arg([var]) \ + : [var] "+m" (_var), \ + "+a" (pco_old__.low), \ + "+d" (pco_old__.high) \ + : "b" (pco_new__.low), \ + "c" (pco_new__.high) \ + : "memory"); \ + (typeof(_var))__pcpu_cast_128(_var, pco_old__.full); \ +}) + +#define raw_cpu_cmpxchg_16(pcp, oval, nval) percpu_cmpxchg128_op(16, , pcp, oval, nval) +#define this_cpu_cmpxchg_16(pcp, oval, nval) percpu_cmpxchg128_op(16, volatile, pcp, oval, nval) +#endif + /* * this_cpu_read() makes gcc load the percpu variable every time it is * accessed while this_cpu_read_stable() allows the value to be cached. --- a/include/asm-generic/percpu.h +++ b/include/asm-generic/percpu.h @@ -298,6 +298,10 @@ do { \ #define raw_cpu_cmpxchg_8(pcp, oval, nval) \ raw_cpu_generic_cmpxchg(pcp, oval, nval) #endif +#ifndef raw_cpu_cmpxchg_16 +#define raw_cpu_cmpxchg_16(pcp, oval, nval) \ + raw_cpu_generic_cmpxchg(pcp, oval, nval) +#endif #ifndef raw_cpu_cmpxchg_double_1 #define raw_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ @@ -423,6 +427,10 @@ do { \ #define this_cpu_cmpxchg_8(pcp, oval, nval) \ this_cpu_generic_cmpxchg(pcp, oval, nval) #endif +#ifndef this_cpu_cmpxchg_16 +#define this_cpu_cmpxchg_16(pcp, oval, nval) \ + this_cpu_generic_cmpxchg(pcp, oval, nval) +#endif #ifndef this_cpu_cmpxchg_double_1 #define this_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ --- a/include/linux/percpu-defs.h +++ b/include/linux/percpu-defs.h @@ -343,6 +343,22 @@ static inline void __this_cpu_preempt_ch pscr2_ret__; \ }) +#define __pcpu_size16_call_return2(stem, variable, ...) \ +({ \ + typeof(variable) pscr2_ret__; \ + __verify_pcpu_ptr(&(variable)); \ + switch(sizeof(variable)) { \ + case 1: pscr2_ret__ = stem##1(variable, __VA_ARGS__); break; \ + case 2: pscr2_ret__ = stem##2(variable, __VA_ARGS__); break; \ + case 4: pscr2_ret__ = stem##4(variable, __VA_ARGS__); break; \ + case 8: pscr2_ret__ = stem##8(variable, __VA_ARGS__); break; \ + case 16: pscr2_ret__ = stem##16(variable, __VA_ARGS__); break; \ + default: \ + __bad_size_call_parameter(); break; \ + } \ + pscr2_ret__; \ +}) + /* * Special handling for cmpxchg_double. cmpxchg_double is passed two * percpu variables. The first has to be aligned to a double word @@ -425,7 +441,7 @@ do { \ #define raw_cpu_add_return(pcp, val) __pcpu_size_call_return2(raw_cpu_add_return_, pcp, val) #define raw_cpu_xchg(pcp, nval) __pcpu_size_call_return2(raw_cpu_xchg_, pcp, nval) #define raw_cpu_cmpxchg(pcp, oval, nval) \ - __pcpu_size_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval) + __pcpu_size16_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval) #define raw_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ __pcpu_double_call_return_bool(raw_cpu_cmpxchg_double_, pcp1, pcp2, oval1, oval2, nval1, nval2) @@ -512,7 +528,7 @@ do { \ #define this_cpu_add_return(pcp, val) __pcpu_size_call_return2(this_cpu_add_return_, pcp, val) #define this_cpu_xchg(pcp, nval) __pcpu_size_call_return2(this_cpu_xchg_, pcp, nval) #define this_cpu_cmpxchg(pcp, oval, nval) \ - __pcpu_size_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) + __pcpu_size16_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) #define this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ __pcpu_double_call_return_bool(this_cpu_cmpxchg_double_, pcp1, pcp2, oval1, oval2, nval1, nval2) From patchwork Mon Dec 19 15:35:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34618 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2463131wrn; Mon, 19 Dec 2022 07:46:56 -0800 (PST) X-Google-Smtp-Source: AA0mqf70QdxipDnGCvUPjBbKzzBq66qjSzMdaPc2ZtCB7VAV1T7+7upgxFK3WhxoRUf89be+Gl4o X-Received: by 2002:a17:90a:13cf:b0:219:3184:2bf with SMTP id s15-20020a17090a13cf00b00219318402bfmr44354530pjf.45.1671464815687; Mon, 19 Dec 2022 07:46:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671464815; cv=none; d=google.com; s=arc-20160816; b=BMU+8XoXkXkorUanaFQXa8rQG/DHHurispBxiIyJANZnEkHieZd3WceMLfK7k2KI3n 2hJ9aRo7dqoxCbLgaGW+gcCzgMSb0F/S5ne2tEFMr6oXNvrW0gqlxlqAQULzmQUTnksR Gdo7vbNTDyEW3KclTw30KZfz4YazDshMZppwCpqLOGhBsTAdnkau4rt4o+buaukQUvYI GYxPoxoiIP81GAkewqqncIZJfuknTou+fnrndN/MIRrMTsq28fRUapfD78LiFpUC9Tj4 g1CcJg10jfKE3qCTDQYoZ7kdMP0IIB6ByAv7pz1h+1+xV8EiuhWhD3/jqJKE7pzvR8HL LJWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=gkcjStIUQsvZMzBnRMKIS/S8C1S2HdQdTZCEH219uhc=; b=wusZ6P3B7EkTjc0rGVybLv+vh67o1rs5em5HHFJKflbxxS/AHw8exUm6n09g3xzAp0 KPS9JbZ7rzzqmF2s6/xxa9gRPRK0+pI6b1hDYTKXxiFzF1ZQhto9JQryV9e69ho4Znpz +0/5Rtjce1IxO8vVwAeCf0y2snwMMWzDlWTwSfM1Vl35NSSfWhD47pIFnvoGy5F2GdBt n1g/gi4n5Nl4/KT60dWif7hI4pVMTpyOB+wo8OFRsA6IdjEyDGcxvF99fTQwGvBKYGey rgSNrIZEldqTXL6aryqdUAYmcrAGmZ4DaFHfgJTOm1fOq9UVtPIrONU4DTTPyUtsNeEw lEdw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=H4AjUmHw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 71-20020a63054a000000b004791b527636si10953669pgf.155.2022.12.19.07.46.42; Mon, 19 Dec 2022 07:46:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=H4AjUmHw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232270AbiLSPqR (ORCPT + 99 others); Mon, 19 Dec 2022 10:46:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232638AbiLSPpa (ORCPT ); Mon, 19 Dec 2022 10:45:30 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54E5D13F2B; Mon, 19 Dec 2022 07:44:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=gkcjStIUQsvZMzBnRMKIS/S8C1S2HdQdTZCEH219uhc=; b=H4AjUmHwWCRzxq87XM6dpIVmhp a6K9WMWqe+zvKBxZT3MnWCIIOoYI+xLcYdnMcJtUyBQ537agojcGqjlVwUHqfZ8U+DwCSlb/u9KU1 a9t3thKS+ZDlG1xqzGNzzZa48TCVaH1NcBD5Az/De/G46fj5MSt/nmPBonJHKh8hNu9qACt/gtfvp STxqJIgeYtBhW19qvGf/EppiN+gRv9z++90g+9r9t5rp2CxhjZeNcDiUYzRO10sDir1IBJqPb3ok/ xDzrfmrgdazGWSgN0dy03Y8gM38cdg75QYDKcSDFiiZOXRZBaXpEIx9k4lY2E1hX7XWVYP22eha5G 9ZUUr+Fg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1p7III-00CeDp-35; Mon, 19 Dec 2022 15:43:11 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 0DD4F30330E; Mon, 19 Dec 2022 16:43:10 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 8F28420B0F89B; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154119.352918965@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:33 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 08/12] s390: Replace cmpxchg_double() with cmpxchg128() References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752657890994722470?= X-GMAIL-MSGID: =?utf-8?q?1752657890994722470?= In order to depricate cmpxchg_double(), replace all its usage with cmpxchg128(). Signed-off-by: Peter Zijlstra (Intel) Reported-by: Alexander Gordeev Acked-by: Alexander Gordeev Acked-by: Hendrik Brueckner Signed-off-by: Heiko Carstens --- arch/s390/include/asm/cpu_mf.h | 29 ++++++++++++----- arch/s390/kernel/perf_cpum_sf.c | 65 +++++++++++++++++++++++++--------------- 2 files changed, 63 insertions(+), 31 deletions(-) --- a/arch/s390/include/asm/cpu_mf.h +++ b/arch/s390/include/asm/cpu_mf.h @@ -131,19 +131,32 @@ struct hws_combined_entry { struct hws_diag_entry diag; /* Diagnostic-sampling data entry */ } __packed; +union hws_flags_and_overflow { + struct { + unsigned long flags; + unsigned long overflow; + }; + u128 full; +}; + struct hws_trailer_entry { union { struct { - unsigned int f:1; /* 0 - Block Full Indicator */ - unsigned int a:1; /* 1 - Alert request control */ - unsigned int t:1; /* 2 - Timestamp format */ - unsigned int :29; /* 3 - 31: Reserved */ - unsigned int bsdes:16; /* 32-47: size of basic SDE */ - unsigned int dsdes:16; /* 48-63: size of diagnostic SDE */ + union { + struct { + unsigned int f:1; /* 0 - Block Full Indicator */ + unsigned int a:1; /* 1 - Alert request control */ + unsigned int t:1; /* 2 - Timestamp format */ + unsigned int :29; /* 3 - 31: Reserved */ + unsigned int bsdes:16; /* 32-47: size of basic SDE */ + unsigned int dsdes:16; /* 48-63: size of diagnostic SDE */ + }; + unsigned long long flags; /* 0 - 63: All indicators */ + }; + unsigned long long overflow; /* 64 - sample Overflow count */ }; - unsigned long long flags; /* 0 - 63: All indicators */ + union hws_flags_and_overflow flags_and_overflow; }; - unsigned long long overflow; /* 64 - sample Overflow count */ unsigned char timestamp[16]; /* 16 - 31 timestamp */ unsigned long long reserved1; /* 32 -Reserved */ unsigned long long reserved2; /* */ --- a/arch/s390/kernel/perf_cpum_sf.c +++ b/arch/s390/kernel/perf_cpum_sf.c @@ -1227,6 +1227,8 @@ static void hw_collect_samples(struct pe } } +typedef union hws_flags_and_overflow fao_t; + /* hw_perf_event_update() - Process sampling buffer * @event: The perf event * @flush_all: Flag to also flush partially filled sample-data-blocks @@ -1243,10 +1245,11 @@ static void hw_collect_samples(struct pe */ static void hw_perf_event_update(struct perf_event *event, int flush_all) { + unsigned long long event_overflow, sampl_overflow, num_sdb; struct hw_perf_event *hwc = &event->hw; struct hws_trailer_entry *te; + fao_t old_fao, new_fao; unsigned long *sdbt; - unsigned long long event_overflow, sampl_overflow, num_sdb, te_flags; int done; /* @@ -1294,12 +1297,16 @@ static void hw_perf_event_update(struct num_sdb++; /* Reset trailer (using compare-double-and-swap) */ + old_fao = te->flags_and_overflow; do { - te_flags = te->flags & ~SDB_TE_BUFFER_FULL_MASK; - te_flags |= SDB_TE_ALERT_REQ_MASK; - } while (!cmpxchg_double(&te->flags, &te->overflow, - te->flags, te->overflow, - te_flags, 0ULL)); + new_fao = (fao_t){ + .flags = old_fao.flags, + .overflow = 0, + }; + new_fao.flags &= ~SDB_TE_BUFFER_FULL_MASK; + new_fao.flags |= SDB_TE_ALERT_REQ_MASK; + } while (!try_cmpxchg128(&te->flags_and_overflow.full, + &old_fao.full, new_fao.full)); /* Advance to next sample-data-block */ sdbt++; @@ -1475,14 +1482,19 @@ static int aux_output_begin(struct perf_ static bool aux_set_alert(struct aux_buffer *aux, unsigned long alert_index, unsigned long long *overflow) { - unsigned long long orig_overflow, orig_flags, new_flags; struct hws_trailer_entry *te; + fao_t old_fao, new_fao; te = aux_sdb_trailer(aux, alert_index); + + old_fao = te->flags_and_overflow; do { - orig_flags = te->flags; - *overflow = orig_overflow = te->overflow; - if (orig_flags & SDB_TE_BUFFER_FULL_MASK) { + new_fao = (fao_t){ + .flags = old_fao.flags, + .overflow = 0, + }; + *overflow = old_fao.overflow; + if (new_fao.flags & SDB_TE_BUFFER_FULL_MASK) { /* * SDB is already set by hardware. * Abort and try to set somewhere @@ -1490,10 +1502,11 @@ static bool aux_set_alert(struct aux_buf */ return false; } - new_flags = orig_flags | SDB_TE_ALERT_REQ_MASK; - } while (!cmpxchg_double(&te->flags, &te->overflow, - orig_flags, orig_overflow, - new_flags, 0ULL)); + new_fao.flags |= SDB_TE_ALERT_REQ_MASK; + + } while (!try_cmpxchg128(&te->flags_and_overflow.full, + &old_fao.full, new_fao.full)); + return true; } @@ -1522,9 +1535,10 @@ static bool aux_set_alert(struct aux_buf static bool aux_reset_buffer(struct aux_buffer *aux, unsigned long range, unsigned long long *overflow) { - unsigned long long orig_overflow, orig_flags, new_flags; unsigned long i, range_scan, idx, idx_old; + unsigned long long orig_overflow; struct hws_trailer_entry *te; + fao_t old_fao, new_fao; debug_sprintf_event(sfdbg, 6, "%s: range %ld head %ld alert %ld " "empty %ld\n", __func__, range, aux->head, @@ -1554,17 +1568,22 @@ static bool aux_reset_buffer(struct aux_ idx_old = idx = aux->empty_mark + 1; for (i = 0; i < range_scan; i++, idx++) { te = aux_sdb_trailer(aux, idx); + + old_fao = te->flags_and_overflow; do { - orig_flags = te->flags; - orig_overflow = te->overflow; - new_flags = orig_flags & ~SDB_TE_BUFFER_FULL_MASK; + new_fao = (fao_t){ + .flags = old_fao.flags, + .overflow = 0, + }; + orig_overflow = old_fao.overflow; + new_fao.flags &= ~SDB_TE_BUFFER_FULL_MASK; if (idx == aux->alert_mark) - new_flags |= SDB_TE_ALERT_REQ_MASK; + new_fao.flags |= SDB_TE_ALERT_REQ_MASK; else - new_flags &= ~SDB_TE_ALERT_REQ_MASK; - } while (!cmpxchg_double(&te->flags, &te->overflow, - orig_flags, orig_overflow, - new_flags, 0ULL)); + new_fao.flags &= ~SDB_TE_ALERT_REQ_MASK; + } while (!try_cmpxchg128(&te->flags_and_overflow.full, + &old_fao.full, new_fao.full)); + *overflow += orig_overflow; } From patchwork Mon Dec 19 15:35:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34622 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2463404wrn; Mon, 19 Dec 2022 07:47:33 -0800 (PST) X-Google-Smtp-Source: AA0mqf7PYOoE4Mqq/2jU9mOYe1VzH16I1zBWQtyT73gkzmI2Yxi3pEB/ihE61WbMyHCcG9nu+Xxz X-Received: by 2002:a05:6a00:a07:b0:573:3de7:89a with SMTP id p7-20020a056a000a0700b005733de7089amr63553747pfh.4.1671464852868; Mon, 19 Dec 2022 07:47:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671464852; cv=none; d=google.com; s=arc-20160816; b=gjXaALUo/rzjAIXnMjGm69jnn0RW0f1UAULa/CDknohf1UowsJmBQTkGMYxk15tQeJ 9VwxgBK4unbBEyWlHfu3PYaR6/tz96pP3YaBA5jt4I91pMGYKlqKhiNs7Y1tUSxzieXr +WBf4n7Zh/ilTCLCGYH2ABB3sXnUwK7Kj1CGWRDjYnOdeq/M3SSb+nbWbp7H+X+LcORs A42NDLQq3zDGBHOC7QvZPb1PbFIofMkaEC5mdxAF447haMv0kfZ+jym6ywbPrC5QmdfX 5i7uUI/LoggRDuWkvkQLyyBAPfvjIHrGn8ldTw5pR5rW29FcjbG3elHqvICEJqukBVtI SUpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=TPf252rpqr9tSwRpNAavICICeFc5ydD0TdU6p9uBQtA=; b=1ETlBVaxR1YENiDCzCqFUNUstCzjoF+Izd+LPO4VIlGaoOML7hGazgail0voi+Lru7 Sgs7eEjMxNB5Gr92jfTJXekATDbWVdRHUFE6qn1mUQpgXSY48FW3s1SsYIrXk357FmCD yuDw9Hyqq0Bsac3ci3iaGDccw+e+Xsy0XTHOQTJPTM9jMxnme3I/kcI7XbqsOI9g0g0M eKJMZv9fHkrQZyzPAu5GtGzlVxfSth3chgZYKoCpEuqMlYJWs8Ev/nPtGcvVtzjilO6L 7YcWP9HxlOSJgsQJXHD7Bjf1BR8fQDqIEv7pifUFY3hFvODfHxho3ZeuNq5Ojd7q1vFM G2Fw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=hikgFxO9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c142-20020a624e94000000b0057743e34597si10050261pfb.272.2022.12.19.07.47.19; Mon, 19 Dec 2022 07:47:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=hikgFxO9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232387AbiLSPqM (ORCPT + 99 others); Mon, 19 Dec 2022 10:46:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232621AbiLSPpa (ORCPT ); Mon, 19 Dec 2022 10:45:30 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B644712AA1; Mon, 19 Dec 2022 07:44:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=TPf252rpqr9tSwRpNAavICICeFc5ydD0TdU6p9uBQtA=; b=hikgFxO9sKFM24xM6AvNTdx5rx 4bK/y8q/B0VCEnW01d/jdOIOOgxIQq4NAI6Euj1jUMXfyP+rrDOK4e4zF1MxY9SDmw3712EXyfs0e zZyVyHpAFwBjGr/vHvoeSRNRrKno451ZKvcGDinbA8BYM5e9DbPj/bZgs6iT29aBwXdpHXM81ydbA BuU2h9UTrqr7sYcUkI12tkRLSEh3whv/yts2oyMWWf/AtwiO7CtycmUfUgvK9Yfqfb/hCNyOAszam f/zQg3GR+4JQj5VUE26N5UTKvtWgZjr6FBo4UI3eE2tZNipWqM42J/lcxj2ocKWrQLjmoRPH/MazT iU0/RFQw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1p7IIT-000qwq-PH; Mon, 19 Dec 2022 15:43:21 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 0DCAD3031AE; Mon, 19 Dec 2022 16:43:10 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 96CCA20B0F89E; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154119.419176389@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:34 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 09/12] x86,amd_iommu: Replace cmpxchg_double() References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752657929462912375?= X-GMAIL-MSGID: =?utf-8?q?1752657929462912375?= Signed-off-by: Peter Zijlstra (Intel) --- drivers/iommu/amd/amd_iommu_types.h | 9 +++++++-- drivers/iommu/amd/iommu.c | 10 ++++------ 2 files changed, 11 insertions(+), 8 deletions(-) --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -979,8 +979,13 @@ union irte_ga_hi { }; struct irte_ga { - union irte_ga_lo lo; - union irte_ga_hi hi; + union { + struct { + union irte_ga_lo lo; + union irte_ga_hi hi; + }; + u128 irte; + }; }; struct irq_2_irte { --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2992,10 +2992,10 @@ static int alloc_irq_index(struct amd_io static int modify_irte_ga(struct amd_iommu *iommu, u16 devid, int index, struct irte_ga *irte, struct amd_ir_data *data) { - bool ret; struct irq_remap_table *table; - unsigned long flags; struct irte_ga *entry; + unsigned long flags; + u128 old; table = get_irq_table(iommu, devid); if (!table) @@ -3006,16 +3006,14 @@ static int modify_irte_ga(struct amd_iom entry = (struct irte_ga *)table->table; entry = &entry[index]; - ret = cmpxchg_double(&entry->lo.val, &entry->hi.val, - entry->lo.val, entry->hi.val, - irte->lo.val, irte->hi.val); /* * We use cmpxchg16 to atomically update the 128-bit IRTE, * and it cannot be updated by the hardware or other processors * behind us, so the return value of cmpxchg16 should be the * same as the old value. */ - WARN_ON(!ret); + old = entry->irte; + WARN_ON(!try_cmpxchg128(&entry->irte, &old, irte->irte)); if (data) data->ref = entry; From patchwork Mon Dec 19 15:35:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34621 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2463290wrn; Mon, 19 Dec 2022 07:47:15 -0800 (PST) X-Google-Smtp-Source: AA0mqf4q0QFsRqt41Co6BYdDsn+e+jc8hTT9k+9jOdo1A9BJJNPnG6dWU+uHXCAt5JTw1wOPM7GH X-Received: by 2002:a17:90a:cf93:b0:219:34cb:477e with SMTP id i19-20020a17090acf9300b0021934cb477emr44006847pju.44.1671464835409; Mon, 19 Dec 2022 07:47:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671464835; cv=none; d=google.com; s=arc-20160816; b=V9p53Z9nND8ArPrvh4oscKevdqZI4aptGZcToxOHLV0HQORjiaZLMU6iSQAkVWZQo/ Skc7jCc97AMGKJE4lnXuaORoPQfBftiAM5O6qCNh+YlrEN2vI4ZrpDIZBhlcgj+5Y+NA ay24cT9E3B3rjMoWYkySdOloXY+D0C16OVor+LxzPw5mxWs043PpzIfvB1acu5FW6vQD aPPpDtqywslBLle3jVuuZp9JEezRTwm/HfmrmP5J0A56XL//mzCla0vrtzRHgY9zj83U IN9pvmfTfeBv31zY66gbthkuDPlYPehVOrvv1+VprlKuSZ+q14EvjcgcS2GauLK885Oh /dDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=3ubHXmjMHuO7YFbxrF9R90Y+YVnse9xBuEbHAitlUas=; b=TNUJPIl7njLh5lDLOP65n4XewStJmiFSgyn6FLTTs3clGSIZSCZZyJ98/VSMEYJuPN mSHfycQNFSrKRHsa0UxJA/V7OVXherMnLF8TqX/Cm6SprOBuptRB00JqMx5u6UsNRMhX MKzfMK+WUvYN9RcLK4pGJYDRY/euQyBT+XQW37xqX7faFaZrmxDykozmLMfXRK7+GbuI a83kWrPhKv4nSvD4lzb4BQuOj6eqaFPkStZ8BxEiKLnFVKEBme3d7R0aCufUvrQbOncZ oZJeLHB5uM3N8G217Wvqi3NJCN7GA2ZP8hw0LrcBnXtDWaknpl6iwzrzmicaF6SSp1oG pgyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=F9Ye6OWU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ob8-20020a17090b390800b002197ff12d35si17950746pjb.25.2022.12.19.07.47.02; Mon, 19 Dec 2022 07:47:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=F9Ye6OWU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232547AbiLSPqb (ORCPT + 99 others); Mon, 19 Dec 2022 10:46:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232614AbiLSPpa (ORCPT ); Mon, 19 Dec 2022 10:45:30 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C989713D3E; Mon, 19 Dec 2022 07:44:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=3ubHXmjMHuO7YFbxrF9R90Y+YVnse9xBuEbHAitlUas=; b=F9Ye6OWUV85f2WpS6IBk1wtws0 OTuNbel0nmECFfo9vSobT5XqJjrd1NXNjVMTobisn4/wA+8YgSgS7EtkQgHrAwNk1mzCZoQWEr3DB NTpTBmvMROo2VE532Wumfb7uJRZrPtV/7FYH/qfCQcjhVKH324Z+UDiF9oUlSfMVRt4iju5/8ArAS oNE9X81K/IL3HQ2NXfx5yaqnUun4rlkKYw90gPjppR/qSPzQkePxvogKLcJyO5OlHCMntGMJy8R8z JvgmFlxxAvis2uvoGT8nU6uoOYzNtGmE8G6pBQ/Ix09BE8EcxzFV4f9LTZAQgindGBBIcL4SyC3j3 zgVgdFyg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1p7IIJ-00CeDs-0v; Mon, 19 Dec 2022 15:43:11 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 15D99303386; Mon, 19 Dec 2022 16:43:10 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 9EA1420B0F89F; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154119.484857499@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:35 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 10/12] x86,intel_iommu: Replace cmpxchg_double() References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752657911162721029?= X-GMAIL-MSGID: =?utf-8?q?1752657911162721029?= Signed-off-by: Peter Zijlstra (Intel) --- drivers/iommu/intel/irq_remapping.c | 8 -- include/linux/dmar.h | 125 +++++++++++++++++++----------------- 2 files changed, 68 insertions(+), 65 deletions(-) --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -174,18 +174,14 @@ static int modify_irte(struct irq_2_iomm irte = &iommu->ir_table->base[index]; if ((irte->pst == 1) || (irte_modified->pst == 1)) { - bool ret; - - ret = cmpxchg_double(&irte->low, &irte->high, - irte->low, irte->high, - irte_modified->low, irte_modified->high); /* * We use cmpxchg16 to atomically update the 128-bit IRTE, * and it cannot be updated by the hardware or other processors * behind us, so the return value of cmpxchg16 should be the * same as the old value. */ - WARN_ON(!ret); + u128 old = irte->irte; + WARN_ON(!try_cmpxchg128(&irte->irte, &old, irte_modified->irte)); } else { WRITE_ONCE(irte->low, irte_modified->low); WRITE_ONCE(irte->high, irte_modified->high); --- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -201,67 +201,74 @@ static inline void detect_intel_iommu(vo struct irte { union { - /* Shared between remapped and posted mode*/ struct { - __u64 present : 1, /* 0 */ - fpd : 1, /* 1 */ - __res0 : 6, /* 2 - 6 */ - avail : 4, /* 8 - 11 */ - __res1 : 3, /* 12 - 14 */ - pst : 1, /* 15 */ - vector : 8, /* 16 - 23 */ - __res2 : 40; /* 24 - 63 */ + union { + /* Shared between remapped and posted mode*/ + struct { + __u64 present : 1, /* 0 */ + fpd : 1, /* 1 */ + __res0 : 6, /* 2 - 6 */ + avail : 4, /* 8 - 11 */ + __res1 : 3, /* 12 - 14 */ + pst : 1, /* 15 */ + vector : 8, /* 16 - 23 */ + __res2 : 40; /* 24 - 63 */ + }; + + /* Remapped mode */ + struct { + __u64 r_present : 1, /* 0 */ + r_fpd : 1, /* 1 */ + dst_mode : 1, /* 2 */ + redir_hint : 1, /* 3 */ + trigger_mode : 1, /* 4 */ + dlvry_mode : 3, /* 5 - 7 */ + r_avail : 4, /* 8 - 11 */ + r_res0 : 4, /* 12 - 15 */ + r_vector : 8, /* 16 - 23 */ + r_res1 : 8, /* 24 - 31 */ + dest_id : 32; /* 32 - 63 */ + }; + + /* Posted mode */ + struct { + __u64 p_present : 1, /* 0 */ + p_fpd : 1, /* 1 */ + p_res0 : 6, /* 2 - 7 */ + p_avail : 4, /* 8 - 11 */ + p_res1 : 2, /* 12 - 13 */ + p_urgent : 1, /* 14 */ + p_pst : 1, /* 15 */ + p_vector : 8, /* 16 - 23 */ + p_res2 : 14, /* 24 - 37 */ + pda_l : 26; /* 38 - 63 */ + }; + __u64 low; + }; + + union { + /* Shared between remapped and posted mode*/ + struct { + __u64 sid : 16, /* 64 - 79 */ + sq : 2, /* 80 - 81 */ + svt : 2, /* 82 - 83 */ + __res3 : 44; /* 84 - 127 */ + }; + + /* Posted mode*/ + struct { + __u64 p_sid : 16, /* 64 - 79 */ + p_sq : 2, /* 80 - 81 */ + p_svt : 2, /* 82 - 83 */ + p_res3 : 12, /* 84 - 95 */ + pda_h : 32; /* 96 - 127 */ + }; + __u64 high; + }; }; - - /* Remapped mode */ - struct { - __u64 r_present : 1, /* 0 */ - r_fpd : 1, /* 1 */ - dst_mode : 1, /* 2 */ - redir_hint : 1, /* 3 */ - trigger_mode : 1, /* 4 */ - dlvry_mode : 3, /* 5 - 7 */ - r_avail : 4, /* 8 - 11 */ - r_res0 : 4, /* 12 - 15 */ - r_vector : 8, /* 16 - 23 */ - r_res1 : 8, /* 24 - 31 */ - dest_id : 32; /* 32 - 63 */ - }; - - /* Posted mode */ - struct { - __u64 p_present : 1, /* 0 */ - p_fpd : 1, /* 1 */ - p_res0 : 6, /* 2 - 7 */ - p_avail : 4, /* 8 - 11 */ - p_res1 : 2, /* 12 - 13 */ - p_urgent : 1, /* 14 */ - p_pst : 1, /* 15 */ - p_vector : 8, /* 16 - 23 */ - p_res2 : 14, /* 24 - 37 */ - pda_l : 26; /* 38 - 63 */ - }; - __u64 low; - }; - - union { - /* Shared between remapped and posted mode*/ - struct { - __u64 sid : 16, /* 64 - 79 */ - sq : 2, /* 80 - 81 */ - svt : 2, /* 82 - 83 */ - __res3 : 44; /* 84 - 127 */ - }; - - /* Posted mode*/ - struct { - __u64 p_sid : 16, /* 64 - 79 */ - p_sq : 2, /* 80 - 81 */ - p_svt : 2, /* 82 - 83 */ - p_res3 : 12, /* 84 - 95 */ - pda_h : 32; /* 96 - 127 */ - }; - __u64 high; +#ifdef CONFIG_IRQ_REMAP + __u128 irte; +#endif }; }; From patchwork Mon Dec 19 15:35:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34627 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2464788wrn; Mon, 19 Dec 2022 07:50:32 -0800 (PST) X-Google-Smtp-Source: AMrXdXua8N/GUchWSvDsg8eUPXCZrYgFAFigPcxK0Q/9cn/5UIJ1DTfA25wYiu+hUdW/mGGex7E0 X-Received: by 2002:a05:6a20:4b25:b0:b0:a35:b763 with SMTP id fp37-20020a056a204b2500b000b00a35b763mr8630670pzb.5.1671465031885; Mon, 19 Dec 2022 07:50:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671465031; cv=none; d=google.com; s=arc-20160816; b=g+pGHaQ6rA90v9c99JUoTM/O7/WV+lK6Z7HEQ/6NP2thw2MC4D5pVufdeEmjAlUhZg xaRyeIfGjoViR5BPTI55Ju/HcIUgC7iPDRgdrXF6h0PuyV/i15sWHkh1asY5CC2M4h43 ssKPEcTpGK65uANsifArl5B91hVcfGdkmi7RywhQGxzqIFND8gzLhIXch7mlNJ2HRRjI jmY5JVlQNp3Ym9V2PqR7Tf/r6FtNlaOLS1pxtg0FsxIqnPA7WYThF5sd6RJ3nyO5sdqC 5t7M6fqPkh4uSsRR+VSvE/pd6QRwidbxPov2IejVJtlWcjPopuOJmA1l0+0UXYnH9aSr gh5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=wYWf/6rFD/HTF6iXeYomE1fwdD9WYUdD1TmCeycYrdc=; b=n6XbcbnGMOo7kn1iloZC5DEP+CMnkTjvrmwNqnhamd1An3KolqlhQ4EgcO6gx0Vm4f E2hNCWxmf8dersgxbr85pBkpsvzKluSguLpgjFk4LYiDgtGSe/qqany2hnluLoYluL23 r3VNFQiFlvzNP4pJWQr47YoYeDnuhSefq9qnrLojJoJ/9UhtLZJvqo0kRe98g7lpnBP8 7chiVBR5FB8HaEL/RthG+ftDuclCZYjwWUr9g1sd6wCNkJSJjpG6M8VSC91FRtCgfcWq 7SEMmp6Bj6DIQiVFFrul+xkwXxvz/w29tUc/unN15YsLRje2EvFbW7AI5o8r8/3u7h0c xC2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=Ok+W64Zx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y187-20020a6364c4000000b004609faa2dbesi11596680pgb.285.2022.12.19.07.50.17; Mon, 19 Dec 2022 07:50:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=Ok+W64Zx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232804AbiLSPrB (ORCPT + 99 others); Mon, 19 Dec 2022 10:47:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232561AbiLSPpc (ORCPT ); Mon, 19 Dec 2022 10:45:32 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A994313F34; Mon, 19 Dec 2022 07:44:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=wYWf/6rFD/HTF6iXeYomE1fwdD9WYUdD1TmCeycYrdc=; b=Ok+W64ZxfwZpDsrf9CXSHPYK0h I6SL5Z2AIcqGg7qtbahx3ygvP+yDl3hCzMKCe+4ecyC6I/mFyduGdyhWQrU7BTaPc9aDFcpHLOXlH cbLTk5WibKaAvN3kfNlfjolGtji/pdnI6Ja5JtV2Q387gaBBdGgO5CfK4Vm1/MzkuoTzB1HsrUpvd zFGz+v2PV3pdYCIsDzG4MU5E676AYo55K30T8PLAWiYVU35lcFfetSFLkve3O101sZijZmU3N3p3w O9q8qTmUA4IoKbBWg128vEdtWEsLMqqB4aTiVxBT7zheUDZLbwk4tmzx3dkKgEltN9+VZI9TgG4st Bx1HKCWw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1p7IIU-000qwu-6N; Mon, 19 Dec 2022 15:43:22 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 1989A30339C; Mon, 19 Dec 2022 16:43:10 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id A6BDB20B0F8A0; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154119.550996611@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:36 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 11/12] slub: Replace cmpxchg_double() References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752658117240736449?= X-GMAIL-MSGID: =?utf-8?q?1752658117240736449?= Signed-off-by: Peter Zijlstra (Intel) Acked-by: Vlastimil Babka --- include/linux/slub_def.h | 12 ++- mm/slab.h | 41 +++++++++++-- mm/slub.c | 146 ++++++++++++++++++++++++++++------------------- 3 files changed, 135 insertions(+), 64 deletions(-) --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -39,15 +39,21 @@ enum stat_item { CPU_PARTIAL_FREE, /* Refill cpu partial on free */ CPU_PARTIAL_NODE, /* Refill cpu partial from node partial */ CPU_PARTIAL_DRAIN, /* Drain cpu partial to node partial */ - NR_SLUB_STAT_ITEMS }; + NR_SLUB_STAT_ITEMS +}; /* * When changing the layout, make sure freelist and tid are still compatible * with this_cpu_cmpxchg_double() alignment requirements. */ struct kmem_cache_cpu { - void **freelist; /* Pointer to next available object */ - unsigned long tid; /* Globally unique transaction id */ + union { + struct { + void **freelist; /* Pointer to next available object */ + unsigned long tid; /* Globally unique transaction id */ + }; + freelist_aba_t freelist_tid; + }; struct slab *slab; /* The slab from which we are allocating */ #ifdef CONFIG_SLUB_CPU_PARTIAL struct slab *partial; /* Partially allocated frozen slabs */ --- a/mm/slab.h +++ b/mm/slab.h @@ -5,6 +5,32 @@ * Internal slab definitions */ +/* + * Freelist pointer and counter to cmpxchg together, avoids the typical ABA + * problems with cmpxchg of just a pointer. + */ +typedef union { + struct { + void *freelist; + unsigned long counter; + }; +#ifdef CONFIG_64BIT + u128 full; +#else + u64 full; +#endif +} freelist_aba_t; + +#ifdef CONFIG_64BIT +# ifdef system_has_cmpxchg128 +# define system_has_freelist_aba() system_has_cmpxchg128() +# endif +#else /* CONFIG_64BIT */ +# ifdef system_has_cmpxchg64 +# define system_has_freelist_aba() system_has_cmpxchg64() +# endif +#endif /* CONFIG_64BIT */ + /* Reuses the bits in struct page */ struct slab { unsigned long __page_flags; @@ -34,14 +60,19 @@ struct slab { }; struct kmem_cache *slab_cache; /* Double-word boundary */ - void *freelist; /* first free object */ union { - unsigned long counters; struct { - unsigned inuse:16; - unsigned objects:15; - unsigned frozen:1; + void *freelist; /* first free object */ + union { + unsigned long counters; + struct { + unsigned inuse:16; + unsigned objects:15; + unsigned frozen:1; + }; + }; }; + freelist_aba_t freelist_counter; }; unsigned int __unused; --- a/mm/slub.c +++ b/mm/slub.c @@ -280,7 +280,13 @@ static inline bool kmem_cache_has_cpu_pa /* Poison object */ #define __OBJECT_POISON ((slab_flags_t __force)0x80000000U) /* Use cmpxchg_double */ + +#if defined(system_has_freelist_aba) && \ + defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) #define __CMPXCHG_DOUBLE ((slab_flags_t __force)0x40000000U) +#else +#define __CMPXCHG_DOUBLE ((slab_flags_t __force)0U) +#endif /* * Tracking user of a slab. @@ -496,6 +502,47 @@ static __always_inline void slab_unlock( __bit_spin_unlock(PG_locked, &page->flags); } +static inline bool +__update_freelist_fast(struct slab *slab, + void *freelist_old, unsigned long counters_old, + void *freelist_new, unsigned long counters_new) +{ + + bool ret = false; + +#ifdef system_has_freelist_aba + freelist_aba_t old = { .freelist = freelist_old, .counter = counters_old }; + freelist_aba_t new = { .freelist = freelist_new, .counter = counters_new }; + +#ifdef CONFIG_64BIT + ret = try_cmpxchg128(&slab->freelist_counter.full, &old.full, new.full); +#else + ret = try_cmpxchg64(&slab->freelist_counter.full, &old.full, new.full); +#endif +#endif /* system_has_freelist_aba */ + + return ret; +} + +static inline bool +__update_freelist_slow(struct slab *slab, + void *freelist_old, unsigned long counters_old, + void *freelist_new, unsigned long counters_new) +{ + bool ret = false; + + slab_lock(slab); + if (slab->freelist == freelist_old && + slab->counters == counters_old) { + slab->freelist = freelist_new; + slab->counters = counters_new; + ret = true; + } + slab_unlock(slab); + + return ret; +} + /* * Interrupts must be disabled (for the fallback code to work right), typically * by an _irqsave() lock variant. On PREEMPT_RT the preempt_disable(), which is @@ -503,33 +550,25 @@ static __always_inline void slab_unlock( * allocation/ free operation in hardirq context. Therefore nothing can * interrupt the operation. */ -static inline bool __cmpxchg_double_slab(struct kmem_cache *s, struct slab *slab, +static inline bool __slab_update_freelist(struct kmem_cache *s, struct slab *slab, void *freelist_old, unsigned long counters_old, void *freelist_new, unsigned long counters_new, const char *n) { + bool ret; + if (USE_LOCKLESS_FAST_PATH()) lockdep_assert_irqs_disabled(); -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ - defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) + if (s->flags & __CMPXCHG_DOUBLE) { - if (cmpxchg_double(&slab->freelist, &slab->counters, - freelist_old, counters_old, - freelist_new, counters_new)) - return true; - } else -#endif - { - slab_lock(slab); - if (slab->freelist == freelist_old && - slab->counters == counters_old) { - slab->freelist = freelist_new; - slab->counters = counters_new; - slab_unlock(slab); - return true; - } - slab_unlock(slab); + ret = __update_freelist_fast(slab, freelist_old, counters_old, + freelist_new, counters_new); + } else { + ret = __update_freelist_slow(slab, freelist_old, counters_old, + freelist_new, counters_new); } + if (likely(ret)) + return true; cpu_relax(); stat(s, CMPXCHG_DOUBLE_FAIL); @@ -541,36 +580,26 @@ static inline bool __cmpxchg_double_slab return false; } -static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct slab *slab, +static inline bool slab_update_freelist(struct kmem_cache *s, struct slab *slab, void *freelist_old, unsigned long counters_old, void *freelist_new, unsigned long counters_new, const char *n) { -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ - defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) + bool ret; + if (s->flags & __CMPXCHG_DOUBLE) { - if (cmpxchg_double(&slab->freelist, &slab->counters, - freelist_old, counters_old, - freelist_new, counters_new)) - return true; - } else -#endif - { + ret = __update_freelist_fast(slab, freelist_old, counters_old, + freelist_new, counters_new); + } else { unsigned long flags; local_irq_save(flags); - slab_lock(slab); - if (slab->freelist == freelist_old && - slab->counters == counters_old) { - slab->freelist = freelist_new; - slab->counters = counters_new; - slab_unlock(slab); - local_irq_restore(flags); - return true; - } - slab_unlock(slab); + ret = __update_freelist_slow(slab, freelist_old, counters_old, + freelist_new, counters_new); local_irq_restore(flags); } + if (likely(ret)) + return true; cpu_relax(); stat(s, CMPXCHG_DOUBLE_FAIL); @@ -2168,7 +2197,7 @@ static inline void *acquire_slab(struct VM_BUG_ON(new.frozen); new.frozen = 1; - if (!__cmpxchg_double_slab(s, slab, + if (!__slab_update_freelist(s, slab, freelist, counters, new.freelist, new.counters, "acquire_slab")) @@ -2500,7 +2529,7 @@ static void deactivate_slab(struct kmem_ } - if (!cmpxchg_double_slab(s, slab, + if (!slab_update_freelist(s, slab, old.freelist, old.counters, new.freelist, new.counters, "unfreezing slab")) { @@ -2561,7 +2590,7 @@ static void __unfreeze_partials(struct k new.frozen = 0; - } while (!__cmpxchg_double_slab(s, slab, + } while (!__slab_update_freelist(s, slab, old.freelist, old.counters, new.freelist, new.counters, "unfreezing slab")); @@ -3022,7 +3051,7 @@ static inline void *get_freelist(struct new.inuse = slab->objects; new.frozen = freelist != NULL; - } while (!__cmpxchg_double_slab(s, slab, + } while (!__slab_update_freelist(s, slab, freelist, counters, NULL, new.counters, "get_freelist")); @@ -3295,6 +3324,18 @@ static __always_inline void maybe_wipe_o 0, sizeof(void *)); } +static inline bool +__update_cpu_freelist_fast(struct kmem_cache *s, + void *freelist_old, void *freelist_new, + unsigned long tid) +{ + freelist_aba_t old = { .freelist = freelist_old, .counter = tid }; + freelist_aba_t new = { .freelist = freelist_new, .counter = next_tid(tid) }; + + return this_cpu_cmpxchg(s->cpu_slab->freelist_tid.full, + old.full, new.full) == old.full; +} + /* * Inlined fastpath so that allocation functions (kmalloc, kmem_cache_alloc) * have the fastpath folded into their functions. So no function call @@ -3379,11 +3420,7 @@ static __always_inline void *slab_alloc_ * against code executing on this cpu *not* from access by * other cpus. */ - if (unlikely(!this_cpu_cmpxchg_double( - s->cpu_slab->freelist, s->cpu_slab->tid, - object, tid, - next_object, next_tid(tid)))) { - + if (unlikely(!__update_cpu_freelist_fast(s, object, next_object, tid))) { note_cmpxchg_failure("slab_alloc", s, tid); goto redo; } @@ -3517,7 +3554,7 @@ static void __slab_free(struct kmem_cach } } - } while (!cmpxchg_double_slab(s, slab, + } while (!slab_update_freelist(s, slab, prior, counters, head, new.counters, "__slab_free")); @@ -3621,11 +3658,7 @@ static __always_inline void do_slab_free set_freepointer(s, tail_obj, freelist); - if (unlikely(!this_cpu_cmpxchg_double( - s->cpu_slab->freelist, s->cpu_slab->tid, - freelist, tid, - head, next_tid(tid)))) { - + if (unlikely(!__update_cpu_freelist_fast(s, freelist, head, tid))) { note_cmpxchg_failure("slab_free", s, tid); goto redo; } @@ -4319,11 +4352,12 @@ static int kmem_cache_open(struct kmem_c } } -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ +#if defined(system_has_freelist_aba) && \ defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) - if (system_has_cmpxchg_double() && (s->flags & SLAB_NO_CMPXCHG) == 0) + if (system_has_freelist_aba() && !(s->flags & SLAB_NO_CMPXCHG)) { /* Enable fast mode */ s->flags |= __CMPXCHG_DOUBLE; + } #endif /* From patchwork Mon Dec 19 15:35:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 34623 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2463507wrn; Mon, 19 Dec 2022 07:47:44 -0800 (PST) X-Google-Smtp-Source: AMrXdXs11MBatErFrpmQRaHoiDH0HwSgvuh7bJ+d/q5YLb2ho+DH3vliuoIlVCeJ8dQbEer7DBvS X-Received: by 2002:a05:6a20:3b85:b0:a4:6ce5:46e7 with SMTP id b5-20020a056a203b8500b000a46ce546e7mr10071402pzh.10.1671464864201; Mon, 19 Dec 2022 07:47:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671464864; cv=none; d=google.com; s=arc-20160816; b=YU0+eJkQ9o7tlem+MlJMJ80F/m/c7yP3XNQUvQnO96hRM2RZcMhMGd0IqV6Ehd4ufP KThClFprM9Ak1+fQdnis4b9dpdOQ3kVkWL4E/4BKD00SybwGtlKZ3TjGAuc8OKC+gk00 iDmDuNYSB2ALNywhCktm604pnLveKLV7ht5tpnyDF4hR90naVeO4OJcnp6JzonFvA+sN bEu/f8LJUjGfkfn5trCXEnyyJPB7MTbr6+FKBIIlva12qQrL5opCS4QixRO69vNp3x6X 5ssR+VG1Xa/sFObBCLcRjq6XZ451u0JUH4sjGiLjx2jY3GYu/wBh0gPiryFb703/36tn 5hmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=Q7JGtw0CYGxGDnBLhpx7/MdDSlK58U/sb9rjRrrBecg=; b=l6r7xZHE1Vzax8Y30qwwzYSuKPUBMzNffWS036m3KYvG3PS7PK4EuJrl2D3T+TvGBf ZdizdsCBAKKp/d4iY3+0t7ohJF10FigYT4Rf8fZFSFkrR4CawZMFa/qyCMYFGiYPJFee pUEfQIrdBOEnHtblI0u1k8TSNkKpqnNX4jSd+o9HdDC4DUOawvGzpHR94mIi7yb1dv7X zvWrnPIbKbxb1KuZVUin9dR455RKSjs6uW92OnnniTafIL+yAz1WjpyqmdARj6jDr25n zDibmYFo85pExN5qe+21OLtLHk1BgxsKU2ubUBIdwdkDuimJLK78R4kjaH7q7R3hk+lY Dybg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=XzJZ5CLK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e187-20020a6369c4000000b00478162cf8e3si11876323pgc.81.2022.12.19.07.47.31; Mon, 19 Dec 2022 07:47:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=XzJZ5CLK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232628AbiLSPqv (ORCPT + 99 others); Mon, 19 Dec 2022 10:46:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231952AbiLSPpg (ORCPT ); Mon, 19 Dec 2022 10:45:36 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C06F13F4B; Mon, 19 Dec 2022 07:44:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=Q7JGtw0CYGxGDnBLhpx7/MdDSlK58U/sb9rjRrrBecg=; b=XzJZ5CLKTpECrKNMmE4dQXMDXx 9ojviI+pJeL7vVsAODadr6wA97uUEvnLjocmzy83qlN3o244elLngLWz385W/R9XWHFt+887523uY loMGwNqobvoGjWtwJP9T95jezN+8kxNgm69hVQRqjrt0yK7PL0jQNSfZxu3TdNMJ4g7fq2P/v4O2P CMTk1d50qCBiogc+/GI9TOpGtWXU+dTtfRHFUQiX2gGPv4a47dOuozsJLkraMh+PwcAraMt6DuDZb le2iih9xgcBh0oWl3Zf5jSxXTBWaVDd2JrFfEXurDlum/e7ErZmkApKE5AxbDMU1xGt9GZRjOSMdA UJN9DY5g==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1p7IIJ-00CeDt-2J; Mon, 19 Dec 2022 15:43:12 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 1B9B13033BF; Mon, 19 Dec 2022 16:43:10 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id AEC1820B0F8A1; Mon, 19 Dec 2022 16:43:06 +0100 (CET) Message-ID: <20221219154119.617065541@infradead.org> User-Agent: quilt/0.66 Date: Mon, 19 Dec 2022 16:35:37 +0100 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, Herbert Xu , davem@davemloft.net, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org Subject: [RFC][PATCH 12/12] arch: Remove cmpxchg_double References: <20221219153525.632521981@infradead.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752657941067908699?= X-GMAIL-MSGID: =?utf-8?q?1752657941067908699?= No moar users, remove the monster. Signed-off-by: Peter Zijlstra (Intel) --- Documentation/core-api/this_cpu_ops.rst | 2 - arch/arm64/include/asm/atomic_ll_sc.h | 33 ---------------- arch/arm64/include/asm/atomic_lse.h | 36 ------------------ arch/arm64/include/asm/cmpxchg.h | 46 ----------------------- arch/arm64/include/asm/percpu.h | 10 ----- arch/s390/include/asm/cmpxchg.h | 34 ----------------- arch/s390/include/asm/percpu.h | 18 --------- arch/x86/include/asm/cmpxchg.h | 25 ------------ arch/x86/include/asm/cmpxchg_32.h | 1 arch/x86/include/asm/cmpxchg_64.h | 1 arch/x86/include/asm/percpu.h | 41 -------------------- include/asm-generic/percpu.h | 58 ----------------------------- include/linux/atomic/atomic-instrumented.h | 17 -------- include/linux/percpu-defs.h | 38 ------------------- scripts/atomic/gen-atomic-instrumented.sh | 17 ++------ 15 files changed, 6 insertions(+), 371 deletions(-) --- a/Documentation/core-api/this_cpu_ops.rst +++ b/Documentation/core-api/this_cpu_ops.rst @@ -53,7 +53,6 @@ are defined. These operations can be use this_cpu_add_return(pcp, val) this_cpu_xchg(pcp, nval) this_cpu_cmpxchg(pcp, oval, nval) - this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) this_cpu_sub(pcp, val) this_cpu_inc(pcp) this_cpu_dec(pcp) @@ -242,7 +241,6 @@ modifies the variable, then RMW actions __this_cpu_add_return(pcp, val) __this_cpu_xchg(pcp, nval) __this_cpu_cmpxchg(pcp, oval, nval) - __this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) __this_cpu_sub(pcp, val) __this_cpu_inc(pcp) __this_cpu_dec(pcp) --- a/arch/arm64/include/asm/atomic_ll_sc.h +++ b/arch/arm64/include/asm/atomic_ll_sc.h @@ -294,39 +294,6 @@ __CMPXCHG_CASE( , , mb_, 64, dmb ish, #undef __CMPXCHG_CASE -#define __CMPXCHG_DBL(name, mb, rel, cl) \ -static __always_inline long \ -__ll_sc__cmpxchg_double##name(unsigned long old1, \ - unsigned long old2, \ - unsigned long new1, \ - unsigned long new2, \ - volatile void *ptr) \ -{ \ - unsigned long tmp, ret; \ - \ - asm volatile("// __cmpxchg_double" #name "\n" \ - " prfm pstl1strm, %2\n" \ - "1: ldxp %0, %1, %2\n" \ - " eor %0, %0, %3\n" \ - " eor %1, %1, %4\n" \ - " orr %1, %0, %1\n" \ - " cbnz %1, 2f\n" \ - " st" #rel "xp %w0, %5, %6, %2\n" \ - " cbnz %w0, 1b\n" \ - " " #mb "\n" \ - "2:" \ - : "=&r" (tmp), "=&r" (ret), "+Q" (*(unsigned long *)ptr) \ - : "r" (old1), "r" (old2), "r" (new1), "r" (new2) \ - : cl); \ - \ - return ret; \ -} - -__CMPXCHG_DBL( , , , ) -__CMPXCHG_DBL(_mb, dmb ish, l, "memory") - -#undef __CMPXCHG_DBL - union __u128_halves { u128 full; struct { --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -288,42 +288,6 @@ __CMPXCHG_CASE(x, , mb_, 64, al, "memo #undef __CMPXCHG_CASE -#define __CMPXCHG_DBL(name, mb, cl...) \ -static __always_inline long \ -__lse__cmpxchg_double##name(unsigned long old1, \ - unsigned long old2, \ - unsigned long new1, \ - unsigned long new2, \ - volatile void *ptr) \ -{ \ - unsigned long oldval1 = old1; \ - unsigned long oldval2 = old2; \ - register unsigned long x0 asm ("x0") = old1; \ - register unsigned long x1 asm ("x1") = old2; \ - register unsigned long x2 asm ("x2") = new1; \ - register unsigned long x3 asm ("x3") = new2; \ - register unsigned long x4 asm ("x4") = (unsigned long)ptr; \ - \ - asm volatile( \ - __LSE_PREAMBLE \ - " casp" #mb "\t%[old1], %[old2], %[new1], %[new2], %[v]\n"\ - " eor %[old1], %[old1], %[oldval1]\n" \ - " eor %[old2], %[old2], %[oldval2]\n" \ - " orr %[old1], %[old1], %[old2]" \ - : [old1] "+&r" (x0), [old2] "+&r" (x1), \ - [v] "+Q" (*(unsigned long *)ptr) \ - : [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), \ - [oldval1] "r" (oldval1), [oldval2] "r" (oldval2) \ - : cl); \ - \ - return x0; \ -} - -__CMPXCHG_DBL( , ) -__CMPXCHG_DBL(_mb, al, "memory") - -#undef __CMPXCHG_DBL - #define __CMPXCHG128(name, mb, cl...) \ static __always_inline u128 \ __lse__cmpxchg128##name(volatile u128 *ptr, u128 old, u128 new) \ --- a/arch/arm64/include/asm/cmpxchg.h +++ b/arch/arm64/include/asm/cmpxchg.h @@ -131,22 +131,6 @@ __CMPXCHG_CASE(mb_, 64) #undef __CMPXCHG_CASE -#define __CMPXCHG_DBL(name) \ -static inline long __cmpxchg_double##name(unsigned long old1, \ - unsigned long old2, \ - unsigned long new1, \ - unsigned long new2, \ - volatile void *ptr) \ -{ \ - return __lse_ll_sc_body(_cmpxchg_double##name, \ - old1, old2, new1, new2, ptr); \ -} - -__CMPXCHG_DBL( ) -__CMPXCHG_DBL(_mb) - -#undef __CMPXCHG_DBL - #define __CMPXCHG128(name) \ static inline long __cmpxchg128##name(volatile u128 *ptr, \ u128 old, u128 new) \ @@ -212,36 +196,6 @@ __CMPXCHG_GEN(_mb) #define arch_cmpxchg64 arch_cmpxchg #define arch_cmpxchg64_local arch_cmpxchg_local -/* cmpxchg_double */ -#define system_has_cmpxchg_double() 1 - -#define __cmpxchg_double_check(ptr1, ptr2) \ -({ \ - if (sizeof(*(ptr1)) != 8) \ - BUILD_BUG(); \ - VM_BUG_ON((unsigned long *)(ptr2) - (unsigned long *)(ptr1) != 1); \ -}) - -#define arch_cmpxchg_double(ptr1, ptr2, o1, o2, n1, n2) \ -({ \ - int __ret; \ - __cmpxchg_double_check(ptr1, ptr2); \ - __ret = !__cmpxchg_double_mb((unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2), \ - ptr1); \ - __ret; \ -}) - -#define arch_cmpxchg_double_local(ptr1, ptr2, o1, o2, n1, n2) \ -({ \ - int __ret; \ - __cmpxchg_double_check(ptr1, ptr2); \ - __ret = !__cmpxchg_double((unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2), \ - ptr1); \ - __ret; \ -}) - /* cmpxchg128 */ #define system_has_cmpxchg128() 1 --- a/arch/arm64/include/asm/percpu.h +++ b/arch/arm64/include/asm/percpu.h @@ -145,16 +145,6 @@ PERCPU_RET_OP(add, add, ldadd) * preemption point when TIF_NEED_RESCHED gets set while preemption is * disabled. */ -#define this_cpu_cmpxchg_double_8(ptr1, ptr2, o1, o2, n1, n2) \ -({ \ - int __ret; \ - preempt_disable_notrace(); \ - __ret = cmpxchg_double_local( raw_cpu_ptr(&(ptr1)), \ - raw_cpu_ptr(&(ptr2)), \ - o1, o2, n1, n2); \ - preempt_enable_notrace(); \ - __ret; \ -}) #define _pcp_protect(op, pcp, ...) \ ({ \ --- a/arch/s390/include/asm/cmpxchg.h +++ b/arch/s390/include/asm/cmpxchg.h @@ -167,40 +167,6 @@ static __always_inline unsigned long __c #define arch_cmpxchg_local arch_cmpxchg #define arch_cmpxchg64_local arch_cmpxchg -#define system_has_cmpxchg_double() 1 - -static __always_inline int __cmpxchg_double(unsigned long p1, unsigned long p2, - unsigned long o1, unsigned long o2, - unsigned long n1, unsigned long n2) -{ - union register_pair old = { .even = o1, .odd = o2, }; - union register_pair new = { .even = n1, .odd = n2, }; - int cc; - - asm volatile( - " cdsg %[old],%[new],%[ptr]\n" - " ipm %[cc]\n" - " srl %[cc],28\n" - : [cc] "=&d" (cc), [old] "+&d" (old.pair) - : [new] "d" (new.pair), - [ptr] "QS" (*(unsigned long *)p1), "Q" (*(unsigned long *)p2) - : "memory", "cc"); - return !cc; -} - -#define arch_cmpxchg_double(p1, p2, o1, o2, n1, n2) \ -({ \ - typeof(p1) __p1 = (p1); \ - typeof(p2) __p2 = (p2); \ - \ - BUILD_BUG_ON(sizeof(*(p1)) != sizeof(long)); \ - BUILD_BUG_ON(sizeof(*(p2)) != sizeof(long)); \ - VM_BUG_ON((unsigned long)((__p1) + 1) != (unsigned long)(__p2));\ - __cmpxchg_double((unsigned long)__p1, (unsigned long)__p2, \ - (unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2)); \ -}) - #define system_has_cmpxchg128() 1 static __always_inline u128 arch_cmpxchg128(volatile u128 *ptr, u128 old, u128 new) --- a/arch/s390/include/asm/percpu.h +++ b/arch/s390/include/asm/percpu.h @@ -184,24 +184,6 @@ #define this_cpu_xchg_4(pcp, nval) arch_this_cpu_xchg(pcp, nval) #define this_cpu_xchg_8(pcp, nval) arch_this_cpu_xchg(pcp, nval) -#define arch_this_cpu_cmpxchg_double(pcp1, pcp2, o1, o2, n1, n2) \ -({ \ - typeof(pcp1) *p1__; \ - typeof(pcp2) *p2__; \ - int ret__; \ - \ - preempt_disable_notrace(); \ - p1__ = raw_cpu_ptr(&(pcp1)); \ - p2__ = raw_cpu_ptr(&(pcp2)); \ - ret__ = __cmpxchg_double((unsigned long)p1__, (unsigned long)p2__, \ - (unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2)); \ - preempt_enable_notrace(); \ - ret__; \ -}) - -#define this_cpu_cmpxchg_double_8 arch_this_cpu_cmpxchg_double - #include #endif /* __ARCH_S390_PERCPU__ */ --- a/arch/x86/include/asm/cmpxchg.h +++ b/arch/x86/include/asm/cmpxchg.h @@ -233,29 +233,4 @@ extern void __add_wrong_size(void) #define __xadd(ptr, inc, lock) __xchg_op((ptr), (inc), xadd, lock) #define xadd(ptr, inc) __xadd((ptr), (inc), LOCK_PREFIX) -#define __cmpxchg_double(pfx, p1, p2, o1, o2, n1, n2) \ -({ \ - bool __ret; \ - __typeof__(*(p1)) __old1 = (o1), __new1 = (n1); \ - __typeof__(*(p2)) __old2 = (o2), __new2 = (n2); \ - BUILD_BUG_ON(sizeof(*(p1)) != sizeof(long)); \ - BUILD_BUG_ON(sizeof(*(p2)) != sizeof(long)); \ - VM_BUG_ON((unsigned long)(p1) % (2 * sizeof(long))); \ - VM_BUG_ON((unsigned long)((p1) + 1) != (unsigned long)(p2)); \ - asm volatile(pfx "cmpxchg%c5b %1" \ - CC_SET(e) \ - : CC_OUT(e) (__ret), \ - "+m" (*(p1)), "+m" (*(p2)), \ - "+a" (__old1), "+d" (__old2) \ - : "i" (2 * sizeof(long)), \ - "b" (__new1), "c" (__new2)); \ - __ret; \ -}) - -#define arch_cmpxchg_double(p1, p2, o1, o2, n1, n2) \ - __cmpxchg_double(LOCK_PREFIX, p1, p2, o1, o2, n1, n2) - -#define arch_cmpxchg_double_local(p1, p2, o1, o2, n1, n2) \ - __cmpxchg_double(, p1, p2, o1, o2, n1, n2) - #endif /* ASM_X86_CMPXCHG_H */ --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -103,7 +103,6 @@ static inline bool __try_cmpxchg64(volat #endif -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX8) #define system_has_cmpxchg64() boot_cpu_has(X86_FEATURE_CX8) #endif /* _ASM_X86_CMPXCHG_32_H */ --- a/arch/x86/include/asm/cmpxchg_64.h +++ b/arch/x86/include/asm/cmpxchg_64.h @@ -72,7 +72,6 @@ static __always_inline bool arch_try_cmp return likely(ret); } -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX16) #define system_has_cmpxchg128() boot_cpu_has(X86_FEATURE_CX16) #endif /* _ASM_X86_CMPXCHG_64_H */ --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -339,23 +339,6 @@ do { \ #define this_cpu_cmpxchg_2(pcp, oval, nval) percpu_cmpxchg_op(2, volatile, pcp, oval, nval) #define this_cpu_cmpxchg_4(pcp, oval, nval) percpu_cmpxchg_op(4, volatile, pcp, oval, nval) -#ifdef CONFIG_X86_CMPXCHG64 -#define percpu_cmpxchg8b_double(pcp1, pcp2, o1, o2, n1, n2) \ -({ \ - bool __ret; \ - typeof(pcp1) __o1 = (o1), __n1 = (n1); \ - typeof(pcp2) __o2 = (o2), __n2 = (n2); \ - asm volatile("cmpxchg8b "__percpu_arg(1) \ - CC_SET(z) \ - : CC_OUT(z) (__ret), "+m" (pcp1), "+m" (pcp2), "+a" (__o1), "+d" (__o2) \ - : "b" (__n1), "c" (__n2)); \ - __ret; \ -}) - -#define raw_cpu_cmpxchg_double_4 percpu_cmpxchg8b_double -#define this_cpu_cmpxchg_double_4 percpu_cmpxchg8b_double -#endif /* CONFIG_X86_CMPXCHG64 */ - /* * Per cpu atomic 64 bit operations are only available under 64 bit. * 32 bit must fall back to generic operations. @@ -378,30 +361,6 @@ do { \ #define this_cpu_add_return_8(pcp, val) percpu_add_return_op(8, volatile, pcp, val) #define this_cpu_xchg_8(pcp, nval) percpu_xchg_op(8, volatile, pcp, nval) #define this_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg_op(8, volatile, pcp, oval, nval) - -/* - * Pretty complex macro to generate cmpxchg16 instruction. The instruction - * is not supported on early AMD64 processors so we must be able to emulate - * it in software. The address used in the cmpxchg16 instruction must be - * aligned to a 16 byte boundary. - */ -#define percpu_cmpxchg16b_double(pcp1, pcp2, o1, o2, n1, n2) \ -({ \ - bool __ret; \ - typeof(pcp1) __o1 = (o1), __n1 = (n1); \ - typeof(pcp2) __o2 = (o2), __n2 = (n2); \ - alternative_io("leaq %P1,%%rsi\n\tcall this_cpu_cmpxchg16b_emu\n\t", \ - "cmpxchg16b " __percpu_arg(1) "\n\tsetz %0\n\t", \ - X86_FEATURE_CX16, \ - ASM_OUTPUT2("=a" (__ret), "+m" (pcp1), \ - "+m" (pcp2), "+d" (__o2)), \ - "b" (__n1), "c" (__n2), "a" (__o1) : "rsi"); \ - __ret; \ -}) - -#define raw_cpu_cmpxchg_double_8 percpu_cmpxchg16b_double -#define this_cpu_cmpxchg_double_8 percpu_cmpxchg16b_double - #endif static __always_inline bool x86_this_cpu_constant_test_bit(unsigned int nr, --- a/include/asm-generic/percpu.h +++ b/include/asm-generic/percpu.h @@ -99,19 +99,6 @@ do { \ __ret; \ }) -#define raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ -({ \ - typeof(pcp1) *__p1 = raw_cpu_ptr(&(pcp1)); \ - typeof(pcp2) *__p2 = raw_cpu_ptr(&(pcp2)); \ - int __ret = 0; \ - if (*__p1 == (oval1) && *__p2 == (oval2)) { \ - *__p1 = nval1; \ - *__p2 = nval2; \ - __ret = 1; \ - } \ - (__ret); \ -}) - #define __this_cpu_generic_read_nopreempt(pcp) \ ({ \ typeof(pcp) ___ret; \ @@ -180,17 +167,6 @@ do { \ __ret; \ }) -#define this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ -({ \ - int __ret; \ - unsigned long __flags; \ - raw_local_irq_save(__flags); \ - __ret = raw_cpu_generic_cmpxchg_double(pcp1, pcp2, \ - oval1, oval2, nval1, nval2); \ - raw_local_irq_restore(__flags); \ - __ret; \ -}) - #ifndef raw_cpu_read_1 #define raw_cpu_read_1(pcp) raw_cpu_generic_read(pcp) #endif @@ -303,23 +279,6 @@ do { \ raw_cpu_generic_cmpxchg(pcp, oval, nval) #endif -#ifndef raw_cpu_cmpxchg_double_1 -#define raw_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef raw_cpu_cmpxchg_double_2 -#define raw_cpu_cmpxchg_double_2(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef raw_cpu_cmpxchg_double_4 -#define raw_cpu_cmpxchg_double_4(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef raw_cpu_cmpxchg_double_8 -#define raw_cpu_cmpxchg_double_8(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif - #ifndef this_cpu_read_1 #define this_cpu_read_1(pcp) this_cpu_generic_read(pcp) #endif @@ -432,21 +391,4 @@ do { \ this_cpu_generic_cmpxchg(pcp, oval, nval) #endif -#ifndef this_cpu_cmpxchg_double_1 -#define this_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef this_cpu_cmpxchg_double_2 -#define this_cpu_cmpxchg_double_2(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef this_cpu_cmpxchg_double_4 -#define this_cpu_cmpxchg_double_4(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef this_cpu_cmpxchg_double_8 -#define this_cpu_cmpxchg_double_8(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif - #endif /* _ASM_GENERIC_PERCPU_H_ */ --- a/include/linux/atomic/atomic-instrumented.h +++ b/include/linux/atomic/atomic-instrumented.h @@ -2141,21 +2141,6 @@ atomic_long_dec_if_positive(atomic_long_ arch_sync_cmpxchg(__ai_ptr, __VA_ARGS__); \ }) -#define cmpxchg_double(ptr, ...) \ -({ \ - typeof(ptr) __ai_ptr = (ptr); \ - kcsan_mb(); \ - instrument_atomic_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \ - arch_cmpxchg_double(__ai_ptr, __VA_ARGS__); \ -}) - - -#define cmpxchg_double_local(ptr, ...) \ -({ \ - typeof(ptr) __ai_ptr = (ptr); \ - instrument_atomic_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \ - arch_cmpxchg_double_local(__ai_ptr, __VA_ARGS__); \ -}) #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */ -// 27320c1ec2bf2878ecb9df3ea4816a7bc0c57a52 +// 416a741acbd4d28dbfa45f1b2a2c1b714454229f --- a/include/linux/percpu-defs.h +++ b/include/linux/percpu-defs.h @@ -359,33 +359,6 @@ static inline void __this_cpu_preempt_ch pscr2_ret__; \ }) -/* - * Special handling for cmpxchg_double. cmpxchg_double is passed two - * percpu variables. The first has to be aligned to a double word - * boundary and the second has to follow directly thereafter. - * We enforce this on all architectures even if they don't support - * a double cmpxchg instruction, since it's a cheap requirement, and it - * avoids breaking the requirement for architectures with the instruction. - */ -#define __pcpu_double_call_return_bool(stem, pcp1, pcp2, ...) \ -({ \ - bool pdcrb_ret__; \ - __verify_pcpu_ptr(&(pcp1)); \ - BUILD_BUG_ON(sizeof(pcp1) != sizeof(pcp2)); \ - VM_BUG_ON((unsigned long)(&(pcp1)) % (2 * sizeof(pcp1))); \ - VM_BUG_ON((unsigned long)(&(pcp2)) != \ - (unsigned long)(&(pcp1)) + sizeof(pcp1)); \ - switch(sizeof(pcp1)) { \ - case 1: pdcrb_ret__ = stem##1(pcp1, pcp2, __VA_ARGS__); break; \ - case 2: pdcrb_ret__ = stem##2(pcp1, pcp2, __VA_ARGS__); break; \ - case 4: pdcrb_ret__ = stem##4(pcp1, pcp2, __VA_ARGS__); break; \ - case 8: pdcrb_ret__ = stem##8(pcp1, pcp2, __VA_ARGS__); break; \ - default: \ - __bad_size_call_parameter(); break; \ - } \ - pdcrb_ret__; \ -}) - #define __pcpu_size_call(stem, variable, ...) \ do { \ __verify_pcpu_ptr(&(variable)); \ @@ -442,9 +415,6 @@ do { \ #define raw_cpu_xchg(pcp, nval) __pcpu_size_call_return2(raw_cpu_xchg_, pcp, nval) #define raw_cpu_cmpxchg(pcp, oval, nval) \ __pcpu_size16_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval) -#define raw_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - __pcpu_double_call_return_bool(raw_cpu_cmpxchg_double_, pcp1, pcp2, oval1, oval2, nval1, nval2) - #define raw_cpu_sub(pcp, val) raw_cpu_add(pcp, -(val)) #define raw_cpu_inc(pcp) raw_cpu_add(pcp, 1) #define raw_cpu_dec(pcp) raw_cpu_sub(pcp, 1) @@ -504,11 +474,6 @@ do { \ raw_cpu_cmpxchg(pcp, oval, nval); \ }) -#define __this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ -({ __this_cpu_preempt_check("cmpxchg_double"); \ - raw_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2); \ -}) - #define __this_cpu_sub(pcp, val) __this_cpu_add(pcp, -(typeof(pcp))(val)) #define __this_cpu_inc(pcp) __this_cpu_add(pcp, 1) #define __this_cpu_dec(pcp) __this_cpu_sub(pcp, 1) @@ -529,9 +494,6 @@ do { \ #define this_cpu_xchg(pcp, nval) __pcpu_size_call_return2(this_cpu_xchg_, pcp, nval) #define this_cpu_cmpxchg(pcp, oval, nval) \ __pcpu_size16_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) -#define this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - __pcpu_double_call_return_bool(this_cpu_cmpxchg_double_, pcp1, pcp2, oval1, oval2, nval1, nval2) - #define this_cpu_sub(pcp, val) this_cpu_add(pcp, -(typeof(pcp))(val)) #define this_cpu_inc(pcp) this_cpu_add(pcp, 1) #define this_cpu_dec(pcp) this_cpu_sub(pcp, 1) --- a/scripts/atomic/gen-atomic-instrumented.sh +++ b/scripts/atomic/gen-atomic-instrumented.sh @@ -84,7 +84,6 @@ gen_xchg() { local xchg="$1"; shift local order="$1"; shift - local mult="$1"; shift kcsan_barrier="" if [ "${xchg%_local}" = "${xchg}" ]; then @@ -104,8 +103,8 @@ cat <