Message ID | 20221022114424.906110403@infradead.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp1169490wrr; Sat, 22 Oct 2022 04:53:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5Z5tCR2wxCUCAInO17mziV2KaSkvNf8W44z4feArCamaRYdKFY+46cHeWK4R69bPdjYYKG X-Received: by 2002:a17:90a:54:b0:212:eb01:1ce with SMTP id 20-20020a17090a005400b00212eb0101cemr1746208pjb.236.1666439634056; Sat, 22 Oct 2022 04:53:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666439634; cv=none; d=google.com; s=arc-20160816; b=hxHI0rLFyGOPCK/5HmR6MA7y9i0utESpTPwwixhiP6XZb0nc5OFj4k/fXYZeqp3LAj sgyecsbX5Pun6xJZUk5DjTM3WBXyPAFvdnC2ncbfeWtNrkoHpalghAL8Z+eJXaxhVoud AGzXab26BnoujI7UZSbaSrjeUu7q/rO2nzm7dSYe+WqXvHSMCbWC1vkGpmJHQNPDy498 gNYeIDwadL5I44IhIn5yWKkTl769RWRK5XRYJ8Uvp85LLgaR/Z7SrpB7uEvQt0VXDpqM PqDA8OGNLttxqVJyR8NNRkAczPQabqkQEcyEV7xFnasIfE0uTrbvXM8H3gdXO6Vv2hsr ApMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=ZxLGyKaonAFGosmEJEXCJIuUc/p2pYnEnlwCFC//QMs=; b=Is3Q15Z5FnLhuZtJq/iRtKw/zAnbGFaef7Hlu1r2jY01v2oML5xLLK+65ffgFu+Q2B ReF3q6lcvMUaTI4kzWbxJFTNjHa7n7V0xQw0h1WQaPVPsTJjEraWEsNjlb5rPKxVWpNe 71qm4U0iUQeVKcSnshLVUd8uhFhYLGoYoITRPpQICS2EKDWmHMiFe/TiiZZZTnhQ7Ttt BnWU2kPP7VHDF6Q/AjwKZuZQv8OC8fmGK5WsllXkm1ryqhnp7EQ+DNDQGexpfWPEWcH9 kdapBtc6IymSf8UXy+MHS/M+Xk3jU3+B3AckYSxKkoU6MFDPplDZvcWE1j7zhSJ90Q6j E1Uw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=d5RZJQiW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h7-20020a635307000000b00450200a1078si29824297pgb.853.2022.10.22.04.53.41; Sat, 22 Oct 2022 04:53:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=d5RZJQiW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229945AbiJVLtf (ORCPT <rfc822;pwkd43@gmail.com> + 99 others); Sat, 22 Oct 2022 07:49:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230024AbiJVLtA (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sat, 22 Oct 2022 07:49:00 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 672C4251D7C for <linux-kernel@vger.kernel.org>; Sat, 22 Oct 2022 04:48:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=ZxLGyKaonAFGosmEJEXCJIuUc/p2pYnEnlwCFC//QMs=; b=d5RZJQiWmMamzGxaIKQnEOtUwD T9H/sWSomE2By/o8fhYFxcJnvZOZCOoyYsj4DPqrDTAEiVXHR6xyiOPOuV3LpQeQUbDuoAPcmVKGs DJSaXerbzdFXHI/CeMHGBW7cMxLNnTrKYaXjJZ3DmnuM+KA59Jo1ENMyp0WIE4NfWdcSJlN6hSA3v I0C+dfhtEKU3yRXt+wguStxjqcWSysKILzObipzuaBHkU3xB0q+bVaL3s37Udbo8ewcAHquPYv+98 QYAVXxD+sKkG/ZpZypq3uRMpr18Br1YlM8YBXYyqaiaLhGv+xeybW3Binc8LI7NIXuDe907gPs0vd qMBpHPRw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzM-005XdG-1V; Sat, 22 Oct 2022 11:48:28 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id BDB6F302D82; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 33F9C28B8E514; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114424.906110403@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:10 +0200 From: Peter Zijlstra <peterz@infradead.org> To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 07/13] mm/gup: Fix the lockless PMD access References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747388605833284185?= X-GMAIL-MSGID: =?utf-8?q?1747388605833284185?= |
Series |
Clean up pmd_get_atomic() and i386-PAE
|
|
Commit Message
Peter Zijlstra
Oct. 22, 2022, 11:14 a.m. UTC
On architectures where the PTE/PMD is larger than the native word size
(i386-PAE for example), READ_ONCE() can do the wrong thing. Use
pmdp_get_lockless() just like we use ptep_get_lockless().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/events/core.c | 2 +-
mm/gup.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
Comments
On Sat, 22 Oct 2022, Peter Zijlstra wrote: > On architectures where the PTE/PMD is larger than the native word size > (i386-PAE for example), READ_ONCE() can do the wrong thing. Use > pmdp_get_lockless() just like we use ptep_get_lockless(). I thought that was something Will Deacon put a lot of effort into handling around 5.8 and 5.9: see "strong prevailing wind" in include/asm-generic/rwonce.h, formerly in include/linux/compiler.h. Was it too optimistic? Did the wind drop? I'm interested in the answer, but I've certainly no objection to making this all more obviously robust - thanks. Hugh > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > kernel/events/core.c | 2 +- > mm/gup.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -7186,7 +7186,7 @@ static u64 perf_get_pgtable_size(struct > return pud_leaf_size(pud); > > pmdp = pmd_offset_lockless(pudp, pud, addr); > - pmd = READ_ONCE(*pmdp); > + pmd = pmdp_get_lockless(pmdp); > if (!pmd_present(pmd)) > return 0; > > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2507,7 +2507,7 @@ static int gup_pmd_range(pud_t *pudp, pu > > pmdp = pmd_offset_lockless(pudp, pud, addr); > do { > - pmd_t pmd = READ_ONCE(*pmdp); > + pmd_t pmd = pmdp_get_lockless(pmdp); > > next = pmd_addr_end(addr, end); > if (!pmd_present(pmd))
On Sat, Oct 22, 2022 at 05:42:18PM -0700, Hugh Dickins wrote: > On Sat, 22 Oct 2022, Peter Zijlstra wrote: > > > On architectures where the PTE/PMD is larger than the native word size > > (i386-PAE for example), READ_ONCE() can do the wrong thing. Use > > pmdp_get_lockless() just like we use ptep_get_lockless(). > > I thought that was something Will Deacon put a lot of effort > into handling around 5.8 and 5.9: see "strong prevailing wind" in > include/asm-generic/rwonce.h, formerly in include/linux/compiler.h. > > Was it too optimistic? Did the wind drop? > > I'm interested in the answer, but I've certainly no objection > to making this all more obviously robust - thanks. READ_ONCE() can't do what the hardware can't do. There is absolutely no way i386 can do an atomic 64bit load without resorting to cmpxchg8b. Also see the comment that goes with compiletime_assert_rwonce_type(). It explicitly allows 64bit because there's just too much stuff that does that (and there's actually 32bit hardware that *can* do it). But it's still very wrong.
On Mon, 24 Oct 2022, Peter Zijlstra wrote: > On Sat, Oct 22, 2022 at 05:42:18PM -0700, Hugh Dickins wrote: > > On Sat, 22 Oct 2022, Peter Zijlstra wrote: > > > > > On architectures where the PTE/PMD is larger than the native word size > > > (i386-PAE for example), READ_ONCE() can do the wrong thing. Use > > > pmdp_get_lockless() just like we use ptep_get_lockless(). > > > > I thought that was something Will Deacon put a lot of effort > > into handling around 5.8 and 5.9: see "strong prevailing wind" in > > include/asm-generic/rwonce.h, formerly in include/linux/compiler.h. > > > > Was it too optimistic? Did the wind drop? > > > > I'm interested in the answer, but I've certainly no objection > > to making this all more obviously robust - thanks. > > READ_ONCE() can't do what the hardware can't do. There is absolutely no > way i386 can do an atomic 64bit load without resorting to cmpxchg8b. Right. > > Also see the comment that goes with compiletime_assert_rwonce_type(). It > explicitly allows 64bit because there's just too much stuff that does > that (and there's actually 32bit hardware that *can* do it). Yes, the "strong prevailing wind" comment. I think I've never read that carefully enough, until you redirected me back there: it is in fact quite clear, that it's only *atomic* in the Armv7 + LPAE case; but READ_ONCEy (READ_EACH_HALF_ONCE I guess) for other 64-on-32 cases. > > But it's still very wrong. Somewhat clearer to me now, thanks. Hugh
--- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7186,7 +7186,7 @@ static u64 perf_get_pgtable_size(struct return pud_leaf_size(pud); pmdp = pmd_offset_lockless(pudp, pud, addr); - pmd = READ_ONCE(*pmdp); + pmd = pmdp_get_lockless(pmdp); if (!pmd_present(pmd)) return 0; --- a/mm/gup.c +++ b/mm/gup.c @@ -2507,7 +2507,7 @@ static int gup_pmd_range(pud_t *pudp, pu pmdp = pmd_offset_lockless(pudp, pud, addr); do { - pmd_t pmd = READ_ONCE(*pmdp); + pmd_t pmd = pmdp_get_lockless(pmdp); next = pmd_addr_end(addr, end); if (!pmd_present(pmd))