Message ID | 20221030212929.335473-8-peterx@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1972505wru; Sun, 30 Oct 2022 14:32:15 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5M/8zrqN+X0xxppwL2AwftpdT3K/Nu+hMC8uSi7AP+YbNrAxCWyZM9wdUF/A+n7r1Or4Of X-Received: by 2002:a63:8bc2:0:b0:46f:5bbb:7370 with SMTP id j185-20020a638bc2000000b0046f5bbb7370mr9860622pge.70.1667165534784; Sun, 30 Oct 2022 14:32:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667165534; cv=none; d=google.com; s=arc-20160816; b=OVT3J5pYuxW5XEUMt6r+tL1YbpzrP90z0WGT9rHP8R99lIecddbkqcOqBhP2D1KNbJ L6iBBczLFJVZW1Bk/Xoci2AFWmzgBpFP9fI2E4qfgo0sqNpvCcYtyJCIHisHwcFZmjy3 DBItXEv3FtGAvhJ1WL6Rt5lzs1onO+N3l2kT+JfARbQwuk2BJ0Dqsh3pzJJ+Vv/byYFy lm8oZVQIYzC2qm+7D1xVPnnvZ47opzz7ajSG4+gDOIX3LfNwkklgFNE3bAfWyCSO77P+ //Na2w6d9Y4sY2zfyIIgq7kl9XvVZ+PF4dxyc5OoGurfPYUxeeE4QPbeqIsavVhp9SdX 036w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=6JldXIZ0VovVxqs2u1isjA5hu6yU0tGljewUccwkM0U=; b=zVczMGvMVlKfccLrBMvvSEC5ST4fSREo/6/jhsBzOVHJkTJWUxlv1aica4TMFYTI+s YZHUZb0myW5qUJlZUbMBtgyPT2I+oj7q52mL5zK/9s+UOO7Q+Zo1VBwpgKkFmPY/2z5j Dx32mQ3kZU2Nc1xmniWVxQT0OzKHbs+chDQ7pEXZSISRyZSMJufRbKjPAm/ANjuwh12w j2C8jUtmYQYCVJSe8fqaqOZZk+IYPh38CsvM85G3KXHT1XAY0JK+N2u1YQuGVelPM8Uh 2G6Np7XF5at9rsJ4RZ1jDqmPMj8dgxKDiF7SxEKQqhS21LhWEfa7xqC2rx+M1usChBya towg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ZWfAMV31; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e38-20020a631e26000000b0046f5808167asi6189926pge.812.2022.10.30.14.32.01; Sun, 30 Oct 2022 14:32:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ZWfAMV31; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229781AbiJ3Vba (ORCPT <rfc822;makky5685@gmail.com> + 99 others); Sun, 30 Oct 2022 17:31:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229785AbiJ3VbV (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 30 Oct 2022 17:31:21 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D263EA45F for <linux-kernel@vger.kernel.org>; Sun, 30 Oct 2022 14:29:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667165388; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6JldXIZ0VovVxqs2u1isjA5hu6yU0tGljewUccwkM0U=; b=ZWfAMV31AnIrzdaOb4rmWP1Uph4sW7jnjx2J+58PQTS7/7ZqFZXefJV0ASj2Ln64JwH+W1 UVkTT9ozXnEqFZ9+i39uFuJ2bIIkodpQwxPrCnMgV+Mx4v1jisLwYsSh0WaNfFOgp3U/tC a1gaWA/8Nj7d0DFpcsgI9IMib91HWiY= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-403-avLD5dvBNOyjZvKtLIGT4g-1; Sun, 30 Oct 2022 17:29:45 -0400 X-MC-Unique: avLD5dvBNOyjZvKtLIGT4g-1 Received: by mail-qt1-f200.google.com with SMTP id cj6-20020a05622a258600b003a519d02f59so2054879qtb.5 for <linux-kernel@vger.kernel.org>; Sun, 30 Oct 2022 14:29:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6JldXIZ0VovVxqs2u1isjA5hu6yU0tGljewUccwkM0U=; b=Xy1GfrLV7UFXKExE+0SEL+isZXHJT/z0gZgnimV7rZsp5X0p8AxF1qmIkXoW4vzuuE bkqYPi2xJtAr+jfuV85kRxWg8EhQGy/BculYMlE3zMi9bGfDV6sX+gPZ2sD7qhPuqnfi YAOAIzmrPvF2oDEjB9nw6j2k2gzTxzYoRB+zBUlXSic1yVv2nLCXSHG73unyIBR+Hoe1 bBT/QTmi0it8ykulJ7KrdAVZjWy3cVM3Pk6ZV3Ll44799JBpCs3a2t7+I/vt+RbTJb6B f0xf2Y7jZ0k1VAniISk5zSRj3temz4R1adPLv+YusplnDjLesPacytoA16wo/MqiDEpj mzwQ== X-Gm-Message-State: ACrzQf399OaEj9q7NPh7iL4LEpn5NnJIQXBZJwYCzpZlGkJGF+3tZx+Q UXUZx+r/hyi129Kei7O/Fym39nJpST8Io5qqu5JsFPvlwTlF8Nm0IPK0ouFX0MQbfhg5hRAoal4 TBrxXWgAyVIcaCr8BKb9YD5z1Y9/W9AUp9MhMFXQNmcqymoCdH2bpwejCdSECQo2XjDLG/gmxbA == X-Received: by 2002:a05:6214:1c09:b0:4b7:f9f6:7d17 with SMTP id u9-20020a0562141c0900b004b7f9f67d17mr8720827qvc.22.1667165383360; Sun, 30 Oct 2022 14:29:43 -0700 (PDT) X-Received: by 2002:a05:6214:1c09:b0:4b7:f9f6:7d17 with SMTP id u9-20020a0562141c0900b004b7f9f67d17mr8720803qvc.22.1667165383128; Sun, 30 Oct 2022 14:29:43 -0700 (PDT) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id a1-20020ac81081000000b003a4b88b886esm2654781qtj.96.2022.10.30.14.29.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 30 Oct 2022 14:29:42 -0700 (PDT) From: Peter Xu <peterx@redhat.com> To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org>, James Houghton <jthoughton@google.com>, Miaohe Lin <linmiaohe@huawei.com>, David Hildenbrand <david@redhat.com>, Muchun Song <songmuchun@bytedance.com>, Andrea Arcangeli <aarcange@redhat.com>, Nadav Amit <nadav.amit@gmail.com>, Mike Kravetz <mike.kravetz@oracle.com>, peterx@redhat.com, Rik van Riel <riel@surriel.com> Subject: [PATCH RFC 07/10] mm/hugetlb: Make hugetlb_follow_page_mask() RCU-safe Date: Sun, 30 Oct 2022 17:29:26 -0400 Message-Id: <20221030212929.335473-8-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221030212929.335473-1-peterx@redhat.com> References: <20221030212929.335473-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748149767861881257?= X-GMAIL-MSGID: =?utf-8?q?1748149767861881257?= |
Series |
mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare
|
|
Commit Message
Peter Xu
Oct. 30, 2022, 9:29 p.m. UTC
RCU makes sure the pte_t* won't go away from under us. Please refer to the
comment above huge_pte_offset() for more information.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
mm/hugetlb.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
Comments
On Sun, Oct 30, 2022 at 2:29 PM Peter Xu <peterx@redhat.com> wrote: > > RCU makes sure the pte_t* won't go away from under us. Please refer to the > comment above huge_pte_offset() for more information. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > mm/hugetlb.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 9869c12e6460..85214095fb85 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -6229,10 +6229,12 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, > if (WARN_ON_ONCE(flags & FOLL_PIN)) > return NULL; > > + /* For huge_pte_offset() */ > + rcu_read_lock(); > retry: > pte = huge_pte_offset(mm, haddr, huge_page_size(h)); > if (!pte) > - return NULL; > + goto out_rcu; > > ptl = huge_pte_lock(h, mm, pte); Just to make sure -- this huge_pte_lock doesn't count as "blocking" (for the purposes of what is allowed in an RCU read-side critical section), right? If so, great! But I think we need to call `rcu_read_unlock` before entering `__migration_entry_wait_huge`, as that function really can block. - James > entry = huge_ptep_get(pte); > @@ -6266,6 +6268,8 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, > } > out: > spin_unlock(ptl); > +out_rcu: > + rcu_read_unlock(); > return page; > } > > -- > 2.37.3 >
On Wed, Nov 02, 2022 at 11:24:57AM -0700, James Houghton wrote: > On Sun, Oct 30, 2022 at 2:29 PM Peter Xu <peterx@redhat.com> wrote: > > > > RCU makes sure the pte_t* won't go away from under us. Please refer to the > > comment above huge_pte_offset() for more information. > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > mm/hugetlb.c | 6 +++++- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index 9869c12e6460..85214095fb85 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -6229,10 +6229,12 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, > > if (WARN_ON_ONCE(flags & FOLL_PIN)) > > return NULL; > > > > + /* For huge_pte_offset() */ > > + rcu_read_lock(); > > retry: > > pte = huge_pte_offset(mm, haddr, huge_page_size(h)); > > if (!pte) > > - return NULL; > > + goto out_rcu; > > > > ptl = huge_pte_lock(h, mm, pte); > > Just to make sure -- this huge_pte_lock doesn't count as "blocking" > (for the purposes of what is allowed in an RCU read-side critical > section), right? If so, great! Yeah I think spinlock should be fine, iiuc it'll be fine as long as we don't proactively yield with any form of sleeping locks. For RT sleepable spinlock should also be fine in this case, as explicitly mentioned in the RCU docs: b. What about the -rt patchset? If readers would need to block in an non-rt kernel, you need SRCU. If readers would block in a -rt kernel, but not in a non-rt kernel, SRCU is not necessary. (The -rt patchset turns spinlocks into sleeplocks, hence this distinction.) > But I think we need to call `rcu_read_unlock` before entering > `__migration_entry_wait_huge`, as that function really can block. Right, let me revisit this after I figure out how to do with the hugetlb_fault() path first, as you commented in the other patch. Actually here I really think we should just remove the migration chunk and return with page==NULL, since I really don't think follow_page_mask should block at all.. then for !sleep cases (FOLL_NOWAIT) or follow_page we'll return the NULL upwards early, while for generic GUP (__get_user_pages) we'll just wait in the upcoming faultin_page(). That's afaict what we do with non-hugetlb memories too (after the recent removal of FOLL_MIGRATE in 4a0499782a).
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 9869c12e6460..85214095fb85 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6229,10 +6229,12 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, if (WARN_ON_ONCE(flags & FOLL_PIN)) return NULL; + /* For huge_pte_offset() */ + rcu_read_lock(); retry: pte = huge_pte_offset(mm, haddr, huge_page_size(h)); if (!pte) - return NULL; + goto out_rcu; ptl = huge_pte_lock(h, mm, pte); entry = huge_ptep_get(pte); @@ -6266,6 +6268,8 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, } out: spin_unlock(ptl); +out_rcu: + rcu_read_unlock(); return page; }