Message ID | 2edc4657-b6ff-3d6e-2342-6b60bfccc5b@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1213110vqo; Sun, 21 May 2023 22:09:54 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6A3YaGZdrjlHa4EbTolhNXJzaFM92Q6Y7I1vdkAdJkWz68lrHiT0me3OqOpgt9U8FkzjBh X-Received: by 2002:a17:902:710b:b0:1ac:7245:ba5a with SMTP id a11-20020a170902710b00b001ac7245ba5amr7761494pll.61.1684732193932; Sun, 21 May 2023 22:09:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684732193; cv=none; d=google.com; s=arc-20160816; b=AeaSKRRWUXUCJkc+joSwgZ1R8JVte1EJDN2neVjZzyK23hhQdrKvoISIuUFdcJ2e07 AnLu2Yzu0Ix+S1lzcEvLUa7BysXrDPTl75+uvsfMjJk6jPRE6SCVIOvm0Hj6t/BaVy4C KdemFa4wh7ignfUJzBOTIyG838aecFTTpklI5U9wKsrqLlgRVb3Z3GhI0QBfRI0B9Yqj MELQrRS4w9XyOfKoPzqA3MtxU+Q9NC1NsKys6YUeMICZlhXTfBjoz+RxvNU4PtIybQnJ OTxFW2ujVTO3IaUOaXwWlCckOnTkyehiGX3wk/6TaLY+eaME2bygRQyuJ85zzjAOTOCv nVlA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=Q9fqf/aXpqllLgdqCnqVNZO09v0Ag5UU0wZCu2wmsUs=; b=va0Q2y11LcKLeFSEjWrKt3KUpQeC7lW/LterB/lU8dHrtf/7lJTzyrKi22/Yl2Z293 vIOPZvNgmxbEPkKw66K6VcYZ2nhkppYlePv8GVWxlb2+jv6a+vVV6FxQ0gwyLjGV/2mw 7eWlkPzKrsZIgXhqk+eNYAbz9DMUrSbuDW28Tj7zVOGgF+cthByzHpHO/S+EXAT5iUDe IvznETpYBvqe56OArMtodhfkZKQZ1Fkid8CiAErqwjt3NcDLjdTCM6SzSspQMz6U1IZ8 GxEB22t89b/4NEz6vmSEBYHqULs6rZZSC20+CkydSyF3bn4cIoBZwLv97HA6iebHyba9 l+1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b="OKj/gvd6"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b21-20020a170902d31500b001aca056bdf9si4094185plc.33.2023.05.21.22.09.42; Sun, 21 May 2023 22:09:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b="OKj/gvd6"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231699AbjEVFFW (ORCPT <rfc822;cscallsign@gmail.com> + 99 others); Mon, 22 May 2023 01:05:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231694AbjEVFFU (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 22 May 2023 01:05:20 -0400 Received: from mail-yb1-xb2c.google.com (mail-yb1-xb2c.google.com [IPv6:2607:f8b0:4864:20::b2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04652E9 for <linux-kernel@vger.kernel.org>; Sun, 21 May 2023 22:05:19 -0700 (PDT) Received: by mail-yb1-xb2c.google.com with SMTP id 3f1490d57ef6-b9daef8681fso4783618276.1 for <linux-kernel@vger.kernel.org>; Sun, 21 May 2023 22:05:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1684731919; x=1687323919; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=Q9fqf/aXpqllLgdqCnqVNZO09v0Ag5UU0wZCu2wmsUs=; b=OKj/gvd6hObldwT6K4phcbvAmhbsjul6iPbLRzKjs71jzjUwd6r3e8xCWCNZVwPENy 0HaWB0qgO/fNnS/Ois67og8DBdTyq3v/IDNkLAR46vDKiCQUR3xPJoyU7eFLCu5OCUxu ene9fgjo5yHZ3AS4/uG54DbEfQUuxcpEJxYx8UgryAp+t99VA5jZFAwJV246EE9mtg1F OZdfnZLlti+W2fNxVTHpJKMKRYyeIyeNsz3sd1gR6zvPQCCVQ9jZz+INsTuvonpNhmYF Bm782lSdhp0xDJA2N1jyxGLFmtKZ57I/CBr/DUG8sWZnMiCyq1l9BTQfv5dwSU+wKfRJ U63g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684731919; x=1687323919; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Q9fqf/aXpqllLgdqCnqVNZO09v0Ag5UU0wZCu2wmsUs=; b=FbAWxP75Sxkv1o8ghhO5EebgDFgN1xxVBnpRxQSPLVGvrZu+l4Zy5oX4Ub+Nta1gML OiFlWdGZFL1qZiMWfdI6XkZq5xHtKMGFLxtiJEAfbd2cgQnRMPie8pNMRi4cL+kKKV3+ WeGvRsduYYzFT1kYASvE4wfLXwwzUAk8llXiU+CrfrvSFsRYz619og6nwrsMyaQpEjIb 9QrOECUHZLElPEKfa448+Zy1Efab9JEbLn1aT/oPPhPFoobGfAfCI7bLBNHDazqhu6lt gSIhbPaWR0fLbsyqQPg4nmUwyv8mHDH/YTwpOzmrXOvEZJ6k9WqrnnmQTh3ZZKKmRRWJ lqLw== X-Gm-Message-State: AC+VfDyzJ/5a8r7C55u8YS3eIQXYFk0fGH7pK7wFJ3BZ0nCOl18W77Cd INa1x4yvnJjjwz1rwfhnbpUn6A== X-Received: by 2002:a25:fa12:0:b0:ba8:1c9e:c77f with SMTP id b18-20020a25fa12000000b00ba81c9ec77fmr9327198ybe.22.1684731919014; Sun, 21 May 2023 22:05:19 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id a12-20020a25938c000000b00ba87e9b5bf9sm1274482ybm.45.2023.05.21.22.05.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 May 2023 22:05:18 -0700 (PDT) Date: Sun, 21 May 2023 22:05:15 -0700 (PDT) From: Hugh Dickins <hughd@google.com> X-X-Sender: hugh@ripple.attlocal.net To: Andrew Morton <akpm@linux-foundation.org> cc: Mike Kravetz <mike.kravetz@oracle.com>, Mike Rapoport <rppt@kernel.org>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, Matthew Wilcox <willy@infradead.org>, David Hildenbrand <david@redhat.com>, Suren Baghdasaryan <surenb@google.com>, Qi Zheng <zhengqi.arch@bytedance.com>, Yang Shi <shy828301@gmail.com>, Mel Gorman <mgorman@techsingularity.net>, Peter Xu <peterx@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Will Deacon <will@kernel.org>, Yu Zhao <yuzhao@google.com>, Alistair Popple <apopple@nvidia.com>, Ralph Campbell <rcampbell@nvidia.com>, Ira Weiny <ira.weiny@intel.com>, Steven Price <steven.price@arm.com>, SeongJae Park <sj@kernel.org>, Naoya Horiguchi <naoya.horiguchi@nec.com>, Christophe Leroy <christophe.leroy@csgroup.eu>, Zack Rusin <zackr@vmware.com>, Jason Gunthorpe <jgg@ziepe.ca>, Axel Rasmussen <axelrasmussen@google.com>, Anshuman Khandual <anshuman.khandual@arm.com>, Pasha Tatashin <pasha.tatashin@soleen.com>, Miaohe Lin <linmiaohe@huawei.com>, Minchan Kim <minchan@kernel.org>, Christoph Hellwig <hch@infradead.org>, Song Liu <song@kernel.org>, Thomas Hellstrom <thomas.hellstrom@linux.intel.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 13/31] mm/hmm: retry if pte_offset_map() fails In-Reply-To: <68a97fbe-5c1e-7ac6-72c-7b9c6290b370@google.com> Message-ID: <2edc4657-b6ff-3d6e-2342-6b60bfccc5b@google.com> References: <68a97fbe-5c1e-7ac6-72c-7b9c6290b370@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766569745165676355?= X-GMAIL-MSGID: =?utf-8?q?1766569745165676355?= |
Series |
mm: allow pte_offset_map[_lock]() to fail
|
|
Commit Message
Hugh Dickins
May 22, 2023, 5:05 a.m. UTC
hmm_vma_walk_pmd() is called through mm_walk, but already has a goto
again loop of its own, so take part in that if pte_offset_map() fails.
Signed-off-by: Hugh Dickins <hughd@google.com>
---
mm/hmm.c | 2 ++
1 file changed, 2 insertions(+)
Comments
On 2023/5/22 13:05, Hugh Dickins wrote: > hmm_vma_walk_pmd() is called through mm_walk, but already has a goto > again loop of its own, so take part in that if pte_offset_map() fails. > > Signed-off-by: Hugh Dickins <hughd@google.com> > --- > mm/hmm.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/hmm.c b/mm/hmm.c > index e23043345615..b1a9159d7c92 100644 > --- a/mm/hmm.c > +++ b/mm/hmm.c > @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, > } > > ptep = pte_offset_map(pmdp, addr); > + if (!ptep) > + goto again; > for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) { > int r; > I haven't read the entire patch set yet, but taking a note here. The hmm_vma_handle_pte() will unmap pte and then call migration_entry_wait() to remap pte, so this may fail, we need to handle this case like below: diff --git a/mm/hmm.c b/mm/hmm.c index 6a151c09de5e..eb726ff0981c 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -276,7 +276,8 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, if (is_migration_entry(entry)) { pte_unmap(ptep); hmm_vma_walk->last = addr; - migration_entry_wait(walk->mm, pmdp, addr); + if (!migration_entry_wait(walk->mm, pmdp, addr)) + return -EAGAIN; return -EBUSY; } @@ -386,6 +387,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, r = hmm_vma_handle_pte(walk, addr, end, pmdp, ptep, hmm_pfns); if (r) { + if (r == -EAGAIN) + goto again; /* hmm_vma_handle_pte() did pte_unmap() */ return r; } Of course, the migration_entry_wait() also needs to be modified.
Qi Zheng <qi.zheng@linux.dev> writes: > On 2023/5/22 13:05, Hugh Dickins wrote: >> hmm_vma_walk_pmd() is called through mm_walk, but already has a goto >> again loop of its own, so take part in that if pte_offset_map() fails. >> Signed-off-by: Hugh Dickins <hughd@google.com> >> --- >> mm/hmm.c | 2 ++ >> 1 file changed, 2 insertions(+) >> diff --git a/mm/hmm.c b/mm/hmm.c >> index e23043345615..b1a9159d7c92 100644 >> --- a/mm/hmm.c >> +++ b/mm/hmm.c >> @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, >> } >> ptep = pte_offset_map(pmdp, addr); >> + if (!ptep) >> + goto again; >> for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) { >> int r; >> > > I haven't read the entire patch set yet, but taking a note here. > The hmm_vma_handle_pte() will unmap pte and then call > migration_entry_wait() to remap pte, so this may fail, we need to > handle this case like below: I don't see a problem here. Sure, hmm_vma_handle_pte() might return -EBUSY but that will get returned up to hmm_range_fault() which will retry the whole thing again and presumably fail when looking at the PMD. > diff --git a/mm/hmm.c b/mm/hmm.c > index 6a151c09de5e..eb726ff0981c 100644 > --- a/mm/hmm.c > +++ b/mm/hmm.c > @@ -276,7 +276,8 @@ static int hmm_vma_handle_pte(struct mm_walk > *walk, unsigned long addr, > if (is_migration_entry(entry)) { > pte_unmap(ptep); > hmm_vma_walk->last = addr; > - migration_entry_wait(walk->mm, pmdp, addr); > + if (!migration_entry_wait(walk->mm, pmdp, addr)) > + return -EAGAIN; > return -EBUSY; > } > > @@ -386,6 +387,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, > > r = hmm_vma_handle_pte(walk, addr, end, pmdp, ptep, > hmm_pfns); > if (r) { > + if (r == -EAGAIN) > + goto again; > /* hmm_vma_handle_pte() did pte_unmap() */ > return r; > } > > Of course, the migration_entry_wait() also needs to be modified.
On 2023/5/23 10:39, Alistair Popple wrote: > > Qi Zheng <qi.zheng@linux.dev> writes: > >> On 2023/5/22 13:05, Hugh Dickins wrote: >>> hmm_vma_walk_pmd() is called through mm_walk, but already has a goto >>> again loop of its own, so take part in that if pte_offset_map() fails. >>> Signed-off-by: Hugh Dickins <hughd@google.com> >>> --- >>> mm/hmm.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> diff --git a/mm/hmm.c b/mm/hmm.c >>> index e23043345615..b1a9159d7c92 100644 >>> --- a/mm/hmm.c >>> +++ b/mm/hmm.c >>> @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, >>> } >>> ptep = pte_offset_map(pmdp, addr); >>> + if (!ptep) >>> + goto again; >>> for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) { >>> int r; >>> >> >> I haven't read the entire patch set yet, but taking a note here. >> The hmm_vma_handle_pte() will unmap pte and then call >> migration_entry_wait() to remap pte, so this may fail, we need to >> handle this case like below: > > I don't see a problem here. Sure, hmm_vma_handle_pte() might return > -EBUSY but that will get returned up to hmm_range_fault() which will > retry the whole thing again and presumably fail when looking at the PMD. Yeah. There is no problem with this and the modification to migration_entry_wait() can be simplified. My previous thought was that we can finish the retry logic in hmm_vma_walk_pmd() without handing it over to the caller. :) > >> diff --git a/mm/hmm.c b/mm/hmm.c >> index 6a151c09de5e..eb726ff0981c 100644 >> --- a/mm/hmm.c >> +++ b/mm/hmm.c >> @@ -276,7 +276,8 @@ static int hmm_vma_handle_pte(struct mm_walk >> *walk, unsigned long addr, >> if (is_migration_entry(entry)) { >> pte_unmap(ptep); >> hmm_vma_walk->last = addr; >> - migration_entry_wait(walk->mm, pmdp, addr); >> + if (!migration_entry_wait(walk->mm, pmdp, addr)) >> + return -EAGAIN; >> return -EBUSY; >> } >> >> @@ -386,6 +387,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, >> >> r = hmm_vma_handle_pte(walk, addr, end, pmdp, ptep, >> hmm_pfns); >> if (r) { >> + if (r == -EAGAIN) >> + goto again; >> /* hmm_vma_handle_pte() did pte_unmap() */ >> return r; >> } >> >> Of course, the migration_entry_wait() also needs to be modified. >
On Tue, 23 May 2023, Qi Zheng wrote: > On 2023/5/23 10:39, Alistair Popple wrote: > > Qi Zheng <qi.zheng@linux.dev> writes: > >> On 2023/5/22 13:05, Hugh Dickins wrote: > >>> hmm_vma_walk_pmd() is called through mm_walk, but already has a goto > >>> again loop of its own, so take part in that if pte_offset_map() fails. > >>> Signed-off-by: Hugh Dickins <hughd@google.com> > >>> --- > >>> mm/hmm.c | 2 ++ > >>> 1 file changed, 2 insertions(+) > >>> diff --git a/mm/hmm.c b/mm/hmm.c > >>> index e23043345615..b1a9159d7c92 100644 > >>> --- a/mm/hmm.c > >>> +++ b/mm/hmm.c > >>> @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, > >>> } > >>> ptep = pte_offset_map(pmdp, addr); > >>> + if (!ptep) > >>> + goto again; > >>> for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) { > >>> int r; > >>> > >> > >> I haven't read the entire patch set yet, but taking a note here. > >> The hmm_vma_handle_pte() will unmap pte and then call > >> migration_entry_wait() to remap pte, so this may fail, we need to > >> handle this case like below: > > > > I don't see a problem here. Sure, hmm_vma_handle_pte() might return > > -EBUSY but that will get returned up to hmm_range_fault() which will > > retry the whole thing again and presumably fail when looking at the PMD. > > Yeah. There is no problem with this and the modification to > migration_entry_wait() can be simplified. My previous thought was that > we can finish the retry logic in hmm_vma_walk_pmd() without handing it > over to the caller. :) Okay, Alistair has resolved this one, thanks, I agree; but what is "the modification to migration_entry_wait()" that you refer to there? I don't think there's any need to make it a bool, it's normal for there to be races on entry to migration_entry_wait(), and we're used to just returning to caller (and back up to userspace) when it does not wait. Hugh
Hugh Dickins <hughd@google.com> writes: > On Tue, 23 May 2023, Qi Zheng wrote: >> On 2023/5/23 10:39, Alistair Popple wrote: >> > Qi Zheng <qi.zheng@linux.dev> writes: >> >> On 2023/5/22 13:05, Hugh Dickins wrote: >> >>> hmm_vma_walk_pmd() is called through mm_walk, but already has a goto >> >>> again loop of its own, so take part in that if pte_offset_map() fails. >> >>> Signed-off-by: Hugh Dickins <hughd@google.com> >> >>> --- >> >>> mm/hmm.c | 2 ++ >> >>> 1 file changed, 2 insertions(+) >> >>> diff --git a/mm/hmm.c b/mm/hmm.c >> >>> index e23043345615..b1a9159d7c92 100644 >> >>> --- a/mm/hmm.c >> >>> +++ b/mm/hmm.c >> >>> @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, >> >>> } >> >>> ptep = pte_offset_map(pmdp, addr); >> >>> + if (!ptep) >> >>> + goto again; >> >>> for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) { >> >>> int r; >> >>> >> >> >> >> I haven't read the entire patch set yet, but taking a note here. >> >> The hmm_vma_handle_pte() will unmap pte and then call >> >> migration_entry_wait() to remap pte, so this may fail, we need to >> >> handle this case like below: >> > >> > I don't see a problem here. Sure, hmm_vma_handle_pte() might return >> > -EBUSY but that will get returned up to hmm_range_fault() which will >> > retry the whole thing again and presumably fail when looking at the PMD. >> >> Yeah. There is no problem with this and the modification to >> migration_entry_wait() can be simplified. My previous thought was that >> we can finish the retry logic in hmm_vma_walk_pmd() without handing it >> over to the caller. :) > > Okay, Alistair has resolved this one, thanks, I agree; but what is > "the modification to migration_entry_wait()" that you refer to there? > > I don't think there's any need to make it a bool, it's normal for there > to be races on entry to migration_entry_wait(), and we're used to just > returning to caller (and back up to userspace) when it does not wait. Agreed. I didn't spot any places where returning to the caller without actually waiting would cause looping. I assume any retries or refaults will find the cleared PMD and fault/error out in some other manner anyway. hmm_range_fault() is the only place that might have been a bit special, but it looks fine to me so: Reviewed-by: Alistair Popple <apopple@nvidia.com> > Hugh
diff --git a/mm/hmm.c b/mm/hmm.c index e23043345615..b1a9159d7c92 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -381,6 +381,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, } ptep = pte_offset_map(pmdp, addr); + if (!ptep) + goto again; for (; addr < end; addr += PAGE_SIZE, ptep++, hmm_pfns++) { int r;