Message ID | 20231116012908.392077-5-peterx@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2924167vqg; Wed, 15 Nov 2023 17:29:36 -0800 (PST) X-Google-Smtp-Source: AGHT+IHcAzdqRHvN4Zh7GBeKs3n5T0niEKWGVdJ4Fc64HBRbtq0b0dwzsSaJbOfm5FfFCgwmpYTi X-Received: by 2002:a25:c048:0:b0:d9a:401d:f5da with SMTP id c69-20020a25c048000000b00d9a401df5damr13119239ybf.51.1700098176352; Wed, 15 Nov 2023 17:29:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700098176; cv=none; d=google.com; s=arc-20160816; b=fpo1WB3dU6mksiizEx+X2ZXeL5q04Xs2iYHUSCGzWWyk56X9jqqTnEi8PbNUvxOgo0 lElnA6/z3r66HxClWMnZctw1ZyOrFFEOzlWdkljWVslloU5G98X4V67Dojz3+nfpyXwI 8VPQNnCux83hhXcp58oey0pLJoUNczj1oViX4J/Zg/xjmw6YXf7VwViJpoGn0qFCOi96 kW5oby3PteIX/GDIlMrDK/wsoT8M+EyrJE1rIAWE8kOPkUcq4GF4WvXIlRD9kNdA65o2 6JUYKmiTGGwlEdHVwMlkLak/HTqQ96GlYDfAqGwAQGvyq+gz+oKCdJlb5efdsEMZW188 EQmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=i0M65PApL8h6gQcrG6fLgH2zAjMUA+gwn5/0n3MZLTE=; fh=o+FJD7UTEFgAOwGBZppViFyhRpBzPGRfoO/6xzRNnMs=; b=C7DZFMDhmvE+ZG4efWv7U2Wd5tm83NbPp+VHf8LgdWbCM4k7DD3AH4tqlMnS23vXLZ dgtNfj6LNfPYWT8H3SQkYprV0UI6fv5g1V6FFz0S8HIjINbwMs00+7zbJlu+r9UlI3Iq uH/CWEs9Zzw5D74cKdXuIvNbenWJDu9IhGYtP6l0vDS64/GAOX/7iFhYgKmXF8AzjL+f Q9Nvr7vG6oQ91oLljSblt/eSnreK2A9wSU7pD3RziZNMgaNQ+a/G6HM+ec5zeDBWyF78 y/c0rvh3/EsQFlmXbIGeHhsk2E9mfrgkJ0if15VexugF7M+xtQWlWYR0AR62Qvup3ME6 W7TA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=GHO1LSnr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id c17-20020a6566d1000000b005aad5164a40si11269516pgw.246.2023.11.15.17.29.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 17:29:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=GHO1LSnr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 6A8FE81068CD; Wed, 15 Nov 2023 17:29:34 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344467AbjKPB31 (ORCPT <rfc822;lhua1029@gmail.com> + 28 others); Wed, 15 Nov 2023 20:29:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344386AbjKPB3Y (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 15 Nov 2023 20:29:24 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0F17198 for <linux-kernel@vger.kernel.org>; Wed, 15 Nov 2023 17:29:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700098160; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i0M65PApL8h6gQcrG6fLgH2zAjMUA+gwn5/0n3MZLTE=; b=GHO1LSnrqCHQL4ba81d/XBbF3oovWtWEpxVi2TuEFvGBrqihW9XTKVjbaPZFbp14xHUxy0 Dk8vPhvUzTwKpfcJy9VPfCzpZ4/5HzAuVUKzh89kkphD4hhVgNrj1Lb7JVw+Da5qgo+/Cy lrguGfv0epguiXO0OxiRIRrcW+H85AA= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-435-eANnE0BLNCCdCo4qnJWQMQ-1; Wed, 15 Nov 2023 20:29:18 -0500 X-MC-Unique: eANnE0BLNCCdCo4qnJWQMQ-1 Received: by mail-qt1-f200.google.com with SMTP id d75a77b69052e-421be23b19aso883451cf.1 for <linux-kernel@vger.kernel.org>; Wed, 15 Nov 2023 17:29:18 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700098158; x=1700702958; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=i0M65PApL8h6gQcrG6fLgH2zAjMUA+gwn5/0n3MZLTE=; b=P5TTnvB7rfwOaEe5bTi5o061054oohYajKyzUPHRYafLfGzAK0djKiM+lPj2dhZQjy s6auI3nDLA042DcaD3exMLMjiKg65YI04BLVggQ0BieXRFwmAz4y8f6W+A4z/jKh2G9s D9GB9QW6hk1GkSnrRMy3cYAe2XQqvHY8AYX+tn7jT56sjjw+Px1McMMSUlW4BcCvDXPZ i4TH+4T4TaqTgltJW11rXUkcBwAUDm2rsAhCyUj0k+MP5YWn3o+61CDnt6klbVQSrCGj vOYfsoGLyx6nlfb5xHx4Fxs7ByF5C64BBp7mI/Gu86fOebrDGB0f/p6kqZCLENlD27nk W4Mw== X-Gm-Message-State: AOJu0Yxp4VT51JC2W+SbGBgBHGveDZ/5wZ6MkF72b5WVg7juU+IhmZxM caBfE0EHYZK2W7Jh+xamQochu+oVWR4NrM2OD7Wh7udzwR5WdwQ8snjPZm59k6AfVgKVcYDmLHr gmEwistp+cj/syDC2arvQN588weosizYOZI4w8sJFxU4Oig5cTCv7GBC2qm0aX7edRnh0WcpduN pQbfDZEQ== X-Received: by 2002:ac8:4716:0:b0:421:abb7:1eac with SMTP id f22-20020ac84716000000b00421abb71eacmr7303886qtp.0.1700098157927; Wed, 15 Nov 2023 17:29:17 -0800 (PST) X-Received: by 2002:ac8:4716:0:b0:421:abb7:1eac with SMTP id f22-20020ac84716000000b00421abb71eacmr7303852qtp.0.1700098157463; Wed, 15 Nov 2023 17:29:17 -0800 (PST) Received: from x1n.redhat.com (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id c24-20020ac85198000000b0041e383d527esm3922598qtn.66.2023.11.15.17.29.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 17:29:16 -0800 (PST) From: Peter Xu <peterx@redhat.com> To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz <mike.kravetz@oracle.com>, "Kirill A . Shutemov" <kirill@shutemov.name>, Lorenzo Stoakes <lstoakes@gmail.com>, Axel Rasmussen <axelrasmussen@google.com>, Matthew Wilcox <willy@infradead.org>, John Hubbard <jhubbard@nvidia.com>, Mike Rapoport <rppt@kernel.org>, peterx@redhat.com, Hugh Dickins <hughd@google.com>, David Hildenbrand <david@redhat.com>, Andrea Arcangeli <aarcange@redhat.com>, Rik van Riel <riel@surriel.com>, James Houghton <jthoughton@google.com>, Yang Shi <shy828301@gmail.com>, Jason Gunthorpe <jgg@nvidia.com>, Vlastimil Babka <vbabka@suse.cz>, Andrew Morton <akpm@linux-foundation.org> Subject: [PATCH RFC 04/12] mm: Introduce vma_pgtable_walk_{begin|end}() Date: Wed, 15 Nov 2023 20:29:00 -0500 Message-ID: <20231116012908.392077-5-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231116012908.392077-1-peterx@redhat.com> References: <20231116012908.392077-1-peterx@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 15 Nov 2023 17:29:34 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782682145412544495 X-GMAIL-MSGID: 1782682145412544495 |
Series |
mm/gup: Unify hugetlb, part 2
|
|
Commit Message
Peter Xu
Nov. 16, 2023, 1:29 a.m. UTC
Introduce per-vma begin()/end() helpers for pgtable walks. This is a
preparation work to merge hugetlb pgtable walkers with generic mm.
The helpers need to be called before and after a pgtable walk, will start
to be needed if the pgtable walker code supports hugetlb pages. It's a
hook point for any type of VMA, but for now only hugetlb uses it to
stablize the pgtable pages from getting away (due to possible pmd
unsharing).
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/linux/mm.h | 3 +++
mm/memory.c | 12 ++++++++++++
2 files changed, 15 insertions(+)
Comments
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Wed, Nov 22, 2023 at 11:24:26PM -0800, Christoph Hellwig wrote: > Looks good: > > Reviewed-by: Christoph Hellwig <hch@lst.de> One thing I'd prefer double check is this email in the R-b is different from From:. Should I always use lst.de for either tags and CCs for my future versions? Let me know if that matters. Thanks,
On Fri, Nov 24, 2023 at 09:32:13AM +0530, Aneesh Kumar K.V wrote: > Peter Xu <peterx@redhat.com> writes: > > > Introduce per-vma begin()/end() helpers for pgtable walks. This is a > > preparation work to merge hugetlb pgtable walkers with generic mm. > > > > The helpers need to be called before and after a pgtable walk, will start > > to be needed if the pgtable walker code supports hugetlb pages. It's a > > hook point for any type of VMA, but for now only hugetlb uses it to > > stablize the pgtable pages from getting away (due to possible pmd > > unsharing). > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > include/linux/mm.h | 3 +++ > > mm/memory.c | 12 ++++++++++++ > > 2 files changed, 15 insertions(+) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 64cd1ee4aacc..349232dd20fb 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -4154,4 +4154,7 @@ static inline bool pfn_is_unaccepted_memory(unsigned long pfn) > > return range_contains_unaccepted_memory(paddr, paddr + PAGE_SIZE); > > } > > > > +void vma_pgtable_walk_begin(struct vm_area_struct *vma); > > +void vma_pgtable_walk_end(struct vm_area_struct *vma); > > + > > #endif /* _LINUX_MM_H */ > > diff --git a/mm/memory.c b/mm/memory.c > > index e27e2e5beb3f..3a6434b40d87 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -6180,3 +6180,15 @@ void ptlock_free(struct ptdesc *ptdesc) > > kmem_cache_free(page_ptl_cachep, ptdesc->ptl); > > } > > #endif > > + > > +void vma_pgtable_walk_begin(struct vm_area_struct *vma) > > +{ > > + if (is_vm_hugetlb_page(vma)) > > + hugetlb_vma_lock_read(vma); > > +} > > > > That is required only if we support pmd sharing? Correct. Note that for this specific gup code path, we're not changing the lock behavior because we used to call hugetlb_vma_lock_read() the same in hugetlb_follow_page_mask(), that's also unconditionally. It make things even more complicated if we see the recent private mapping change that Rik introduced in bf4916922c. I think it means we'll also take that lock if private lock is allocated, but I'm not really sure whether that's necessary for all pgtable walks, as the hugetlb vma lock is taken mostly in all walk paths currently, only some special paths take i_mmap rwsem instead of the vma lock. Per my current understanding, the private lock was only for avoiding a race between truncate & zapping. I had a feeling that maybe there's better way to do this rather than sticking different functions with the same lock (or, lock api). In summary, the hugetlb vma lock is still complicated and may prone to further refactoring. But all those needs further investigations. This series can be hopefully seen as completely separated from that so far. Thanks,
diff --git a/include/linux/mm.h b/include/linux/mm.h index 64cd1ee4aacc..349232dd20fb 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4154,4 +4154,7 @@ static inline bool pfn_is_unaccepted_memory(unsigned long pfn) return range_contains_unaccepted_memory(paddr, paddr + PAGE_SIZE); } +void vma_pgtable_walk_begin(struct vm_area_struct *vma); +void vma_pgtable_walk_end(struct vm_area_struct *vma); + #endif /* _LINUX_MM_H */ diff --git a/mm/memory.c b/mm/memory.c index e27e2e5beb3f..3a6434b40d87 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -6180,3 +6180,15 @@ void ptlock_free(struct ptdesc *ptdesc) kmem_cache_free(page_ptl_cachep, ptdesc->ptl); } #endif + +void vma_pgtable_walk_begin(struct vm_area_struct *vma) +{ + if (is_vm_hugetlb_page(vma)) + hugetlb_vma_lock_read(vma); +} + +void vma_pgtable_walk_end(struct vm_area_struct *vma) +{ + if (is_vm_hugetlb_page(vma)) + hugetlb_vma_unlock_read(vma); +}