Message ID | 20230731074829.79309-5-wangkefeng.wang@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp1864674vqg; Mon, 31 Jul 2023 01:07:50 -0700 (PDT) X-Google-Smtp-Source: APBJJlGSyXtVwLiFAIVUeOvHCnP9j5LDePzgoBd4Ce2xLVjuf0TU+/ylfGvsmVW5bSRM0MpfHqxi X-Received: by 2002:a17:902:e803:b0:1b6:6e3a:77fb with SMTP id u3-20020a170902e80300b001b66e3a77fbmr10742473plg.2.1690790869771; Mon, 31 Jul 2023 01:07:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690790869; cv=none; d=google.com; s=arc-20160816; b=bAOF/FsrG6PuplC1k94VJXPppWMNxl5pKVPH6nl/Y/I9cNxiWyPPpQcmPEIE8JE2A6 5NJfjgxQ+fKrQPdWQ1ux8h95/G59owNF1goPEXRj3rYYblv8iHuYMk1JRVQChcLTXPk8 yPdS5RJ5uVQ5SLUu2IL50g1LFzwmygCReHxnu7c1NhwXlnYfWFO37o1OmJuRsZXb2tAk Ch/vaF9NbEbaMDSIQeGDgttgmwgSbb4BsYa5lMWUIaK27GJ/W/cTkpS2iUxDowUdXJRC 0B7tL5jZA3NLy320/Je4p+3+rsJjwhzTo+smlhApeRH22HG8OJnM1WGpbOxtYiRh7dkL 1JXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=Y3tb33D7Yj/stY7JznHb58iHPQ693NkVRx8ipzJMNtk=; fh=pWGPqOKfNGLee4HdbDJvfX9Bibz+qaOAq2a2PB8WHto=; b=PU7tF4To6wvyryd9grvKPu4xnFjRDhj4YQ6n9wn75sukJ08S9gwjTDO6tA51oB/+L4 TsC+bGgmVpz0fnTBG8x/5Szv1GR5ZBeB7iHGLT2L7xf8ewwDPK8rRX8UXFzaKKNo1ch1 McRdzBxGapAYkw2W3mRMn8iDY9rtgfMaS37OCzpq/UWb3/v2TBfDgQsWVYLGi4h3ys6f 4SwrNs0JgPGj8r/lQ0pp1wLa54/wDXd+Ca0fHeOljKu7i6GG29CEA3Qu41ZWkhTG8A2s gRdybiBklKrgSTr+QHu6Id12FOl7UVz2wn498ow69ujDFcKEI0yBb15Kc6DrsM+co1Nv yzHw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id le11-20020a170902fb0b00b001bbca0a8393si2772644plb.56.2023.07.31.01.07.35; Mon, 31 Jul 2023 01:07:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231634AbjGaHji (ORCPT <rfc822;dengxinlin2429@gmail.com> + 99 others); Mon, 31 Jul 2023 03:39:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229680AbjGaHjV (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 31 Jul 2023 03:39:21 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06E33125 for <linux-kernel@vger.kernel.org>; Mon, 31 Jul 2023 00:39:19 -0700 (PDT) Received: from dggpemm100001.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4RDqp763FHzVjdw; Mon, 31 Jul 2023 15:37:35 +0800 (CST) Received: from localhost.localdomain.localdomain (10.175.113.25) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 31 Jul 2023 15:39:16 +0800 From: Kefeng Wang <wangkefeng.wang@huawei.com> To: Andrew Morton <akpm@linux-foundation.org>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Mike Kravetz <mike.kravetz@oracle.com>, Muchun Song <muchun.song@linux.dev>, Mina Almasry <almasrymina@google.com>, <kirill@shutemov.name>, <joel@joelfernandes.org>, <william.kucharski@oracle.com>, <kaleshsingh@google.com>, <linux-mm@kvack.org> CC: <linux-arm-kernel@lists.infradead.org>, <linux-kernel@vger.kernel.org>, Kefeng Wang <wangkefeng.wang@huawei.com> Subject: [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage Date: Mon, 31 Jul 2023 15:48:29 +0800 Message-ID: <20230731074829.79309-5-wangkefeng.wang@huawei.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230731074829.79309-1-wangkefeng.wang@huawei.com> References: <20230731074829.79309-1-wangkefeng.wang@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.113.25] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemm100001.china.huawei.com (7.185.36.93) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772922726820028505 X-GMAIL-MSGID: 1772922726820028505 |
Series |
mm: mremap: fix move page tables
|
|
Commit Message
Kefeng Wang
July 31, 2023, 7:48 a.m. UTC
It is better to use huge_page_size() for hugepage(HugeTLB) instead of
PAGE_SIZE for stride, which has been done in flush_pmd/pud_tlb_range(),
it could reduce the loop in __flush_tlb_range().
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
arch/arm64/include/asm/tlbflush.h | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
Comments
On Mon, Jul 31, 2023 at 4:14 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > > It is better to use huge_page_size() for hugepage(HugeTLB) instead of > PAGE_SIZE for stride, which has been done in flush_pmd/pud_tlb_range(), > it could reduce the loop in __flush_tlb_range(). > > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > arch/arm64/include/asm/tlbflush.h | 21 +++++++++++---------- > 1 file changed, 11 insertions(+), 10 deletions(-) > > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h > index 412a3b9a3c25..25e35e6f8093 100644 > --- a/arch/arm64/include/asm/tlbflush.h > +++ b/arch/arm64/include/asm/tlbflush.h > @@ -360,16 +360,17 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, > dsb(ish); > } > > -static inline void flush_tlb_range(struct vm_area_struct *vma, > - unsigned long start, unsigned long end) > -{ > - /* > - * We cannot use leaf-only invalidation here, since we may be invalidating > - * table entries as part of collapsing hugepages or moving page tables. > - * Set the tlb_level to 0 because we can not get enough information here. > - */ > - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); > -} > +/* > + * We cannot use leaf-only invalidation here, since we may be invalidating > + * table entries as part of collapsing hugepages or moving page tables. > + * Set the tlb_level to 0 because we can not get enough information here. > + */ > +#define flush_tlb_range(vma, start, end) \ > + __flush_tlb_range(vma, start, end, \ > + ((vma)->vm_flags & VM_HUGETLB) \ > + ? huge_page_size(hstate_vma(vma)) \ > + : PAGE_SIZE, false, 0) > + seems like a good idea. I wonder if a better implementation will be MMU_GATHER_PAGE_SIZE, in this case, we are going to support stride for other large folios as well, such as thp. > > static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) > { > -- > 2.41.0 > Thanks Barry
On Mon, Jul 31, 2023 at 4:33 PM Barry Song <21cnbao@gmail.com> wrote: > > On Mon, Jul 31, 2023 at 4:14 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > > > > It is better to use huge_page_size() for hugepage(HugeTLB) instead of > > PAGE_SIZE for stride, which has been done in flush_pmd/pud_tlb_range(), > > it could reduce the loop in __flush_tlb_range(). > > > > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > > --- > > arch/arm64/include/asm/tlbflush.h | 21 +++++++++++---------- > > 1 file changed, 11 insertions(+), 10 deletions(-) > > > > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h > > index 412a3b9a3c25..25e35e6f8093 100644 > > --- a/arch/arm64/include/asm/tlbflush.h > > +++ b/arch/arm64/include/asm/tlbflush.h > > @@ -360,16 +360,17 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, > > dsb(ish); > > } > > > > -static inline void flush_tlb_range(struct vm_area_struct *vma, > > - unsigned long start, unsigned long end) > > -{ > > - /* > > - * We cannot use leaf-only invalidation here, since we may be invalidating > > - * table entries as part of collapsing hugepages or moving page tables. > > - * Set the tlb_level to 0 because we can not get enough information here. > > - */ > > - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); > > -} > > +/* > > + * We cannot use leaf-only invalidation here, since we may be invalidating > > + * table entries as part of collapsing hugepages or moving page tables. > > + * Set the tlb_level to 0 because we can not get enough information here. > > + */ > > +#define flush_tlb_range(vma, start, end) \ > > + __flush_tlb_range(vma, start, end, \ > > + ((vma)->vm_flags & VM_HUGETLB) \ > > + ? huge_page_size(hstate_vma(vma)) \ > > + : PAGE_SIZE, false, 0) > > + > > seems like a good idea. > > I wonder if a better implementation will be MMU_GATHER_PAGE_SIZE, in this case, > we are going to support stride for other large folios as well, such as thp. > BTW, in most cases we have already had right stride: arch/arm64/include/asm/tlb.h has already this to get stride: static inline void tlb_flush(struct mmu_gather *tlb) { struct vm_area_struct vma = TLB_FLUSH_VMA(tlb->mm, 0); bool last_level = !tlb->freed_tables; unsigned long stride = tlb_get_unmap_size(tlb); int tlb_level = tlb_get_level(tlb); /* * If we're tearing down the address space then we only care about * invalidating the walk-cache, since the ASID allocator won't * reallocate our ASID without invalidating the entire TLB. */ if (tlb->fullmm) { if (!last_level) flush_tlb_mm(tlb->mm); return; } __flush_tlb_range(&vma, tlb->start, tlb->end, stride, last_level, tlb_level); } > > > > static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) > > { > > -- > > 2.41.0 > > > > Thanks > Barry
On Mon, Jul 31, 2023 at 5:29 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > > > > On 2023/7/31 16:43, Barry Song wrote: > > On Mon, Jul 31, 2023 at 4:33 PM Barry Song <21cnbao@gmail.com> wrote: > >> > >> On Mon, Jul 31, 2023 at 4:14 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > >>> > >>> It is better to use huge_page_size() for hugepage(HugeTLB) instead of > >>> PAGE_SIZE for stride, which has been done in flush_pmd/pud_tlb_range(), > >>> it could reduce the loop in __flush_tlb_range(). > >>> > >>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > >>> --- > >>> arch/arm64/include/asm/tlbflush.h | 21 +++++++++++---------- > >>> 1 file changed, 11 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h > >>> index 412a3b9a3c25..25e35e6f8093 100644 > >>> --- a/arch/arm64/include/asm/tlbflush.h > >>> +++ b/arch/arm64/include/asm/tlbflush.h > >>> @@ -360,16 +360,17 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, > >>> dsb(ish); > >>> } > >>> > >>> -static inline void flush_tlb_range(struct vm_area_struct *vma, > >>> - unsigned long start, unsigned long end) > >>> -{ > >>> - /* > >>> - * We cannot use leaf-only invalidation here, since we may be invalidating > >>> - * table entries as part of collapsing hugepages or moving page tables. > >>> - * Set the tlb_level to 0 because we can not get enough information here. > >>> - */ > >>> - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); > >>> -} > >>> +/* > >>> + * We cannot use leaf-only invalidation here, since we may be invalidating > >>> + * table entries as part of collapsing hugepages or moving page tables. > >>> + * Set the tlb_level to 0 because we can not get enough information here. > >>> + */ > >>> +#define flush_tlb_range(vma, start, end) \ > >>> + __flush_tlb_range(vma, start, end, \ > >>> + ((vma)->vm_flags & VM_HUGETLB) \ > >>> + ? huge_page_size(hstate_vma(vma)) \ > >>> + : PAGE_SIZE, false, 0) > >>> + > >> > >> seems like a good idea. > >> > >> I wonder if a better implementation will be MMU_GATHER_PAGE_SIZE, in this case, > >> we are going to support stride for other large folios as well, such as thp. > >> > > > > BTW, in most cases we have already had right stride: > > > > arch/arm64/include/asm/tlb.h has already this to get stride: > > MMU_GATHER_PAGE_SIZE works for tlb_flush, but flush_tlb_range() > directly called without mmu_gather, see above 3 patches is to > use correct flush_[hugetlb/pmd/pud]_tlb_range(also there are > some other places, like get_clear_contig_flush/clear_flush on arm64), > so enable MMU_GATHER_PAGE_SIZE for arm64 is independent thing, right? > You are right. I was thinking of those zap_pte/pmd_range cases especially for those vmas where large folios engage. but it is not very relevant. In that case, one vma might have mixed different folio sizes. your patch, for sure, will benefit hugetlb with arm64 contiguous bits. Thanks Barry
On Mon, Jul 31, 2023 at 03:48:29PM +0800, Kefeng Wang wrote: > +/* > + * We cannot use leaf-only invalidation here, since we may be invalidating > + * table entries as part of collapsing hugepages or moving page tables. > + * Set the tlb_level to 0 because we can not get enough information here. > + */ > +#define flush_tlb_range(vma, start, end) \ > + __flush_tlb_range(vma, start, end, \ > + ((vma)->vm_flags & VM_HUGETLB) \ > + ? huge_page_size(hstate_vma(vma)) \ > + : PAGE_SIZE, false, 0) This won't work if we use the contiguous PTE to get 64K hugetlb pages on a 4K base page configuration. The 16 base pages in the range would have to be invalidated individually (the contig PTE bit is just a hint, the hardware may or may not take it into account).
On 2023/7/31 19:11, Catalin Marinas wrote: > On Mon, Jul 31, 2023 at 03:48:29PM +0800, Kefeng Wang wrote: >> +/* >> + * We cannot use leaf-only invalidation here, since we may be invalidating >> + * table entries as part of collapsing hugepages or moving page tables. >> + * Set the tlb_level to 0 because we can not get enough information here. >> + */ >> +#define flush_tlb_range(vma, start, end) \ >> + __flush_tlb_range(vma, start, end, \ >> + ((vma)->vm_flags & VM_HUGETLB) \ >> + ? huge_page_size(hstate_vma(vma)) \ >> + : PAGE_SIZE, false, 0) > > This won't work if we use the contiguous PTE to get 64K hugetlb pages on > a 4K base page configuration. The 16 base pages in the range would have > to be invalidated individually (the contig PTE bit is just a hint, the > hardware may or may not take it into account). Got it, the contig huge page is depended on hardware implementation, but for normal hugepage(2M/1G), we could use this, right? >
On Mon, Jul 31, 2023 at 07:27:14PM +0800, Kefeng Wang wrote: > On 2023/7/31 19:11, Catalin Marinas wrote: > > On Mon, Jul 31, 2023 at 03:48:29PM +0800, Kefeng Wang wrote: > > > +/* > > > + * We cannot use leaf-only invalidation here, since we may be invalidating > > > + * table entries as part of collapsing hugepages or moving page tables. > > > + * Set the tlb_level to 0 because we can not get enough information here. > > > + */ > > > +#define flush_tlb_range(vma, start, end) \ > > > + __flush_tlb_range(vma, start, end, \ > > > + ((vma)->vm_flags & VM_HUGETLB) \ > > > + ? huge_page_size(hstate_vma(vma)) \ > > > + : PAGE_SIZE, false, 0) > > > > This won't work if we use the contiguous PTE to get 64K hugetlb pages on > > a 4K base page configuration. The 16 base pages in the range would have > > to be invalidated individually (the contig PTE bit is just a hint, the > > hardware may or may not take it into account). > > Got it, the contig huge page is depended on hardware implementation, > but for normal hugepage(2M/1G), we could use this, right? Right. Only the pmd/pud cases.
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 412a3b9a3c25..25e35e6f8093 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -360,16 +360,17 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, dsb(ish); } -static inline void flush_tlb_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end) -{ - /* - * We cannot use leaf-only invalidation here, since we may be invalidating - * table entries as part of collapsing hugepages or moving page tables. - * Set the tlb_level to 0 because we can not get enough information here. - */ - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); -} +/* + * We cannot use leaf-only invalidation here, since we may be invalidating + * table entries as part of collapsing hugepages or moving page tables. + * Set the tlb_level to 0 because we can not get enough information here. + */ +#define flush_tlb_range(vma, start, end) \ + __flush_tlb_range(vma, start, end, \ + ((vma)->vm_flags & VM_HUGETLB) \ + ? huge_page_size(hstate_vma(vma)) \ + : PAGE_SIZE, false, 0) + static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) {