[v2,3/4] riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb
Message ID | 20230727185553.980262-4-alexghiti@rivosinc.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a985:0:b0:3e4:2afc:c1 with SMTP id t5csp1315699vqo; Thu, 27 Jul 2023 12:22:39 -0700 (PDT) X-Google-Smtp-Source: APBJJlEI72w3gPCtsKbQ/ccSNSUSI7ZNuHVd9OBRu0ETeleKJSQJnPes9MMzUz5IamptZjVSNM91 X-Received: by 2002:a05:6512:348e:b0:4f9:5933:8eea with SMTP id v14-20020a056512348e00b004f959338eeamr93648lfr.3.1690485758928; Thu, 27 Jul 2023 12:22:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690485758; cv=none; d=google.com; s=arc-20160816; b=cfCjIEwbk5bRDLDHu82VUImB0Xar49fYEGixh2YS6ppkWF2YDb1CaReIK3rp3M7dtG oZL2d3LvdjvKrG3OiTI5Rw5rYqd4yYZkr7uOBZqFem1FPyh107rFcEjnwkG6kAx0Sw4t VH4/mTchuRyYdp7YIGP4GX9M4OgC5+2mI+plEEW7BIaWnolV5l9RmKQQmIkICDOgGqTa 2CglBxy4r6VIv3Kgk2vUrR+cKcWGO0X3bztl0YOiZTmKeJIQx0p+USlZl5oil8weEt0D P6DvbylObMJJXBNX8eBwdTmbBMvcwiR0tywvs86tASQlrkLhZv1cLmjcP7nSddL+HTpI spyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=EDamHVk4ujUG+CHbAQWQotzld5TgznJ3XAxjQYSqNSU=; fh=9KG1GxoJIHO8DA9/W8zOCR/cizSACkxrp9LVRbza3AA=; b=qrb9Zb/E2NkapIrqYNuqGgFYZrGs031rHmpZVs2jn3tlK23dxxJQiZTOeht2f6uwEF qzJgsaMHNKg0WcfDSZWGZEBc/XIAWIIAWZWr/C4vW8VXtfraZl5c+3KuqpjdCeKz6jZD VIjspNgnoQbugAiLmYTEMjdmuVx2vBdE/z+xQuyzxNZnmmDitDGfYDSEzRSg9pbGnjDJ nHg5QdJHmGgSLqxy2QDdgLNgVEhHXc4Bbke04MrYWYqTF1fJ57MCZ30yH/atOyUQEY1q mt4tci3TBXWgfybVfrH/3Wd53orT+eqCqyJ8cTDng40HSwRVoXgq31mq4mC62UBoNyma mnlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20221208.gappssmtp.com header.s=20221208 header.b=IdGa66Ut; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c8-20020a170906340800b00988ceb28006si1548023ejb.754.2023.07.27.12.22.14; Thu, 27 Jul 2023 12:22:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20221208.gappssmtp.com header.s=20221208 header.b=IdGa66Ut; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231689AbjG0S7L (ORCPT <rfc822;kloczko.tomasz@gmail.com> + 99 others); Thu, 27 Jul 2023 14:59:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231698AbjG0S7J (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 27 Jul 2023 14:59:09 -0400 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E68F32D73 for <linux-kernel@vger.kernel.org>; Thu, 27 Jul 2023 11:59:05 -0700 (PDT) Received: by mail-wm1-x32b.google.com with SMTP id 5b1f17b1804b1-3fc04692e20so14368955e9.0 for <linux-kernel@vger.kernel.org>; Thu, 27 Jul 2023 11:59:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1690484344; x=1691089144; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=EDamHVk4ujUG+CHbAQWQotzld5TgznJ3XAxjQYSqNSU=; b=IdGa66UtIIFbYYRT34X7nXZCEdzcXTUXP42pHU+fRDakhGzxioo4R0X4pMQHNU3UvT cb6zYYiY1GSCrHL23POq5GqGsmx16xdh+ORbhy0jbLlh3508NDq2BU8MNyVptXqxHU3m z6Huh+dgLHgTmMh6/GFFtZgMKDfLTF/lLUhuu6lFFzZ5FfNz2M+UalL7Nxy2wcTt4Urm vHdUKMcDMqL0Rjs+nJEbZgdzu00ikS2T1OAVYH9IyrUU52fcpQTId9PDN+EBqrinVVD3 YmbFuFktcTPCtmoQ7c19/s3iT4UVf0BCy+pzRa9E9o/mm8ujQqggxhdzJL4AmSxTKYKG +0qQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690484344; x=1691089144; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EDamHVk4ujUG+CHbAQWQotzld5TgznJ3XAxjQYSqNSU=; b=I2H6WEGVwxqZ6LuJhPmfQhzWC5HTC9bz7bNr4DlcqyRs0fjO4Qc78Z1XH5Bikk9FBI 6F2bnrPZ38OKOh89QxU7xDAcToxmAQoX865IGIsx6RvY6JXRTBqpTvUpK5y3S7W6PlYv G0AorB/YNp9YIiAUcr2svS83nFwabUif24+BQjRylmtYdB+uxPVRrX1QP3GexPyZnVWg sDov2YKdZTYUei5oodgCDVtTS3DTFbGDzFPKmoWvg5K3e26aizYDK0kjMw/zd6eozXOE AdWs8Q2Piq03Z781/XVyV6AskoDXhQAXxe3wPq7RV5Xyr5jePQ9O5bBFiIpefWDULa2o 8wcQ== X-Gm-Message-State: ABy/qLYs1mF0e6o5tKTQC1otzYLmUFNnJDsddRa5HaMkHVZO1A8AD6iG A3ug8ALupwb25yr0/kb5akTw5w== X-Received: by 2002:a05:600c:aca:b0:3fb:a937:6024 with SMTP id c10-20020a05600c0aca00b003fba9376024mr2398881wmr.29.1690484344310; Thu, 27 Jul 2023 11:59:04 -0700 (PDT) Received: from alex-rivos.home (amontpellier-656-1-456-62.w92-145.abo.wanadoo.fr. [92.145.124.62]) by smtp.gmail.com with ESMTPSA id 2-20020a05600c22c200b003fa96620b23sm5376517wmg.12.2023.07.27.11.59.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Jul 2023 11:59:04 -0700 (PDT) From: Alexandre Ghiti <alexghiti@rivosinc.com> To: Will Deacon <will@kernel.org>, "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>, Andrew Morton <akpm@linux-foundation.org>, Nick Piggin <npiggin@gmail.com>, Peter Zijlstra <peterz@infradead.org>, Mayuresh Chitale <mchitale@ventanamicro.com>, Vincent Chen <vincent.chen@sifive.com>, Paul Walmsley <paul.walmsley@sifive.com>, Palmer Dabbelt <palmer@dabbelt.com>, Albert Ou <aou@eecs.berkeley.edu>, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Subject: [PATCH v2 3/4] riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb Date: Thu, 27 Jul 2023 20:55:52 +0200 Message-Id: <20230727185553.980262-4-alexghiti@rivosinc.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230727185553.980262-1-alexghiti@rivosinc.com> References: <20230727185553.980262-1-alexghiti@rivosinc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772602794956507540 X-GMAIL-MSGID: 1772602794956507540 |
Series |
riscv: tlb flush improvements
|
|
Commit Message
Alexandre Ghiti
July 27, 2023, 6:55 p.m. UTC
Currently, when the range to flush covers more than one page (a 4K page or a hugepage), __flush_tlb_range() flushes the whole tlb. Flushing the whole tlb comes with a greater cost than flushing a single entry so we should flush single entries up to a certain threshold so that: threshold * cost of flushing a single entry < cost of flushing the whole tlb. This threshold is microarchitecture dependent and can/should be overwritten by vendors. Co-developed-by: Mayuresh Chitale <mchitale@ventanamicro.com> Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> --- arch/riscv/mm/tlbflush.c | 41 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 39 insertions(+), 2 deletions(-)
Comments
On Thu, Jul 27, 2023 at 08:55:52PM +0200, Alexandre Ghiti wrote: > Currently, when the range to flush covers more than one page (a 4K page or > a hugepage), __flush_tlb_range() flushes the whole tlb. Flushing the whole > tlb comes with a greater cost than flushing a single entry so we should > flush single entries up to a certain threshold so that: > threshold * cost of flushing a single entry < cost of flushing the whole > tlb. > > This threshold is microarchitecture dependent and can/should be > overwritten by vendors. > > Co-developed-by: Mayuresh Chitale <mchitale@ventanamicro.com> > Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com> > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> > --- > arch/riscv/mm/tlbflush.c | 41 ++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 39 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c > index 3e4acef1f6bc..8017d2130e27 100644 > --- a/arch/riscv/mm/tlbflush.c > +++ b/arch/riscv/mm/tlbflush.c > @@ -24,13 +24,48 @@ static inline void local_flush_tlb_page_asid(unsigned long addr, > : "memory"); > } > > +/* > + * Flush entire TLB if number of entries to be flushed is greater > + * than the threshold below. Platforms may override the threshold > + * value based on marchid, mvendorid, and mimpid. > + */ > +static unsigned long tlb_flush_all_threshold __read_mostly = 64; > + > +static void local_flush_tlb_range_threshold_asid(unsigned long start, > + unsigned long size, > + unsigned long stride, > + unsigned long asid) > +{ > + u16 nr_ptes_in_range = DIV_ROUND_UP(size, stride); > + int i; > + > + if (nr_ptes_in_range > tlb_flush_all_threshold) { > + if (asid != -1) > + local_flush_tlb_all_asid(asid); > + else > + local_flush_tlb_all(); > + return; > + } > + > + for (i = 0; i < nr_ptes_in_range; ++i) { > + if (asid != -1) > + local_flush_tlb_page_asid(start, asid); > + else > + local_flush_tlb_page(start); > + start += stride; > + } > +} > + > static inline void local_flush_tlb_range(unsigned long start, > unsigned long size, unsigned long stride) > { > if (size <= stride) > local_flush_tlb_page(start); > - else > + else if (size == (unsigned long)-1) The more we scatter this -1 around, especially now that we also need to cast it, the more I think we should introduce a #define for it. > local_flush_tlb_all(); > + else > + local_flush_tlb_range_threshold_asid(start, size, stride, -1); > + > } > > static inline void local_flush_tlb_range_asid(unsigned long start, > @@ -38,8 +73,10 @@ static inline void local_flush_tlb_range_asid(unsigned long start, > { > if (size <= stride) > local_flush_tlb_page_asid(start, asid); > - else > + else if (size == (unsigned long)-1) > local_flush_tlb_all_asid(asid); > + else > + local_flush_tlb_range_threshold_asid(start, size, stride, asid); > } > > static void __ipi_flush_tlb_all(void *info) > -- > 2.39.2 > Otherwise, Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Thanks, drew
On Thu, Jul 27, 2023 at 08:55:52PM +0200, Alexandre Ghiti wrote: > Currently, when the range to flush covers more than one page (a 4K page or > a hugepage), __flush_tlb_range() flushes the whole tlb. Flushing the whole > tlb comes with a greater cost than flushing a single entry so we should > flush single entries up to a certain threshold so that: > threshold * cost of flushing a single entry < cost of flushing the whole > tlb. > > This threshold is microarchitecture dependent and can/should be > overwritten by vendors. Please remove the latter part of this, as there is no infrastructure for this at present, nor likely in the immediate future. > Co-developed-by: Mayuresh Chitale <mchitale@ventanamicro.com> > Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com> > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> > --- > arch/riscv/mm/tlbflush.c | 41 ++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 39 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c > index 3e4acef1f6bc..8017d2130e27 100644 > --- a/arch/riscv/mm/tlbflush.c > +++ b/arch/riscv/mm/tlbflush.c > @@ -24,13 +24,48 @@ static inline void local_flush_tlb_page_asid(unsigned long addr, > : "memory"); > } > > +/* > + * Flush entire TLB if number of entries to be flushed is greater > + * than the threshold below. > Platforms may override the threshold > + * value based on marchid, mvendorid, and mimpid. And this too, as there is no infrastructure for this the comment is misleading. This kind of thing should only be added when there is actually a mechanism for doing so. I did say I would think about how to do this, but I have not come up with something. I dislike using the marchid/mvendorid/mimpid stuff if we can avoid it, as there's no control over what actually gets put in there, especially if people are going to use the open souce cores. Do we even, unless under extreme duress, want to allow setting custom values here via firmware? Sounds like a recipe for 1200 different alternatives or a big LUT...
On Fri, Jul 28, 2023 at 03:32:35PM +0200, Andrew Jones wrote: > On Thu, Jul 27, 2023 at 08:55:52PM +0200, Alexandre Ghiti wrote: > > + else if (size == (unsigned long)-1) > > The more we scatter this -1 around, especially now that we also need to > cast it, the more I think we should introduce a #define for it. Please.
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c index 3e4acef1f6bc..8017d2130e27 100644 --- a/arch/riscv/mm/tlbflush.c +++ b/arch/riscv/mm/tlbflush.c @@ -24,13 +24,48 @@ static inline void local_flush_tlb_page_asid(unsigned long addr, : "memory"); } +/* + * Flush entire TLB if number of entries to be flushed is greater + * than the threshold below. Platforms may override the threshold + * value based on marchid, mvendorid, and mimpid. + */ +static unsigned long tlb_flush_all_threshold __read_mostly = 64; + +static void local_flush_tlb_range_threshold_asid(unsigned long start, + unsigned long size, + unsigned long stride, + unsigned long asid) +{ + u16 nr_ptes_in_range = DIV_ROUND_UP(size, stride); + int i; + + if (nr_ptes_in_range > tlb_flush_all_threshold) { + if (asid != -1) + local_flush_tlb_all_asid(asid); + else + local_flush_tlb_all(); + return; + } + + for (i = 0; i < nr_ptes_in_range; ++i) { + if (asid != -1) + local_flush_tlb_page_asid(start, asid); + else + local_flush_tlb_page(start); + start += stride; + } +} + static inline void local_flush_tlb_range(unsigned long start, unsigned long size, unsigned long stride) { if (size <= stride) local_flush_tlb_page(start); - else + else if (size == (unsigned long)-1) local_flush_tlb_all(); + else + local_flush_tlb_range_threshold_asid(start, size, stride, -1); + } static inline void local_flush_tlb_range_asid(unsigned long start, @@ -38,8 +73,10 @@ static inline void local_flush_tlb_range_asid(unsigned long start, { if (size <= stride) local_flush_tlb_page_asid(start, asid); - else + else if (size == (unsigned long)-1) local_flush_tlb_all_asid(asid); + else + local_flush_tlb_range_threshold_asid(start, size, stride, asid); } static void __ipi_flush_tlb_all(void *info)