Message ID | 20230327121317.4081816-10-arnd@kernel.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1465140vqo; Mon, 27 Mar 2023 05:17:49 -0700 (PDT) X-Google-Smtp-Source: AKy350aaTP4lWwqQqtX0sXl3l3LUo85BIZuoxfZxrE319fdws+9GcP+GIP/XgUB4h6h8tT6h7SgP X-Received: by 2002:a17:906:eca4:b0:8b1:7ae8:ba6f with SMTP id qh4-20020a170906eca400b008b17ae8ba6fmr10991374ejb.16.1679919469473; Mon, 27 Mar 2023 05:17:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679919469; cv=none; d=google.com; s=arc-20160816; b=PvY10RKRs8PrXkZox+in3WzKT/0KW7k50Mwr7Kzh732DNHvvvpi2ghhpw4TmDTMgJJ 3oEmmx87BF/MW90Bynu4L0jPk9oTjyxFxsye7eR77+OVIczX7VBkjv3Xp577Ct/BW5ye FhLp0M8e76nRfLbC9fVePFIX7Np0tISr4aPHy9cp+nU4tG52m62yZadpMbTo/yZsw21x Y5+WEEA33x+YPrru9m4MXrJCWdLo27uM0rl7tujy9jxhZpDU40h5GH9glHZL9dJBSWvv X7rbqrVUSYmI3lbNkxyLDUB3VaBBMvGrpjELp9jzLcRICuF5ghAssiLTYZ7QTLrzSQXM bS0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ZdXVuKbTQGYvJqZnFjzWuPwVwObVH7VvAmyX/rvzfVI=; b=HFx+qZIUa50nTm5sag9wbDiKCPgByxuo4bEko+viWsNyV+axpFKbZKKAt2jrvWNn+u nIMbEBHfcwA21/gMtT/qEeiIm46JZTS/6aoZoUpLZ4a32iP08GxItPpBODK8goQg6e4M yS1b+MBOnEGSQQEtVAt9Vq2AzFCUXFUIQBalBXZc9RwtO+YUp55JNSZas7DPgvGI4Dou uESJZnQBuZT6Gl2M0SsHx+rRn8dGtice0zqMGDPsT2mKWEZOOVp3TK2cW6PFHTtH/Z7B 45d0d9IAMmIGSDI4YKi98otgu/gr+ysQ1aVHoKsp1g1HpA9Dp3ly2iF0ZgVAU4HLxu3N qY3Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=sMy3k5gM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f25-20020a05640214d900b004fd2a2c6ff8si27506630edx.467.2023.03.27.05.17.25; Mon, 27 Mar 2023 05:17:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=sMy3k5gM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232259AbjC0MQU (ORCPT <rfc822;makky5685@gmail.com> + 99 others); Mon, 27 Mar 2023 08:16:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230264AbjC0MP5 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 27 Mar 2023 08:15:57 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32E3D5B9A; Mon, 27 Mar 2023 05:15:20 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 10CDAB81151; Mon, 27 Mar 2023 12:15:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7D8DC4339B; Mon, 27 Mar 2023 12:15:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679919315; bh=gK+eQNvNHVtgIlRiu3kF6HYH3yBsX0klGJL13i4k/aU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sMy3k5gM2XS604J4ArMuSWvMXzgs2AlKyYx4fCZS7J/lWvOZ2Yw/Z/jrtMa5gIHU9 FjlooO1xQTngsA6hbEvqN4tEnshVSie7B+AEJaU1Cy1oTWuJpEfFFlaRXS3tq+GwRJ er3cljYGerTP/ivX+B9p913oVNSRI6mJdCnknZw/+Go9C3u3Axj27xDUpsuC4nBj8r bgnEtVHNKaiAki6GHJ1OFvO5x00a1gCx3DhHWTCMKP7Wv9fwGxJu8GpPhhiTKRkk1x zZRoP0hNRHHFBO8EoUedx3N9IqVzTMyc60Fp/OWdtgr6CAuRCyXfSdikQcmkzRIL2M hyE6lbMlfVtJQ== From: Arnd Bergmann <arnd@kernel.org> To: linux-kernel@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de>, Vineet Gupta <vgupta@kernel.org>, Russell King <linux@armlinux.org.uk>, Neil Armstrong <neil.armstrong@linaro.org>, Linus Walleij <linus.walleij@linaro.org>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Guo Ren <guoren@kernel.org>, Brian Cain <bcain@quicinc.com>, Geert Uytterhoeven <geert@linux-m68k.org>, Michal Simek <monstr@monstr.eu>, Thomas Bogendoerfer <tsbogend@alpha.franken.de>, Dinh Nguyen <dinguyen@kernel.org>, Stafford Horne <shorne@gmail.com>, Helge Deller <deller@gmx.de>, Michael Ellerman <mpe@ellerman.id.au>, Christophe Leroy <christophe.leroy@csgroup.eu>, Paul Walmsley <paul.walmsley@sifive.com>, Palmer Dabbelt <palmer@dabbelt.com>, Rich Felker <dalias@libc.org>, John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>, "David S. Miller" <davem@davemloft.net>, Max Filippov <jcmvbkbc@gmail.com>, Christoph Hellwig <hch@lst.de>, Robin Murphy <robin.murphy@arm.com>, Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>, Conor Dooley <conor.dooley@microchip.com>, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-oxnas@groups.io, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH 09/21] riscv: dma-mapping: skip invalidation before bidirectional DMA Date: Mon, 27 Mar 2023 14:13:05 +0200 Message-Id: <20230327121317.4081816-10-arnd@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327121317.4081816-1-arnd@kernel.org> References: <20230327121317.4081816-1-arnd@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761523237509247435?= X-GMAIL-MSGID: =?utf-8?q?1761523237509247435?= |
Series |
dma-mapping: unify support for cache flushes
|
|
Commit Message
Arnd Bergmann
March 27, 2023, 12:13 p.m. UTC
From: Arnd Bergmann <arnd@arndb.de> For a DMA_BIDIRECTIONAL transfer, the caches have to be cleaned first to let the device see data written by the CPU, and invalidated after the transfer to let the CPU see data written by the device. riscv also invalidates the caches before the transfer, which does not appear to serve any purpose. Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- arch/riscv/mm/dma-noncoherent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Mon, Mar 27, 2023 at 02:13:05PM +0200, Arnd Bergmann wrote: > From: Arnd Bergmann <arnd@arndb.de> > > For a DMA_BIDIRECTIONAL transfer, the caches have to be cleaned > first to let the device see data written by the CPU, and invalidated > after the transfer to let the CPU see data written by the device. > > riscv also invalidates the caches before the transfer, which does > not appear to serve any purpose. Rationale makes sense to me.. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Thanks for working on all of this Arnd!
On Mon, Mar 27, 2023 at 1:16 PM Arnd Bergmann <arnd@kernel.org> wrote: > > From: Arnd Bergmann <arnd@arndb.de> > > For a DMA_BIDIRECTIONAL transfer, the caches have to be cleaned > first to let the device see data written by the CPU, and invalidated > after the transfer to let the CPU see data written by the device. > > riscv also invalidates the caches before the transfer, which does > not appear to serve any purpose. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > arch/riscv/mm/dma-noncoherent.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Cheers, Prabhakar > diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c > index 640f4c496d26..69c80b2155a1 100644 > --- a/arch/riscv/mm/dma-noncoherent.c > +++ b/arch/riscv/mm/dma-noncoherent.c > @@ -25,7 +25,7 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, > ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > break; > case DMA_BIDIRECTIONAL: > - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > + ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > break; > default: > break; > -- > 2.39.2 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Mon, 27 Mar 2023 05:13:05 PDT (-0700), arnd@kernel.org wrote: > From: Arnd Bergmann <arnd@arndb.de> > > For a DMA_BIDIRECTIONAL transfer, the caches have to be cleaned > first to let the device see data written by the CPU, and invalidated > after the transfer to let the CPU see data written by the device. > > riscv also invalidates the caches before the transfer, which does > not appear to serve any purpose. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > arch/riscv/mm/dma-noncoherent.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c > index 640f4c496d26..69c80b2155a1 100644 > --- a/arch/riscv/mm/dma-noncoherent.c > +++ b/arch/riscv/mm/dma-noncoherent.c > @@ -25,7 +25,7 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, > ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > break; > case DMA_BIDIRECTIONAL: > - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > + ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > break; > default: > break; Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
On Mon, Mar 27, 2023 at 8:15 PM Arnd Bergmann <arnd@kernel.org> wrote: > > From: Arnd Bergmann <arnd@arndb.de> > > For a DMA_BIDIRECTIONAL transfer, the caches have to be cleaned > first to let the device see data written by the CPU, and invalidated > after the transfer to let the CPU see data written by the device. > > riscv also invalidates the caches before the transfer, which does > not appear to serve any purpose. Yes, we can't guarantee the CPU pre-load cache lines randomly during dma working. But I've two purposes to keep invalidates before dma transfer: - We clearly tell the CPU these cache lines are invalid. The caching algorithm would use these invalid slots first instead of replacing valid ones. - Invalidating is very cheap. Actually, flush and clean have the same performance in our machine. So, how about: diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c index d919efab6eba..2c52fbc15064 100644 --- a/arch/riscv/mm/dma-noncoherent.c +++ b/arch/riscv/mm/dma-noncoherent.c @@ -22,8 +22,6 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); break; case DMA_FROM_DEVICE: - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); - break; case DMA_BIDIRECTIONAL: ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); break; @@ -42,7 +40,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, break; case DMA_FROM_DEVICE: case DMA_BIDIRECTIONAL: /* I'm not sure all drivers have guaranteed cacheline alignment. If not, this inval would cause problems */ - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); + ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size); break; default: break; > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > arch/riscv/mm/dma-noncoherent.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c > index 640f4c496d26..69c80b2155a1 100644 > --- a/arch/riscv/mm/dma-noncoherent.c > +++ b/arch/riscv/mm/dma-noncoherent.c > @@ -25,7 +25,7 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, > ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > break; > case DMA_BIDIRECTIONAL: > - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > + ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > break; > default: > break; > -- > 2.39.2 >
On Fri, May 5, 2023, at 07:47, Guo Ren wrote: > On Mon, Mar 27, 2023 at 8:15 PM Arnd Bergmann <arnd@kernel.org> wrote: >> >> riscv also invalidates the caches before the transfer, which does >> not appear to serve any purpose. > Yes, we can't guarantee the CPU pre-load cache lines randomly during > dma working. > > But I've two purposes to keep invalidates before dma transfer: > - We clearly tell the CPU these cache lines are invalid. The caching > algorithm would use these invalid slots first instead of replacing > valid ones. > - Invalidating is very cheap. Actually, flush and clean have the same > performance in our machine. The main purpose of the series was to get consistent behavior on all machines, so I really don't want a custom optimization on one architecture. You make a good point about cacheline reuse after invalidation, but if we do that, I'd suggest doing this across all architectures. > So, how about: > > diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c > index d919efab6eba..2c52fbc15064 100644 > --- a/arch/riscv/mm/dma-noncoherent.c > +++ b/arch/riscv/mm/dma-noncoherent.c > @@ -22,8 +22,6 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, > ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > break; > case DMA_FROM_DEVICE: > - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > - break; > case DMA_BIDIRECTIONAL: > ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > break; This is something we can consider. Unfortunately, this is something that no architecture (except pa-risc, which has other problems) does at the moment, so we'd probably need to have a proper debate about this. We already have two conflicting ways to handle DMA_FROM_DEVICE, either invalidate/invalidate, or clean/invalidate. I can see that flush/invalidate may be a sensible option as well, but I'd want to have that discussion after the series is complete, so we can come to a generic solution that has the same documented behavior across all architectures. In particular, if we end up moving arm64 and riscv back to the traditional invalidate/invalidate for DMA_FROM_DEVICE and document that driver must not rely on buffers getting cleaned before a partial DMA_FROM_DEVICE, the question between clean or flush becomes moot as well. > @@ -42,7 +40,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, > break; > case DMA_FROM_DEVICE: > case DMA_BIDIRECTIONAL: > /* I'm not sure all drivers have guaranteed cacheline > alignment. If not, this inval would cause problems */ > - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > + ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size); > break; This is my original patch, and I would not mix it with the other change. The problem with non-aligned DMA_BIDIRECTIONAL buffers in is that both flush and inval would be wrong if you get simultaneous writes from device and cpu to the same cache line, so there is no way to win this. Using inval instead of flush would at least work if the CPU data in the cacheline is read-only from the CPU, so that seems better than something that is always wrong. The documented API is that sharing the cache line is not allowed at all, so anything that would observe a difference between the two is also a bug. One idea that we have considered already is that we could overwrite the unused bits of the cacheline with poison values and/or mark them as invalid using KASAN for debugging purposes, to find drivers that already violate this. Arnd
On Fri, May 5, 2023 at 9:19 PM Arnd Bergmann <arnd@arndb.de> wrote: > > On Fri, May 5, 2023, at 07:47, Guo Ren wrote: > > On Mon, Mar 27, 2023 at 8:15 PM Arnd Bergmann <arnd@kernel.org> wrote: > > >> > >> riscv also invalidates the caches before the transfer, which does > >> not appear to serve any purpose. > > Yes, we can't guarantee the CPU pre-load cache lines randomly during > > dma working. > > > > But I've two purposes to keep invalidates before dma transfer: > > - We clearly tell the CPU these cache lines are invalid. The caching > > algorithm would use these invalid slots first instead of replacing > > valid ones. > > - Invalidating is very cheap. Actually, flush and clean have the same > > performance in our machine. > > The main purpose of the series was to get consistent behavior on > all machines, so I really don't want a custom optimization on > one architecture. You make a good point about cacheline reuse > after invalidation, but if we do that, I'd suggest doing this > across all architectures. Yes, invalidation of DMA_FROM_DEVICE-for_device is a proposal for all architectures. > > > So, how about: > > > > diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c > > index d919efab6eba..2c52fbc15064 100644 > > --- a/arch/riscv/mm/dma-noncoherent.c > > +++ b/arch/riscv/mm/dma-noncoherent.c > > @@ -22,8 +22,6 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, > > ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > > break; > > case DMA_FROM_DEVICE: > > - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > > - break; > > case DMA_BIDIRECTIONAL: > > ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > > break; > > This is something we can consider. Unfortunately, this is something > that no architecture (except pa-risc, which has other problems) > does at the moment, so we'd probably need to have a proper debate > about this. > > We already have two conflicting ways to handle DMA_FROM_DEVICE, > either invalidate/invalidate, or clean/invalidate. I can see I vote to invalidate/invalidate. My key point is to let DMA_FROM_DEVICE-for_device invalidate, and DMA_BIDIRECTIONAL contains DMA_FROM_DEVICE. So I also agree: @@ -22,8 +22,6 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); break; case DMA_FROM_DEVICE: - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); + ALT_CMO_OP(invalidate, vaddr, size, riscv_cbom_block_size); break; case DMA_BIDIRECTIONAL: ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); break; > that flush/invalidate may be a sensible option as well, but I'd > want to have that discussion after the series is complete, so > we can come to a generic solution that has the same documented > behavior across all architectures. Yes, I agree to unify them into a generic solution first. My proposal could be another topic in the future. For that purpose, I give Acked-by: Guo Ren <guoren@kernel.org> > > In particular, if we end up moving arm64 and riscv back to the > traditional invalidate/invalidate for DMA_FROM_DEVICE and > document that driver must not rely on buffers getting cleaned After invalidation, the cache lines are also cleaned, right? So why do we need to document it additionally? > before a partial DMA_FROM_DEVICE, the question between clean > or flush becomes moot as well. > > > @@ -42,7 +40,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, > > break; > > case DMA_FROM_DEVICE: > > case DMA_BIDIRECTIONAL: > > /* I'm not sure all drivers have guaranteed cacheline > > alignment. If not, this inval would cause problems */ > > - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > > + ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size); > > break; > > This is my original patch, and I would not mix it with the other > change. The problem with non-aligned DMA_BIDIRECTIONAL buffers in > is that both flush and inval would be wrong if you get simultaneous > writes from device and cpu to the same cache line, so there is > no way to win this. Using inval instead of flush would at least > work if the CPU data in the cacheline is read-only from the CPU, > so that seems better than something that is always wrong. If CPU data in the cacheline is read-only, the cacheline would never be dirty. Yes, It's always safe. Okay, I agree we must keep cache-line-aligned. I comment it here, just worry some dirty drivers couldn't work with the "invalid mechanism" because of the CPU data corruption, and device data in the cacheline is useless. > > The documented API is that sharing the cache line is not allowed > at all, so anything that would observe a difference between the > two is also a bug. One idea that we have considered already is > that we could overwrite the unused bits of the cacheline with > poison values and/or mark them as invalid using KASAN for debugging > purposes, to find drivers that already violate this. > > Arnd
On Sat, May 6, 2023, at 09:25, Guo Ren wrote: > On Fri, May 5, 2023 at 9:19 PM Arnd Bergmann <arnd@arndb.de> wrote: >> >> This is something we can consider. Unfortunately, this is something >> that no architecture (except pa-risc, which has other problems) >> does at the moment, so we'd probably need to have a proper debate >> about this. >> >> We already have two conflicting ways to handle DMA_FROM_DEVICE, >> either invalidate/invalidate, or clean/invalidate. I can see > I vote to invalidate/invalidate. > ... > >> that flush/invalidate may be a sensible option as well, but I'd >> want to have that discussion after the series is complete, so >> we can come to a generic solution that has the same documented >> behavior across all architectures. > Yes, I agree to unify them into a generic solution first. My proposal > could be another topic in the future. Right, I was explicitly trying to exclude that question from my series, and left it as an architecture specific Kconfig option based on the current behavior. >> In particular, if we end up moving arm64 and riscv back to the >> traditional invalidate/invalidate for DMA_FROM_DEVICE and >> document that driver must not rely on buffers getting cleaned > After invalidation, the cache lines are also cleaned, right? So why do > we need to document it additionally? I mentioned the debate in the cover letter, the full explanation is archived at https://lore.kernel.org/all/20220606152150.GA31568@willie-the-truck/ In short, the problem that is addressed here is leaking sensitive kernel data to user space or a device as in this sequence: 1. A DMA buffer is allocated in the kernel and contains stale data that is no longer needed but must not be exposed to untrusted userspace, i.e. encryption keys or user file pages 2. allocator uses memset() to clear out the buffer 3. buffer gets mapped into a device for DMA_FROM_DEVICE 4. writeback cache gets invalidated, uncovering the sensitive data by discarding the zeros 5. device returns less data than expected 6. buffer is unmapped 7. whole buffer is mapped or copied to user space Will added his patch for arm64 to prevent this scenario by using 'clean' instead of 'invalidate' in step 4, and the same behavior got copied to riscv but not most of the other architectures. The dma-mapping documentation does not say anything about this case, and an alternative approach would be to document that device drivers must watch out for short reads in step 5, or that kzalloc() should clean the cache in step 2. Both of these come at a cost as well. Arnd
diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c index 640f4c496d26..69c80b2155a1 100644 --- a/arch/riscv/mm/dma-noncoherent.c +++ b/arch/riscv/mm/dma-noncoherent.c @@ -25,7 +25,7 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); break; case DMA_BIDIRECTIONAL: - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); + ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); break; default: break;