Message ID | 20240301071918.64631-1-xuanzhuo@linux.alibaba.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-88003-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2097:b0:108:e6aa:91d0 with SMTP id gs23csp910783dyb; Thu, 29 Feb 2024 23:19:46 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUqQ32qbS+2VFPTF+KxGA9Q68cblGfkuVSE4/h10xHz/ARmzU6uuVxAeHneRjOmAg7hqsS/IA41NHCUhLleKqRXgl32Yg== X-Google-Smtp-Source: AGHT+IGUKZEO6kx35y1l8RgXWp+Z8nyu6epdb6dZhhjmhhtQ382vf8pZ0X2+FCA60swdLgNbjrul X-Received: by 2002:a17:906:a84c:b0:a44:9a5d:2be2 with SMTP id dx12-20020a170906a84c00b00a449a5d2be2mr300587ejb.56.1709277586364; Thu, 29 Feb 2024 23:19:46 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709277586; cv=pass; d=google.com; s=arc-20160816; b=M1ANRON4PktHMdYFxuAyXPsVBQOKSuBevrL2dyDWsRxQAm02F0vcMaJ3uQtfZOJm7F CZd0ptAI1NKLnTuh7xHaeqhjGGzfKxj3t2M/up6wjWr+PjYD3F9mYsOsC5Lx2cpsyar8 fkVIa/nkHx8/HjtM3XwfVczRiH0kH6I/858kpFQGEuD+36UApPMe5h7m0fecrJd9xZdr C6Y0kjJq09HTjA+hSPwHdEAEItgQ0dq1kG04CKr7Q/K5sMUM2feo+87OUzL3FdUee9YJ LoDsRHV6odxjC3GtnoVtXqSbSf/h/nC7js2BiglBGd+8RzGKRreqTR5DAbgA1Z4VhFk3 mVsQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=lKxH1wti4AEphZSbDiXy/K+c1q2VHgxihXjjl65ACMw=; fh=MBz7832haxrBstyfGW6Z6GrpsDGAf6bMx+NpFaPf51g=; b=x+oRq2INRArvOtf/K8hoKHPRXIg8UZK8pmlnV+EOfn27aaPdqjinIGjoIaOKtTl0Ww 8aaAY5l0yKkkXUW8KnbUiNrFGNunJVbNIu3sWz0x87fiYOFsncOBqtIuZm8iJfQkE3gQ 161j6rzY/fLiRVudAXDxE0qYV0K0zAhh3yjHUBizJJQ/nsrv6b5M86doiAjZryJxMEaT Zu7E0uetCiP5i2fllCVCC3lxPy5pM2nGfu+ZgZL8gEQFWU6iSt+K9JUr/8vhxfVjpfTz xLgjp8mik1k2VQjfHHNrv2Mt8fny6l5+LY1xRNAXwdyKF8wunIusb6vqTh6Iyq1/Lqqd ntuA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linux.alibaba.com header.s=default header.b=pfP7CLvZ; arc=pass (i=1 spf=pass spfdomain=linux.alibaba.com dkim=pass dkdomain=linux.alibaba.com dmarc=pass fromdomain=linux.alibaba.com); spf=pass (google.com: domain of linux-kernel+bounces-88003-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-88003-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.alibaba.com Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id hq2-20020a1709073f0200b00a42edc09d48si1241572ejc.1034.2024.02.29.23.19.46 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Feb 2024 23:19:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-88003-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.alibaba.com header.s=default header.b=pfP7CLvZ; arc=pass (i=1 spf=pass spfdomain=linux.alibaba.com dkim=pass dkdomain=linux.alibaba.com dmarc=pass fromdomain=linux.alibaba.com); spf=pass (google.com: domain of linux-kernel+bounces-88003-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-88003-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.alibaba.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id CC06E1F23378 for <ouuuleilei@gmail.com>; Fri, 1 Mar 2024 07:19:45 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 9BFAD6995B; Fri, 1 Mar 2024 07:19:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="pfP7CLvZ" Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D02146931B for <linux-kernel@vger.kernel.org>; Fri, 1 Mar 2024 07:19:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.112 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709277571; cv=none; b=ePFxwAdORl5GmCgNoGkYyKitZyVwhbIyz5WA4u9ycZ5hqiVV2LtgK2ceJ9apma9G2RfYPLi8H7p8BHVSdTQxTb4/nWI3OThxLS7zjnvG+2qJjjc0+J4pKUkiERvKuL1PyMX2nWxckPguhADklG1mYlavuXmfTA6gQ91mWUKePMU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709277571; c=relaxed/simple; bh=8uclGvYnwgBo+pRR0QCV8T4jAbEr7kN0Zz+U0UsWlqU=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=khsT6cMB2HtDWwlAQK11TwWMEw/iCiZPT0v+2CKqm6c4i8k7vaX2kmqJCycnodvTAkys4Gk29w646KNvGUh7pLJyjHDsGs1A6p84+RGN1qziLNT+eVaEANAYgtKgaCw1506uYMjUiNlJCzVRnt/Ac7o1CD7lvHiTOftN7z7gwao= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=pfP7CLvZ; arc=none smtp.client-ip=115.124.30.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1709277559; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=lKxH1wti4AEphZSbDiXy/K+c1q2VHgxihXjjl65ACMw=; b=pfP7CLvZjHbfv1T/HVBLRe6tsFlZ12vm7Po/YKcmvNO01gOZe+ruJOGmOqfUI0lkgB9aUZVhz6oy8n0FDmRLNDLsx23Jmt/4f6MvGmfrB92/jqLRsqYvdZzyN1aYYYJQy4+kef1U69hlB8AapBDn1loetzd/WqkI4TcgUwEiF3g= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046050;MF=xuanzhuo@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0W1ZHnHc_1709277558; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0W1ZHnHc_1709277558) by smtp.aliyun-inc.com; Fri, 01 Mar 2024 15:19:19 +0800 From: Xuan Zhuo <xuanzhuo@linux.alibaba.com> To: linux-kernel@vger.kernel.org Cc: Robin Murphy <robin.murphy@arm.com>, Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>, Christoph Hellwig <hch@lst.de>, Marek Szyprowski <m.szyprowski@samsung.com>, iommu@lists.linux.dev, "Michael S. Tsirkin" <mst@redhat.com>, Zelin Deng <zelin.deng@linux.alibaba.com> Subject: [RFC] dma-mapping: introduce dma_can_skip_unmap() Date: Fri, 1 Mar 2024 15:19:18 +0800 Message-Id: <20240301071918.64631-1-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 X-Git-Hash: d1edb4657402 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792307454190322501 X-GMAIL-MSGID: 1792307454190322501 |
Series |
[RFC] dma-mapping: introduce dma_can_skip_unmap()
|
|
Commit Message
Xuan Zhuo
March 1, 2024, 7:19 a.m. UTC
In a typical workflow, we first perform a dma map on an address to obtain a dma address, followed by a dma unmap on that address. Generally, this process works without issues. However, under certain circumstances, we require additional resources to manage these dma addresses. For instance, in layered architectures, we pass the dma address to another module, but retrieving it back from that module can present some challenges. In such cases, we must allocate extra resources to manage these dma addresses. However, considering that many times the dma unmap operation is actually a no-op, if we know in advance that unmap is not necessary, we can save on these extra management overheads. Moreover, we can directly skip the dma unmap operation. This would be advantageous. This tries to resolve the problem of patchset: http://lore.kernel.org/all/20240225032330-mutt-send-email-mst@kernel.org For a single packet, virtio-net may submit 1-19 dma addresses to virtio core. If the virtio-net maintains the dma addresses will waste too much memory when the unmap is not necessary. If the virtio-net retrieves the dma addresses of the packet from the virtio core, we need to hold the 19 dma addresses by one call. And the net drivers maintain the dma is the future. So we hope to check the unmap is necessary or not. Co-developed-by: Zelin Deng <zelin.deng@linux.alibaba.com> Signed-off-by: Zelin Deng <zelin.deng@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> --- drivers/iommu/dma-iommu.c | 11 +++++++++++ include/linux/dma-map-ops.h | 1 + include/linux/dma-mapping.h | 5 +++++ kernel/dma/mapping.c | 23 +++++++++++++++++++++++ 4 files changed, 40 insertions(+)
Comments
On 2024-03-01 7:19 am, Xuan Zhuo wrote: > In a typical workflow, we first perform a dma map on an address to > obtain a dma address, followed by a dma unmap on that address. Generally, > this process works without issues. However, under certain circumstances, > we require additional resources to manage these dma addresses. For > instance, in layered architectures, we pass the dma address to another > module, but retrieving it back from that module can present some > challenges. In such cases, we must allocate extra resources to manage > these dma addresses. > > However, considering that many times the dma unmap operation is actually > a no-op, if we know in advance that unmap is not necessary, we can save > on these extra management overheads. Moreover, we can directly skip the > dma unmap operation. This would be advantageous. > > This tries to resolve the problem of patchset: > > http://lore.kernel.org/all/20240225032330-mutt-send-email-mst@kernel.org > > For a single packet, virtio-net may submit 1-19 dma addresses to virtio > core. If the virtio-net maintains the dma addresses will waste too much > memory when the unmap is not necessary. If the virtio-net retrieves the > dma addresses of the packet from the virtio core, we need to hold the 19 > dma addresses by one call. And the net drivers maintain the dma is the > future. So we hope to check the unmap is necessary or not. > > Co-developed-by: Zelin Deng <zelin.deng@linux.alibaba.com> > Signed-off-by: Zelin Deng <zelin.deng@linux.alibaba.com> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > --- > drivers/iommu/dma-iommu.c | 11 +++++++++++ > include/linux/dma-map-ops.h | 1 + > include/linux/dma-mapping.h | 5 +++++ > kernel/dma/mapping.c | 23 +++++++++++++++++++++++ > 4 files changed, 40 insertions(+) > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index 50ccc4f1ef81..8c661a0e1111 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -1706,6 +1706,16 @@ static size_t iommu_dma_opt_mapping_size(void) > return iova_rcache_range(); > } > > +static bool iommu_dma_opt_can_skip_unmap(struct device *dev) > +{ > + struct iommu_domain *domain = iommu_get_dma_domain(dev); > + > + if (domain->type == IOMMU_DOMAIN_IDENTITY) This is nonsense; iommu-dma does not operate on identity domains in the first place. > + return true; > + else > + return false; > +} > + > static const struct dma_map_ops iommu_dma_ops = { > .flags = DMA_F_PCI_P2PDMA_SUPPORTED, > .alloc = iommu_dma_alloc, > @@ -1728,6 +1738,7 @@ static const struct dma_map_ops iommu_dma_ops = { > .unmap_resource = iommu_dma_unmap_resource, > .get_merge_boundary = iommu_dma_get_merge_boundary, > .opt_mapping_size = iommu_dma_opt_mapping_size, > + .dma_can_skip_unmap = iommu_dma_opt_can_skip_unmap, > }; > > /* > diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h > index 4abc60f04209..d508fa90bc06 100644 > --- a/include/linux/dma-map-ops.h > +++ b/include/linux/dma-map-ops.h > @@ -83,6 +83,7 @@ struct dma_map_ops { > size_t (*max_mapping_size)(struct device *dev); > size_t (*opt_mapping_size)(void); > unsigned long (*get_merge_boundary)(struct device *dev); > + bool (*dma_can_skip_unmap)(struct device *dev); > }; > > #ifdef CONFIG_DMA_OPS > diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h > index 4a658de44ee9..af5d9275f8cc 100644 > --- a/include/linux/dma-mapping.h > +++ b/include/linux/dma-mapping.h > @@ -140,6 +140,7 @@ int dma_mmap_attrs(struct device *dev, struct vm_area_struct *vma, > void *cpu_addr, dma_addr_t dma_addr, size_t size, > unsigned long attrs); > bool dma_can_mmap(struct device *dev); > +bool dma_can_skip_unmap(struct device *dev); > bool dma_pci_p2pdma_supported(struct device *dev); > int dma_set_mask(struct device *dev, u64 mask); > int dma_set_coherent_mask(struct device *dev, u64 mask); > @@ -249,6 +250,10 @@ static inline bool dma_can_mmap(struct device *dev) > { > return false; > } > +static inline bool dma_can_skip_unmap(struct device *dev) > +{ > + return false; > +} > static inline bool dma_pci_p2pdma_supported(struct device *dev) > { > return false; > diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c > index 58db8fd70471..99a81932820b 100644 > --- a/kernel/dma/mapping.c > +++ b/kernel/dma/mapping.c > @@ -445,6 +445,29 @@ bool dma_can_mmap(struct device *dev) > } > EXPORT_SYMBOL_GPL(dma_can_mmap); > > +/** > + * dma_can_skip_unmap - check if unmap can be skipped > + * @dev: device to check > + * > + * Returns %true if @dev supports direct map or dma_can_skip_unmap() return true. > + */ > +bool dma_can_skip_unmap(struct device *dev) > +{ > + const struct dma_map_ops *ops = get_dma_ops(dev); > + > + if (is_swiotlb_force_bounce(dev)) > + return false; > + > + if (dma_map_direct(dev, ops)) > + return true; And this is also broken and nonsensical. What about non-coherent cache maintenance, or regular non-forced SWIOTLB bouncing due to per-mapping address limitations or buffer alignment, or dma-debug? Not only is this idea not viable, the entire premise seems flawed - the reasons for virtio needing to use the DMA API at all are highly likely to be the same reasons for it needing to use the DMA API *properly* anyway. Thanks, Robin. > + > + if (ops->dma_can_skip_unmap) > + return ops->dma_can_skip_unmap(dev); > + > + return false; > +} > +EXPORT_SYMBOL_GPL(dma_can_skip_unmap); > + > /** > * dma_mmap_attrs - map a coherent DMA allocation into user space > * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
On Fri, Mar 01, 2024 at 11:38:25AM +0000, Robin Murphy wrote: > Not only is this idea not viable, the entire premise seems flawed - the > reasons for virtio needing to use the DMA API at all are highly likely to be > the same reasons for it needing to use the DMA API *properly* anyway. The idea has nothing to do with virtio per se - we are likely not the only driver that wastes a lot of memory (hot in cache, too) keeping DMA addresses around for the sole purpose of calling DMA unmap. On a bunch of systems unmap is always a nop and we could save some memory if there was a way to find out. What is proposed is an API extension allowing that for anyone - not just virtio.
On 2024-03-01 11:50 am, Michael S. Tsirkin wrote: > On Fri, Mar 01, 2024 at 11:38:25AM +0000, Robin Murphy wrote: >> Not only is this idea not viable, the entire premise seems flawed - the >> reasons for virtio needing to use the DMA API at all are highly likely to be >> the same reasons for it needing to use the DMA API *properly* anyway. > > The idea has nothing to do with virtio per se Sure, I can see that, but if virtio is presented as the justification for doing this then it's the justification I'm going to look at first. And the fact is that it *does* seem to have particular significance, since having up to 19 DMA addresses involved in a single transfer is very much an outlier compared to typical hardware drivers. Furthermore the fact that DMA API support was retrofitted to the established virtio design means I would always expect it to run up against more challenges than a hardware driver designed around the expectation that DMA buffers have DMA addresses. > - we are likely not the > only driver that wastes a lot of memory (hot in cache, too) keeping DMA > addresses around for the sole purpose of calling DMA unmap. On a bunch > of systems unmap is always a nop and we could save some memory if there > was a way to find out. What is proposed is an API extension allowing > that for anyone - not just virtio. And the point I'm making is that that "always" is a big assumption, and in fact for the situations where it is robustly true we already have the DEFINE_DMA_UNMAP_{ADDR,LEN} mechanism. I'd consider it rare for DMA addresses to be stored in isolation, as opposed to being part of some kind of buffer descriptor (or indeed struct scatterlist, for an obvious example) that a driver or subsystem still has to keep track of anyway, so in general I believe the scope for saving decidedly small amounts of memory at runtime is also considerably less than you might be imagining. Thanks, Robin.
On Fri, Mar 01, 2024 at 12:42:39PM +0000, Robin Murphy wrote: > On 2024-03-01 11:50 am, Michael S. Tsirkin wrote: > > On Fri, Mar 01, 2024 at 11:38:25AM +0000, Robin Murphy wrote: > > > Not only is this idea not viable, the entire premise seems flawed - the > > > reasons for virtio needing to use the DMA API at all are highly likely to be > > > the same reasons for it needing to use the DMA API *properly* anyway. > > > > The idea has nothing to do with virtio per se > > Sure, I can see that, but if virtio is presented as the justification for > doing this then it's the justification I'm going to look at first. And the > fact is that it *does* seem to have particular significance, since having up > to 19 DMA addresses involved in a single transfer is very much an outlier > compared to typical hardware drivers. That's a valid comment. Xuan Zhuo do other drivers do this too, could you check pls? > Furthermore the fact that DMA API > support was retrofitted to the established virtio design means I would > always expect it to run up against more challenges than a hardware driver > designed around the expectation that DMA buffers have DMA addresses. It seems virtio can't drive any DMA changes then it's forever tainted? Seems unfair - we retrofitted it years ago, enough refactoring happened since then. > > - we are likely not the > > only driver that wastes a lot of memory (hot in cache, too) keeping DMA > > addresses around for the sole purpose of calling DMA unmap. On a bunch > > of systems unmap is always a nop and we could save some memory if there > > was a way to find out. What is proposed is an API extension allowing > > that for anyone - not just virtio. > > And the point I'm making is that that "always" is a big assumption, and in > fact for the situations where it is robustly true we already have the > DEFINE_DMA_UNMAP_{ADDR,LEN} mechanism. > I'd consider it rare for DMA > addresses to be stored in isolation, as opposed to being part of some kind > of buffer descriptor (or indeed struct scatterlist, for an obvious example) > that a driver or subsystem still has to keep track of anyway, so in general > I believe the scope for saving decidedly small amounts of memory at runtime > is also considerably less than you might be imagining. > > Thanks, > Robin. Yes. DEFINE_DMA_UNMAP_ exits but that's only compile time. And I think the fact we have that mechanism is a hint that enough configurations could benefit from a runtime mechanism, too. E.g. since you mentioned scatterlist, it has a bunch of ifdefs in place. Of course - finding more examples would be benefitial to help maintainers do the cost/benefit analysis - a robust implementation is needed
On 2024-03-01 1:41 pm, Michael S. Tsirkin wrote: > On Fri, Mar 01, 2024 at 12:42:39PM +0000, Robin Murphy wrote: >> On 2024-03-01 11:50 am, Michael S. Tsirkin wrote: >>> On Fri, Mar 01, 2024 at 11:38:25AM +0000, Robin Murphy wrote: >>>> Not only is this idea not viable, the entire premise seems flawed - the >>>> reasons for virtio needing to use the DMA API at all are highly likely to be >>>> the same reasons for it needing to use the DMA API *properly* anyway. >>> >>> The idea has nothing to do with virtio per se >> >> Sure, I can see that, but if virtio is presented as the justification for >> doing this then it's the justification I'm going to look at first. And the >> fact is that it *does* seem to have particular significance, since having up >> to 19 DMA addresses involved in a single transfer is very much an outlier >> compared to typical hardware drivers. > > That's a valid comment. Xuan Zhuo do other drivers do this too, > could you check pls? > >> Furthermore the fact that DMA API >> support was retrofitted to the established virtio design means I would >> always expect it to run up against more challenges than a hardware driver >> designed around the expectation that DMA buffers have DMA addresses. > > > It seems virtio can't drive any DMA changes then it's forever tainted? > Seems unfair - we retrofitted it years ago, enough refactoring happened > since then. No, I'm not saying we couldn't still do things to help virtio if and when it does prove reasonable to do so; just that if anything it's *because* that retrofit is mature and fairly well polished by now that any remaining issues like this one are going to be found in the most awkward corners and thus unlikely to generalise. FWIW in my experience it seems more common for network drivers to actually have the opposite problem, where knowing the DMA address of a buffer is easy, but keeping track of the corresponding CPU address can be more of a pain. >>> - we are likely not the >>> only driver that wastes a lot of memory (hot in cache, too) keeping DMA >>> addresses around for the sole purpose of calling DMA unmap. On a bunch >>> of systems unmap is always a nop and we could save some memory if there >>> was a way to find out. What is proposed is an API extension allowing >>> that for anyone - not just virtio. >> >> And the point I'm making is that that "always" is a big assumption, and in >> fact for the situations where it is robustly true we already have the >> DEFINE_DMA_UNMAP_{ADDR,LEN} mechanism. >> I'd consider it rare for DMA >> addresses to be stored in isolation, as opposed to being part of some kind >> of buffer descriptor (or indeed struct scatterlist, for an obvious example) >> that a driver or subsystem still has to keep track of anyway, so in general >> I believe the scope for saving decidedly small amounts of memory at runtime >> is also considerably less than you might be imagining. >> >> Thanks, >> Robin. > > > Yes. DEFINE_DMA_UNMAP_ exits but that's only compile time. > And I think the fact we have that mechanism is a hint that > enough configurations could benefit from a runtime > mechanism, too. > > E.g. since you mentioned scatterlist, it has a bunch of ifdefs > in place. But what could that benefit be in general? It's not like we can change structure layouts on a per-DMA-mapping-call basis to save already-allocated memory... :/ Thanks, Robin. > > Of course > - finding more examples would be benefitial to help maintainers > do the cost/benefit analysis > - a robust implementation is needed > >
On Fri, Mar 01, 2024 at 06:04:10PM +0000, Robin Murphy wrote: > On 2024-03-01 1:41 pm, Michael S. Tsirkin wrote: > > On Fri, Mar 01, 2024 at 12:42:39PM +0000, Robin Murphy wrote: > > > On 2024-03-01 11:50 am, Michael S. Tsirkin wrote: > > > > On Fri, Mar 01, 2024 at 11:38:25AM +0000, Robin Murphy wrote: > > > > > Not only is this idea not viable, the entire premise seems flawed - the > > > > > reasons for virtio needing to use the DMA API at all are highly likely to be > > > > > the same reasons for it needing to use the DMA API *properly* anyway. > > > > > > > > The idea has nothing to do with virtio per se > > > > > > Sure, I can see that, but if virtio is presented as the justification for > > > doing this then it's the justification I'm going to look at first. And the > > > fact is that it *does* seem to have particular significance, since having up > > > to 19 DMA addresses involved in a single transfer is very much an outlier > > > compared to typical hardware drivers. > > > > That's a valid comment. Xuan Zhuo do other drivers do this too, > > could you check pls? > > > > > Furthermore the fact that DMA API > > > support was retrofitted to the established virtio design means I would > > > always expect it to run up against more challenges than a hardware driver > > > designed around the expectation that DMA buffers have DMA addresses. > > > > > > It seems virtio can't drive any DMA changes then it's forever tainted? > > Seems unfair - we retrofitted it years ago, enough refactoring happened > > since then. > > No, I'm not saying we couldn't still do things to help virtio if and when it > does prove reasonable to do so; just that if anything it's *because* that > retrofit is mature and fairly well polished by now that any remaining issues > like this one are going to be found in the most awkward corners and thus > unlikely to generalise. > > FWIW in my experience it seems more common for network drivers to actually > have the opposite problem, where knowing the DMA address of a buffer is > easy, but keeping track of the corresponding CPU address can be more of a > pain. > > > > > - we are likely not the > > > > only driver that wastes a lot of memory (hot in cache, too) keeping DMA > > > > addresses around for the sole purpose of calling DMA unmap. On a bunch > > > > of systems unmap is always a nop and we could save some memory if there > > > > was a way to find out. What is proposed is an API extension allowing > > > > that for anyone - not just virtio. > > > > > > And the point I'm making is that that "always" is a big assumption, and in > > > fact for the situations where it is robustly true we already have the > > > DEFINE_DMA_UNMAP_{ADDR,LEN} mechanism. > > > I'd consider it rare for DMA > > > addresses to be stored in isolation, as opposed to being part of some kind > > > of buffer descriptor (or indeed struct scatterlist, for an obvious example) > > > that a driver or subsystem still has to keep track of anyway, so in general > > > I believe the scope for saving decidedly small amounts of memory at runtime > > > is also considerably less than you might be imagining. > > > > > > Thanks, > > > Robin. > > > > > > Yes. DEFINE_DMA_UNMAP_ exits but that's only compile time. > > And I think the fact we have that mechanism is a hint that > > enough configurations could benefit from a runtime > > mechanism, too. > > > > E.g. since you mentioned scatterlist, it has a bunch of ifdefs > > in place. > > But what could that benefit be in general? It's not like we can change > structure layouts on a per-DMA-mapping-call basis to save already-allocated > memory... :/ > > Thanks, > Robin. This is all speculation, but maybe e.g. by not writing into a cache line we can reduce pressure on the cache. Some other code and/or structure changes might or might not become benefitial. > > > > Of course > > - finding more examples would be benefitial to help maintainers > > do the cost/benefit analysis > > - a robust implementation is needed > > > >
On Fri, 1 Mar 2024 08:41:50 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Fri, Mar 01, 2024 at 12:42:39PM +0000, Robin Murphy wrote: > > On 2024-03-01 11:50 am, Michael S. Tsirkin wrote: > > > On Fri, Mar 01, 2024 at 11:38:25AM +0000, Robin Murphy wrote: > > > > Not only is this idea not viable, the entire premise seems flawed - the > > > > reasons for virtio needing to use the DMA API at all are highly likely to be > > > > the same reasons for it needing to use the DMA API *properly* anyway. > > > > > > The idea has nothing to do with virtio per se > > > > Sure, I can see that, but if virtio is presented as the justification for > > doing this then it's the justification I'm going to look at first. And the > > fact is that it *does* seem to have particular significance, since having up > > to 19 DMA addresses involved in a single transfer is very much an outlier > > compared to typical hardware drivers. > > That's a valid comment. Xuan Zhuo do other drivers do this too, > could you check pls? I checked some drivers(gve, mlx, ice), no one has the same operation like virtio-net. We can do it because we have indirect feature. Thanks. > > > Furthermore the fact that DMA API > > support was retrofitted to the established virtio design means I would > > always expect it to run up against more challenges than a hardware driver > > designed around the expectation that DMA buffers have DMA addresses. > > > It seems virtio can't drive any DMA changes then it's forever tainted? > Seems unfair - we retrofitted it years ago, enough refactoring happened > since then. > > > > > - we are likely not the > > > only driver that wastes a lot of memory (hot in cache, too) keeping DMA > > > addresses around for the sole purpose of calling DMA unmap. On a bunch > > > of systems unmap is always a nop and we could save some memory if there > > > was a way to find out. What is proposed is an API extension allowing > > > that for anyone - not just virtio. > > > > And the point I'm making is that that "always" is a big assumption, and in > > fact for the situations where it is robustly true we already have the > > DEFINE_DMA_UNMAP_{ADDR,LEN} mechanism. > > I'd consider it rare for DMA > > addresses to be stored in isolation, as opposed to being part of some kind > > of buffer descriptor (or indeed struct scatterlist, for an obvious example) > > that a driver or subsystem still has to keep track of anyway, so in general > > I believe the scope for saving decidedly small amounts of memory at runtime > > is also considerably less than you might be imagining. > > > > Thanks, > > Robin. > > > Yes. DEFINE_DMA_UNMAP_ exits but that's only compile time. > And I think the fact we have that mechanism is a hint that > enough configurations could benefit from a runtime > mechanism, too. > > E.g. since you mentioned scatterlist, it has a bunch of ifdefs > in place. > > Of course > - finding more examples would be benefitial to help maintainers > do the cost/benefit analysis > - a robust implementation is needed > > > -- > MST >
On Fri, 1 Mar 2024 18:04:10 +0000, Robin Murphy <robin.murphy@arm.com> wrote: > On 2024-03-01 1:41 pm, Michael S. Tsirkin wrote: > > On Fri, Mar 01, 2024 at 12:42:39PM +0000, Robin Murphy wrote: > >> On 2024-03-01 11:50 am, Michael S. Tsirkin wrote: > >>> On Fri, Mar 01, 2024 at 11:38:25AM +0000, Robin Murphy wrote: > >>>> Not only is this idea not viable, the entire premise seems flawed - the > >>>> reasons for virtio needing to use the DMA API at all are highly likely to be > >>>> the same reasons for it needing to use the DMA API *properly* anyway. > >>> > >>> The idea has nothing to do with virtio per se > >> > >> Sure, I can see that, but if virtio is presented as the justification for > >> doing this then it's the justification I'm going to look at first. And the > >> fact is that it *does* seem to have particular significance, since having up > >> to 19 DMA addresses involved in a single transfer is very much an outlier > >> compared to typical hardware drivers. > > > > That's a valid comment. Xuan Zhuo do other drivers do this too, > > could you check pls? > > > >> Furthermore the fact that DMA API > >> support was retrofitted to the established virtio design means I would > >> always expect it to run up against more challenges than a hardware driver > >> designed around the expectation that DMA buffers have DMA addresses. > > > > > > It seems virtio can't drive any DMA changes then it's forever tainted? > > Seems unfair - we retrofitted it years ago, enough refactoring happened > > since then. > > No, I'm not saying we couldn't still do things to help virtio if and > when it does prove reasonable to do so; just that if anything it's > *because* that retrofit is mature and fairly well polished by now that > any remaining issues like this one are going to be found in the most > awkward corners and thus unlikely to generalise. > > FWIW in my experience it seems more common for network drivers to > actually have the opposite problem, where knowing the DMA address of a > buffer is easy, but keeping track of the corresponding CPU address can > be more of a pain. > > >>> - we are likely not the > >>> only driver that wastes a lot of memory (hot in cache, too) keeping DMA > >>> addresses around for the sole purpose of calling DMA unmap. On a bunch > >>> of systems unmap is always a nop and we could save some memory if there > >>> was a way to find out. What is proposed is an API extension allowing > >>> that for anyone - not just virtio. > >> > >> And the point I'm making is that that "always" is a big assumption, and in > >> fact for the situations where it is robustly true we already have the > >> DEFINE_DMA_UNMAP_{ADDR,LEN} mechanism. > >> I'd consider it rare for DMA > >> addresses to be stored in isolation, as opposed to being part of some kind > >> of buffer descriptor (or indeed struct scatterlist, for an obvious example) > >> that a driver or subsystem still has to keep track of anyway, so in general > >> I believe the scope for saving decidedly small amounts of memory at runtime > >> is also considerably less than you might be imagining. > >> > >> Thanks, > >> Robin. > > > > > > Yes. DEFINE_DMA_UNMAP_ exits but that's only compile time. > > And I think the fact we have that mechanism is a hint that > > enough configurations could benefit from a runtime > > mechanism, too. > > > > E.g. since you mentioned scatterlist, it has a bunch of ifdefs > > in place. > > But what could that benefit be in general? It's not like we can change > structure layouts on a per-DMA-mapping-call basis to save > already-allocated memory... :/ We can put the memory together. If the unmap is not needed, then we do not allocate the memory. That is the way we are trying. Thanks. > > Thanks, > Robin. > > > > > Of course > > - finding more examples would be benefitial to help maintainers > > do the cost/benefit analysis > > - a robust implementation is needed > > > >
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 50ccc4f1ef81..8c661a0e1111 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1706,6 +1706,16 @@ static size_t iommu_dma_opt_mapping_size(void) return iova_rcache_range(); } +static bool iommu_dma_opt_can_skip_unmap(struct device *dev) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + + if (domain->type == IOMMU_DOMAIN_IDENTITY) + return true; + else + return false; +} + static const struct dma_map_ops iommu_dma_ops = { .flags = DMA_F_PCI_P2PDMA_SUPPORTED, .alloc = iommu_dma_alloc, @@ -1728,6 +1738,7 @@ static const struct dma_map_ops iommu_dma_ops = { .unmap_resource = iommu_dma_unmap_resource, .get_merge_boundary = iommu_dma_get_merge_boundary, .opt_mapping_size = iommu_dma_opt_mapping_size, + .dma_can_skip_unmap = iommu_dma_opt_can_skip_unmap, }; /* diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index 4abc60f04209..d508fa90bc06 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -83,6 +83,7 @@ struct dma_map_ops { size_t (*max_mapping_size)(struct device *dev); size_t (*opt_mapping_size)(void); unsigned long (*get_merge_boundary)(struct device *dev); + bool (*dma_can_skip_unmap)(struct device *dev); }; #ifdef CONFIG_DMA_OPS diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 4a658de44ee9..af5d9275f8cc 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -140,6 +140,7 @@ int dma_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size, unsigned long attrs); bool dma_can_mmap(struct device *dev); +bool dma_can_skip_unmap(struct device *dev); bool dma_pci_p2pdma_supported(struct device *dev); int dma_set_mask(struct device *dev, u64 mask); int dma_set_coherent_mask(struct device *dev, u64 mask); @@ -249,6 +250,10 @@ static inline bool dma_can_mmap(struct device *dev) { return false; } +static inline bool dma_can_skip_unmap(struct device *dev) +{ + return false; +} static inline bool dma_pci_p2pdma_supported(struct device *dev) { return false; diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 58db8fd70471..99a81932820b 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -445,6 +445,29 @@ bool dma_can_mmap(struct device *dev) } EXPORT_SYMBOL_GPL(dma_can_mmap); +/** + * dma_can_skip_unmap - check if unmap can be skipped + * @dev: device to check + * + * Returns %true if @dev supports direct map or dma_can_skip_unmap() return true. + */ +bool dma_can_skip_unmap(struct device *dev) +{ + const struct dma_map_ops *ops = get_dma_ops(dev); + + if (is_swiotlb_force_bounce(dev)) + return false; + + if (dma_map_direct(dev, ops)) + return true; + + if (ops->dma_can_skip_unmap) + return ops->dma_can_skip_unmap(dev); + + return false; +} +EXPORT_SYMBOL_GPL(dma_can_skip_unmap); + /** * dma_mmap_attrs - map a coherent DMA allocation into user space * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices