From patchwork Mon Mar 20 12:28:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Tesarik X-Patchwork-Id: 72141 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1198922wrt; Mon, 20 Mar 2023 05:57:42 -0700 (PDT) X-Google-Smtp-Source: AK7set+/Au7Me6upuMpagYHeR7gvJeOLwdcu3d1/51wBxSk0pxEiZOE+L+lFPdK+BDSGIqHfS7ca X-Received: by 2002:a05:6a20:3b05:b0:da:4d25:8fdd with SMTP id c5-20020a056a203b0500b000da4d258fddmr206384pzh.38.1679317061728; Mon, 20 Mar 2023 05:57:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679317061; cv=none; d=google.com; s=arc-20160816; b=R8H3pFyN71q+/qn9oUXVz338Kbo/DeLU5fUtT/pFe6eHkxergkfNJRodRtcVTCXX0i X+mX3h5xK39ulk0aG3yBRaLR6+5JysTovfyakNMa9mOHw9RM153es0tOxGz8848zI9fj p/DoTwhCCUVSYcTWo5AlSff4w3Joee+k7E/DJUfHe5AvCOTWJ+oT65Qh35kk4N3s1LUg EkagciZAQrHEeUzyj6a/BoBED8JZfmXm7F3ZxBjKOxGIOwNkWDC2EanB/+lR6p8DCFVH wQWP1dvVxvij00LGZCKEzBELV5mWWXzJ67BCvIg6i4L+6pT6onD7m5/0Z87Fo1rkImhA MuIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=lafm/5Dt0a0qHt3fMJfLjBIVLMMZshJeJMs/urtZZHg=; b=tmx5MxMLtG1wKu7Hnn6xGf629KJtKJaf/DHG/O/eKPncqcsqFjUU0O6KXDbWYfPkjq 8AU/gRJi93rzNoucZcQS1+plTT9y+g4CYJw6YmQL8SEEMMTy4BkDvYNkZ+seRmJ1AbRQ htr9bFPYz4uTCvifuUboYjPTJUr9Z2Z72O/73ieKOtc6UUMM8bUcotoIjePdoNtaZ4NM mfC6aQLUMTkg86GHb8uNym99DKTxdc219giRHLbuIkq5pbVqla0/8CBSNSL6xSEiiwpr VM0WvvQwXicTf6dcJM7gbWMZoC+7iUX1GBq4JVerlTBIrbIsbMo711BR4+EjHFHfXrh+ xvFw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q11-20020a65524b000000b0050bcc1b7d3fsi10201626pgp.227.2023.03.20.05.57.27; Mon, 20 Mar 2023 05:57:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231346AbjCTM3U (ORCPT + 99 others); Mon, 20 Mar 2023 08:29:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231201AbjCTM3M (ORCPT ); Mon, 20 Mar 2023 08:29:12 -0400 Received: from frasgout11.his.huawei.com (frasgout11.his.huawei.com [14.137.139.23]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3828A83C0; Mon, 20 Mar 2023 05:29:10 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.227]) by frasgout11.his.huawei.com (SkyGuard) with ESMTP id 4PgDMY3bQ6z9y4Sk; Mon, 20 Mar 2023 20:20:09 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.48.148.162]) by APP2 (Coremail) with SMTP id GxC2BwBnOF9kURhkkqGyAQ--.46782S3; Mon, 20 Mar 2023 13:28:42 +0100 (CET) From: Petr Tesarik To: Jonathan Corbet , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Borislav Petkov , "Paul E. McKenney" , Andrew Morton , Randy Dunlap , Damien Le Moal , Kim Phillips , "Steven Rostedt (Google)" , linux-doc@vger.kernel.org (open list:DOCUMENTATION), linux-kernel@vger.kernel.org (open list), iommu@lists.linux.dev (open list:DMA MAPPING HELPERS) Cc: Roberto Sassu , petr@tesarici.cz Subject: [RFC v1 1/4] dma-mapping: introduce the DMA_ATTR_MAY_SLEEP attribute Date: Mon, 20 Mar 2023 13:28:13 +0100 Message-Id: X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: References: MIME-Version: 1.0 X-CM-TRANSID: GxC2BwBnOF9kURhkkqGyAQ--.46782S3 X-Coremail-Antispam: 1UD129KBjvJXoW7ZF17ur1rWw1kKFyrJr1DJrb_yoW8Zw1Dp3 ZagFyfGr92gr1xCr1kGw1agF4UWa1ru345GF40vr1rZrW5A3Z29rs8Kr1Yq34DXryxCFWF vrW29ry5Cryqy37anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPIb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUGw A2048vs2IY020Ec7CjxVAFwI0_Gr0_Xr1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVWUJVWUCwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8JVWxJwA2z4x0Y4vEx4A2jsIE14v26r4j6F4UM28EF7xvwVC2z280aVCY1x0267AKxVW8 Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAKzI0EY4 vE52x082I5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8C rVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVW8ZVWrXw CIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x02 67AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x07jx wIDUUUUU= X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760891567437451641?= X-GMAIL-MSGID: =?utf-8?q?1760891567437451641?= From: Petr Tesarik Introduce a DMA attribute to tell the DMA-mapping subsystem that the operation is allowed to sleep. This patch merely adds the flag, which is not used for anything at the moment. It should be used by users who can sleep (e.g. dma-buf ioctls) to allow page reclaim and/or allocations from CMA. Signed-off-by: Petr Tesarik --- Documentation/core-api/dma-attributes.rst | 10 ++++++++++ include/linux/dma-mapping.h | 6 ++++++ 2 files changed, 16 insertions(+) diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst index 1887d92e8e92..6481ce2acf5d 100644 --- a/Documentation/core-api/dma-attributes.rst +++ b/Documentation/core-api/dma-attributes.rst @@ -130,3 +130,13 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged subsystem that the buffer is fully accessible at the elevated privilege level (and ideally inaccessible or at least read-only at the lesser-privileged levels). + +DMA_ATTR_MAY_SLEEP +------------------ + +This tells the DMA-mapping subsystem that it is allowed to sleep. For example, +if mapping needs a bounce buffer, software IO TLB may use CMA for the +allocation if this flag is given. + +This attribute is not used for dma_alloc_* functions. Instead, the provided +GFP flags are used to determine whether the allocation may sleep. diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 0ee20b764000..7a75c503ac38 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -61,6 +61,12 @@ */ #define DMA_ATTR_PRIVILEGED (1UL << 9) +/* + * DMA_ATTR_MAY_SLEEP: This tells the DMA-mapping subsystem that it is allowed + * to sleep. + */ +#define DMA_ATTR_MAY_SLEEP (1UL << 10) + /* * A dma_addr_t can hold any valid DMA or bus address for the platform. It can * be given to a device to use as a DMA source or target. It is specific to a From patchwork Mon Mar 20 12:28:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Tesarik X-Patchwork-Id: 72158 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1218768wrt; Mon, 20 Mar 2023 06:33:21 -0700 (PDT) X-Google-Smtp-Source: AK7set/+aPxZXVsfHeTrOsYFNdJ+2nypOYS5p69e6lJgMQWTe6SgfdzgElsNxiCUA6aYbFhLMcTT X-Received: by 2002:a17:902:f90f:b0:19f:1c64:c9d8 with SMTP id kw15-20020a170902f90f00b0019f1c64c9d8mr14574082plb.14.1679319200770; Mon, 20 Mar 2023 06:33:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679319200; cv=none; d=google.com; s=arc-20160816; b=c4aE4xwEsABRBvdojNJBeGVSXAAfg1rFbR1YrGFC5HrhHyxUHfeck622aO2AGoXDkZ Os/rniqp7HS7VyIL0ZNtBWqPTdfGVoP2XfsfYa7M6fu+CclZaQ+oEMpVlnq19HMdagP0 9OWXB0kxyFh4W3DUcZ5+0M0oRjGis1L99MGt47TdDJUhg01s5vlFlIpI8VNLWOOwiet/ 3QOmEJaLs5JvEiIjSAw2LfcqcKOVKt953PpWyCgueOmn8IXg4ujJiEZjSEc5Z22F6jX7 /iXziARihmtyn7H/VSbHq3THLj8leM2rx6NkSkq/7t+y2hLGcLzZZJnTqb7yw1OG88Kt PtPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=rCdLdAKSCkBORbGR+HT104raYcCqmn2GL98BLnF/5aI=; b=AM7TX0m3CFUj2OfBPnJjwa2Qx06VHqemGOl4WsKHFzFefalA2JcaiXv71RLcPfG6QN /fsk3eJYIFYvTJDjX2BLBKvL27/Fiw6+Mik7avKJNHcxoR0g5ZlaCDAxz+LGW3NcTLCM sA6vvg9ekntBAhc7s7kfHlPiCvkcnLEyYKeh/AFhtDtuUlMsEqJ5qkckulm4JDLWgCj3 qVGC8mwgfUPcEtW0MkoUEROGKDyI9MFR/ZXVYRHWEnQNc34nZUCPhsm0htfp5r9TQXut bH9nF0B6xmm3jx9Dv/cHICdSpO6aCTeta129civrtNAf/7t8dOW1RrvlNUz1UB9xUEXk xkAw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id iy6-20020a170903130600b0019f33cb669asi9881230plb.615.2023.03.20.06.32.53; Mon, 20 Mar 2023 06:33:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231536AbjCTMto (ORCPT + 99 others); Mon, 20 Mar 2023 08:49:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231655AbjCTMtV (ORCPT ); Mon, 20 Mar 2023 08:49:21 -0400 Received: from frasgout12.his.huawei.com (frasgout12.his.huawei.com [14.137.139.154]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B18922DC4; Mon, 20 Mar 2023 05:48:09 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.229]) by frasgout12.his.huawei.com (SkyGuard) with ESMTP id 4PgDM03lywz9v7YD; Mon, 20 Mar 2023 20:19:40 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.48.148.162]) by APP2 (Coremail) with SMTP id GxC2BwBnOF9kURhkkqGyAQ--.46782S4; Mon, 20 Mar 2023 13:28:51 +0100 (CET) From: Petr Tesarik To: Jonathan Corbet , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Borislav Petkov , "Paul E. McKenney" , Andrew Morton , Randy Dunlap , Damien Le Moal , Kim Phillips , "Steven Rostedt (Google)" , linux-doc@vger.kernel.org (open list:DOCUMENTATION), linux-kernel@vger.kernel.org (open list), iommu@lists.linux.dev (open list:DMA MAPPING HELPERS) Cc: Roberto Sassu , petr@tesarici.cz Subject: [RFC v1 2/4] swiotlb: Move code around in preparation for dynamic bounce buffers Date: Mon, 20 Mar 2023 13:28:14 +0100 Message-Id: <932dee179c950e98713f8636f0c9d95a6a37b640.1679309810.git.petr.tesarik.ext@huawei.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: References: MIME-Version: 1.0 X-CM-TRANSID: GxC2BwBnOF9kURhkkqGyAQ--.46782S4 X-Coremail-Antispam: 1UD129KBjvJXoW3Jr4kZF4DXF1UtF1xuFWUXFb_yoW7Cw4xpF 4xJFyrKFZ3tF18C3sF9a1kGF1F9w1kK3y3JFWa9ryF9a4DXrn0qFZ8CrWjg3WFqFWv9FW7 Xr98ZF4rKF47Ar7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPIb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUXw A2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVWUJVWUCwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8JVWxJwA2z4x0Y4vEx4A2jsIE14v26r4j6F4UM28EF7xvwVC2z280aVCY1x0267AKxVW8 Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAKzI0EY4 vE52x082I5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8C rVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVW8ZVWrXw CIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x02 67AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x07j5 -B_UUUUU= X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760893810254999547?= X-GMAIL-MSGID: =?utf-8?q?1760893810254999547?= From: Petr Tesarik In preparation for the introduction of dynamically allocated bounce buffers, separate out common code and code which handles non-dynamic bounce buffers. No functional change, but this commit should make the addition of dynamic allocations easier to review. Signed-off-by: Petr Tesarik --- include/linux/swiotlb.h | 7 ++++- kernel/dma/swiotlb.c | 64 +++++++++++++++++++++++++++++------------ 2 files changed, 52 insertions(+), 19 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 35bc4e281c21..b71adba03dc7 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -105,11 +105,16 @@ struct io_tlb_mem { }; extern struct io_tlb_mem io_tlb_default_mem; +static inline bool is_swiotlb_fixed(struct io_tlb_mem *mem, phys_addr_t paddr) +{ + return paddr >= mem->start && paddr < mem->end; +} + static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) { struct io_tlb_mem *mem = dev->dma_io_tlb_mem; - return mem && paddr >= mem->start && paddr < mem->end; + return mem && is_swiotlb_fixed(mem, paddr); } static inline bool is_swiotlb_force_bounce(struct device *dev) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index a34c38bbe28f..e8608bcb205e 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -78,6 +78,10 @@ phys_addr_t swiotlb_unencrypted_base; static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT; static unsigned long default_nareas; +static void swiotlb_copy(struct device *dev, phys_addr_t orig_addr, + unsigned char *vaddr, size_t size, size_t alloc_size, + unsigned int tlb_offset, enum dma_data_direction dir); + /** * struct io_tlb_area - IO TLB memory area descriptor * @@ -530,7 +534,6 @@ static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT; phys_addr_t orig_addr = mem->slots[index].orig_addr; size_t alloc_size = mem->slots[index].alloc_size; - unsigned long pfn = PFN_DOWN(orig_addr); unsigned char *vaddr = mem->vaddr + tlb_addr - mem->start; unsigned int tlb_offset, orig_addr_offset; @@ -547,6 +550,18 @@ static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size } tlb_offset -= orig_addr_offset; + swiotlb_copy(dev, orig_addr, vaddr, size, alloc_size, tlb_offset, dir); +} + +/* + * Copy swiotlb buffer content, checking for overflows. + */ +static void swiotlb_copy(struct device *dev, phys_addr_t orig_addr, + unsigned char *vaddr, size_t size, size_t alloc_size, + unsigned int tlb_offset, enum dma_data_direction dir) +{ + unsigned long pfn = PFN_DOWN(orig_addr); + if (tlb_offset > alloc_size) { dev_WARN_ONCE(dev, 1, "Buffer overflow detected. Allocation size: %zu. Mapping size: %zu+%u.\n", @@ -738,15 +753,35 @@ static unsigned long mem_used(struct io_tlb_mem *mem) return used; } +static phys_addr_t swiotlb_fixed_map(struct device *dev, phys_addr_t orig_addr, + size_t alloc_size, unsigned int alloc_align_mask, + unsigned long attrs) +{ + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; + unsigned int offset = swiotlb_align_offset(dev, orig_addr); + int index = swiotlb_find_slots(dev, orig_addr, + alloc_size + offset, alloc_align_mask); + unsigned int i; + + if (index == -1) + return (phys_addr_t)DMA_MAPPING_ERROR; + + /* + * Save away the mapping from the original address to the DMA address. + * This is needed when we sync the memory. Then we sync the buffer if + * needed. + */ + for (i = 0; i < nr_slots(alloc_size + offset); i++) + mem->slots[index + i].orig_addr = slot_addr(orig_addr, i); + return slot_addr(mem->start, index) + offset; +} + phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, size_t mapping_size, size_t alloc_size, unsigned int alloc_align_mask, enum dma_data_direction dir, unsigned long attrs) { struct io_tlb_mem *mem = dev->dma_io_tlb_mem; - unsigned int offset = swiotlb_align_offset(dev, orig_addr); - unsigned int i; - int index; phys_addr_t tlb_addr; if (!mem || !mem->nslabs) { @@ -764,24 +799,17 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, return (phys_addr_t)DMA_MAPPING_ERROR; } - index = swiotlb_find_slots(dev, orig_addr, - alloc_size + offset, alloc_align_mask); - if (index == -1) { + tlb_addr = swiotlb_fixed_map(dev, orig_addr, alloc_size, + alloc_align_mask, attrs); + + if (tlb_addr == (phys_addr_t)DMA_MAPPING_ERROR) { if (!(attrs & DMA_ATTR_NO_WARN)) dev_warn_ratelimited(dev, - "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n", - alloc_size, mem->nslabs, mem_used(mem)); - return (phys_addr_t)DMA_MAPPING_ERROR; + "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n", + alloc_size, mem->nslabs, mem_used(mem)); + return tlb_addr; } - /* - * Save away the mapping from the original address to the DMA address. - * This is needed when we sync the memory. Then we sync the buffer if - * needed. - */ - for (i = 0; i < nr_slots(alloc_size + offset); i++) - mem->slots[index + i].orig_addr = slot_addr(orig_addr, i); - tlb_addr = slot_addr(mem->start, index) + offset; /* * When dir == DMA_FROM_DEVICE we could omit the copy from the orig * to the tlb buffer, if we knew for sure the device will From patchwork Mon Mar 20 12:28:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Tesarik X-Patchwork-Id: 72144 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1201013wrt; Mon, 20 Mar 2023 06:01:24 -0700 (PDT) X-Google-Smtp-Source: AK7set+BVyXJBAJWQrEktqcL+bQvpoSqiYNaPYpjskLhxKSzfsn5owvVDFBR0j0Xhdblths69ZiP X-Received: by 2002:a17:90b:3e8b:b0:23b:3641:cf16 with SMTP id rj11-20020a17090b3e8b00b0023b3641cf16mr14062602pjb.11.1679317283787; Mon, 20 Mar 2023 06:01:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679317283; cv=none; d=google.com; s=arc-20160816; b=gDmviQ9VZj5MTHt/2uTZxn7fyNNqQYpzEduPuEtvfx187plXiBfnBejW0E01yRBikv fgX2A6iqORWBE8OHlrJiOXUHavNiXZf9b+6jxcyRjREkTWcSX4fgh3n5OcHKXyK90pIx qpMm+D/xNvEIzdB3CIlWs3I6vEy6aY2ortCd09jr/Q3fU/vw5WZUnMLxhGIEaVMY0bSx MPwAe3ptk2dKsaoAA0shDqSXU33qJ5LZSbnQJeejpo476GPmSRbGaQlBWykPG02XV7L4 P9M667sXihihgrF/2pXVW2bwcCcRrdZN4YzYehlCK/hvULM9H5nlzbm39s5qJo3kA8u3 PRHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=Z4jM+QJxT6pBKD9BtS3YFUDrhf9F3UhHWz3giDWSRyY=; b=GUZ78bAL0Feh2Fs/0T+0mvOCbCQakVXwKlU4kCA0MoUYyXU12NzSMgboOj4CXmOzk9 qPWng9B7CEjJB6e4zTKNtDHp3fmrVXzHxB1/zTYWg5H48r1aO+40tXqIBY3kTS56pa50 bua/7ZOAEFr3nLN7QPF0I2ge6gSRsPtlV6Q0ui/zcT4iacDz7whqfFfJotMmLiYMn3uV yGZ2rBGONF8LCmELTfHNGx9x49NNtwHlxqbASSOHKpIFWXNv+Hnb4HIumvhaxbSXcZi3 6MZlD+bwf35nfiGpUTheRWAVGVI4XnjtM01+MshzGilNk9u+YDZg+4T7vzrqLxLrYsuA c96Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i4-20020a17090a718400b0022c8ba1bd77si14903912pjk.174.2023.03.20.06.01.03; Mon, 20 Mar 2023 06:01:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230149AbjCTM3z (ORCPT + 99 others); Mon, 20 Mar 2023 08:29:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231391AbjCTM3s (ORCPT ); Mon, 20 Mar 2023 08:29:48 -0400 Received: from frasgout11.his.huawei.com (frasgout11.his.huawei.com [14.137.139.23]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 473D427D6C; Mon, 20 Mar 2023 05:29:29 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.228]) by frasgout11.his.huawei.com (SkyGuard) with ESMTP id 4PgDMv19cBz9y4Sr; Mon, 20 Mar 2023 20:20:27 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.48.148.162]) by APP2 (Coremail) with SMTP id GxC2BwBnOF9kURhkkqGyAQ--.46782S5; Mon, 20 Mar 2023 13:28:59 +0100 (CET) From: Petr Tesarik To: Jonathan Corbet , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Borislav Petkov , "Paul E. McKenney" , Andrew Morton , Randy Dunlap , Damien Le Moal , Kim Phillips , "Steven Rostedt (Google)" , linux-doc@vger.kernel.org (open list:DOCUMENTATION), linux-kernel@vger.kernel.org (open list), iommu@lists.linux.dev (open list:DMA MAPPING HELPERS) Cc: Roberto Sassu , petr@tesarici.cz Subject: [RFC v1 3/4] swiotlb: Allow dynamic allocation of bounce buffers Date: Mon, 20 Mar 2023 13:28:15 +0100 Message-Id: <0334a54332ab75312c9de825548b616439dcc9f5.1679309810.git.petr.tesarik.ext@huawei.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: References: MIME-Version: 1.0 X-CM-TRANSID: GxC2BwBnOF9kURhkkqGyAQ--.46782S5 X-Coremail-Antispam: 1UD129KBjvAXoW3ZFykKry5CFyxCr17uFyxAFb_yoW8Jr1fto WxAF13Wr1fKw1UGrZ0kFZrJF47Zayvka1rAr4fZ3yYga9FyryYgw12gF4rJwn3Ww18KFWx Ar9Iga48Xan7Ar48n29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUOe7kC6x804xWl14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK 8VAvwI8IcIk0rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr Wl82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Cr0_Gr1UM28EF7xvwVC2z280aVAFwI0_Gr0_Cr1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6r4UJVWxJr1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2 WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkE bVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0264kExV AvwVAq07x20xyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6r W5MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF 7I0E14v26F4j6r4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI 0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x 07jeKZXUUUUU= X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760891799749400514?= X-GMAIL-MSGID: =?utf-8?q?1760891799749400514?= From: Petr Tesarik The software IO TLB was designed with the assumption that it is not used much, especially on 64-bit systems, so a small fixed memory area (currently 64 MiB) is sufficient to handle the few cases which still require a bounce buffer. However, these cases are not so rare in some circumstances. First, if SEV is active, all DMA must be done through shared unencrypted pages, and SWIOTLB is used to make this happen without changing device drivers. The software IO TLB size is increased to 6% of total memory in sev_setup_arch(), but that is more of an approximation. The actual requirements may vary depending on which drivers are used and the amount of I/O. Second, on the Raspberry Pi 4, swiotlb is used by dma-buf for pages moved from the rendering GPU (v3d driver), which can access all memory, to the display output (vc4 driver), which is connected to a bus with an address limit of 1 GiB and no IOMMU. These buffers can be large (several megabytes) and cannot be handled by SWIOTLB, because they exceed maximum segment size of 256 KiB. Such mapping failures can be easily reproduced on a Raspberry Pi4: Starting GNOME remote desktop results in a flood of kernel messages like these: [ 387.937625] vc4-drm gpu: swiotlb buffer is full (sz: 524288 bytes), total 32768 (slots), used 3136 (slots) [ 387.960381] vc4-drm gpu: swiotlb buffer is full (sz: 815104 bytes), total 32768 (slots), used 2 (slots) This second example cannot be even solved without increasing the segment size (and the complexity of {map,unmap}_single size). At that point, it's better to allocate bounce buffers dynamically with dma_direct_alloc_pages(). One caveat is that the DMA API often takes only the address of a buffer, and the implementation (direct or IOMMU) checks whether it belongs to the software IO TLB. This is easy if the IO TLB is a single chunk of physically contiguous memory, but not if some buffers are allocated dynamically. Testing on a Raspberry Pi 4 shows that there can be 1k+ such buffers. This requires something better than a linked list. I'm using a maple tree to track dynamically allocated buffers. This data structure was invented for a similar use case, but there are some challenges: 1. The value is limited to ULONG_MAX, which is too little both for physical addresses (e.g. x86 PAE or 32-bit ARM LPAE) and DMA addresses (e.g. Xen guests on 32-bit ARM). 2. Since buffers are currently allocated with page granularity, a PFN can be used instead. However, some values are reserved by the maple tree implementation. Liam suggests to use xa_mk_value() in that case, but that reduces the usable range by half. Luckily, 31 bits are still enough to hold a PFN on all 32-bit platforms. 3. Software IO TLB is used from interrupt context. The maple tree implementation is not IRQ-safe (MT_FLAGS_LOCK_IRQ does nothing AFAICS). Instead, I use an external lock, spin_lock_irqsave() and spin_unlock_irqrestore(). Note that bounce buffers are never allocated dynamically if the software IO TLB is in fact a DMA restricted pool, which is intended to be stay in its designated location in physical memory. Signed-off-by: Petr Tesarik --- include/linux/swiotlb.h | 11 ++- kernel/dma/swiotlb.c | 156 +++++++++++++++++++++++++++++++++++++--- 2 files changed, 157 insertions(+), 10 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index b71adba03dc7..0ef27d6491b9 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -7,6 +7,7 @@ #include #include #include +#include #include struct device; @@ -87,6 +88,8 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys, * @for_alloc: %true if the pool is used for memory allocation * @nareas: The area number in the pool. * @area_nslabs: The slot number in the area. + * @dyn_lock: Protect dynamically allocated slots. + * @dyn_slots: Dynamically allocated slots. */ struct io_tlb_mem { phys_addr_t start; @@ -102,9 +105,13 @@ struct io_tlb_mem { unsigned int area_nslabs; struct io_tlb_area *areas; struct io_tlb_slot *slots; + spinlock_t dyn_lock; + struct maple_tree dyn_slots; }; extern struct io_tlb_mem io_tlb_default_mem; +bool is_swiotlb_dyn(struct io_tlb_mem *mem, phys_addr_t paddr); + static inline bool is_swiotlb_fixed(struct io_tlb_mem *mem, phys_addr_t paddr) { return paddr >= mem->start && paddr < mem->end; @@ -114,7 +121,9 @@ static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) { struct io_tlb_mem *mem = dev->dma_io_tlb_mem; - return mem && is_swiotlb_fixed(mem, paddr); + return mem && + (is_swiotlb_fixed(mem, paddr) || + is_swiotlb_dyn(mem, paddr)); } static inline bool is_swiotlb_force_bounce(struct device *dev) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index e8608bcb205e..c6a0b8f2aa6f 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -41,6 +41,7 @@ #include #include #include +#include #ifdef CONFIG_DMA_RESTRICTED_POOL #include #include @@ -68,6 +69,13 @@ struct io_tlb_slot { unsigned int list; }; +struct io_tlb_dyn_slot { + phys_addr_t orig_addr; + size_t alloc_size; + struct page *page; + dma_addr_t dma_addr; +}; + static bool swiotlb_force_bounce; static bool swiotlb_force_disable; @@ -292,6 +300,10 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start, mem->slots[i].alloc_size = 0; } + spin_lock_init(&mem->dyn_lock); + mt_init_flags(&mem->dyn_slots, MT_FLAGS_LOCK_EXTERN); + mt_set_external_lock(&mem->dyn_slots, &mem->dyn_lock); + /* * If swiotlb_unencrypted_base is set, the bounce buffer memory will * be remapped and cleared in swiotlb_update_mem_attributes. @@ -516,6 +528,115 @@ void __init swiotlb_exit(void) memset(mem, 0, sizeof(*mem)); } +static struct io_tlb_dyn_slot *swiotlb_dyn_slot(struct io_tlb_mem *mem, + phys_addr_t paddr) +{ + unsigned long index = (uintptr_t)xa_mk_value(PHYS_PFN(paddr)); + struct io_tlb_dyn_slot *slot; + unsigned long flags; + + spin_lock_irqsave(&mem->dyn_lock, flags); + slot = mt_find(&mem->dyn_slots, &index, index); + spin_unlock_irqrestore(&mem->dyn_lock, flags); + return slot; +} + +bool is_swiotlb_dyn(struct io_tlb_mem *mem, phys_addr_t paddr) +{ + return !!swiotlb_dyn_slot(mem, paddr); +} + +static void swiotlb_dyn_bounce(struct device *dev, phys_addr_t tlb_addr, + size_t size, enum dma_data_direction dir) +{ + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; + struct io_tlb_dyn_slot *slot = swiotlb_dyn_slot(mem, tlb_addr); + unsigned int tlb_offset; + unsigned char *vaddr; + + if (!slot) + return; + + tlb_offset = tlb_addr - page_to_phys(slot->page); + vaddr = page_address(slot->page) + tlb_offset; + + swiotlb_copy(dev, slot->orig_addr, vaddr, size, slot->alloc_size, + tlb_offset, dir); +} + +static phys_addr_t swiotlb_dyn_map(struct device *dev, phys_addr_t orig_addr, + size_t alloc_size, unsigned int alloc_align_mask, + enum dma_data_direction dir, unsigned long attrs) +{ + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; + struct io_tlb_dyn_slot *slot; + unsigned long index; + unsigned long flags; + phys_addr_t paddr; + gfp_t gfp; + int err; + + /* Allocation has page granularity. Avoid small buffers. */ + if (alloc_size < PAGE_SIZE) + goto err; + + /* DMA direct does not deal with physical address constraints. */ + if (alloc_align_mask || dma_get_min_align_mask(dev)) + goto err; + + gfp = (attrs & DMA_ATTR_MAY_SLEEP) ? GFP_KERNEL : GFP_NOWAIT; + slot = kmalloc(sizeof(*slot), gfp | __GFP_NOWARN); + if (!slot) + goto err; + + slot->orig_addr = orig_addr; + slot->alloc_size = alloc_size; + slot->page = dma_direct_alloc_pages(dev, PAGE_ALIGN(alloc_size), + &slot->dma_addr, dir, + gfp | __GFP_NOWARN); + if (!slot->page) + goto err_free_slot; + + paddr = page_to_phys(slot->page); + index = (uintptr_t)xa_mk_value(PHYS_PFN(paddr)); + spin_lock_irqsave(&mem->dyn_lock, flags); + err = mtree_store_range(&mem->dyn_slots, index, + index + PFN_UP(alloc_size) - 1, + slot, GFP_NOWAIT | __GFP_NOWARN); + spin_unlock_irqrestore(&mem->dyn_lock, flags); + if (err) + goto err_free_dma; + + return paddr; + +err_free_dma: + dma_direct_free_pages(dev, slot->alloc_size, slot->page, + slot->dma_addr, dir); + +err_free_slot: + kfree(slot); +err: + return (phys_addr_t)DMA_MAPPING_ERROR; +} + +static void swiotlb_dyn_unmap(struct device *dev, phys_addr_t tlb_addr, + enum dma_data_direction dir) +{ + unsigned long index = (uintptr_t)xa_mk_value(PHYS_PFN(tlb_addr)); + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; + struct io_tlb_dyn_slot *slot; + unsigned long flags; + + spin_lock_irqsave(&mem->dyn_lock, flags); + slot = mt_find(&mem->dyn_slots, &index, index); + mtree_erase(&mem->dyn_slots, index); + spin_unlock_irqrestore(&mem->dyn_lock, flags); + + dma_direct_free_pages(dev, slot->alloc_size, slot->page, + slot->dma_addr, dir); + kfree(slot); +} + /* * Return the offset into a iotlb slot required to keep the device happy. */ @@ -524,11 +645,8 @@ static unsigned int swiotlb_align_offset(struct device *dev, u64 addr) return addr & dma_get_min_align_mask(dev) & (IO_TLB_SIZE - 1); } -/* - * Bounce: copy the swiotlb buffer from or back to the original dma location - */ -static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size, - enum dma_data_direction dir) +static void swiotlb_fixed_bounce(struct device *dev, phys_addr_t tlb_addr, + size_t size, enum dma_data_direction dir) { struct io_tlb_mem *mem = dev->dma_io_tlb_mem; int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT; @@ -608,6 +726,18 @@ static void swiotlb_copy(struct device *dev, phys_addr_t orig_addr, } } +/* + * Bounce: copy the swiotlb buffer from or back to the original dma location + */ +static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size, + enum dma_data_direction dir) +{ + if (is_swiotlb_fixed(dev->dma_io_tlb_mem, tlb_addr)) + swiotlb_fixed_bounce(dev, tlb_addr, size, dir); + else + swiotlb_dyn_bounce(dev, tlb_addr, size, dir); +} + static inline phys_addr_t slot_addr(phys_addr_t start, phys_addr_t idx) { return start + (idx << IO_TLB_SHIFT); @@ -799,8 +929,13 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, return (phys_addr_t)DMA_MAPPING_ERROR; } - tlb_addr = swiotlb_fixed_map(dev, orig_addr, alloc_size, - alloc_align_mask, attrs); + tlb_addr = (phys_addr_t)DMA_MAPPING_ERROR; + if (!is_swiotlb_for_alloc(dev)) + tlb_addr = swiotlb_dyn_map(dev, orig_addr, alloc_size, + alloc_align_mask, dir, attrs); + if (tlb_addr == (phys_addr_t)DMA_MAPPING_ERROR) + tlb_addr = swiotlb_fixed_map(dev, orig_addr, alloc_size, + alloc_align_mask, attrs); if (tlb_addr == (phys_addr_t)DMA_MAPPING_ERROR) { if (!(attrs & DMA_ATTR_NO_WARN)) @@ -882,7 +1017,10 @@ void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr, (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_FROM_DEVICE); - swiotlb_release_slots(dev, tlb_addr); + if (is_swiotlb_fixed(dev->dma_io_tlb_mem, tlb_addr)) + swiotlb_release_slots(dev, tlb_addr); + else + swiotlb_dyn_unmap(dev, tlb_addr, dir); } void swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr, @@ -1013,7 +1151,7 @@ bool swiotlb_free(struct device *dev, struct page *page, size_t size) { phys_addr_t tlb_addr = page_to_phys(page); - if (!is_swiotlb_buffer(dev, tlb_addr)) + if (!is_swiotlb_fixed(dev->dma_io_tlb_mem, tlb_addr)) return false; swiotlb_release_slots(dev, tlb_addr); From patchwork Mon Mar 20 12:28:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Tesarik X-Patchwork-Id: 72152 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1214683wrt; Mon, 20 Mar 2023 06:25:37 -0700 (PDT) X-Google-Smtp-Source: AK7set/+In2c8MqSi2HiGdLS9j2X7nNLIpre62hVIwmkvdiYEkjfPJpj2ahm+jcgbnDd98I5eyhr X-Received: by 2002:a05:6a20:4e2a:b0:da:53ea:5ca3 with SMTP id gk42-20020a056a204e2a00b000da53ea5ca3mr121051pzb.57.1679318737552; Mon, 20 Mar 2023 06:25:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679318737; cv=none; d=google.com; s=arc-20160816; b=C80e3//knyEcKMi3gagj46qr+7G4H9pSJhZ4cQeQxgR5W7/DFEixgEvTEC6DHK9n4r h3OCKFvRjgXp6NXc+Zk8pm86aopwB+UMfgv1AsOlT/HCptJNkZL326aD5FvBEFVc9AOo HeN5ZSDU8JrWKqQ+vJN0eFDzxxAlI5T3o0lNvK48leB20aHqUuHTq/gm/kqyPfSUP11u U/0vB1Y5h3N7dMeYfu3bJaWwAm6Vo54+NUtp5+gUtBnDcIdqczVO3HNdNjx6QfO/qOK/ MRp8o/PY8QjHMfjz6c8zxBcgJu2fo7zy8DuoE7zwx3slA39epAqhEn5YQq8Nj5dM9BsJ HiCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=dpp0ETI0U2df1cdhAvpp3fVTVeCKw0OIybFgA8OXLuc=; b=pgaXc6vNiKk+aorMBxwAA7UbrI+5ZjhynESjuJBxnsqiKHMbgoLPVpdqM9bDoPaT9o BRuyAbR24iEm2SSsZYGYaPbXMgqOMzfAdqPYEOu/agLMFHeNn6szVq8qaFchYC0CU4JU aI0rGd9pvWA1FLvlTQGC2sbigzFwnRAs6mdxVgAdlkMxb7jz5zBIuIiZWGhkvJrpfapB x13xf7A4nicRiij6iFWfAI4EWS/fqX464a9RzCm1LkYBZplNh+v20fQpZy5HAKHcydrs amDDzYUdRLMEQ0F3jBf/X6cP0DQDAfcCMxyO4Z59OWSLMZbXmqjHkIkP1bdAm9i/JV09 GtfA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w123-20020a627b81000000b00626007418c3si10099354pfc.289.2023.03.20.06.25.24; Mon, 20 Mar 2023 06:25:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231650AbjCTMsS (ORCPT + 99 others); Mon, 20 Mar 2023 08:48:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231600AbjCTMsC (ORCPT ); Mon, 20 Mar 2023 08:48:02 -0400 X-Greylist: delayed 1103 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Mon, 20 Mar 2023 05:47:45 PDT Received: from frasgout13.his.huawei.com (frasgout13.his.huawei.com [14.137.139.46]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F9B012BE8; Mon, 20 Mar 2023 05:47:44 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.229]) by frasgout13.his.huawei.com (SkyGuard) with ESMTP id 4PgDN6589Vz9v7Z6; Mon, 20 Mar 2023 20:20:38 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.48.148.162]) by APP2 (Coremail) with SMTP id GxC2BwBnOF9kURhkkqGyAQ--.46782S6; Mon, 20 Mar 2023 13:29:08 +0100 (CET) From: Petr Tesarik To: Jonathan Corbet , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Borislav Petkov , "Paul E. McKenney" , Andrew Morton , Randy Dunlap , Damien Le Moal , Kim Phillips , "Steven Rostedt (Google)" , linux-doc@vger.kernel.org (open list:DOCUMENTATION), linux-kernel@vger.kernel.org (open list), iommu@lists.linux.dev (open list:DMA MAPPING HELPERS) Cc: Roberto Sassu , petr@tesarici.cz Subject: [RFC v1 4/4] swiotlb: Add an option to allow dynamic bounce buffers Date: Mon, 20 Mar 2023 13:28:16 +0100 Message-Id: X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: References: MIME-Version: 1.0 X-CM-TRANSID: GxC2BwBnOF9kURhkkqGyAQ--.46782S6 X-Coremail-Antispam: 1UD129KBjvJXoWxWFyUXw4rKw1kZw1kJFWfuFg_yoWrZryUpr y0va4YgFZ7JF18Z347Cw47GF1Fkan29ay3ZayFgryFyr98Xrn0qwnrtr4YqF1rt3y09F47 ZFyYvF4Ykr17t3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPmb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVWUJVWUCwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVWxJVW8Jr1l84ACjcxK6I8E87Iv67AKxVW8JVWxJwA2z4x0Y4vEx4A2jsIEc7CjxVAF wI0_Gr1j6F4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I 80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCj c4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0Ew4 C26cxK6c8Ij28IcwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s02 6c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GF v_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvE c7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67 AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZE Xa7IU8gAw7UUUUU== X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760893324929003168?= X-GMAIL-MSGID: =?utf-8?q?1760893324929003168?= From: Petr Tesarik Dynamic allocation of bounce buffers may introduce regression for some workloads. The expected outcomes are bigger worst-case I/O latency reduced performance for some workloads. Unfortunately, real-world testing has been too unstable to draw any conclusion. To stay on the safe side, make the feature disabled by default and let people turn it on with "swiotlb=dynamic" if needed. Since this option can be combined with "force", the parser must be modified to allow multiple options separated by commas. A new bool field is added to struct io_tlb_mem to tell whether dynamic allocations are allowed. This field is always false for DMA restricted pools. It is also false for other software IO TLBs unless "swiotlb=dynamic" was specified. Signed-off-by: Petr Tesarik --- .../admin-guide/kernel-parameters.txt | 6 +++++- include/linux/swiotlb.h | 3 ++- kernel/dma/swiotlb.c | 19 ++++++++++++++----- 3 files changed, 21 insertions(+), 7 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 6cfa6e3996cf..6240a463631b 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -6081,14 +6081,18 @@ Execution Facility on pSeries. swiotlb= [ARM,IA-64,PPC,MIPS,X86] - Format: { [,] | force | noforce } + Format: { [,] [,option-list] | option-list } -- Number of I/O TLB slabs -- Second integer after comma. Number of swiotlb areas with their own lock. Will be rounded up to a power of 2. + -- Comma-separated list of options. + + Available options: force -- force using of bounce buffers even if they wouldn't be automatically used by the kernel noforce -- Never use bounce buffers (for debugging) + dynamic -- allow dynamic allocation of bounce buffers switches= [HW,M68k] diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 0ef27d6491b9..628e25ad7db7 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -101,6 +101,7 @@ struct io_tlb_mem { bool late_alloc; bool force_bounce; bool for_alloc; + bool allow_dyn; unsigned int nareas; unsigned int area_nslabs; struct io_tlb_area *areas; @@ -123,7 +124,7 @@ static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) return mem && (is_swiotlb_fixed(mem, paddr) || - is_swiotlb_dyn(mem, paddr)); + (mem->allow_dyn && is_swiotlb_dyn(mem, paddr))); } static inline bool is_swiotlb_force_bounce(struct device *dev) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index c6a0b8f2aa6f..3efaefebb6af 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -78,6 +78,7 @@ struct io_tlb_dyn_slot { static bool swiotlb_force_bounce; static bool swiotlb_force_disable; +static bool swiotlb_dynamic; struct io_tlb_mem io_tlb_default_mem; @@ -159,10 +160,17 @@ setup_io_tlb_npages(char *str) swiotlb_adjust_nareas(simple_strtoul(str, &str, 0)); if (*str == ',') ++str; - if (!strcmp(str, "force")) - swiotlb_force_bounce = true; - else if (!strcmp(str, "noforce")) - swiotlb_force_disable = true; + while (str && *str) { + char *opt = strsep(&str, ","); + if (!strcmp(opt, "force")) + swiotlb_force_bounce = true; + else if (!strcmp(opt, "noforce")) + swiotlb_force_disable = true; + else if (!strcmp(opt, "dynamic")) + swiotlb_dynamic = true; + else + pr_warn("Invalid swiotlb option: %s", opt); + } return 0; } @@ -287,6 +295,7 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start, mem->area_nslabs = nslabs / mem->nareas; mem->force_bounce = swiotlb_force_bounce || (flags & SWIOTLB_FORCE); + mem->allow_dyn = swiotlb_dynamic; for (i = 0; i < mem->nareas; i++) { spin_lock_init(&mem->areas[i].lock); @@ -930,7 +939,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, } tlb_addr = (phys_addr_t)DMA_MAPPING_ERROR; - if (!is_swiotlb_for_alloc(dev)) + if (mem->allow_dyn) tlb_addr = swiotlb_dyn_map(dev, orig_addr, alloc_size, alloc_align_mask, dir, attrs); if (tlb_addr == (phys_addr_t)DMA_MAPPING_ERROR)