From patchwork Tue Sep 26 16:23:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?UGV0ciBUZXNhxZnDrWs=?= X-Patchwork-Id: 144969 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:cae8:0:b0:403:3b70:6f57 with SMTP id r8csp2042069vqu; Tue, 26 Sep 2023 09:28:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF3faac463tXZ4NLrZf40xfNeSfz9udWpFQKHtQG0ZfGA8QGxA9r/vWUCpysefOhptszOB+ X-Received: by 2002:a05:6358:4323:b0:134:d45b:7dd1 with SMTP id r35-20020a056358432300b00134d45b7dd1mr12028956rwc.21.1695745699787; Tue, 26 Sep 2023 09:28:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695745699; cv=none; d=google.com; s=arc-20160816; b=hJKmcsLAc9fYfzSKSLqqE4ZU4ZwxRSLBgZntXCUFU7ZbOicyvErsrTjiVwcPqqM5Go bnemRqKtt0tzsOykzViGNhPG1DGJxNcFlxC5mEX7ir9knZNbYK1OF5UT6xR3A4c3Cnim oeSZ7jcTj6jrpQ543+jQTX7ABSli6S0J8wIGmomAnsx+bHE3JLxvl7nn0qmYg3DwsLQU dKoxnV0nrJZJpw6o/pQBgEFlrQLliqtpOaSWs2k9NKB4N7q0N0FLgrvCfCmnSMuWJ8nX 1H/Z1Jni/ZcuZ7uF4OW89pUhqm1eFwn52qrziBIF4HUpRwk+ZQufdzRaFF2+fbAzDJVb jWYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=Lzndxfvu5VmO5xjVgsybQ6MToVumkzYGGRu3aMTRN54=; fh=FKQfjv1YcoHUcEXgtQj4uwHELkqlHh01NrrbAm2q3Ew=; b=TiRhEqzwVT3ghqKiPQQprVpWUh5yVDBQie/4h3oIsj0GlH/Sk33VaT+BBEDdN9Kcip 0kEStGBqFBNrZ4qYO8M5hMazU7W4Aw7iu1u08aWQuiPRs3rEZHelGOr868XwfCcUgag8 uHS2hy0nqugcTC+XODgicis2imF4kX+tP0LJvo3hA5XEj9rncD4Qx0RjL/oddyIF0t48 f23THDf8eYcyvSh2wo9ds+JFfbwWfDqNmT3BJxs2WypH0M4p5q2DaM6nmAVBnRBdQTou c4GeQUdWiZ2Qp/YshU4Tr7ZmHrZtxRbRdxsWcowb/499kFu7zJJJsmh3ujlUcF6oz07H xD1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=cn2ylJsd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=tesarici.cz Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id bz14-20020a056a02060e00b00578e7a37c44si15386677pgb.42.2023.09.26.09.28.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 09:28:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=cn2ylJsd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=tesarici.cz Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 6A49383654FC; Tue, 26 Sep 2023 09:24:13 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234928AbjIZQYD (ORCPT + 28 others); Tue, 26 Sep 2023 12:24:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231137AbjIZQYB (ORCPT ); Tue, 26 Sep 2023 12:24:01 -0400 Received: from bee.tesarici.cz (bee.tesarici.cz [77.93.223.253]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1457A8E for ; Tue, 26 Sep 2023 09:23:52 -0700 (PDT) Received: from meshulam.tesarici.cz (dynamic-2a00-1028-83b8-1e7a-4427-cc85-6706-c595.ipv6.o2.cz [IPv6:2a00:1028:83b8:1e7a:4427:cc85:6706:c595]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bee.tesarici.cz (Postfix) with ESMTPSA id F14B5180D70; Tue, 26 Sep 2023 18:23:49 +0200 (CEST) Authentication-Results: mail.tesarici.cz; dmarc=fail (p=none dis=none) header.from=tesarici.cz DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tesarici.cz; s=mail; t=1695745430; bh=qJ0lkRvUdqr0kqwT3a+4Y09zi1t3fU/A9MJSdq9iPXU=; h=From:To:Cc:Subject:Date:From; b=cn2ylJsdjsO6gSWuNnVSjaT/EVyC7emGbtJ2mWZnYby+od13lz9jKuB1C+04cB02M ehyyplkXbobbZaLEex5LhTVJnWtednXJV3QDhDtdDDpd+JOSyUt4Fgn+Gn06nBij6C KnEMvx263XpqyxiAtanXdV7gHb5T792sOynv3eQ2qEzXVOl34HotdAgsmhB5HfQmmv ipZCiW8RXrNyorGZ45/vuGyLoOsGKEnTSEEpMjlwEX6/KynvKB5XQcPMjDl1RN82a+ B5HjoST0AYNDcjgymiO+G/VujgSWv6MAit73WOBMRUzayVSgZQAM1QgO4AMbonMIQF 3NwaLc3GaMUjQ== From: Petr Tesarik To: Christoph Hellwig , Marek Szyprowski , Robin Murphy , iommu@lists.linux.dev (open list:DMA MAPPING HELPERS), linux-kernel@vger.kernel.org (open list) Cc: Roberto Sassu , Catalin Marinas , Petr Tesarik , Jonathan Corbet Subject: [PATCH v2] swiotlb: fix the check whether a device has used software IO TLB Date: Tue, 26 Sep 2023 18:23:39 +0200 Message-ID: <20230926162339.12940-1-petr@tesarici.cz> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Tue, 26 Sep 2023 09:24:13 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778118242561655753 X-GMAIL-MSGID: 1778118242561655753 When CONFIG_SWIOTLB_DYNAMIC=y, devices which do not use the software IO TLB can avoid swiotlb lookup. A flag is added by commit 1395706a1490 ("swiotlb: search the software IO TLB only if the device makes use of it"), the flag is correctly set, but it is then never checked. Add the actual check here. Note that this code is an alternative to the default pool check, not an additional check, because: 1. swiotlb_find_pool() also searches the default pool; 2. if dma_uses_io_tlb is false, the default swiotlb pool is not used. Tested in a KVM guest against a QEMU RAM-backed SATA disk over virtio and *not* using software IO TLB, this patch increases IOPS by approx 2% for 4-way parallel I/O. The write memory barrier in swiotlb_dyn_alloc() is not needed, because a newly allocated pool must always be observed by swiotlb_find_slots() before an address from that pool is passed to is_swiotlb_buffer(). Correctness was verified using the following litmus test: C swiotlb-new-pool (* * Result: Never * * Check that a newly allocated pool is always visible when the * corresponding swiotlb buffer is visible. *) { mem_pools = default; } P0(int **mem_pools, int *pool) { /* add_mem_pool() */ WRITE_ONCE(*pool, 999); rcu_assign_pointer(*mem_pools, pool); } P1(int **mem_pools, int *flag, int *buf) { /* swiotlb_find_slots() */ int *r0; int r1; rcu_read_lock(); r0 = READ_ONCE(*mem_pools); r1 = READ_ONCE(*r0); rcu_read_unlock(); if (r1) { WRITE_ONCE(*flag, 1); smp_mb(); } /* device driver (presumed) */ WRITE_ONCE(*buf, r1); } P2(int **mem_pools, int *flag, int *buf) { /* device driver (presumed) */ int r0 = READ_ONCE(*buf); /* is_swiotlb_buffer() */ int r1; int *r2; int r3; smp_rmb(); r1 = READ_ONCE(*flag); if (r1) { /* swiotlb_find_pool() */ rcu_read_lock(); r2 = READ_ONCE(*mem_pools); r3 = READ_ONCE(*r2); rcu_read_unlock(); } } exists (2:r0<>0 /\ 2:r3=0) (* Not found. *) Fixes: 1395706a1490 ("swiotlb: search the software IO TLB only if the device makes use of it") Reported-by: Jonathan Corbet Closes: https://lore.kernel.org/linux-iommu/87a5uz3ob8.fsf@meer.lwn.net/ Signed-off-by: Petr Tesarik Reviewed-by: Catalin Marinas --- include/linux/swiotlb.h | 22 +++++++++++++++------- kernel/dma/swiotlb.c | 25 +++++++++++++++++++------ 2 files changed, 34 insertions(+), 13 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index b4536626f8ff..93b400d9be91 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -172,14 +172,22 @@ static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) if (!mem) return false; - if (IS_ENABLED(CONFIG_SWIOTLB_DYNAMIC)) { - /* Pairs with smp_wmb() in swiotlb_find_slots() and - * swiotlb_dyn_alloc(), which modify the RCU lists. - */ - smp_rmb(); - return swiotlb_find_pool(dev, paddr); - } +#ifdef CONFIG_SWIOTLB_DYNAMIC + /* All SWIOTLB buffer addresses must have been returned by + * swiotlb_tbl_map_single() and passed to a device driver. + * If a SWIOTLB address is checked on another CPU, then it was + * presumably loaded by the device driver from an unspecified private + * data structure. Make sure that this load is ordered before reading + * dev->dma_uses_io_tlb here and mem->pools in swiotlb_find_pool(). + * + * This barrier pairs with smp_mb() in swiotlb_find_slots(). + */ + smp_rmb(); + return READ_ONCE(dev->dma_uses_io_tlb) && + swiotlb_find_pool(dev, paddr); +#else return paddr >= mem->defpool.start && paddr < mem->defpool.end; +#endif } static inline bool is_swiotlb_force_bounce(struct device *dev) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 394494a6b1f3..ab7101ed1461 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -729,9 +729,6 @@ static void swiotlb_dyn_alloc(struct work_struct *work) } add_mem_pool(mem, pool); - - /* Pairs with smp_rmb() in is_swiotlb_buffer(). */ - smp_wmb(); } /** @@ -1152,9 +1149,25 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr, spin_unlock_irqrestore(&dev->dma_io_tlb_lock, flags); found: - dev->dma_uses_io_tlb = true; - /* Pairs with smp_rmb() in is_swiotlb_buffer() */ - smp_wmb(); + WRITE_ONCE(dev->dma_uses_io_tlb, true); + + /* The general barrier orders reads and writes against a presumed store + * of the SWIOTLB buffer address by a device driver (to a driver private + * data structure). It serves two purposes. + * + * First, the store to dev->dma_uses_io_tlb must be ordered before the + * presumed store. This guarantees that the returned buffer address + * cannot be passed to another CPU before updating dev->dma_uses_io_tlb. + * + * Second, the load from mem->pools must be ordered before the same + * presumed store. This guarantees that the returned buffer address + * cannot be observed by another CPU before an update of the RCU list + * that was made by swiotlb_dyn_alloc() on a third CPU (cf. multicopy + * atomicity). + * + * See also the comment in is_swiotlb_buffer(). + */ + smp_mb(); *retpool = pool; return index;