From patchwork Fri Jul 21 06:35:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanmay Jagdale X-Patchwork-Id: 123615 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp27425vqg; Fri, 21 Jul 2023 00:23:00 -0700 (PDT) X-Google-Smtp-Source: APBJJlHcR6SE1I3aCTpzHkVRDWKsmBIRjboVzhG2nn55PJrEy5b9/41g0vHRACAFKBF+nHdV/Auz X-Received: by 2002:a0d:e2c9:0:b0:577:417b:8e90 with SMTP id l192-20020a0de2c9000000b00577417b8e90mr1336203ywe.10.1689924179974; Fri, 21 Jul 2023 00:22:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689924179; cv=none; d=google.com; s=arc-20160816; b=RsE1rDGS8W9VXhaUkAcJanUkKbvE7lYuO+emiafsQM+g9XyY6Qw0SBnBuc7JlTtvE5 o+E8NYgD2Uz71OY6m5qEBljaZB4wzYEzgfawUN/u+Y5S4B3r4UvCaDgeO2/Q7JkVxB1i /cvbtOP4BrQE00lzWxqbrxUS8JeujYZjxRvWpm0NJ/kDd92dQyBZiwa0Xqs5IPLOEGJA 7XWJBx5ZcgBNL3nb4QqFqLDjdHuwT16h1TmlI4pca3AMH6b10g4XK4sZgsdapFFOAIGv iLoWJ5x3pQMnpa/zeiqOjQpMCCIPskOOSkYQWRU0qAfzpZW9K36e68gv9ce2VnR0uKnR e2rA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=GqhQaI3ci/2Z58j5lRNg+6rR77RLGp456Nv7btgRFiA=; fh=fd26O43kPxr11JZ766hECURF11FSWmFuTg8mAeKtYLQ=; b=BsQ+xvGj4fuos29/UwnOOMig1+OB36q/LOHK/TIeMo/6z4TSws+vDqWgqKrNXrr2zs bxo3diwzuIYbi/zPJnRcb7aseebLfXuanxoeGzlEo2IEFNVMKOCQ9WjdJVugchjxtHtu 8Y6UmdCkMVEAAr16BxLIy9z6n3o9DrRy+tOwruuA6y3L2fYWo1HhbFeg1y15tHNVtWii gvvsvfRassl2ghacmyrgEhm92a9Y0osU2w+ZtxyCGF81PBUNJOmQyfAV+jBF9Xxkezn9 t4x+SHcqnelYuDRjBMP6uIxE4X3M3rwEprpfk2frqzLLyhnGuDfY6dAdcB6vRxQeS+hh bM6A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=Wi4aCN1q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ls4-20020a17090b350400b002563db5c4b0si5624994pjb.184.2023.07.21.00.22.47; Fri, 21 Jul 2023 00:22:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=Wi4aCN1q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231175AbjGUGgV (ORCPT + 99 others); Fri, 21 Jul 2023 02:36:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229771AbjGUGgS (ORCPT ); Fri, 21 Jul 2023 02:36:18 -0400 Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36D7E1BFC for ; Thu, 20 Jul 2023 23:36:15 -0700 (PDT) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36L2PbF8010479; Thu, 20 Jul 2023 23:35:49 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=GqhQaI3ci/2Z58j5lRNg+6rR77RLGp456Nv7btgRFiA=; b=Wi4aCN1qb2plLVz79WJcR/gq5LmvarufBb5ISWQDPTVH1FwnTnamrrOSWWXEkMwfsW+H qDn/Eyf7XVWhk1Ec5+M0uq+PU+1mcuK4b0FL/kVuwQHASeTbdLv/1RLpK2FE2l940s2j MhfqBwppxEqIRk+VzvpkWJcRDhNx9N3ifg8wh1CYtJq6oCj3MPGVC8oY02y1cqWN2Tjr tr1Af0RBiHYpzwGK1mtQ/PxepKZ5ZrwwIOyRHbx8KSjyBlitQpz0N2gnI3TyFZiIhOJV g+Rnql24DPAaxs79GKZpLXPib3uXVNgCRVy1hLY8a0+tpL2bblgT5X7lFN9FvESup7p+ 6A== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3ryh7g8nkx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 20 Jul 2023 23:35:49 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Thu, 20 Jul 2023 23:35:47 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Thu, 20 Jul 2023 23:35:47 -0700 Received: from odyssey-031.marvell.com (unknown [10.75.48.92]) by maili.marvell.com (Postfix) with ESMTP id C6A193F7068; Thu, 20 Jul 2023 23:35:46 -0700 (PDT) From: Tanmay Jagdale To: , , , , CC: , , , , , , Subject: [RESEND PATCH 1/4] iommu/arm-smmu-v3: Add support for ECMDQ register mode Date: Fri, 21 Jul 2023 02:35:10 -0400 Message-ID: <20230721063513.33431-2-tanmay@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230721063513.33431-1-tanmay@marvell.com> References: <20230721063513.33431-1-tanmay@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: 5KSR_5cuM44LwNz2vtgtOEtQ4wy13NLR X-Proofpoint-GUID: 5KSR_5cuM44LwNz2vtgtOEtQ4wy13NLR X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-21_03,2023-07-20_01,2023-05-22_02 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772013936633179960 X-GMAIL-MSGID: 1772013936633179960 From: Zhen Lei Ensure that each core exclusively occupies an ECMDQ and all of them are enabled during initialization. During this initialization process, any errors will result in a fallback to using normal CMDQ. When GERROR is triggered by ECMDQ, all ECMDQs need to be traversed: the ECMDQs with errors will be processed and the ECMDQs without errors will be skipped directly. Compared with register SMMU_CMDQ_PROD, register SMMU_ECMDQ_PROD has one more 'EN' bit and one more 'ERRACK' bit. Therefore, an extra member 'ecmdq_prod' is added to record the values of these two bits. Each time register SMMU_ECMDQ_PROD is updated, the value of 'ecmdq_prod' is ORed. After the error indicated by SMMU_GERROR.CMDQP_ERR is fixed, the 'ERRACK' bit needs to be toggled to resume the corresponding ECMDQ. Therefore, a rwlock is used to protect the write operation to bit 'ERRACK' during error handling and the read operation to bit 'ERRACK' during command insertion. Signed-off-by: Zhen Lei --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 210 +++++++++++++++++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 36 ++++ 2 files changed, 245 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 9b0dc3505601..dfb5bf8cbcf9 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -347,6 +347,14 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu) { + if (smmu->ecmdq_enabled) { + struct arm_smmu_ecmdq *ecmdq; + + ecmdq = *this_cpu_ptr(smmu->ecmdq); + + return &ecmdq->cmdq; + } + return &smmu->cmdq; } @@ -429,6 +437,38 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu) __arm_smmu_cmdq_skip_err(smmu, &smmu->cmdq.q); } +static void arm_smmu_ecmdq_skip_err(struct arm_smmu_device *smmu) +{ + int i; + u32 prod, cons; + struct arm_smmu_queue *q; + struct arm_smmu_ecmdq *ecmdq; + + for (i = 0; i < smmu->nr_ecmdq; i++) { + unsigned long flags; + + ecmdq = *per_cpu_ptr(smmu->ecmdq, i); + q = &ecmdq->cmdq.q; + + prod = readl_relaxed(q->prod_reg); + cons = readl_relaxed(q->cons_reg); + if (((prod ^ cons) & ECMDQ_CONS_ERR) == 0) + continue; + + __arm_smmu_cmdq_skip_err(smmu, q); + + write_lock_irqsave(&q->ecmdq_lock, flags); + q->ecmdq_prod &= ~ECMDQ_PROD_ERRACK; + q->ecmdq_prod |= cons & ECMDQ_CONS_ERR; + + prod = readl_relaxed(q->prod_reg); + prod &= ~ECMDQ_PROD_ERRACK; + prod |= cons & ECMDQ_CONS_ERR; + writel(prod, q->prod_reg); + write_unlock_irqrestore(&q->ecmdq_lock, flags); + } +} + /* * Command queue locking. * This is a form of bastardised rwlock with the following major changes: @@ -825,7 +865,13 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, * d. Advance the hardware prod pointer * Control dependency ordering from the entries becoming valid. */ - writel_relaxed(prod, cmdq->q.prod_reg); + if (smmu->ecmdq_enabled) { + read_lock(&cmdq->q.ecmdq_lock); + writel_relaxed(prod | cmdq->q.ecmdq_prod, cmdq->q.prod_reg); + read_unlock(&cmdq->q.ecmdq_lock); + } else { + writel_relaxed(prod, cmdq->q.prod_reg); + } /* * e. Tell the next owner we're done @@ -1701,6 +1747,9 @@ static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev) if (active & GERROR_CMDQ_ERR) arm_smmu_cmdq_skip_err(smmu); + if (active & GERROR_CMDQP_ERR) + arm_smmu_ecmdq_skip_err(smmu); + writel(gerror, smmu->base + ARM_SMMU_GERRORN); return IRQ_HANDLED; } @@ -2957,6 +3006,20 @@ static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu) return 0; } +static int arm_smmu_ecmdq_init(struct arm_smmu_cmdq *cmdq) +{ + unsigned int nents = 1 << cmdq->q.llq.max_n_shift; + + atomic_set(&cmdq->owner_prod, 0); + atomic_set(&cmdq->lock, 0); + + cmdq->valid_map = (atomic_long_t *)bitmap_zalloc(nents, GFP_KERNEL); + if (!cmdq->valid_map) + return -ENOMEM; + + return 0; +} + static int arm_smmu_init_queues(struct arm_smmu_device *smmu) { int ret; @@ -3307,6 +3370,7 @@ static int arm_smmu_device_disable(struct arm_smmu_device *smmu) static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) { + int i; int ret; u32 reg, enables; struct arm_smmu_cmdq_ent cmd; @@ -3351,6 +3415,28 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) writel_relaxed(smmu->cmdq.q.llq.prod, smmu->base + ARM_SMMU_CMDQ_PROD); writel_relaxed(smmu->cmdq.q.llq.cons, smmu->base + ARM_SMMU_CMDQ_CONS); + for (i = 0; i < smmu->nr_ecmdq; i++) { + struct arm_smmu_ecmdq *ecmdq; + struct arm_smmu_queue *q; + + ecmdq = *per_cpu_ptr(smmu->ecmdq, i); + q = &ecmdq->cmdq.q; + + writeq_relaxed(q->q_base, ecmdq->base + ARM_SMMU_ECMDQ_BASE); + writel_relaxed(q->llq.prod, ecmdq->base + ARM_SMMU_ECMDQ_PROD); + writel_relaxed(q->llq.cons, ecmdq->base + ARM_SMMU_ECMDQ_CONS); + + /* enable ecmdq */ + writel(ECMDQ_PROD_EN, q->prod_reg); + ret = readl_relaxed_poll_timeout(q->cons_reg, reg, reg & ECMDQ_CONS_ENACK, + 1, ARM_SMMU_POLL_TIMEOUT_US); + if (ret) { + dev_err(smmu->dev, "ecmdq[%d] enable failed\n", i); + smmu->ecmdq_enabled = 0; + break; + } + } + enables = CR0_CMDQEN; ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0, ARM_SMMU_CR0ACK); @@ -3476,6 +3562,115 @@ static void arm_smmu_device_iidr_probe(struct arm_smmu_device *smmu) } break; } +}; + +static int arm_smmu_ecmdq_layout(struct arm_smmu_device *smmu) +{ + int cpu; + struct arm_smmu_ecmdq *ecmdq; + + if (num_possible_cpus() <= smmu->nr_ecmdq) { + ecmdq = devm_alloc_percpu(smmu->dev, *ecmdq); + if (!ecmdq) + return -ENOMEM; + + for_each_possible_cpu(cpu) + *per_cpu_ptr(smmu->ecmdq, cpu) = per_cpu_ptr(ecmdq, cpu); + + /* A core requires at most one ECMDQ */ + smmu->nr_ecmdq = num_possible_cpus(); + + return 0; + } + + return -ENOSPC; +} + +static int arm_smmu_ecmdq_probe(struct arm_smmu_device *smmu) +{ + int ret, cpu; + u32 i, nump, numq, gap; + u32 reg, shift_increment; + u64 addr, smmu_dma_base; + void __iomem *cp_regs, *cp_base; + + /* IDR6 */ + reg = readl_relaxed(smmu->base + ARM_SMMU_IDR6); + nump = 1 << FIELD_GET(IDR6_LOG2NUMP, reg); + numq = 1 << FIELD_GET(IDR6_LOG2NUMQ, reg); + smmu->nr_ecmdq = nump * numq; + gap = ECMDQ_CP_RRESET_SIZE >> FIELD_GET(IDR6_LOG2NUMQ, reg); + + smmu_dma_base = (vmalloc_to_pfn(smmu->base) << PAGE_SHIFT); + cp_regs = ioremap(smmu_dma_base + ARM_SMMU_ECMDQ_CP_BASE, PAGE_SIZE); + if (!cp_regs) + return -ENOMEM; + + for (i = 0; i < nump; i++) { + u64 val, pre_addr; + + val = readq_relaxed(cp_regs + 32 * i); + if (!(val & ECMDQ_CP_PRESET)) { + iounmap(cp_regs); + dev_err(smmu->dev, "ecmdq control page %u is memory mode\n", i); + return -EFAULT; + } + + if (i && ((val & ECMDQ_CP_ADDR) != (pre_addr + ECMDQ_CP_RRESET_SIZE))) { + iounmap(cp_regs); + dev_err(smmu->dev, "ecmdq_cp memory region is not contiguous\n"); + return -EFAULT; + } + + pre_addr = val & ECMDQ_CP_ADDR; + } + + addr = readl_relaxed(cp_regs) & ECMDQ_CP_ADDR; + iounmap(cp_regs); + + cp_base = devm_ioremap(smmu->dev, smmu_dma_base + addr, ECMDQ_CP_RRESET_SIZE * nump); + if (!cp_base) + return -ENOMEM; + + smmu->ecmdq = devm_alloc_percpu(smmu->dev, struct arm_smmu_ecmdq *); + if (!smmu->ecmdq) + return -ENOMEM; + + ret = arm_smmu_ecmdq_layout(smmu); + if (ret) + return ret; + + shift_increment = order_base_2(num_possible_cpus() / smmu->nr_ecmdq); + + addr = 0; + for_each_possible_cpu(cpu) { + struct arm_smmu_ecmdq *ecmdq; + struct arm_smmu_queue *q; + + ecmdq = *per_cpu_ptr(smmu->ecmdq, cpu); + ecmdq->base = cp_base + addr; + + q = &ecmdq->cmdq.q; + + q->llq.max_n_shift = ECMDQ_MAX_SZ_SHIFT + shift_increment; + ret = arm_smmu_init_one_queue(smmu, q, ecmdq->base, ARM_SMMU_ECMDQ_PROD, + ARM_SMMU_ECMDQ_CONS, CMDQ_ENT_DWORDS, "ecmdq"); + if (ret) + return ret; + + q->ecmdq_prod = ECMDQ_PROD_EN; + rwlock_init(&q->ecmdq_lock); + + ret = arm_smmu_ecmdq_init(&ecmdq->cmdq); + if (ret) { + dev_err(smmu->dev, "ecmdq[%d] init failed\n", i); + return ret; + } + + addr += gap; + } + + return 0; } static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) @@ -3588,6 +3783,9 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) return -ENXIO; } + if (reg & IDR1_ECMDQ) + smmu->features |= ARM_SMMU_FEAT_ECMDQ; + /* Queue sizes, capped to ensure natural alignment */ smmu->cmdq.q.llq.max_n_shift = min_t(u32, CMDQ_MAX_SZ_SHIFT, FIELD_GET(IDR1_CMDQS, reg)); @@ -3695,6 +3893,16 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) dev_info(smmu->dev, "ias %lu-bit, oas %lu-bit (features 0x%08x)\n", smmu->ias, smmu->oas, smmu->features); + + if (smmu->features & ARM_SMMU_FEAT_ECMDQ) { + int err; + + err = arm_smmu_ecmdq_probe(smmu); + if (err) { + dev_err(smmu->dev, "suppress ecmdq feature, errno=%d\n", err); + smmu->ecmdq_enabled = 0; + } + } return 0; } diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index dcab85698a4e..1f8777817e31 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -41,6 +41,7 @@ #define IDR0_S2P (1 << 0) #define ARM_SMMU_IDR1 0x4 +#define IDR1_ECMDQ (1 << 31) #define IDR1_TABLES_PRESET (1 << 30) #define IDR1_QUEUES_PRESET (1 << 29) #define IDR1_REL (1 << 28) @@ -113,6 +114,7 @@ #define ARM_SMMU_IRQ_CTRLACK 0x54 #define ARM_SMMU_GERROR 0x60 +#define GERROR_CMDQP_ERR (1 << 9) #define GERROR_SFM_ERR (1 << 8) #define GERROR_MSI_GERROR_ABT_ERR (1 << 7) #define GERROR_MSI_PRIQ_ABT_ERR (1 << 6) @@ -158,6 +160,26 @@ #define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8 #define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc +#define ARM_SMMU_IDR6 0x190 +#define IDR6_LOG2NUMP GENMASK(27, 24) +#define IDR6_LOG2NUMQ GENMASK(19, 16) +#define IDR6_BA_DOORBELLS GENMASK(9, 0) + +#define ARM_SMMU_ECMDQ_BASE 0x00 +#define ARM_SMMU_ECMDQ_PROD 0x08 +#define ARM_SMMU_ECMDQ_CONS 0x0c +#define ECMDQ_MAX_SZ_SHIFT 8 +#define ECMDQ_PROD_EN (1 << 31) +#define ECMDQ_CONS_ENACK (1 << 31) +#define ECMDQ_CONS_ERR (1 << 23) +#define ECMDQ_PROD_ERRACK (1 << 23) + +#define ARM_SMMU_ECMDQ_CP_BASE 0x4000 +#define ECMDQ_CP_ADDR GENMASK_ULL(51, 12) +#define ECMDQ_CP_CMDQGS GENMASK_ULL(2, 1) +#define ECMDQ_CP_PRESET (1UL << 0) +#define ECMDQ_CP_RRESET_SIZE 0x10000 + #define ARM_SMMU_REG_SZ 0xe00 /* Common MSI config fields */ @@ -527,6 +549,8 @@ struct arm_smmu_ll_queue { struct arm_smmu_queue { struct arm_smmu_ll_queue llq; int irq; /* Wired interrupt */ + u32 ecmdq_prod; + rwlock_t ecmdq_lock; __le64 *base; dma_addr_t base_dma; @@ -552,6 +576,11 @@ struct arm_smmu_cmdq { atomic_t lock; }; +struct arm_smmu_ecmdq { + struct arm_smmu_cmdq cmdq; + void __iomem *base; +}; + struct arm_smmu_cmdq_batch { u64 cmds[CMDQ_BATCH_ENTRIES * CMDQ_ENT_DWORDS]; int num; @@ -646,6 +675,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_SVA (1 << 17) #define ARM_SMMU_FEAT_E2H (1 << 18) #define ARM_SMMU_FEAT_NESTING (1 << 19) +#define ARM_SMMU_FEAT_ECMDQ (1 << 20) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) @@ -654,6 +684,12 @@ struct arm_smmu_device { #define ARM_SMMU_OPT_CMDQ_FORCE_SYNC (1 << 3) u32 options; + union { + u32 nr_ecmdq; + u32 ecmdq_enabled; + }; + struct arm_smmu_ecmdq *__percpu *ecmdq; + struct arm_smmu_cmdq cmdq; struct arm_smmu_evtq evtq; struct arm_smmu_priq priq; From patchwork Fri Jul 21 06:35:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanmay Jagdale X-Patchwork-Id: 123606 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp17903vqg; Fri, 21 Jul 2023 00:01:13 -0700 (PDT) X-Google-Smtp-Source: APBJJlGJYhP2WWEYAw5VZkv788OlAvpUiK15V/Ccr5Nl8P+1DNkXvyYjHpVFA4AhLMRHNmbfp8SA X-Received: by 2002:a05:6a20:8f0c:b0:137:4bff:7c92 with SMTP id b12-20020a056a208f0c00b001374bff7c92mr1154677pzk.0.1689922872614; Fri, 21 Jul 2023 00:01:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689922872; cv=none; d=google.com; s=arc-20160816; b=hvMDkmOO0PmgeTzgXOi1ZYRA0I0FW6rAnfFmdMOpDNcgaMSATKCDfpvHANHfj32885 +aqHQnzK6KPg/7XN1cqJiySXFF26fVxafuFb4k5tiwTOg+t2DyRMGxkTDoAL5PyoWToi ADCV4NsAHLCwQmT7BWeRly3TaX3F8oqKfBukL8CGBbvxu8nQ/OB8kETiHAzNlh18yZfM Rln8FTwLbZSEkvYbzafvH+Cvbayh5JpZAXzNLQp5zJ9NBHcXv0JBVdVVHwkaxQioC3J+ xGkAhw2r37bVFGurxYuMbkHgp1Hgd17WsgGdr0PqcQPXGOsk5ZvN8BFH3FSsp6pcymxN 9jsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=7mnorin6NXNKK95IWr6irgZNfgOv849RtHW6/z7AsoI=; fh=fd26O43kPxr11JZ766hECURF11FSWmFuTg8mAeKtYLQ=; b=nkGlmluVsxi7BVZWGNxN+Rmw617ySFD9V+6Hg3n2ylNdsBPlwuwHiGBQ39rh0p1Bty 01oZi5L84z/Dqz7uit/v34/jZPE2SKHZvjbLpVqCi6woGLBKU54VzGajZMbVkk/S44sC 4BL1XKMK3FukZvmW4FEXlHjXDylJ43RePDGC1WmvFIQHByTf5RgBFj0FAM55GFsUpmMu ld8U39XHbnbYvpwGqRTtJcom+/YB8jww0AQeHM/boKW1WNJkAccxGTj3hbx647locA9I v5yHBDUpBoKml4wmJ2iZIU/25HZSc/uL6OU8tsaFK6hjqEev4NKQMFAZimRcXJ74gIQH T9kA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=iX2BPUyS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b3-20020a056a000cc300b0068219eb2796si2651750pfv.267.2023.07.21.00.00.59; Fri, 21 Jul 2023 00:01:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=iX2BPUyS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231218AbjGUGgg (ORCPT + 99 others); Fri, 21 Jul 2023 02:36:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34804 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231153AbjGUGgX (ORCPT ); Fri, 21 Jul 2023 02:36:23 -0400 Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFD582718 for ; Thu, 20 Jul 2023 23:36:20 -0700 (PDT) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36L5WnJp017249; Thu, 20 Jul 2023 23:35:58 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=7mnorin6NXNKK95IWr6irgZNfgOv849RtHW6/z7AsoI=; b=iX2BPUySrpWWHzWnrPmp/ry8Zf5oCA3FSoTjZSxOhB1mdtLCNGHwjIBAKaxjXFktYd2E 42lLWRF73ucQNeZ6RX/2o0fdmN8usAqh7Z7HeZCkExIVDlM3UiqC6aLOFeTzII7RYmlp qcqjHJCYLzlknO17s1SB2bV9tA2Rb1nDu5HHL8sD8yC1/YDFCHrZiHFshGXPQqViM0Mx rN7pvVlOVnZaKGhPXGLlydDDhQ88C59IXiJvCrcDJDbR9InF3VX1cqK5AhEVlWySxaqh oL3RxrOS8Fs3095M6X8LnzGlzFsU7cFpJmeYgmvTSfm4jhQ/9yg9m7PXwjFkWb+Az/Vj 3w== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3ryh5egm4t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 20 Jul 2023 23:35:58 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Thu, 20 Jul 2023 23:35:56 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Thu, 20 Jul 2023 23:35:56 -0700 Received: from odyssey-031.marvell.com (unknown [10.75.48.92]) by maili.marvell.com (Postfix) with ESMTP id B52713F7068; Thu, 20 Jul 2023 23:35:55 -0700 (PDT) From: Tanmay Jagdale To: , , , , CC: , , , , , , Subject: [RESEND PATCH 2/4] iommu/arm-smmu-v3: Ensure that a set of associated commands are inserted in the same ECMDQ Date: Fri, 21 Jul 2023 02:35:11 -0400 Message-ID: <20230721063513.33431-3-tanmay@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230721063513.33431-1-tanmay@marvell.com> References: <20230721063513.33431-1-tanmay@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: 4ftrmbQv65nvJOi9WF3WLoc_DhG0wT1a X-Proofpoint-ORIG-GUID: 4ftrmbQv65nvJOi9WF3WLoc_DhG0wT1a X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-21_03,2023-07-20_01,2023-05-22_02 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772012566213321792 X-GMAIL-MSGID: 1772012566213321792 From: Zhen Lei The SYNC command only ensures that the command that precedes it in the same ECMDQ must be executed, but cannot synchronize the commands in other ECMDQs. If an unmap involves multiple commands, some commands are executed on one core, and the other commands are executed on another core. In this case, after the SYNC execution is complete, the execution of all preceded commands can not be ensured. Prevent the process that performs a set of associated commands insertion from being migrated to other cores ensures that all commands are inserted into the same ECMDQ. Signed-off-by: Zhen Lei --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 +++++++++++++++++---- 1 file changed, 34 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index dfb5bf8cbcf9..1b3b37a1972e 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -243,6 +243,18 @@ static int queue_remove_raw(struct arm_smmu_queue *q, u64 *ent) return 0; } +static void arm_smmu_preempt_disable(struct arm_smmu_device *smmu) +{ + if (smmu->ecmdq_enabled) + preempt_disable(); +} + +static void arm_smmu_preempt_enable(struct arm_smmu_device *smmu) +{ + if (smmu->ecmdq_enabled) + preempt_enable(); +} + /* High-level queue accessors */ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) { @@ -1035,6 +1047,7 @@ static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, cmds.num = 0; + arm_smmu_preempt_disable(smmu); spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_for_each_entry(master, &smmu_domain->devices, domain_head) { for (i = 0; i < master->num_streams; i++) { @@ -1045,6 +1058,7 @@ static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); arm_smmu_cmdq_batch_submit(smmu, &cmds); + arm_smmu_preempt_enable(smmu); } static int arm_smmu_alloc_cd_leaf_table(struct arm_smmu_device *smmu, @@ -1840,31 +1854,38 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size, static int arm_smmu_atc_inv_master(struct arm_smmu_master *master) { - int i; + int i, ret; struct arm_smmu_cmdq_ent cmd; struct arm_smmu_cmdq_batch cmds; + struct arm_smmu_device *smmu = master->smmu; arm_smmu_atc_inv_to_cmd(0, 0, 0, &cmd); cmds.num = 0; + + arm_smmu_preempt_disable(smmu); for (i = 0; i < master->num_streams; i++) { cmd.atc.sid = master->streams[i].id; - arm_smmu_cmdq_batch_add(master->smmu, &cmds, &cmd); + arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd); } - return arm_smmu_cmdq_batch_submit(master->smmu, &cmds); + ret = arm_smmu_cmdq_batch_submit(smmu, &cmds); + arm_smmu_preempt_enable(smmu); + + return ret; } int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid, unsigned long iova, size_t size) { - int i; + int i, ret; unsigned long flags; struct arm_smmu_cmdq_ent cmd; struct arm_smmu_master *master; struct arm_smmu_cmdq_batch cmds; + struct arm_smmu_device *smmu = smmu_domain->smmu; - if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS)) + if (!(smmu->features & ARM_SMMU_FEAT_ATS)) return 0; /* @@ -1888,6 +1909,7 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid, cmds.num = 0; + arm_smmu_preempt_disable(smmu); spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_for_each_entry(master, &smmu_domain->devices, domain_head) { if (!master->ats_enabled) @@ -1895,12 +1917,15 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid, for (i = 0; i < master->num_streams; i++) { cmd.atc.sid = master->streams[i].id; - arm_smmu_cmdq_batch_add(smmu_domain->smmu, &cmds, &cmd); + arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd); } } spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); - return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds); + ret = arm_smmu_cmdq_batch_submit(smmu, &cmds); + arm_smmu_preempt_enable(smmu); + + return ret; } /* IO_PGTABLE API */ @@ -1960,6 +1985,7 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd, cmds.num = 0; + arm_smmu_preempt_disable(smmu); while (iova < end) { if (smmu->features & ARM_SMMU_FEAT_RANGE_INV) { /* @@ -1991,6 +2017,7 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd, iova += inv_range; } arm_smmu_cmdq_batch_submit(smmu, &cmds); + arm_smmu_preempt_enable(smmu); } static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size, From patchwork Fri Jul 21 06:35:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanmay Jagdale X-Patchwork-Id: 123610 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp19151vqg; Fri, 21 Jul 2023 00:03:28 -0700 (PDT) X-Google-Smtp-Source: APBJJlF7S7nFqhewKyFBe9gGvczBjvu0ALFJrMtN2DV8EI3mM4yoWb9SH0aMOUvz7zlvmyP9MzBt X-Received: by 2002:a05:6a00:1747:b0:677:bdc:cd6b with SMTP id j7-20020a056a00174700b006770bdccd6bmr906880pfc.19.1689923008251; Fri, 21 Jul 2023 00:03:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689923008; cv=none; d=google.com; s=arc-20160816; b=GJhnN3DYlXlp8HLFXEbcxB7VbzstJAX8OGxuMJYyuEJlJEkWI7DJSlq77ObO28Kcu6 gbvX5eIo34Vbj6QidoBLT+fe5Gy3JRNKKepCrdyV6pMjjMKxCgcj8bcD2JOAnndZqa5j bW/f/vDDJD0RPFB2qJMSE/PJDW6o1tv3F/5pLugoAVMR05GGWT0FlZylPCrhBal1w39h KRDd5M5Ji2ypQWumwhcCNe6r8AVLCTPQrIKZHdldIrgeZMRvNvjwrWxMKPHy/PYGiOUW 2ooxogn1J2qzRd4YA9uXPgCKinnuBOK117D6gTG3TS3yHdbRJs0ALtqGOMEKXlhYv61H S6HA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=eiE0t1Y6hvHtNltoFN6UQCbnWTgRzOhbld9Wfvf5Ag8=; fh=fd26O43kPxr11JZ766hECURF11FSWmFuTg8mAeKtYLQ=; b=RROp8VnSrAAouBVLk68jabKKAxOGJg4EbpU78Jxwo9+8ixIODwTAYkLlXF3RrqLnsi RDsKlqz+5pUa1nJAmKuwx/r5kmvZ6HX+xRcJet0GHmHI4U+i+0Q3LHH1vXT7wdjbGJKd yxZntaxTkJe1TmtZ2hou7QO3/OYan9Ov76hI4Sdiu0paMowzYmjw53cgotafRZt13CLb qoy+edNPz7o8vAHpgTvWKZj5KyiRjRa9+nsuJp7MeeIlDk3LiNaFKUbgUWEtydgdI2Jm MZ85v/w4sctqSAqohV5em5jZBVTen+gG+h+LtMnpvvjZRzotp5Z7yZG7YXhJ/s+PenYn nqYg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=b8ExM3Ss; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fh41-20020a056a00392900b0064d45bbba8csi2540386pfb.62.2023.07.21.00.03.10; Fri, 21 Jul 2023 00:03:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=b8ExM3Ss; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231228AbjGUGgo (ORCPT + 99 others); Fri, 21 Jul 2023 02:36:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35130 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231190AbjGUGgc (ORCPT ); Fri, 21 Jul 2023 02:36:32 -0400 Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F6522D7F for ; Thu, 20 Jul 2023 23:36:25 -0700 (PDT) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36L2PfHc010529; Thu, 20 Jul 2023 23:36:06 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=eiE0t1Y6hvHtNltoFN6UQCbnWTgRzOhbld9Wfvf5Ag8=; b=b8ExM3SsqvB5zftlu1jz7QlLDtsuroLK4ODpSyjD8mHNkYz34rzXeZb2y1jmGbNvdh+G oFApoOVm3giMUONpHXYy+SwOEE7FNHL+28j3jbIDK3Zi4WOa58fuWlUEtmNe8LJpYyeN m7T9gwCsFwgj0GG5cP7DRfbE1fCLhyt1HU0tjOFFTmfCdE2ph1Dphz3DiSm/kooibcE7 UaAYDRGbwD5V71HlToC8np26bxTdp+yamctc4yZ8lgI0ZZE5P3Kzfg8HovLmFMYs401k ASI1LgoCEmIjC/mR716k/bSxEuKwlhLmuRK1xeOqRKgNoIEdJHsd/4YAyp3hCmgQ87sD Ug== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3ryh7g8nme-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 20 Jul 2023 23:36:06 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Thu, 20 Jul 2023 23:36:04 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Thu, 20 Jul 2023 23:36:04 -0700 Received: from odyssey-031.marvell.com (unknown [10.75.48.92]) by maili.marvell.com (Postfix) with ESMTP id BCBF13F7068; Thu, 20 Jul 2023 23:36:03 -0700 (PDT) From: Tanmay Jagdale To: , , , , CC: , , , , , , Subject: [RESEND PATCH 3/4] iommu/arm-smmu-v3: Add arm_smmu_ecmdq_issue_cmdlist() for non-shared ECMDQ Date: Fri, 21 Jul 2023 02:35:12 -0400 Message-ID: <20230721063513.33431-4-tanmay@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230721063513.33431-1-tanmay@marvell.com> References: <20230721063513.33431-1-tanmay@marvell.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: _HYL2lVCotGiVS3tVEKCMo3a6KHKTKoJ X-Proofpoint-GUID: _HYL2lVCotGiVS3tVEKCMo3a6KHKTKoJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-21_03,2023-07-20_01,2023-05-22_02 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772012708103960070 X-GMAIL-MSGID: 1772012708103960070 From: Zhen Lei When a core can exclusively own an ECMDQ, competition with other cores does not need to be considered during command insertion. Therefore, we can delete the part of arm_smmu_cmdq_issue_cmdlist() that deals with multi-core contention and generate a more efficient ECMDQ-specific function arm_smmu_ecmdq_issue_cmdlist(). Signed-off-by: Zhen Lei --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 85 +++++++++++++++++++++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 + 2 files changed, 86 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 1b3b37a1972e..dc3ff4796aaf 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -777,6 +777,87 @@ static void arm_smmu_cmdq_write_entries(struct arm_smmu_cmdq *cmdq, u64 *cmds, } } +/* + * The function is used when the current core exclusively occupies an ECMDQ. + * This is a reduced version of arm_smmu_cmdq_issue_cmdlist(), which eliminates + * a lot of unnecessary inter-core competition considerations. + */ +static int arm_smmu_ecmdq_issue_cmdlist(struct arm_smmu_device *smmu, + struct arm_smmu_cmdq *cmdq, + u64 *cmds, int n, bool sync) +{ + u32 prod; + unsigned long flags; + struct arm_smmu_ll_queue llq = { + .max_n_shift = cmdq->q.llq.max_n_shift, + }, head; + int ret = 0; + + /* 1. Allocate some space in the queue */ + local_irq_save(flags); + llq.val = READ_ONCE(cmdq->q.llq.val); + do { + u64 old; + + while (!queue_has_space(&llq, n + sync)) { + local_irq_restore(flags); + if (arm_smmu_cmdq_poll_until_not_full(smmu, &llq)) + dev_err_ratelimited(smmu->dev, "ECMDQ timeout\n"); + local_irq_save(flags); + } + + head.cons = llq.cons; + head.prod = queue_inc_prod_n(&llq, n + sync); + + old = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val); + if (old == llq.val) + break; + + llq.val = old; + } while (1); + + /* 2. Write our commands into the queue */ + arm_smmu_cmdq_write_entries(cmdq, cmds, llq.prod, n); + if (sync) { + u64 cmd_sync[CMDQ_ENT_DWORDS]; + + prod = queue_inc_prod_n(&llq, n); + arm_smmu_cmdq_build_sync_cmd(cmd_sync, smmu, &cmdq->q, prod); + queue_write(Q_ENT(&cmdq->q, prod), cmd_sync, CMDQ_ENT_DWORDS); + } + + /* 3. Ensuring commands are visible first */ + dma_wmb(); + + /* 4. Advance the hardware prod pointer */ + read_lock(&cmdq->q.ecmdq_lock); + writel_relaxed(head.prod | cmdq->q.ecmdq_prod, cmdq->q.prod_reg); + read_unlock(&cmdq->q.ecmdq_lock); + + /* 5. If we are inserting a CMD_SYNC, we must wait for it to complete */ + if (sync) { + llq.prod = queue_inc_prod_n(&llq, n); + ret = arm_smmu_cmdq_poll_until_sync(smmu, &llq); + if (ret) { + dev_err_ratelimited(smmu->dev, + "CMD_SYNC timeout at 0x%08x [hwprod 0x%08x, hwcons 0x%08x]\n", + llq.prod, + readl_relaxed(cmdq->q.prod_reg), + readl_relaxed(cmdq->q.cons_reg)); + } + + /* + * Update cmdq->q.llq.cons, to improve the success rate of + * queue_has_space() when some new commands are inserted next + * time. + */ + WRITE_ONCE(cmdq->q.llq.cons, llq.cons); + } + + local_irq_restore(flags); + return ret; +} + /* * This is the actual insertion function, and provides the following * ordering guarantees to callers: @@ -806,6 +887,9 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, llq.max_n_shift = cmdq->q.llq.max_n_shift; + if (!cmdq->shared) + return arm_smmu_ecmdq_issue_cmdlist(smmu, cmdq, cmds, n, sync); + /* 1. Allocate some space in the queue */ local_irq_save(flags); llq.val = READ_ONCE(cmdq->q.llq.val); @@ -3022,6 +3106,7 @@ static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu) struct arm_smmu_cmdq *cmdq = &smmu->cmdq; unsigned int nents = 1 << cmdq->q.llq.max_n_shift; + cmdq->shared = 1; atomic_set(&cmdq->owner_prod, 0); atomic_set(&cmdq->lock, 0); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 1f8777817e31..a8988fcd605f 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -574,6 +574,7 @@ struct arm_smmu_cmdq { atomic_long_t *valid_map; atomic_t owner_prod; atomic_t lock; + int shared; }; struct arm_smmu_ecmdq { From patchwork Fri Jul 21 06:35:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanmay Jagdale X-Patchwork-Id: 123617 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp27567vqg; Fri, 21 Jul 2023 00:23:25 -0700 (PDT) X-Google-Smtp-Source: APBJJlFKFUXp1d6q/klPhmb0dBMZ9g8eZlAQaQwMTtQN7ZG+CaFv5vCzrtSHLiXdwoEpBVLJ4wEC X-Received: by 2002:a17:90b:1808:b0:263:feab:2804 with SMTP id lw8-20020a17090b180800b00263feab2804mr722424pjb.37.1689924205325; Fri, 21 Jul 2023 00:23:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689924205; cv=none; d=google.com; s=arc-20160816; b=X7pfqRX8Fny0l13/DtMqOY7lLeyHJN63GIFwTJX7XZFbzD/e0GjlJg8DrSt+4Fi1xg vXoi5358OBUpZ8+5eVWQChqb4mItkcNiXFTtUQDnb8zGSv1Cl5naD6C12bMVaiMiddMU bBbyeKOUS3ApPgeZz9VIWbd/rUh8wkFBHMLTJyin2lWvsWqhw9Wx4DOcYwEQaWt56Lpl L7MSuAG0R/rr5S7lXuQjv7Z4tKBtdDqse5Xm4LYMgp29R9J5PVFnKYlCCjwXTzOrB0I+ I+Lp4BeHrOpVsoMa3W5xsiOKmRa3L5FkEwz3UIV3XEjIDsGBvduzPd9Lw1Hwm6Qz8wYt gf/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=y5rYVMqlqcgaO0n1TOniSNX7UmtSDrRMaWOfswavg+g=; fh=fd26O43kPxr11JZ766hECURF11FSWmFuTg8mAeKtYLQ=; b=lzFEl6PPQDiQw59kBuA6AoImft/FJ2SOPl9BVVKccTHXTssOlwgjA3PhokalCCVMpL DpNStGUVJ3aXZD3eH6iO+btZcdglqZx//wDLLtkwapFQgFA/41PaaJslKcfCrE+s7k4o 8FwznHlvuSfhTfUY6UYcahLfMEPqII9BCMWjrkLEOjfgjcV63S0NALKTI0msRT/rmS8q HmkLn+n4X85tU9jhQ77iXMZSnm72yiVnbI5yvfroBCA3nqItDZ6raqpx9N9panO+Tw1g OhLvGAhrwIkEtEex8zZb7mN3mMJbJxFixUKbzaRs/OPqgI0u3xaa5HCigZYRyY+oNpEU M2RQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=J8HOFnUL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ls4-20020a17090b350400b002563db5c4b0si5624994pjb.184.2023.07.21.00.23.12; Fri, 21 Jul 2023 00:23:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=J8HOFnUL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231262AbjGUGhR (ORCPT + 99 others); Fri, 21 Jul 2023 02:37:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35198 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231211AbjGUGgz (ORCPT ); Fri, 21 Jul 2023 02:36:55 -0400 Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3FF053588 for ; Thu, 20 Jul 2023 23:36:41 -0700 (PDT) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36L2LDL7013218; Thu, 20 Jul 2023 23:36:17 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=y5rYVMqlqcgaO0n1TOniSNX7UmtSDrRMaWOfswavg+g=; b=J8HOFnULcDdLqKzHn+S/zOaKGSrzpVlyzjwPbG7cTvqf3DFsPGsCAgtxRn0Fg/d8Hy7O k1kkamkZID4kA5/cPM4g4BumrBRsAwYNnnoP1T97HhtjTM6S6bnw48bj/k6Eg+cvt4U/ 0GQKcr3nngjt3R94MiBuoTqXwpbaRSCS/j64JMZw7XDQPGYIQ77g0//qa3ElzIs5CUs5 YLt26xywfIQL0Vf0b0eZ8mIeav6+YNl9Rpr5iRSqPtZyb+tfpG0Pc3ZmcR4Ued0vgzBn ISdabiSj+ZT24LucxwcH/RPrBSWe7f3Wv4HbGCuhKiXi4DWElTkEkfXyb2JK3Bj53C51 6Q== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3ryh5egm67-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 20 Jul 2023 23:36:17 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Thu, 20 Jul 2023 23:36:15 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Thu, 20 Jul 2023 23:36:15 -0700 Received: from odyssey-031.marvell.com (unknown [10.75.48.92]) by maili.marvell.com (Postfix) with ESMTP id A35BE3F706D; Thu, 20 Jul 2023 23:36:14 -0700 (PDT) From: Tanmay Jagdale To: , , , , CC: , , , , , , Subject: [RESEND PATCH 4/4] iommu/arm-smmu-v3: Add support for less than one ECMDQ per core Date: Fri, 21 Jul 2023 02:35:13 -0400 Message-ID: <20230721063513.33431-5-tanmay@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230721063513.33431-1-tanmay@marvell.com> References: <20230721063513.33431-1-tanmay@marvell.com> MIME-Version: 1.0 X-Proofpoint-GUID: 67Cu0u_IHwnyHfEkMTTzbXYj73N31Vpa X-Proofpoint-ORIG-GUID: 67Cu0u_IHwnyHfEkMTTzbXYj73N31Vpa X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-21_03,2023-07-20_01,2023-05-22_02 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772013963677316308 X-GMAIL-MSGID: 1772013963677316308 From: Zhen Lei Due to limited hardware resources, the number of ECMDQs may be less than the number of cores. If the number of ECMDQs is greater than the number of numa nodes, ensure that each node has at least one ECMDQ. This is because ECMDQ queue memory is requested from the NUMA node where it resides, which may result in better command filling and insertion performance. The current ECMDQ implementation reuses the command insertion function arm_smmu_cmdq_issue_cmdlist() of the normal CMDQ. This function already supports multiple cores concurrent insertion commands. Signed-off-by: Zhen Lei --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 101 ++++++++++++++++++-- 1 file changed, 92 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index dc3ff4796aaf..7a4f3d871635 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -3678,14 +3678,15 @@ static void arm_smmu_device_iidr_probe(struct arm_smmu_device *smmu) static int arm_smmu_ecmdq_layout(struct arm_smmu_device *smmu) { - int cpu; - struct arm_smmu_ecmdq *ecmdq; + int cpu, node, nr_remain, nr_nodes = 0; + int *nr_ecmdqs; + struct arm_smmu_ecmdq *ecmdq, **ecmdqs; - if (num_possible_cpus() <= smmu->nr_ecmdq) { - ecmdq = devm_alloc_percpu(smmu->dev, *ecmdq); - if (!ecmdq) - return -ENOMEM; + ecmdq = devm_alloc_percpu(smmu->dev, *ecmdq); + if (!ecmdq) + return -ENOMEM; + if (num_possible_cpus() <= smmu->nr_ecmdq) { for_each_possible_cpu(cpu) *per_cpu_ptr(smmu->ecmdq, cpu) = per_cpu_ptr(ecmdq, cpu); @@ -3695,7 +3696,79 @@ static int arm_smmu_ecmdq_layout(struct arm_smmu_device *smmu) return 0; } - return -ENOSPC; + for_each_node(node) + if (nr_cpus_node(node)) + nr_nodes++; + + if (nr_nodes >= smmu->nr_ecmdq) { + dev_err(smmu->dev, "%d ECMDQs is less than %d nodes\n", smmu->nr_ecmdq, nr_nodes); + return -ENOSPC; + } + + nr_ecmdqs = kcalloc(MAX_NUMNODES, sizeof(int), GFP_KERNEL); + if (!nr_ecmdqs) + return -ENOMEM; + + ecmdqs = kcalloc(smmu->nr_ecmdq, sizeof(*ecmdqs), GFP_KERNEL); + if (!ecmdqs) { + kfree(nr_ecmdqs); + return -ENOMEM; + } + + /* [1] Ensure that each node has at least one ECMDQ */ + nr_remain = smmu->nr_ecmdq - nr_nodes; + for_each_node(node) { + /* + * Calculate the number of ECMDQs to be allocated to this node. + * NR_ECMDQS_PER_CPU = nr_remain / num_possible_cpus(); + * When nr_cpus_node(node) is not zero, less than one ECMDQ + * may be left due to truncation rounding. + */ + nr_ecmdqs[node] = nr_cpus_node(node) * nr_remain / num_possible_cpus(); + nr_remain -= nr_ecmdqs[node]; + } + + /* Divide the remaining ECMDQs */ + while (nr_remain) { + for_each_node(node) { + if (!nr_remain) + break; + + if (nr_ecmdqs[node] >= nr_cpus_node(node)) + continue; + + nr_ecmdqs[node]++; + nr_remain--; + } + } + + for_each_node(node) { + int i, round, shared = 0; + + if (!nr_cpus_node(node)) + continue; + + /* An ECMDQ has been reserved for each node at above [1] */ + nr_ecmdqs[node]++; + + if (nr_ecmdqs[node] < nr_cpus_node(node)) + shared = 1; + + i = 0; + for_each_cpu(cpu, cpumask_of_node(node)) { + round = i % nr_ecmdqs[node]; + if (i++ < nr_ecmdqs[node]) { + ecmdqs[round] = per_cpu_ptr(ecmdq, cpu); + ecmdqs[round]->cmdq.shared = shared; + } + *per_cpu_ptr(smmu->ecmdq, cpu) = ecmdqs[round]; + } + } + + kfree(nr_ecmdqs); + kfree(ecmdqs); + + return 0; } static int arm_smmu_ecmdq_probe(struct arm_smmu_device *smmu) @@ -3760,10 +3833,20 @@ static int arm_smmu_ecmdq_probe(struct arm_smmu_device *smmu) struct arm_smmu_queue *q; ecmdq = *per_cpu_ptr(smmu->ecmdq, cpu); - ecmdq->base = cp_base + addr; - q = &ecmdq->cmdq.q; + /* + * The boot option "maxcpus=" can limit the number of online + * CPUs. The CPUs that are not selected are not showed in + * cpumask_of_node(node), their 'ecmdq' may be NULL. + * + * (q->ecmdq_prod & ECMDQ_PROD_EN) indicates that the ECMDQ is + * shared by multiple cores and has been initialized. + */ + if (!ecmdq || (q->ecmdq_prod & ECMDQ_PROD_EN)) + continue; + ecmdq->base = cp_base + addr; + q->llq.max_n_shift = ECMDQ_MAX_SZ_SHIFT + shift_increment; ret = arm_smmu_init_one_queue(smmu, q, ecmdq->base, ARM_SMMU_ECMDQ_PROD, ARM_SMMU_ECMDQ_CONS, CMDQ_ENT_DWORDS, "ecmdq");