From patchwork Wed Nov 16 17:16:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niklas Schnelle X-Patchwork-Id: 21203 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp264426wru; Wed, 16 Nov 2022 09:29:24 -0800 (PST) X-Google-Smtp-Source: AA0mqf5ulh1wImMLoP8hv5hFcXsJwL+NQ49btRMKX7eSQBQFOJCkz2U7zmMWt2GSU0I7nYgrxzDc X-Received: by 2002:a50:fb03:0:b0:467:621f:879e with SMTP id d3-20020a50fb03000000b00467621f879emr19954995edq.380.1668619764691; Wed, 16 Nov 2022 09:29:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668619764; cv=none; d=google.com; s=arc-20160816; b=AWm9il+64okpKfDkUvxskMxAhoA25nm7ZBlILX9jnFlcm+VxFPceLp4qcA+OVoxDsb cIPEW14YDpfHB/fD5SRkgXi7weaN9yivz1cCdm24NxwecAhqAMh/VxG8YxTyFugSncCv aC51CPeFRy4ocaao3RkE0jzyzEJLsGPILHFZcHZ0NHqhzMtDViNqnoMaJuwXx3Sikdhw 1Mf34dD9aVRMJDTjFqFSqcgt8f6hfBXlj9iL2IKEkkmMkeHtghV1oYRIQsq0NuuefrDz 5xYjymaDgZ1Z/25rYzhG1JXaL7zkY6+LFibmKwEPczt9G+KJIXOYP7MuLeXybu5+tsuL tojw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XLl9jwPWrKUEGMxUqZ8ZOHMSN21/G6zNljyYvTYvI4o=; b=JtxZkNAbcyle7Bf4KXV5HJar7QXVm4IPC8qQcWr9klkc7RwBZPh+deZIqAX6g+MJ4Y War9j9KyH8UdChoeepzfvyoMo0wOt5fRAWHRMp0EgrQ55BxrQRD7h+JEXHn4Sc7E1Ru4 ijc7kDrjw11Bj9Y0oquwdhB0fv1/62keOIdDNQYpmolZWOqNXIBkjG2MG5KDJFgjj6k2 MuHt7nwXxcoU4knUQRNzCK86ME8f42MeQTSmYt3LBxCcK2aDnReU/JQQBUMZlkaDHuXS kFUnX8efKtGMC7QPMLg2XvD725KowvAzxgXDPZocaz5Tm5Zrf0BLAYs7TGruKEkeUdZd Misg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b="tAc//YkJ"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hv21-20020a17090760d500b007ae4717bf11si12907965ejc.80.2022.11.16.09.28.48; Wed, 16 Nov 2022 09:29:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b="tAc//YkJ"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234505AbiKPRSB (ORCPT + 99 others); Wed, 16 Nov 2022 12:18:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234309AbiKPRRk (ORCPT ); Wed, 16 Nov 2022 12:17:40 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA6C15B5B6; Wed, 16 Nov 2022 09:17:25 -0800 (PST) Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AGGK3qt037051; Wed, 16 Nov 2022 17:17:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=XLl9jwPWrKUEGMxUqZ8ZOHMSN21/G6zNljyYvTYvI4o=; b=tAc//YkJSEKv5CGActK615FcZZw9K9CaK2oDBl3c87TwlRDXvzXEDV0bHVK+2ocapbya ynvurXvGYw+ZvGcB6Vp3XdSZCRyYZqIDqpa/3bRdIimhm0OFHCfnM+z5qVjDSHiADEL9 eCa6Ypipx4Ho1EidNSzFQggwG5NyYaqspShHMnBrqUpmao8idHqRF9wHbyVNlTyf3hoo c8MT3TGspZmaPp0JdkubkN9FuynTTHEg17vgE1ZxobEIeKflQthkvZl1rpEve3n/7RUI ZG/OdufgYAfX1qbSHnz5DSuiZxC90PJr2m/KbeKQ5g5zCrKALvYcLaPPgVAsZ3WnrLpa SQ== Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3kw39s1pr7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:02 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AGH6k8I009477; Wed, 16 Nov 2022 17:17:00 GMT Received: from b06avi18878370.portsmouth.uk.ibm.com (b06avi18878370.portsmouth.uk.ibm.com [9.149.26.194]) by ppma03ams.nl.ibm.com with ESMTP id 3kt348xawm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:00 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06avi18878370.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AGHHbju39518624 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 16 Nov 2022 17:17:37 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 92C4A5204E; Wed, 16 Nov 2022 17:16:57 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 09A0D52050; Wed, 16 Nov 2022 17:16:57 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , Gerd Bayer , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Wenjia Zhang Cc: Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org, Julian Ruess Subject: [PATCH v2 1/7] s390/ism: Set DMA coherent mask Date: Wed, 16 Nov 2022 18:16:50 +0100 Message-Id: <20221116171656.4128212-2-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221116171656.4128212-1-schnelle@linux.ibm.com> References: <20221116171656.4128212-1-schnelle@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: l7jEHWgblB2GeUGu6m8stahNtk2oHL6n X-Proofpoint-GUID: l7jEHWgblB2GeUGu6m8stahNtk2oHL6n X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-16_03,2022-11-16_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1011 malwarescore=0 phishscore=0 adultscore=0 mlxlogscore=999 lowpriorityscore=0 impostorscore=0 mlxscore=0 suspectscore=0 bulkscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160119 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749674638063987404?= X-GMAIL-MSGID: =?utf-8?q?1749674638063987404?= A future change will convert the DMA API implementation from the architecture specific arch/s390/pci/pci_dma.c to using the common code drivers/iommu/dma-iommu.c which the utilizes the same IOMMU hardware through the s390-iommu driver. Unlike the s390 specific DMA API this requires devices to correctly call set the coherent mask to be allowed to use IOVAs >2^32 in dma_alloc_coherent(). This was however not done for ISM devices. ISM requires such addresses since currently the DMA aperture for PCI devices starts at 2^32 and all calls to dma_alloc_coherent() would thus fail. Signed-off-by: Niklas Schnelle --- v1 -> v2: - Use dma_set_mask_and_coherent() (Christoph Hellwig) drivers/s390/net/ism_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/s390/net/ism_drv.c b/drivers/s390/net/ism_drv.c index d34bb6ec1490..da8a549b5965 100644 --- a/drivers/s390/net/ism_drv.c +++ b/drivers/s390/net/ism_drv.c @@ -556,7 +556,7 @@ static int ism_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (ret) goto err_disable; - ret = dma_set_mask(&pdev->dev, DMA_BIT_MASK(64)); + ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); if (ret) goto err_resource; From patchwork Wed Nov 16 17:16:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niklas Schnelle X-Patchwork-Id: 21200 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp263768wru; Wed, 16 Nov 2022 09:27:59 -0800 (PST) X-Google-Smtp-Source: AA0mqf5aZMsv3a78HRtr1gklwERwZRD3xXhQ8HvjfmK/VUeqU/INPEcjAFQvABpIvJhChe/NPqkf X-Received: by 2002:a17:906:e0d7:b0:78d:38cd:f4bc with SMTP id gl23-20020a170906e0d700b0078d38cdf4bcmr17986776ejb.192.1668619679209; Wed, 16 Nov 2022 09:27:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668619679; cv=none; d=google.com; s=arc-20160816; b=F2oHFJYicXulLeaAAiN7i+FV+xDhTMc/tMUnv6G2+xtimL27OXgPkUU4FBkPnK2bEx aXyyk13wZb3lOnSWOhe4IvmACCnKuQQUImQuxgmfNFu5rn54ZWKvxN2EkoPNoDbCOVmZ 6E092jnNgd7MycwsAWY+Odp8u6qMFlrakL2sdFRu7JNiRWl/pHVnh6+7n6wZzKQQZByN bvz4RJSNinMqUQfqqa47ULML3aEqTw7plSRXeRXnkU9cyi2voJeNhuVaWyfwSncz/1GY PakBXxW7ZtdaAw8IqNLHe1vghAcOtn5Av6oXToti5UNU2rHOQzXgxTmNygIxyaFD1oSt uS+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=hP0kCovPbWmZTSy7tSN9XNIJQRipnc5HWdbuDbVN2YU=; b=jdtMTpkVCuj3eGoqGssrG3BgJeY3PBYA2VxZJ0DnalAaOusG1G0XNX3M5Hl3meehK1 gC7EfIzXYmwODmGIzOhlI8fOTIkIgYWsqXeXJijV13EgGk7PrL7KjOidaOvfi+KK6d5m mJ2wIZ+JUAGSl6p90oFpXvjw8vRQFQX8rYA/KFlrccpbY85jQ7Tq5bb4e+ReNueburC4 gGf8LX/REnh1EirWtBDAGfYPEbRO7r7vilbOLuCWIT2BVLplA2tICqFgUqRD5SYqIXlS HmIDC5ys/xUtljFITK0g7bbL00JcbLtpMwgGfU6ywSXu7+HLI+lN3njzBAjM53cEeU9T nHrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=I5ppI5Sa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hc10-20020a170907168a00b007add0c2ee2csi15689251ejc.924.2022.11.16.09.27.36; Wed, 16 Nov 2022 09:27:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=I5ppI5Sa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234492AbiKPRRn (ORCPT + 99 others); Wed, 16 Nov 2022 12:17:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234220AbiKPRRY (ORCPT ); Wed, 16 Nov 2022 12:17:24 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DAD7A5987F; Wed, 16 Nov 2022 09:17:23 -0800 (PST) Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AGGm8qX004040; Wed, 16 Nov 2022 17:17:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=hP0kCovPbWmZTSy7tSN9XNIJQRipnc5HWdbuDbVN2YU=; b=I5ppI5Sa34t7XL8lJrgphc2wQQ6qaz0Cir62hE3QYn411ISMlvR7ik0pGZ7rtQPHqKCt sLcdBzZ+Vufl+ryjEWYVGvnT8oiDTfCah0H+maQwAl7HCU/Q0rt6Eh+zsC01IGhyPUcP T9lh2d2Ej2MKn4IxLfWKA1s7nv2HzwVtoCChlP+orRwTMgh+mZAqZDptRrG2p54yOdjE +bISHKhB2Jki+7AJnl5K3aPRoB3aiOKc1D4VXkxmK13zfNv3laDQMyPwzcIlKugxNVKb cS3ls0tlHLmjN0+gmCR8FX36Gdm20HSK2Wr4tVF4M5FPDoFYHcr21LTPVP/U8qCRwvhf vQ== Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3kw3q1rtkq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:03 +0000 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AGH5PtH021164; Wed, 16 Nov 2022 17:17:01 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma04fra.de.ibm.com with ESMTP id 3kt349cqmf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:01 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AGHAwoM44892494 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 16 Nov 2022 17:10:58 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 206885204E; Wed, 16 Nov 2022 17:16:58 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 94D205204F; Wed, 16 Nov 2022 17:16:57 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , Gerd Bayer , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Wenjia Zhang Cc: Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org, Julian Ruess Subject: [PATCH v2 2/7] s390/pci: prepare is_passed_through() for dma-iommu Date: Wed, 16 Nov 2022 18:16:51 +0100 Message-Id: <20221116171656.4128212-3-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221116171656.4128212-1-schnelle@linux.ibm.com> References: <20221116171656.4128212-1-schnelle@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 1tce3CspqI2aIwvCdZzZ0lQnt6ul_v4B X-Proofpoint-GUID: 1tce3CspqI2aIwvCdZzZ0lQnt6ul_v4B X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-16_03,2022-11-16_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 spamscore=0 priorityscore=1501 mlxscore=0 phishscore=0 mlxlogscore=965 lowpriorityscore=0 bulkscore=0 malwarescore=0 adultscore=0 clxscore=1015 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160119 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749674548416134433?= X-GMAIL-MSGID: =?utf-8?q?1749674548416134433?= With the IOMMU always controlled through the IOMMU driver testing for zdev->s390_domain is not a valid indication of the device being passed-through. Instead test if zdev->kzdev is set. Reviewed-by: Pierre Morel Signed-off-by: Niklas Schnelle --- arch/s390/pci/pci_event.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c index b9324ca2eb94..4ef5a6a1d618 100644 --- a/arch/s390/pci/pci_event.c +++ b/arch/s390/pci/pci_event.c @@ -59,9 +59,16 @@ static inline bool ers_result_indicates_abort(pci_ers_result_t ers_res) } } -static bool is_passed_through(struct zpci_dev *zdev) +static bool is_passed_through(struct pci_dev *pdev) { - return zdev->s390_domain; + struct zpci_dev *zdev = to_zpci(pdev); + bool ret; + + mutex_lock(&zdev->kzdev_lock); + ret = !!zdev->kzdev; + mutex_unlock(&zdev->kzdev_lock); + + return ret; } static bool is_driver_supported(struct pci_driver *driver) @@ -176,7 +183,7 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev) } pdev->error_state = pci_channel_io_frozen; - if (is_passed_through(to_zpci(pdev))) { + if (is_passed_through(pdev)) { pr_info("%s: Cannot be recovered in the host because it is a pass-through device\n", pci_name(pdev)); goto out_unlock; @@ -239,7 +246,7 @@ static void zpci_event_io_failure(struct pci_dev *pdev, pci_channel_state_t es) * we will inject the error event and let the guest recover the device * itself. */ - if (is_passed_through(to_zpci(pdev))) + if (is_passed_through(pdev)) goto out; driver = to_pci_driver(pdev->dev.driver); if (driver && driver->err_handler && driver->err_handler->error_detected) From patchwork Wed Nov 16 17:16:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niklas Schnelle X-Patchwork-Id: 21204 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp264489wru; Wed, 16 Nov 2022 09:29:35 -0800 (PST) X-Google-Smtp-Source: AA0mqf5u9eXfrh0npp7q9YGHDdd7hLBCPpxsuK+oZTAnQukUIp/DaxuXaaEZRIM8NJftGPNrRUu/ X-Received: by 2002:a63:ce58:0:b0:476:c39b:ab5b with SMTP id r24-20020a63ce58000000b00476c39bab5bmr6935253pgi.565.1668619775430; Wed, 16 Nov 2022 09:29:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668619775; cv=none; d=google.com; s=arc-20160816; b=e8Y0/rPCufHnEi/czwB4RTqY4MGt5vFW/dkAfjtYFFPavElJ83BmWGl+6sQdhV06IX PjCMrG21BYrUm3n4XHezUQ0hZdXZeMnm1s6ixY1fMtLXvjmW6hiBdv1f+2bt8i1vMbDT wiBbiI0ibX6xKih+kQw2zDN6RO2t+HGOuekBqsIsnsJtBDxuwJ/5OlFgkMrusuea8npa lycOg6VB5LArBkRBARQlmh21wncrA6fkpDM5EzCdp81gfgrq1h3tN8GgzJSd/JsGDH/O EacITKe3t9fgMlO1vEQxXN/5iDBLG2djFbgB02+NWmQRnks2Ti5bqYXN+Qu7QQk2cHmd 0n8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=dUKrLQuykJQTkJ45R4KKE5wSKpLokIpCPBxiQjkQ4w0=; b=cpwvQ0E4vWsBitAeZ0nqhpv5Ba/ZHxVo/Um+oTGPDDxGoGtZdVcAJfPGIi/fHPRjRQ 1MG47IkEMl01HI/K7xg2JoeGah+uDRaLzCJ+L/gG6h9WsjCoqg4bxaLQTDXp/+e8ESFD jXgVTd08wBhT23u2gLrNEO/aPcyonAWb2n9ADw8HbVjM00TSG/6DasIGwIKba4g+hGg7 VLNXYNWjirDuysAYFf4GHvC5t+RvY0mM2tHEvikeAjczDzBzxJr0VXCAcRwznikDSghk lbVDMYm8pgvmNM/DbWZkaK/VkMfx/kEWWDehrXxdgXAOy432T7WU/Hc2liJa2kQSqMM3 Y3LQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=B0rbgnEE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r12-20020a632b0c000000b004344b581925si15432315pgr.879.2022.11.16.09.29.20; Wed, 16 Nov 2022 09:29:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=B0rbgnEE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238603AbiKPRSF (ORCPT + 99 others); Wed, 16 Nov 2022 12:18:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234451AbiKPRRm (ORCPT ); Wed, 16 Nov 2022 12:17:42 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 674EA5B865; Wed, 16 Nov 2022 09:17:27 -0800 (PST) Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AGH0gRL016616; Wed, 16 Nov 2022 17:17:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=dUKrLQuykJQTkJ45R4KKE5wSKpLokIpCPBxiQjkQ4w0=; b=B0rbgnEEGzQIR/lZF/n5+U4SBCB3867k3QFh3Kd6dJKD4WiiAcFKQD42e12OMV0bYShr 0/ZduzpzvXKvUqiE5GxIP/yAk8Bj8gFcQcPliwN4CXetggheWBdNe1Y2x+VqJUoAnsyx x+e9oVHUl782D53zZ7oPoE1hJbtrhGgh2MIBBO0HdHBrfcWcpmQvhbvJTbHV5+yD4sXI 9txhbUvMoKBzsHguNXJz0teWd+B7IlohLvvwZi7p7ibq1UD62RAC7aWkA1OMlIZIV/lm B+tTCL8L1JVWNN+LIDQXXlyMez/Y9dmaPNvk1bh3l7YBhYqu5QqXKwXfzKp3VNWu5mAI nA== Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3kw3vwrf8c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:08 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AGH5llg024480; Wed, 16 Nov 2022 17:17:02 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma04ams.nl.ibm.com with ESMTP id 3kt348x9jx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:02 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AGHAwqm49414406 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 16 Nov 2022 17:10:58 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A9F5952051; Wed, 16 Nov 2022 17:16:58 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 22FB052050; Wed, 16 Nov 2022 17:16:58 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , Gerd Bayer , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Wenjia Zhang Cc: Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org, Julian Ruess Subject: [PATCH v2 3/7] s390/pci: Use dma-iommu layer Date: Wed, 16 Nov 2022 18:16:52 +0100 Message-Id: <20221116171656.4128212-4-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221116171656.4128212-1-schnelle@linux.ibm.com> References: <20221116171656.4128212-1-schnelle@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 1HLImKCpnu7RwvsYzo_hZrZKIA44xovk X-Proofpoint-GUID: 1HLImKCpnu7RwvsYzo_hZrZKIA44xovk X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-16_03,2022-11-16_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 spamscore=0 priorityscore=1501 mlxscore=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 bulkscore=0 mlxlogscore=999 phishscore=0 clxscore=1015 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160119 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749674649265299644?= X-GMAIL-MSGID: =?utf-8?q?1749674649265299644?= While s390 already has a standard IOMMU driver and previous changes have added I/O TLB flushing operations this driver is currently only used for user-space PCI access such as vfio-pci. For the DMA API s390 instead utilizes its own implementation in arch/s390/pci/pci_dma.c which drives the same hardware and shares some code but requires a complex and fragile hand over between DMA API and IOMMU API use of a device and despite code sharing still leads to significant duplication and maintenance effort. Let's utilize the common code DMAP API implementation from drivers/iommu/dma-iommu.c instead allowing us to get rid of arch/s390/pci/pci_dma.c. Signed-off-by: Niklas Schnelle --- .../admin-guide/kernel-parameters.txt | 9 +- arch/s390/include/asm/pci.h | 7 - arch/s390/include/asm/pci_dma.h | 120 +-- arch/s390/pci/Makefile | 2 +- arch/s390/pci/pci.c | 22 +- arch/s390/pci/pci_bus.c | 5 - arch/s390/pci/pci_debug.c | 13 +- arch/s390/pci/pci_dma.c | 732 ------------------ arch/s390/pci/pci_event.c | 2 - arch/s390/pci/pci_sysfs.c | 19 +- drivers/iommu/Kconfig | 3 +- drivers/iommu/s390-iommu.c | 391 +++++++++- 12 files changed, 408 insertions(+), 917 deletions(-) delete mode 100644 arch/s390/pci/pci_dma.c diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a465d5242774..633312c4f800 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2154,7 +2154,7 @@ forcing Dual Address Cycle for PCI cards supporting greater than 32-bit addressing. - iommu.strict= [ARM64, X86] Configure TLB invalidation behaviour + iommu.strict= [ARM64, X86, S390] Configure TLB invalidation behaviour Format: { "0" | "1" } 0 - Lazy mode. Request that DMA unmap operations use deferred @@ -5387,9 +5387,10 @@ s390_iommu= [HW,S390] Set s390 IOTLB flushing mode strict - With strict flushing every unmap operation will result in - an IOTLB flush. Default is lazy flushing before reuse, - which is faster. + With strict flushing every unmap operation will result + in an IOTLB flush. Default is lazy flushing before + reuse, which is faster. Deprecated, equivalent to + iommu.strict=1. s390_iommu_aperture= [KNL,S390] Specifies the size of the per device DMA address space diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h index b248694e0024..3f74f1cf37df 100644 --- a/arch/s390/include/asm/pci.h +++ b/arch/s390/include/asm/pci.h @@ -159,13 +159,6 @@ struct zpci_dev { unsigned long *dma_table; int tlb_refresh; - spinlock_t iommu_bitmap_lock; - unsigned long *iommu_bitmap; - unsigned long *lazy_bitmap; - unsigned long iommu_size; - unsigned long iommu_pages; - unsigned int next_bit; - struct iommu_device iommu_dev; /* IOMMU core handle */ char res_name[16]; diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h index 91e63426bdc5..42d7cc4262ca 100644 --- a/arch/s390/include/asm/pci_dma.h +++ b/arch/s390/include/asm/pci_dma.h @@ -82,116 +82,16 @@ enum zpci_ioat_dtype { #define ZPCI_TABLE_VALID_MASK 0x20 #define ZPCI_TABLE_PROT_MASK 0x200 -static inline unsigned int calc_rtx(dma_addr_t ptr) -{ - return ((unsigned long) ptr >> ZPCI_RT_SHIFT) & ZPCI_INDEX_MASK; -} - -static inline unsigned int calc_sx(dma_addr_t ptr) -{ - return ((unsigned long) ptr >> ZPCI_ST_SHIFT) & ZPCI_INDEX_MASK; -} - -static inline unsigned int calc_px(dma_addr_t ptr) -{ - return ((unsigned long) ptr >> PAGE_SHIFT) & ZPCI_PT_MASK; -} - -static inline void set_pt_pfaa(unsigned long *entry, phys_addr_t pfaa) -{ - *entry &= ZPCI_PTE_FLAG_MASK; - *entry |= (pfaa & ZPCI_PTE_ADDR_MASK); -} - -static inline void set_rt_sto(unsigned long *entry, phys_addr_t sto) -{ - *entry &= ZPCI_RTE_FLAG_MASK; - *entry |= (sto & ZPCI_RTE_ADDR_MASK); - *entry |= ZPCI_TABLE_TYPE_RTX; -} - -static inline void set_st_pto(unsigned long *entry, phys_addr_t pto) -{ - *entry &= ZPCI_STE_FLAG_MASK; - *entry |= (pto & ZPCI_STE_ADDR_MASK); - *entry |= ZPCI_TABLE_TYPE_SX; -} - -static inline void validate_rt_entry(unsigned long *entry) -{ - *entry &= ~ZPCI_TABLE_VALID_MASK; - *entry &= ~ZPCI_TABLE_OFFSET_MASK; - *entry |= ZPCI_TABLE_VALID; - *entry |= ZPCI_TABLE_LEN_RTX; -} - -static inline void validate_st_entry(unsigned long *entry) -{ - *entry &= ~ZPCI_TABLE_VALID_MASK; - *entry |= ZPCI_TABLE_VALID; -} - -static inline void invalidate_pt_entry(unsigned long *entry) -{ - WARN_ON_ONCE((*entry & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_INVALID); - *entry &= ~ZPCI_PTE_VALID_MASK; - *entry |= ZPCI_PTE_INVALID; -} - -static inline void validate_pt_entry(unsigned long *entry) -{ - WARN_ON_ONCE((*entry & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_VALID); - *entry &= ~ZPCI_PTE_VALID_MASK; - *entry |= ZPCI_PTE_VALID; -} - -static inline void entry_set_protected(unsigned long *entry) -{ - *entry &= ~ZPCI_TABLE_PROT_MASK; - *entry |= ZPCI_TABLE_PROTECTED; -} - -static inline void entry_clr_protected(unsigned long *entry) -{ - *entry &= ~ZPCI_TABLE_PROT_MASK; - *entry |= ZPCI_TABLE_UNPROTECTED; -} - -static inline int reg_entry_isvalid(unsigned long entry) -{ - return (entry & ZPCI_TABLE_VALID_MASK) == ZPCI_TABLE_VALID; -} - -static inline int pt_entry_isvalid(unsigned long entry) -{ - return (entry & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_VALID; -} - -static inline unsigned long *get_rt_sto(unsigned long entry) -{ - if ((entry & ZPCI_TABLE_TYPE_MASK) == ZPCI_TABLE_TYPE_RTX) - return phys_to_virt(entry & ZPCI_RTE_ADDR_MASK); - else - return NULL; - -} - -static inline unsigned long *get_st_pto(unsigned long entry) -{ - if ((entry & ZPCI_TABLE_TYPE_MASK) == ZPCI_TABLE_TYPE_SX) - return phys_to_virt(entry & ZPCI_STE_ADDR_MASK); - else - return NULL; -} - -/* Prototypes */ -void dma_free_seg_table(unsigned long); -unsigned long *dma_alloc_cpu_table(void); -void dma_cleanup_tables(unsigned long *); -unsigned long *dma_walk_cpu_trans(unsigned long *rto, dma_addr_t dma_addr); -void dma_update_cpu_trans(unsigned long *entry, phys_addr_t page_addr, int flags); - -extern const struct dma_map_ops s390_pci_dma_ops; +struct zpci_iommu_ctrs { + atomic64_t mapped_pages; + atomic64_t unmapped_pages; + atomic64_t global_rpcits; + atomic64_t sync_map_rpcits; + atomic64_t sync_rpcits; +}; + +struct zpci_dev; +struct zpci_iommu_ctrs *zpci_get_iommu_ctrs(struct zpci_dev *zdev); #endif diff --git a/arch/s390/pci/Makefile b/arch/s390/pci/Makefile index 5ae31ca9dd44..0547a10406e7 100644 --- a/arch/s390/pci/Makefile +++ b/arch/s390/pci/Makefile @@ -3,7 +3,7 @@ # Makefile for the s390 PCI subsystem. # -obj-$(CONFIG_PCI) += pci.o pci_irq.o pci_dma.o pci_clp.o pci_sysfs.o \ +obj-$(CONFIG_PCI) += pci.o pci_irq.o pci_clp.o pci_sysfs.o \ pci_event.o pci_debug.o pci_insn.o pci_mmio.o \ pci_bus.o pci_kvm_hook.o obj-$(CONFIG_PCI_IOV) += pci_iov.o diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c index ef38b1514c77..6b0fe8761509 100644 --- a/arch/s390/pci/pci.c +++ b/arch/s390/pci/pci.c @@ -124,7 +124,11 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas, WARN_ON_ONCE(iota & 0x3fff); fib.pba = base; - fib.pal = limit; + /* Work around off by one in ISM virt device */ + if (zdev->pft == 0x5 && limit > base) + fib.pal = limit + (1 << 12); + else + fib.pal = limit; fib.iota = iota | ZPCI_IOTA_RTTO_FLAG; fib.gd = zdev->gisa; cc = zpci_mod_fc(req, &fib, status); @@ -615,7 +619,6 @@ int pcibios_device_add(struct pci_dev *pdev) pdev->no_vf_scan = 1; pdev->dev.groups = zpci_attr_groups; - pdev->dev.dma_ops = &s390_pci_dma_ops; zpci_map_resources(pdev); for (i = 0; i < PCI_STD_NUM_BARS; i++) { @@ -789,8 +792,6 @@ int zpci_hot_reset_device(struct zpci_dev *zdev) if (zdev->dma_table) rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma, virt_to_phys(zdev->dma_table), &status); - else - rc = zpci_dma_init_device(zdev); if (rc) { zpci_disable_device(zdev); return rc; @@ -915,11 +916,6 @@ int zpci_deconfigure_device(struct zpci_dev *zdev) if (zdev->zbus->bus) zpci_bus_remove_device(zdev, false); - if (zdev->dma_table) { - rc = zpci_dma_exit_device(zdev); - if (rc) - return rc; - } if (zdev_enabled(zdev)) { rc = zpci_disable_device(zdev); if (rc) @@ -968,8 +964,6 @@ void zpci_release_device(struct kref *kref) if (zdev->zbus->bus) zpci_bus_remove_device(zdev, false); - if (zdev->dma_table) - zpci_dma_exit_device(zdev); if (zdev_enabled(zdev)) zpci_disable_device(zdev); @@ -1159,10 +1153,6 @@ static int __init pci_base_init(void) if (rc) goto out_irq; - rc = zpci_dma_init(); - if (rc) - goto out_dma; - rc = clp_scan_pci_devices(); if (rc) goto out_find; @@ -1172,8 +1162,6 @@ static int __init pci_base_init(void) return 0; out_find: - zpci_dma_exit(); -out_dma: zpci_irq_exit(); out_irq: zpci_mem_exit(); diff --git a/arch/s390/pci/pci_bus.c b/arch/s390/pci/pci_bus.c index 6a8da1b742ae..b15ad15999f8 100644 --- a/arch/s390/pci/pci_bus.c +++ b/arch/s390/pci/pci_bus.c @@ -49,11 +49,6 @@ static int zpci_bus_prepare_device(struct zpci_dev *zdev) rc = zpci_enable_device(zdev); if (rc) return rc; - rc = zpci_dma_init_device(zdev); - if (rc) { - zpci_disable_device(zdev); - return rc; - } } if (!zdev->has_resources) { diff --git a/arch/s390/pci/pci_debug.c b/arch/s390/pci/pci_debug.c index ca6bd98eec13..60cec57a3907 100644 --- a/arch/s390/pci/pci_debug.c +++ b/arch/s390/pci/pci_debug.c @@ -53,9 +53,12 @@ static char *pci_fmt3_names[] = { }; static char *pci_sw_names[] = { - "Allocated pages", +/* TODO "Allocated pages", */ "Mapped pages", "Unmapped pages", + "Global RPCITs", + "Sync Map RPCITs", + "Sync RPCITs", }; static void pci_fmb_show(struct seq_file *m, char *name[], int length, @@ -69,10 +72,14 @@ static void pci_fmb_show(struct seq_file *m, char *name[], int length, static void pci_sw_counter_show(struct seq_file *m) { - struct zpci_dev *zdev = m->private; - atomic64_t *counter = &zdev->allocated_pages; + struct zpci_iommu_ctrs *ctrs = zpci_get_iommu_ctrs(m->private); + atomic64_t *counter; int i; + if (!ctrs) + return; + + counter = &ctrs->mapped_pages; for (i = 0; i < ARRAY_SIZE(pci_sw_names); i++, counter++) seq_printf(m, "%26s:\t%llu\n", pci_sw_names[i], atomic64_read(counter)); diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c deleted file mode 100644 index ea478d11fbd1..000000000000 --- a/arch/s390/pci/pci_dma.c +++ /dev/null @@ -1,732 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * Copyright IBM Corp. 2012 - * - * Author(s): - * Jan Glauber - */ - -#include -#include -#include -#include -#include -#include -#include -#include - -static struct kmem_cache *dma_region_table_cache; -static struct kmem_cache *dma_page_table_cache; -static int s390_iommu_strict; -static u64 s390_iommu_aperture; -static u32 s390_iommu_aperture_factor = 1; - -static int zpci_refresh_global(struct zpci_dev *zdev) -{ - return zpci_refresh_trans((u64) zdev->fh << 32, zdev->start_dma, - zdev->iommu_pages * PAGE_SIZE); -} - -unsigned long *dma_alloc_cpu_table(void) -{ - unsigned long *table, *entry; - - table = kmem_cache_alloc(dma_region_table_cache, GFP_ATOMIC); - if (!table) - return NULL; - - for (entry = table; entry < table + ZPCI_TABLE_ENTRIES; entry++) - *entry = ZPCI_TABLE_INVALID; - return table; -} - -static void dma_free_cpu_table(void *table) -{ - kmem_cache_free(dma_region_table_cache, table); -} - -static unsigned long *dma_alloc_page_table(void) -{ - unsigned long *table, *entry; - - table = kmem_cache_alloc(dma_page_table_cache, GFP_ATOMIC); - if (!table) - return NULL; - - for (entry = table; entry < table + ZPCI_PT_ENTRIES; entry++) - *entry = ZPCI_PTE_INVALID; - return table; -} - -static void dma_free_page_table(void *table) -{ - kmem_cache_free(dma_page_table_cache, table); -} - -static unsigned long *dma_get_seg_table_origin(unsigned long *rtep) -{ - unsigned long old_rte, rte; - unsigned long *sto; - - rte = READ_ONCE(*rtep); - if (reg_entry_isvalid(rte)) { - sto = get_rt_sto(rte); - } else { - sto = dma_alloc_cpu_table(); - if (!sto) - return NULL; - - set_rt_sto(&rte, virt_to_phys(sto)); - validate_rt_entry(&rte); - entry_clr_protected(&rte); - - old_rte = cmpxchg(rtep, ZPCI_TABLE_INVALID, rte); - if (old_rte != ZPCI_TABLE_INVALID) { - /* Somone else was faster, use theirs */ - dma_free_cpu_table(sto); - sto = get_rt_sto(old_rte); - } - } - return sto; -} - -static unsigned long *dma_get_page_table_origin(unsigned long *step) -{ - unsigned long old_ste, ste; - unsigned long *pto; - - ste = READ_ONCE(*step); - if (reg_entry_isvalid(ste)) { - pto = get_st_pto(ste); - } else { - pto = dma_alloc_page_table(); - if (!pto) - return NULL; - set_st_pto(&ste, virt_to_phys(pto)); - validate_st_entry(&ste); - entry_clr_protected(&ste); - - old_ste = cmpxchg(step, ZPCI_TABLE_INVALID, ste); - if (old_ste != ZPCI_TABLE_INVALID) { - /* Somone else was faster, use theirs */ - dma_free_page_table(pto); - pto = get_st_pto(old_ste); - } - } - return pto; -} - -unsigned long *dma_walk_cpu_trans(unsigned long *rto, dma_addr_t dma_addr) -{ - unsigned long *sto, *pto; - unsigned int rtx, sx, px; - - rtx = calc_rtx(dma_addr); - sto = dma_get_seg_table_origin(&rto[rtx]); - if (!sto) - return NULL; - - sx = calc_sx(dma_addr); - pto = dma_get_page_table_origin(&sto[sx]); - if (!pto) - return NULL; - - px = calc_px(dma_addr); - return &pto[px]; -} - -void dma_update_cpu_trans(unsigned long *ptep, phys_addr_t page_addr, int flags) -{ - unsigned long pte; - - pte = READ_ONCE(*ptep); - if (flags & ZPCI_PTE_INVALID) { - invalidate_pt_entry(&pte); - } else { - set_pt_pfaa(&pte, page_addr); - validate_pt_entry(&pte); - } - - if (flags & ZPCI_TABLE_PROTECTED) - entry_set_protected(&pte); - else - entry_clr_protected(&pte); - - xchg(ptep, pte); -} - -static int __dma_update_trans(struct zpci_dev *zdev, phys_addr_t pa, - dma_addr_t dma_addr, size_t size, int flags) -{ - unsigned int nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT; - phys_addr_t page_addr = (pa & PAGE_MASK); - unsigned long *entry; - int i, rc = 0; - - if (!nr_pages) - return -EINVAL; - - if (!zdev->dma_table) - return -EINVAL; - - for (i = 0; i < nr_pages; i++) { - entry = dma_walk_cpu_trans(zdev->dma_table, dma_addr); - if (!entry) { - rc = -ENOMEM; - goto undo_cpu_trans; - } - dma_update_cpu_trans(entry, page_addr, flags); - page_addr += PAGE_SIZE; - dma_addr += PAGE_SIZE; - } - -undo_cpu_trans: - if (rc && ((flags & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_VALID)) { - flags = ZPCI_PTE_INVALID; - while (i-- > 0) { - page_addr -= PAGE_SIZE; - dma_addr -= PAGE_SIZE; - entry = dma_walk_cpu_trans(zdev->dma_table, dma_addr); - if (!entry) - break; - dma_update_cpu_trans(entry, page_addr, flags); - } - } - return rc; -} - -static int __dma_purge_tlb(struct zpci_dev *zdev, dma_addr_t dma_addr, - size_t size, int flags) -{ - unsigned long irqflags; - int ret; - - /* - * With zdev->tlb_refresh == 0, rpcit is not required to establish new - * translations when previously invalid translation-table entries are - * validated. With lazy unmap, rpcit is skipped for previously valid - * entries, but a global rpcit is then required before any address can - * be re-used, i.e. after each iommu bitmap wrap-around. - */ - if ((flags & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_VALID) { - if (!zdev->tlb_refresh) - return 0; - } else { - if (!s390_iommu_strict) - return 0; - } - - ret = zpci_refresh_trans((u64) zdev->fh << 32, dma_addr, - PAGE_ALIGN(size)); - if (ret == -ENOMEM && !s390_iommu_strict) { - /* enable the hypervisor to free some resources */ - if (zpci_refresh_global(zdev)) - goto out; - - spin_lock_irqsave(&zdev->iommu_bitmap_lock, irqflags); - bitmap_andnot(zdev->iommu_bitmap, zdev->iommu_bitmap, - zdev->lazy_bitmap, zdev->iommu_pages); - bitmap_zero(zdev->lazy_bitmap, zdev->iommu_pages); - spin_unlock_irqrestore(&zdev->iommu_bitmap_lock, irqflags); - ret = 0; - } -out: - return ret; -} - -static int dma_update_trans(struct zpci_dev *zdev, phys_addr_t pa, - dma_addr_t dma_addr, size_t size, int flags) -{ - int rc; - - rc = __dma_update_trans(zdev, pa, dma_addr, size, flags); - if (rc) - return rc; - - rc = __dma_purge_tlb(zdev, dma_addr, size, flags); - if (rc && ((flags & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_VALID)) - __dma_update_trans(zdev, pa, dma_addr, size, ZPCI_PTE_INVALID); - - return rc; -} - -void dma_free_seg_table(unsigned long entry) -{ - unsigned long *sto = get_rt_sto(entry); - int sx; - - for (sx = 0; sx < ZPCI_TABLE_ENTRIES; sx++) - if (reg_entry_isvalid(sto[sx])) - dma_free_page_table(get_st_pto(sto[sx])); - - dma_free_cpu_table(sto); -} - -void dma_cleanup_tables(unsigned long *table) -{ - int rtx; - - if (!table) - return; - - for (rtx = 0; rtx < ZPCI_TABLE_ENTRIES; rtx++) - if (reg_entry_isvalid(table[rtx])) - dma_free_seg_table(table[rtx]); - - dma_free_cpu_table(table); -} - -static unsigned long __dma_alloc_iommu(struct device *dev, - unsigned long start, int size) -{ - struct zpci_dev *zdev = to_zpci(to_pci_dev(dev)); - - return iommu_area_alloc(zdev->iommu_bitmap, zdev->iommu_pages, - start, size, zdev->start_dma >> PAGE_SHIFT, - dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT), - 0); -} - -static dma_addr_t dma_alloc_address(struct device *dev, int size) -{ - struct zpci_dev *zdev = to_zpci(to_pci_dev(dev)); - unsigned long offset, flags; - - spin_lock_irqsave(&zdev->iommu_bitmap_lock, flags); - offset = __dma_alloc_iommu(dev, zdev->next_bit, size); - if (offset == -1) { - if (!s390_iommu_strict) { - /* global flush before DMA addresses are reused */ - if (zpci_refresh_global(zdev)) - goto out_error; - - bitmap_andnot(zdev->iommu_bitmap, zdev->iommu_bitmap, - zdev->lazy_bitmap, zdev->iommu_pages); - bitmap_zero(zdev->lazy_bitmap, zdev->iommu_pages); - } - /* wrap-around */ - offset = __dma_alloc_iommu(dev, 0, size); - if (offset == -1) - goto out_error; - } - zdev->next_bit = offset + size; - spin_unlock_irqrestore(&zdev->iommu_bitmap_lock, flags); - - return zdev->start_dma + offset * PAGE_SIZE; - -out_error: - spin_unlock_irqrestore(&zdev->iommu_bitmap_lock, flags); - return DMA_MAPPING_ERROR; -} - -static void dma_free_address(struct device *dev, dma_addr_t dma_addr, int size) -{ - struct zpci_dev *zdev = to_zpci(to_pci_dev(dev)); - unsigned long flags, offset; - - offset = (dma_addr - zdev->start_dma) >> PAGE_SHIFT; - - spin_lock_irqsave(&zdev->iommu_bitmap_lock, flags); - if (!zdev->iommu_bitmap) - goto out; - - if (s390_iommu_strict) - bitmap_clear(zdev->iommu_bitmap, offset, size); - else - bitmap_set(zdev->lazy_bitmap, offset, size); - -out: - spin_unlock_irqrestore(&zdev->iommu_bitmap_lock, flags); -} - -static inline void zpci_err_dma(unsigned long rc, unsigned long addr) -{ - struct { - unsigned long rc; - unsigned long addr; - } __packed data = {rc, addr}; - - zpci_err_hex(&data, sizeof(data)); -} - -static dma_addr_t s390_dma_map_pages(struct device *dev, struct page *page, - unsigned long offset, size_t size, - enum dma_data_direction direction, - unsigned long attrs) -{ - struct zpci_dev *zdev = to_zpci(to_pci_dev(dev)); - unsigned long pa = page_to_phys(page) + offset; - int flags = ZPCI_PTE_VALID; - unsigned long nr_pages; - dma_addr_t dma_addr; - int ret; - - /* This rounds up number of pages based on size and offset */ - nr_pages = iommu_num_pages(pa, size, PAGE_SIZE); - dma_addr = dma_alloc_address(dev, nr_pages); - if (dma_addr == DMA_MAPPING_ERROR) { - ret = -ENOSPC; - goto out_err; - } - - /* Use rounded up size */ - size = nr_pages * PAGE_SIZE; - - if (direction == DMA_NONE || direction == DMA_TO_DEVICE) - flags |= ZPCI_TABLE_PROTECTED; - - ret = dma_update_trans(zdev, pa, dma_addr, size, flags); - if (ret) - goto out_free; - - atomic64_add(nr_pages, &zdev->mapped_pages); - return dma_addr + (offset & ~PAGE_MASK); - -out_free: - dma_free_address(dev, dma_addr, nr_pages); -out_err: - zpci_err("map error:\n"); - zpci_err_dma(ret, pa); - return DMA_MAPPING_ERROR; -} - -static void s390_dma_unmap_pages(struct device *dev, dma_addr_t dma_addr, - size_t size, enum dma_data_direction direction, - unsigned long attrs) -{ - struct zpci_dev *zdev = to_zpci(to_pci_dev(dev)); - int npages, ret; - - npages = iommu_num_pages(dma_addr, size, PAGE_SIZE); - dma_addr = dma_addr & PAGE_MASK; - ret = dma_update_trans(zdev, 0, dma_addr, npages * PAGE_SIZE, - ZPCI_PTE_INVALID); - if (ret) { - zpci_err("unmap error:\n"); - zpci_err_dma(ret, dma_addr); - return; - } - - atomic64_add(npages, &zdev->unmapped_pages); - dma_free_address(dev, dma_addr, npages); -} - -static void *s390_dma_alloc(struct device *dev, size_t size, - dma_addr_t *dma_handle, gfp_t flag, - unsigned long attrs) -{ - struct zpci_dev *zdev = to_zpci(to_pci_dev(dev)); - struct page *page; - phys_addr_t pa; - dma_addr_t map; - - size = PAGE_ALIGN(size); - page = alloc_pages(flag | __GFP_ZERO, get_order(size)); - if (!page) - return NULL; - - pa = page_to_phys(page); - map = s390_dma_map_pages(dev, page, 0, size, DMA_BIDIRECTIONAL, 0); - if (dma_mapping_error(dev, map)) { - __free_pages(page, get_order(size)); - return NULL; - } - - atomic64_add(size / PAGE_SIZE, &zdev->allocated_pages); - if (dma_handle) - *dma_handle = map; - return phys_to_virt(pa); -} - -static void s390_dma_free(struct device *dev, size_t size, - void *vaddr, dma_addr_t dma_handle, - unsigned long attrs) -{ - struct zpci_dev *zdev = to_zpci(to_pci_dev(dev)); - - size = PAGE_ALIGN(size); - atomic64_sub(size / PAGE_SIZE, &zdev->allocated_pages); - s390_dma_unmap_pages(dev, dma_handle, size, DMA_BIDIRECTIONAL, 0); - free_pages((unsigned long)vaddr, get_order(size)); -} - -/* Map a segment into a contiguous dma address area */ -static int __s390_dma_map_sg(struct device *dev, struct scatterlist *sg, - size_t size, dma_addr_t *handle, - enum dma_data_direction dir) -{ - unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT; - struct zpci_dev *zdev = to_zpci(to_pci_dev(dev)); - dma_addr_t dma_addr_base, dma_addr; - int flags = ZPCI_PTE_VALID; - struct scatterlist *s; - phys_addr_t pa = 0; - int ret; - - dma_addr_base = dma_alloc_address(dev, nr_pages); - if (dma_addr_base == DMA_MAPPING_ERROR) - return -ENOMEM; - - dma_addr = dma_addr_base; - if (dir == DMA_NONE || dir == DMA_TO_DEVICE) - flags |= ZPCI_TABLE_PROTECTED; - - for (s = sg; dma_addr < dma_addr_base + size; s = sg_next(s)) { - pa = page_to_phys(sg_page(s)); - ret = __dma_update_trans(zdev, pa, dma_addr, - s->offset + s->length, flags); - if (ret) - goto unmap; - - dma_addr += s->offset + s->length; - } - ret = __dma_purge_tlb(zdev, dma_addr_base, size, flags); - if (ret) - goto unmap; - - *handle = dma_addr_base; - atomic64_add(nr_pages, &zdev->mapped_pages); - - return ret; - -unmap: - dma_update_trans(zdev, 0, dma_addr_base, dma_addr - dma_addr_base, - ZPCI_PTE_INVALID); - dma_free_address(dev, dma_addr_base, nr_pages); - zpci_err("map error:\n"); - zpci_err_dma(ret, pa); - return ret; -} - -static int s390_dma_map_sg(struct device *dev, struct scatterlist *sg, - int nr_elements, enum dma_data_direction dir, - unsigned long attrs) -{ - struct scatterlist *s = sg, *start = sg, *dma = sg; - unsigned int max = dma_get_max_seg_size(dev); - unsigned int size = s->offset + s->length; - unsigned int offset = s->offset; - int count = 0, i, ret; - - for (i = 1; i < nr_elements; i++) { - s = sg_next(s); - - s->dma_length = 0; - - if (s->offset || (size & ~PAGE_MASK) || - size + s->length > max) { - ret = __s390_dma_map_sg(dev, start, size, - &dma->dma_address, dir); - if (ret) - goto unmap; - - dma->dma_address += offset; - dma->dma_length = size - offset; - - size = offset = s->offset; - start = s; - dma = sg_next(dma); - count++; - } - size += s->length; - } - ret = __s390_dma_map_sg(dev, start, size, &dma->dma_address, dir); - if (ret) - goto unmap; - - dma->dma_address += offset; - dma->dma_length = size - offset; - - return count + 1; -unmap: - for_each_sg(sg, s, count, i) - s390_dma_unmap_pages(dev, sg_dma_address(s), sg_dma_len(s), - dir, attrs); - - return ret; -} - -static void s390_dma_unmap_sg(struct device *dev, struct scatterlist *sg, - int nr_elements, enum dma_data_direction dir, - unsigned long attrs) -{ - struct scatterlist *s; - int i; - - for_each_sg(sg, s, nr_elements, i) { - if (s->dma_length) - s390_dma_unmap_pages(dev, s->dma_address, s->dma_length, - dir, attrs); - s->dma_address = 0; - s->dma_length = 0; - } -} - -int zpci_dma_init_device(struct zpci_dev *zdev) -{ - u8 status; - int rc; - - /* - * At this point, if the device is part of an IOMMU domain, this would - * be a strong hint towards a bug in the IOMMU API (common) code and/or - * simultaneous access via IOMMU and DMA API. So let's issue a warning. - */ - WARN_ON(zdev->s390_domain); - - spin_lock_init(&zdev->iommu_bitmap_lock); - - zdev->dma_table = dma_alloc_cpu_table(); - if (!zdev->dma_table) { - rc = -ENOMEM; - goto out; - } - - /* - * Restrict the iommu bitmap size to the minimum of the following: - * - s390_iommu_aperture which defaults to high_memory - * - 3-level pagetable address limit minus start_dma offset - * - DMA address range allowed by the hardware (clp query pci fn) - * - * Also set zdev->end_dma to the actual end address of the usable - * range, instead of the theoretical maximum as reported by hardware. - * - * This limits the number of concurrently usable DMA mappings since - * for each DMA mapped memory address we need a DMA address including - * extra DMA addresses for multiple mappings of the same memory address. - */ - zdev->start_dma = PAGE_ALIGN(zdev->start_dma); - zdev->iommu_size = min3(s390_iommu_aperture, - ZPCI_TABLE_SIZE_RT - zdev->start_dma, - zdev->end_dma - zdev->start_dma + 1); - zdev->end_dma = zdev->start_dma + zdev->iommu_size - 1; - zdev->iommu_pages = zdev->iommu_size >> PAGE_SHIFT; - zdev->iommu_bitmap = vzalloc(zdev->iommu_pages / 8); - if (!zdev->iommu_bitmap) { - rc = -ENOMEM; - goto free_dma_table; - } - if (!s390_iommu_strict) { - zdev->lazy_bitmap = vzalloc(zdev->iommu_pages / 8); - if (!zdev->lazy_bitmap) { - rc = -ENOMEM; - goto free_bitmap; - } - - } - if (zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma, - virt_to_phys(zdev->dma_table), &status)) { - rc = -EIO; - goto free_bitmap; - } - - return 0; -free_bitmap: - vfree(zdev->iommu_bitmap); - zdev->iommu_bitmap = NULL; - vfree(zdev->lazy_bitmap); - zdev->lazy_bitmap = NULL; -free_dma_table: - dma_free_cpu_table(zdev->dma_table); - zdev->dma_table = NULL; -out: - return rc; -} - -int zpci_dma_exit_device(struct zpci_dev *zdev) -{ - int cc = 0; - - /* - * At this point, if the device is part of an IOMMU domain, this would - * be a strong hint towards a bug in the IOMMU API (common) code and/or - * simultaneous access via IOMMU and DMA API. So let's issue a warning. - */ - WARN_ON(zdev->s390_domain); - if (zdev_enabled(zdev)) - cc = zpci_unregister_ioat(zdev, 0); - /* - * cc == 3 indicates the function is gone already. This can happen - * if the function was deconfigured/disabled suddenly and we have not - * received a new handle yet. - */ - if (cc && cc != 3) - return -EIO; - - dma_cleanup_tables(zdev->dma_table); - zdev->dma_table = NULL; - vfree(zdev->iommu_bitmap); - zdev->iommu_bitmap = NULL; - vfree(zdev->lazy_bitmap); - zdev->lazy_bitmap = NULL; - zdev->next_bit = 0; - return 0; -} - -static int __init dma_alloc_cpu_table_caches(void) -{ - dma_region_table_cache = kmem_cache_create("PCI_DMA_region_tables", - ZPCI_TABLE_SIZE, ZPCI_TABLE_ALIGN, - 0, NULL); - if (!dma_region_table_cache) - return -ENOMEM; - - dma_page_table_cache = kmem_cache_create("PCI_DMA_page_tables", - ZPCI_PT_SIZE, ZPCI_PT_ALIGN, - 0, NULL); - if (!dma_page_table_cache) { - kmem_cache_destroy(dma_region_table_cache); - return -ENOMEM; - } - return 0; -} - -int __init zpci_dma_init(void) -{ - s390_iommu_aperture = (u64)virt_to_phys(high_memory); - if (!s390_iommu_aperture_factor) - s390_iommu_aperture = ULONG_MAX; - else - s390_iommu_aperture *= s390_iommu_aperture_factor; - - return dma_alloc_cpu_table_caches(); -} - -void zpci_dma_exit(void) -{ - kmem_cache_destroy(dma_page_table_cache); - kmem_cache_destroy(dma_region_table_cache); -} - -const struct dma_map_ops s390_pci_dma_ops = { - .alloc = s390_dma_alloc, - .free = s390_dma_free, - .map_sg = s390_dma_map_sg, - .unmap_sg = s390_dma_unmap_sg, - .map_page = s390_dma_map_pages, - .unmap_page = s390_dma_unmap_pages, - .mmap = dma_common_mmap, - .get_sgtable = dma_common_get_sgtable, - .alloc_pages = dma_common_alloc_pages, - .free_pages = dma_common_free_pages, - /* dma_supported is unconditionally true without a callback */ -}; -EXPORT_SYMBOL_GPL(s390_pci_dma_ops); - -static int __init s390_iommu_setup(char *str) -{ - if (!strcmp(str, "strict")) - s390_iommu_strict = 1; - return 1; -} - -__setup("s390_iommu=", s390_iommu_setup); - -static int __init s390_iommu_aperture_setup(char *str) -{ - if (kstrtou32(str, 10, &s390_iommu_aperture_factor)) - s390_iommu_aperture_factor = 1; - return 1; -} - -__setup("s390_iommu_aperture=", s390_iommu_aperture_setup); diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c index 4ef5a6a1d618..4d9773ef9e0a 100644 --- a/arch/s390/pci/pci_event.c +++ b/arch/s390/pci/pci_event.c @@ -313,8 +313,6 @@ static void zpci_event_hard_deconfigured(struct zpci_dev *zdev, u32 fh) /* Even though the device is already gone we still * need to free zPCI resources as part of the disable. */ - if (zdev->dma_table) - zpci_dma_exit_device(zdev); if (zdev_enabled(zdev)) zpci_disable_device(zdev); zdev->state = ZPCI_FN_STATE_STANDBY; diff --git a/arch/s390/pci/pci_sysfs.c b/arch/s390/pci/pci_sysfs.c index cae280e5c047..8a7abac51816 100644 --- a/arch/s390/pci/pci_sysfs.c +++ b/arch/s390/pci/pci_sysfs.c @@ -56,6 +56,7 @@ static ssize_t recover_store(struct device *dev, struct device_attribute *attr, struct pci_dev *pdev = to_pci_dev(dev); struct zpci_dev *zdev = to_zpci(pdev); int ret = 0; + u8 status; /* Can't use device_remove_self() here as that would lead us to lock * the pci_rescan_remove_lock while holding the device' kernfs lock. @@ -82,12 +83,6 @@ static ssize_t recover_store(struct device *dev, struct device_attribute *attr, pci_lock_rescan_remove(); if (pci_dev_is_added(pdev)) { pci_stop_and_remove_bus_device(pdev); - if (zdev->dma_table) { - ret = zpci_dma_exit_device(zdev); - if (ret) - goto out; - } - if (zdev_enabled(zdev)) { ret = zpci_disable_device(zdev); /* @@ -105,14 +100,16 @@ static ssize_t recover_store(struct device *dev, struct device_attribute *attr, ret = zpci_enable_device(zdev); if (ret) goto out; - ret = zpci_dma_init_device(zdev); - if (ret) { - zpci_disable_device(zdev); - goto out; + + if (zdev->dma_table) { + ret = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma, + virt_to_phys(zdev->dma_table), &status); + if (ret) + zpci_disable_device(zdev); } - pci_rescan_bus(zdev->zbus->bus); } out: + pci_rescan_bus(zdev->zbus->bus); pci_unlock_rescan_remove(); if (kn) sysfs_unbreak_active_protection(kn); diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index dc5f7a156ff5..804fb8f42d61 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -93,7 +93,7 @@ config IOMMU_DEBUGFS choice prompt "IOMMU default domain type" depends on IOMMU_API - default IOMMU_DEFAULT_DMA_LAZY if X86 || IA64 + default IOMMU_DEFAULT_DMA_LAZY if X86 || IA64 || S390 default IOMMU_DEFAULT_DMA_STRICT help Choose the type of IOMMU domain used to manage DMA API usage by @@ -412,6 +412,7 @@ config ARM_SMMU_V3_SVA config S390_IOMMU def_bool y if S390 && PCI depends on S390 && PCI + select IOMMU_DMA select IOMMU_API help Support for the IOMMU API for s390 PCI devices. diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c index ed33c6cce083..9fd788d64ac8 100644 --- a/drivers/iommu/s390-iommu.c +++ b/drivers/iommu/s390-iommu.c @@ -14,16 +14,300 @@ #include #include +#include "dma-iommu.h" + static const struct iommu_ops s390_iommu_ops; +static struct kmem_cache *dma_region_table_cache; +static struct kmem_cache *dma_page_table_cache; + +static u64 s390_iommu_aperture; +static u32 s390_iommu_aperture_factor = 1; + struct s390_domain { struct iommu_domain domain; struct list_head devices; + struct zpci_iommu_ctrs ctrs; unsigned long *dma_table; spinlock_t list_lock; struct rcu_head rcu; }; +static inline unsigned int calc_rtx(dma_addr_t ptr) +{ + return ((unsigned long)ptr >> ZPCI_RT_SHIFT) & ZPCI_INDEX_MASK; +} + +static inline unsigned int calc_sx(dma_addr_t ptr) +{ + return ((unsigned long)ptr >> ZPCI_ST_SHIFT) & ZPCI_INDEX_MASK; +} + +static inline unsigned int calc_px(dma_addr_t ptr) +{ + return ((unsigned long)ptr >> PAGE_SHIFT) & ZPCI_PT_MASK; +} + +static inline void set_pt_pfaa(unsigned long *entry, phys_addr_t pfaa) +{ + *entry &= ZPCI_PTE_FLAG_MASK; + *entry |= (pfaa & ZPCI_PTE_ADDR_MASK); +} + +static inline void set_rt_sto(unsigned long *entry, phys_addr_t sto) +{ + *entry &= ZPCI_RTE_FLAG_MASK; + *entry |= (sto & ZPCI_RTE_ADDR_MASK); + *entry |= ZPCI_TABLE_TYPE_RTX; +} + +static inline void set_st_pto(unsigned long *entry, phys_addr_t pto) +{ + *entry &= ZPCI_STE_FLAG_MASK; + *entry |= (pto & ZPCI_STE_ADDR_MASK); + *entry |= ZPCI_TABLE_TYPE_SX; +} + +static inline void validate_rt_entry(unsigned long *entry) +{ + *entry &= ~ZPCI_TABLE_VALID_MASK; + *entry &= ~ZPCI_TABLE_OFFSET_MASK; + *entry |= ZPCI_TABLE_VALID; + *entry |= ZPCI_TABLE_LEN_RTX; +} + +static inline void validate_st_entry(unsigned long *entry) +{ + *entry &= ~ZPCI_TABLE_VALID_MASK; + *entry |= ZPCI_TABLE_VALID; +} + +static inline void invalidate_pt_entry(unsigned long *entry) +{ + WARN_ON_ONCE((*entry & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_INVALID); + *entry &= ~ZPCI_PTE_VALID_MASK; + *entry |= ZPCI_PTE_INVALID; +} + +static inline void validate_pt_entry(unsigned long *entry) +{ + WARN_ON_ONCE((*entry & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_VALID); + *entry &= ~ZPCI_PTE_VALID_MASK; + *entry |= ZPCI_PTE_VALID; +} + +static inline void entry_set_protected(unsigned long *entry) +{ + *entry &= ~ZPCI_TABLE_PROT_MASK; + *entry |= ZPCI_TABLE_PROTECTED; +} + +static inline void entry_clr_protected(unsigned long *entry) +{ + *entry &= ~ZPCI_TABLE_PROT_MASK; + *entry |= ZPCI_TABLE_UNPROTECTED; +} + +static inline int reg_entry_isvalid(unsigned long entry) +{ + return (entry & ZPCI_TABLE_VALID_MASK) == ZPCI_TABLE_VALID; +} + +static inline int pt_entry_isvalid(unsigned long entry) +{ + return (entry & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_VALID; +} + +static inline unsigned long *get_rt_sto(unsigned long entry) +{ + if ((entry & ZPCI_TABLE_TYPE_MASK) == ZPCI_TABLE_TYPE_RTX) + return phys_to_virt(entry & ZPCI_RTE_ADDR_MASK); + else + return NULL; +} + +static inline unsigned long *get_st_pto(unsigned long entry) +{ + if ((entry & ZPCI_TABLE_TYPE_MASK) == ZPCI_TABLE_TYPE_SX) + return phys_to_virt(entry & ZPCI_STE_ADDR_MASK); + else + return NULL; +} + +static int __init dma_alloc_cpu_table_caches(void) +{ + dma_region_table_cache = kmem_cache_create("PCI_DMA_region_tables", + ZPCI_TABLE_SIZE, + ZPCI_TABLE_ALIGN, + 0, NULL); + if (!dma_region_table_cache) + return -ENOMEM; + + dma_page_table_cache = kmem_cache_create("PCI_DMA_page_tables", + ZPCI_PT_SIZE, + ZPCI_PT_ALIGN, + 0, NULL); + if (!dma_page_table_cache) { + kmem_cache_destroy(dma_region_table_cache); + return -ENOMEM; + } + return 0; +} + +static unsigned long *dma_alloc_cpu_table(void) +{ + unsigned long *table, *entry; + + table = kmem_cache_alloc(dma_region_table_cache, GFP_ATOMIC); + if (!table) + return NULL; + + for (entry = table; entry < table + ZPCI_TABLE_ENTRIES; entry++) + *entry = ZPCI_TABLE_INVALID; + return table; +} + +static void dma_free_cpu_table(void *table) +{ + kmem_cache_free(dma_region_table_cache, table); +} + +static void dma_free_page_table(void *table) +{ + kmem_cache_free(dma_page_table_cache, table); +} + +static void dma_free_seg_table(unsigned long entry) +{ + unsigned long *sto = get_rt_sto(entry); + int sx; + + for (sx = 0; sx < ZPCI_TABLE_ENTRIES; sx++) + if (reg_entry_isvalid(sto[sx])) + dma_free_page_table(get_st_pto(sto[sx])); + + dma_free_cpu_table(sto); +} + +static void dma_cleanup_tables(unsigned long *table) +{ + int rtx; + + if (!table) + return; + + for (rtx = 0; rtx < ZPCI_TABLE_ENTRIES; rtx++) + if (reg_entry_isvalid(table[rtx])) + dma_free_seg_table(table[rtx]); + + dma_free_cpu_table(table); +} + +static unsigned long *dma_alloc_page_table(void) +{ + unsigned long *table, *entry; + + table = kmem_cache_alloc(dma_page_table_cache, GFP_ATOMIC); + if (!table) + return NULL; + + for (entry = table; entry < table + ZPCI_PT_ENTRIES; entry++) + *entry = ZPCI_PTE_INVALID; + return table; +} + +static unsigned long *dma_get_seg_table_origin(unsigned long *rtep) +{ + unsigned long old_rte, rte; + unsigned long *sto; + + rte = READ_ONCE(*rtep); + if (reg_entry_isvalid(rte)) { + sto = get_rt_sto(rte); + } else { + sto = dma_alloc_cpu_table(); + if (!sto) + return NULL; + + set_rt_sto(&rte, virt_to_phys(sto)); + validate_rt_entry(&rte); + entry_clr_protected(&rte); + + old_rte = cmpxchg(rtep, ZPCI_TABLE_INVALID, rte); + if (old_rte != ZPCI_TABLE_INVALID) { + /* Somone else was faster, use theirs */ + dma_free_cpu_table(sto); + sto = get_rt_sto(old_rte); + } + } + return sto; +} + +static unsigned long *dma_get_page_table_origin(unsigned long *step) +{ + unsigned long old_ste, ste; + unsigned long *pto; + + ste = READ_ONCE(*step); + if (reg_entry_isvalid(ste)) { + pto = get_st_pto(ste); + } else { + pto = dma_alloc_page_table(); + if (!pto) + return NULL; + set_st_pto(&ste, virt_to_phys(pto)); + validate_st_entry(&ste); + entry_clr_protected(&ste); + + old_ste = cmpxchg(step, ZPCI_TABLE_INVALID, ste); + if (old_ste != ZPCI_TABLE_INVALID) { + /* Somone else was faster, use theirs */ + dma_free_page_table(pto); + pto = get_st_pto(old_ste); + } + } + return pto; +} + +static unsigned long *dma_walk_cpu_trans(unsigned long *rto, dma_addr_t dma_addr) +{ + unsigned long *sto, *pto; + unsigned int rtx, sx, px; + + rtx = calc_rtx(dma_addr); + sto = dma_get_seg_table_origin(&rto[rtx]); + if (!sto) + return NULL; + + sx = calc_sx(dma_addr); + pto = dma_get_page_table_origin(&sto[sx]); + if (!pto) + return NULL; + + px = calc_px(dma_addr); + return &pto[px]; +} + +static void dma_update_cpu_trans(unsigned long *ptep, phys_addr_t page_addr, int flags) +{ + unsigned long pte; + + pte = READ_ONCE(*ptep); + if (flags & ZPCI_PTE_INVALID) { + invalidate_pt_entry(&pte); + } else { + set_pt_pfaa(&pte, page_addr); + validate_pt_entry(&pte); + } + + if (flags & ZPCI_TABLE_PROTECTED) + entry_set_protected(&pte); + else + entry_clr_protected(&pte); + + xchg(ptep, pte); +} + static struct s390_domain *to_s390_domain(struct iommu_domain *dom) { return container_of(dom, struct s390_domain, domain); @@ -45,9 +329,14 @@ static struct iommu_domain *s390_domain_alloc(unsigned domain_type) { struct s390_domain *s390_domain; - if (domain_type != IOMMU_DOMAIN_UNMANAGED) + switch (domain_type) { + case IOMMU_DOMAIN_DMA: + case IOMMU_DOMAIN_DMA_FQ: + case IOMMU_DOMAIN_UNMANAGED: + break; + default: return NULL; - + } s390_domain = kzalloc(sizeof(*s390_domain), GFP_KERNEL); if (!s390_domain) return NULL; @@ -86,11 +375,14 @@ static void s390_domain_free(struct iommu_domain *domain) call_rcu(&s390_domain->rcu, s390_iommu_rcu_free_domain); } -static void __s390_iommu_detach_device(struct zpci_dev *zdev) +static void s390_iommu_detach_device(struct iommu_domain *domain, + struct device *dev) { - struct s390_domain *s390_domain = zdev->s390_domain; + struct s390_domain *s390_domain = to_s390_domain(domain); + struct zpci_dev *zdev = to_zpci_dev(dev); unsigned long flags; + WARN_ON(zdev->s390_domain != to_s390_domain(domain)); if (!s390_domain) return; @@ -120,9 +412,7 @@ static int s390_iommu_attach_device(struct iommu_domain *domain, return -EINVAL; if (zdev->s390_domain) - __s390_iommu_detach_device(zdev); - else if (zdev->dma_table) - zpci_dma_exit_device(zdev); + s390_iommu_detach_device(&zdev->s390_domain->domain, dev); cc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma, virt_to_phys(s390_domain->dma_table), &status); @@ -144,17 +434,6 @@ static int s390_iommu_attach_device(struct iommu_domain *domain, return 0; } -static void s390_iommu_detach_device(struct iommu_domain *domain, - struct device *dev) -{ - struct zpci_dev *zdev = to_zpci_dev(dev); - - WARN_ON(zdev->s390_domain != to_s390_domain(domain)); - - __s390_iommu_detach_device(zdev); - zpci_dma_init_device(zdev); -} - static void s390_iommu_get_resv_regions(struct device *dev, struct list_head *list) { @@ -207,7 +486,7 @@ static void s390_iommu_release_device(struct device *dev) * to the device, but keep it attached to other devices in the group. */ if (zdev) - __s390_iommu_detach_device(zdev); + s390_iommu_detach_device(&zdev->s390_domain->domain, dev); } static void s390_iommu_flush_iotlb_all(struct iommu_domain *domain) @@ -217,6 +496,7 @@ static void s390_iommu_flush_iotlb_all(struct iommu_domain *domain) rcu_read_lock(); list_for_each_entry_rcu(zdev, &s390_domain->devices, iommu_list) { + atomic64_inc(&s390_domain->ctrs.global_rpcits); zpci_refresh_trans((u64)zdev->fh << 32, zdev->start_dma, zdev->end_dma - zdev->start_dma + 1); } @@ -236,6 +516,7 @@ static void s390_iommu_iotlb_sync(struct iommu_domain *domain, rcu_read_lock(); list_for_each_entry_rcu(zdev, &s390_domain->devices, iommu_list) { + atomic64_inc(&s390_domain->ctrs.sync_rpcits); zpci_refresh_trans((u64)zdev->fh << 32, gather->start, size); } @@ -252,6 +533,7 @@ static void s390_iommu_iotlb_sync_map(struct iommu_domain *domain, list_for_each_entry_rcu(zdev, &s390_domain->devices, iommu_list) { if (!zdev->tlb_refresh) continue; + atomic64_inc(&s390_domain->ctrs.sync_map_rpcits); zpci_refresh_trans((u64)zdev->fh << 32, iova, size); } @@ -332,16 +614,15 @@ static int s390_iommu_map_pages(struct iommu_domain *domain, if (!IS_ALIGNED(iova | paddr, pgsize)) return -EINVAL; - if (!(prot & IOMMU_READ)) - return -EINVAL; - if (!(prot & IOMMU_WRITE)) flags |= ZPCI_TABLE_PROTECTED; rc = s390_iommu_validate_trans(s390_domain, paddr, iova, - pgcount, flags); - if (!rc) + pgcount, flags); + if (!rc) { *mapped = size; + atomic64_add(pgcount, &s390_domain->ctrs.mapped_pages); + } return rc; } @@ -397,12 +678,29 @@ static size_t s390_iommu_unmap_pages(struct iommu_domain *domain, return 0; iommu_iotlb_gather_add_range(gather, iova, size); + atomic64_add(pgcount, &s390_domain->ctrs.unmapped_pages); return size; } +static void s390_iommu_probe_finalize(struct device *dev) +{ + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + + iommu_dma_forcedac = true; + iommu_setup_dma_ops(dev, domain->geometry.aperture_start, domain->geometry.aperture_end); +} + +struct zpci_iommu_ctrs *zpci_get_iommu_ctrs(struct zpci_dev *zdev) +{ + if (!zdev && !zdev->s390_domain) + return NULL; + return &zdev->s390_domain->ctrs; +} + int zpci_init_iommu(struct zpci_dev *zdev) { + u64 aperture_size; int rc = 0; rc = iommu_device_sysfs_add(&zdev->iommu_dev, NULL, NULL, @@ -414,6 +712,12 @@ int zpci_init_iommu(struct zpci_dev *zdev) if (rc) goto out_sysfs; + zdev->start_dma = PAGE_ALIGN(zdev->start_dma); + aperture_size = min3(s390_iommu_aperture, + ZPCI_TABLE_SIZE_RT - zdev->start_dma, + zdev->end_dma - zdev->start_dma + 1); + zdev->end_dma = zdev->start_dma + aperture_size - 1; + return 0; out_sysfs: @@ -429,10 +733,49 @@ void zpci_destroy_iommu(struct zpci_dev *zdev) iommu_device_sysfs_remove(&zdev->iommu_dev); } +static int __init s390_iommu_setup(char *str) +{ + if (!strcmp(str, "strict")) { + pr_warn("s390_iommu=strict deprecated; use iommu.strict=1 instead\n"); + iommu_set_dma_strict(); + } + return 1; +} + +__setup("s390_iommu=", s390_iommu_setup); + +static int __init s390_iommu_aperture_setup(char *str) +{ + if (kstrtou32(str, 10, &s390_iommu_aperture_factor)) + s390_iommu_aperture_factor = 1; + return 1; +} + +__setup("s390_iommu_aperture=", s390_iommu_aperture_setup); + +static int __init s390_iommu_init(void) +{ + int rc; + + s390_iommu_aperture = (u64)virt_to_phys(high_memory); + if (!s390_iommu_aperture_factor) + s390_iommu_aperture = ULONG_MAX; + else + s390_iommu_aperture *= s390_iommu_aperture_factor; + + rc = dma_alloc_cpu_table_caches(); + if (rc) + return rc; + + return rc; +} +subsys_initcall(s390_iommu_init); + static const struct iommu_ops s390_iommu_ops = { .capable = s390_iommu_capable, .domain_alloc = s390_domain_alloc, .probe_device = s390_iommu_probe_device, + .probe_finalize = s390_iommu_probe_finalize, .release_device = s390_iommu_release_device, .device_group = generic_device_group, .pgsize_bitmap = SZ_4K, From patchwork Wed Nov 16 17:16:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niklas Schnelle X-Patchwork-Id: 21199 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp263618wru; Wed, 16 Nov 2022 09:27:36 -0800 (PST) X-Google-Smtp-Source: AA0mqf4jT0jW0psU5+cRtMqSO3kf3m6g4sm0HX51okr+MZ0cOC1TqTSpH9732Pkl5CzMltnyd/DA X-Received: by 2002:a63:1802:0:b0:476:d22c:bdf7 with SMTP id y2-20020a631802000000b00476d22cbdf7mr6100609pgl.279.1668619656586; Wed, 16 Nov 2022 09:27:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668619656; cv=none; d=google.com; s=arc-20160816; b=gnN1FKbYzZPK1aaX5CvxiyGY1OnVCFWNxm7zPlGNOnGxpaN7vJ9LnSQetf3nK0P1uR X4BN8Nnu++3aJG9WCBgPMxbLz00GchWeHkxosUCu0akcZrbYPQKDGnvVSNx68WEugGmt MY0GV6Nma8XvkEnyiccvdB5clnsHt+xZiJUoCHJL1GuyvdHmgTC6d3wOT0G8Wwvdm+3M pcRp4hgtI5c4ZcbT6XFJEAxQtl548H1s71Q+p79GhKq35zFX4+drVxjpAHVb/2ljmTN1 TXeGpooiifaUTgjOD8N2gpgPbURvcX2eMqvwbFYT5F2wWx7jgKwQ5WX3wQ7COrZ8SnnN cMHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DesZm6hVauGJ5jEKwYVVgM1CsFnIzR6FUV6dd8h3oPU=; b=BFmi6kFgKnW8kGeIckzRdWTteFxFgHK1fy7G3qzhhFZ7bfInRLdJE7iQ+Q78Tw62wB lb/wAehN2Ao1tPzwagCzt5Ccncdemnhvy+zMKqF70Kf1bIqbUnQbr6YwEQqy95Hp5iXd Feeq95WWcos8Ry7RqTgAGsLoG3B2ag8hkTDerd5Z+sLFXE0lJW7SUQwHZFmvI22JU1jT 1WapwQrkDjactMkrW99hCVaAjisHZeDoeWWW3/zOgygCW+IYKisc5LPfOkCbLJnkkOEZ 60OpJj7UtKnBskhKILGs0nuQ8aJv2SpOOcVod0YzlBko52Fx5pefNXubfTkj7BTsO2gS n05A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b="Opmw8G/V"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q19-20020a056a00089300b00562331a3562si16917936pfj.130.2022.11.16.09.27.23; Wed, 16 Nov 2022 09:27:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b="Opmw8G/V"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233605AbiKPRRU (ORCPT + 99 others); Wed, 16 Nov 2022 12:17:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33768 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230377AbiKPRRT (ORCPT ); Wed, 16 Nov 2022 12:17:19 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE50E4731C; Wed, 16 Nov 2022 09:17:18 -0800 (PST) Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AGHDJHH005207; Wed, 16 Nov 2022 17:17:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=DesZm6hVauGJ5jEKwYVVgM1CsFnIzR6FUV6dd8h3oPU=; b=Opmw8G/V3dRoyOOSGVdh+czxS74TKiVkIlbiAvymk1v5SGU2j7vxqY9dw7z8eA5cEo/l LFJXRolPGu46ib3tkLFCCAhkjCghXY7YryLjWADW4te3jHsOme0M9Y1hxEV7BGXtIQ2l N91Qs6PqvlYi2bLUud6/3HufTLz9crz+rJ9FMnePH5QIwI/N2VF0qeu5I79h8Fyd9Rnt JJTnGAjotrOcuCPcQ8IJ2n/1zc17QWHimsFEtqjigSVjzcu+Y/MXKOXQVrRfJ2rluM53 5to8DBwGv4zJXk7FoKRcFhG3eW5/DqbC4sk/YfyT+QUJUv9xPd6lTRmTmiYOP/kbw1XJ XQ== Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3kw3qe8say-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:04 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AGH5llf024480; Wed, 16 Nov 2022 17:17:02 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma04ams.nl.ibm.com with ESMTP id 3kt348x9jy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:02 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AGHGxwi38928914 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 16 Nov 2022 17:16:59 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 41D3F5204F; Wed, 16 Nov 2022 17:16:59 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id AC39452054; Wed, 16 Nov 2022 17:16:58 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , Gerd Bayer , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Wenjia Zhang Cc: Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org, Julian Ruess Subject: [PATCH v2 4/7] iommu: Let iommu.strict override ops->def_domain_type Date: Wed, 16 Nov 2022 18:16:53 +0100 Message-Id: <20221116171656.4128212-5-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221116171656.4128212-1-schnelle@linux.ibm.com> References: <20221116171656.4128212-1-schnelle@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Gop84IsW_vCOuB5QmJucy4dagLgXcyIS X-Proofpoint-GUID: Gop84IsW_vCOuB5QmJucy4dagLgXcyIS X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-16_03,2022-11-16_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 spamscore=0 malwarescore=0 bulkscore=0 lowpriorityscore=0 adultscore=0 priorityscore=1501 impostorscore=0 phishscore=0 clxscore=1015 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160119 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749674524708715786?= X-GMAIL-MSGID: =?utf-8?q?1749674524708715786?= When iommu.strict=1 is set or iommu_set_dma_strict() was called we should use IOMMU_DOMAIN_DMA irrespective of ops->def_domain_type. Signed-off-by: Niklas Schnelle --- drivers/iommu/iommu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 65a3b3d886dc..d9bf94d198df 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1562,6 +1562,9 @@ static int iommu_get_def_domain_type(struct device *dev) { const struct iommu_ops *ops = dev_iommu_ops(dev); + if (iommu_dma_strict) + return IOMMU_DOMAIN_DMA; + if (dev_is_pci(dev) && to_pci_dev(dev)->untrusted) return IOMMU_DOMAIN_DMA; From patchwork Wed Nov 16 17:16:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niklas Schnelle X-Patchwork-Id: 21198 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp263371wru; Wed, 16 Nov 2022 09:26:58 -0800 (PST) X-Google-Smtp-Source: AA0mqf5rHFXP8dQyJ+WnLbQ6b4i9u563YaAmWET1bXFK3Ykms7+EDHpOSdWxNqMU5wACxGyajMGR X-Received: by 2002:a17:906:2b10:b0:7ae:c1af:a034 with SMTP id a16-20020a1709062b1000b007aec1afa034mr18760708ejg.346.1668619618361; Wed, 16 Nov 2022 09:26:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668619618; cv=none; d=google.com; s=arc-20160816; b=eeSSoLYnoRbobuTRvpRjxsK/aZn4H1FSU9EOte4qdmJPYOH8QffOrP01QV8J35Fzi0 0wDFoJ9uLT7yOlxoQZkCYZzZUTGGYt+JcMFiOW4paHIYQXsag1bJCFgG0kDrio0wgsl8 OmKv35fOg89tDDEdx7DPIsWAXdhylxvQ58KSLLKUhc3zXtilUKh4yWIApZTX3o6x9krw Uke0pzZXUWUEEFvB2059H3pI+3Vjd2dwJvmISHxlfuBfoGhlBPn8UFm1pfasTxgkSWPJ k08WwPiJMKXMGtVWc+2p/hV65nZlt2VjR0CF29F6djElGSO6HG9siUn4XnAVm/2MvMq1 py4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=dD5Ny4XwKb88esjInxMrKHy9NTEe+LmEr0Gd4GzZKQc=; b=m1RqH94dx03S1uKco3EdlbwyETFtf5HYNJ0fHCrjHHiHoOSI0xzCZ0sdbQuzYS4fVz YPwF5j7ba/4V/BBZy6+9bL/PcP3zGOqlFjShTzTKP7JFlI4MwRkdzGGR8T44IFEC3Tgx RMiHP4jfEi0a1EEyyPXhYKssd7SPFEPwjTOMVt6ruCx/0+jKNOeRl7v1s8WqDL+og3Rt +nTy8eTzWOcgteXhvj899nO3Wj6gO6AcvLEfleal7KzfrjcdEF0Fb2+rUuPegx9Gn6zU mB9JEOCIppzw07GbblGnFbRyxXLkRyP8d5E2Pf3ueHbkJh6Qbun/sEH65d3IvmB9pi9/ aeaw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b="p/tZ4FpT"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id xj10-20020a170906db0a00b00792e39c31dcsi15384239ejb.988.2022.11.16.09.26.35; Wed, 16 Nov 2022 09:26:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b="p/tZ4FpT"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234244AbiKPRR0 (ORCPT + 99 others); Wed, 16 Nov 2022 12:17:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234128AbiKPRRW (ORCPT ); Wed, 16 Nov 2022 12:17:22 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00DB94731C; Wed, 16 Nov 2022 09:17:20 -0800 (PST) Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AGGn7W1003369; Wed, 16 Nov 2022 17:17:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : mime-version; s=pp1; bh=dD5Ny4XwKb88esjInxMrKHy9NTEe+LmEr0Gd4GzZKQc=; b=p/tZ4FpTjf7lZcmacjh/vb0S2ZIetEriMX6fI4Kmleb3bZK6wPFOvdEAu+iOwldsVzg4 FRFsgUH7ypgIT/8QLTKa8odIj07YhxGXvoRxYGrEgWBgdhr3zZpiUD2Odpq6mRoRqQt3 eW2DojME8YIemhftOC9e+l9RlhQDIOoDjdNBq23phGnPxXeEZNpk/O/vOu7840Z2ZCom l1nJxArSpd542nSGg50KSaC7GEPO3ky1rrFZpnrW7UaScFOaEaN1jAD2ADddY2ZEoi3C jN78dawliH2j3dkg0CjLrmlW7iT1ux9twPT9lKvfjj066B3hzLgHXQzkMEF60+U2Iqvp NA== Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3kw3qe8sb9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:05 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AGH4kcL006969; Wed, 16 Nov 2022 17:17:03 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma05fra.de.ibm.com with ESMTP id 3kt3494quk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:02 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AGHGxnw37290650 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 16 Nov 2022 17:16:59 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C431D5204E; Wed, 16 Nov 2022 17:16:59 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 44A5A52051; Wed, 16 Nov 2022 17:16:59 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , Gerd Bayer , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Wenjia Zhang Cc: Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org, Julian Ruess Subject: [PATCH v2 5/7] iommu/dma: Allow a single FQ in addition to per-CPU FQs Date: Wed, 16 Nov 2022 18:16:54 +0100 Message-Id: <20221116171656.4128212-6-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221116171656.4128212-1-schnelle@linux.ibm.com> References: <20221116171656.4128212-1-schnelle@linux.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: rFn_8yTMUSysU67cdqYTmcEyG_kO1JZJ X-Proofpoint-GUID: rFn_8yTMUSysU67cdqYTmcEyG_kO1JZJ X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-16_03,2022-11-16_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 spamscore=0 malwarescore=0 bulkscore=0 lowpriorityscore=0 adultscore=0 priorityscore=1501 impostorscore=0 phishscore=0 clxscore=1015 mlxscore=0 mlxlogscore=901 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160119 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749674485191271186?= X-GMAIL-MSGID: =?utf-8?q?1749674485191271186?= In some virtualized environments, including s390 paged memory guests, IOTLB flushes are used to update IOMMU shadow tables. Due to this, they are much more expensive than in typical bare metal environments or non-paged s390 guests. In addition they may parallelize more poorly in virtualized environments. This changes the trade off for flushing IOVAs such that minimizing the number of IOTLB flushes trumps any benefit of cheaper queuing operations or increased paralellism. In this scenario per-CPU flush queues pose several problems. Firstly per-CPU memory is often quite limited prohibiting larger queues. Secondly collecting IOVAs per-CPU but flushing via a global timeout reduces the number of IOVAs flushed for each timeout especially on s390 where PCI interrupts are not bound to a specific CPU. Thus let's introduce a single flush queue mode IOMMU_DOMAIN_DMA_SQ that reuses the same queue logic but only allocates a single global queue allowing larger batches of IOVAs to be freed at once and with larger timeouts. This is based on an idea by Robin Murphy to allow the common IOVA flushing code to more closely resemble the global flush behavior used on s390's previous internal DMA API implementation. Link: https://lore.kernel.org/linux-iommu/3e402947-61f9-b7e8-1414-fde006257b6f@arm.com/ Signed-off-by: Niklas Schnelle --- drivers/iommu/dma-iommu.c | 146 ++++++++++++++++++++++++++++--------- drivers/iommu/iommu.c | 15 +++- drivers/iommu/s390-iommu.c | 11 +++ include/linux/iommu.h | 7 ++ 4 files changed, 140 insertions(+), 39 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 9297b741f5e8..1cdbf8579946 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -48,8 +48,11 @@ struct iommu_dma_cookie { /* Full allocator for IOMMU_DMA_IOVA_COOKIE */ struct { struct iova_domain iovad; - - struct iova_fq __percpu *fq; /* Flush queue */ + /* Flush queue */ + union { + struct iova_fq *single_fq; + struct iova_fq __percpu *percpu_fq; + }; /* Number of TLB flushes that have been started */ atomic64_t fq_flush_start_cnt; /* Number of TLB flushes that have been finished */ @@ -151,25 +154,44 @@ static void fq_flush_iotlb(struct iommu_dma_cookie *cookie) atomic64_inc(&cookie->fq_flush_finish_cnt); } -static void fq_flush_timeout(struct timer_list *t) +static void fq_flush_percpu(struct iommu_dma_cookie *cookie) { - struct iommu_dma_cookie *cookie = from_timer(cookie, t, fq_timer); int cpu; - atomic_set(&cookie->fq_timer_on, 0); - fq_flush_iotlb(cookie); - for_each_possible_cpu(cpu) { unsigned long flags; struct iova_fq *fq; - fq = per_cpu_ptr(cookie->fq, cpu); + fq = per_cpu_ptr(cookie->percpu_fq, cpu); spin_lock_irqsave(&fq->lock, flags); fq_ring_free(cookie, fq); spin_unlock_irqrestore(&fq->lock, flags); } } +static void fq_flush_single(struct iommu_dma_cookie *cookie) +{ + struct iova_fq *fq = cookie->single_fq; + unsigned long flags; + + spin_lock_irqsave(&fq->lock, flags); + fq_ring_free(cookie, fq); + spin_unlock_irqrestore(&fq->lock, flags); +} + +static void fq_flush_timeout(struct timer_list *t) +{ + struct iommu_dma_cookie *cookie = from_timer(cookie, t, fq_timer); + + atomic_set(&cookie->fq_timer_on, 0); + fq_flush_iotlb(cookie); + + if (cookie->fq_domain->type == IOMMU_DOMAIN_DMA_FQ) + fq_flush_percpu(cookie); + else + fq_flush_single(cookie); +} + static void queue_iova(struct iommu_dma_cookie *cookie, unsigned long pfn, unsigned long pages, struct list_head *freelist) @@ -187,7 +209,11 @@ static void queue_iova(struct iommu_dma_cookie *cookie, */ smp_mb(); - fq = raw_cpu_ptr(cookie->fq); + if (cookie->fq_domain->type == IOMMU_DOMAIN_DMA_FQ) + fq = raw_cpu_ptr(cookie->percpu_fq); + else + fq = cookie->single_fq; + spin_lock_irqsave(&fq->lock, flags); /* @@ -218,31 +244,91 @@ static void queue_iova(struct iommu_dma_cookie *cookie, jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT)); } -static void iommu_dma_free_fq(struct iommu_dma_cookie *cookie) +static void iommu_dma_free_fq_single(struct iova_fq *fq) { - int cpu, idx; + int idx; - if (!cookie->fq) + if (!fq) return; + fq_ring_for_each(idx, fq) + put_pages_list(&fq->entries[idx].freelist); + vfree(fq); +} + +static void iommu_dma_free_fq_percpu(struct iova_fq __percpu *percpu_fq) +{ + int cpu, idx; - del_timer_sync(&cookie->fq_timer); /* The IOVAs will be torn down separately, so just free our queued pages */ for_each_possible_cpu(cpu) { - struct iova_fq *fq = per_cpu_ptr(cookie->fq, cpu); + struct iova_fq *fq = per_cpu_ptr(percpu_fq, cpu); fq_ring_for_each(idx, fq) put_pages_list(&fq->entries[idx].freelist); } - free_percpu(cookie->fq); + free_percpu(percpu_fq); +} + +static void iommu_dma_free_fq(struct iommu_dma_cookie *cookie) +{ + if (!cookie->fq_domain) + return; + + del_timer_sync(&cookie->fq_timer); + if (cookie->fq_domain->type == IOMMU_DOMAIN_DMA_FQ) + iommu_dma_free_fq_percpu(cookie->percpu_fq); + else + iommu_dma_free_fq_single(cookie->single_fq); +} + + +static void iommu_dma_init_one_fq(struct iova_fq *fq) +{ + int i; + + fq->head = 0; + fq->tail = 0; + + spin_lock_init(&fq->lock); + + for (i = 0; i < IOVA_FQ_SIZE; i++) + INIT_LIST_HEAD(&fq->entries[i].freelist); +} + +static int iommu_dma_init_fq_single(struct iommu_dma_cookie *cookie) +{ + struct iova_fq *queue; + + queue = vzalloc(sizeof(*queue)); + if (!queue) + return -ENOMEM; + iommu_dma_init_one_fq(queue); + cookie->single_fq = queue; + + return 0; +} + +static int iommu_dma_init_fq_percpu(struct iommu_dma_cookie *cookie) +{ + struct iova_fq __percpu *queue; + int cpu; + + queue = alloc_percpu(struct iova_fq); + if (!queue) + return -ENOMEM; + + for_each_possible_cpu(cpu) + iommu_dma_init_one_fq(per_cpu_ptr(queue, cpu)); + cookie->percpu_fq = queue; + return 0; } /* sysfs updates are serialised by the mutex of the group owning @domain */ int iommu_dma_init_fq(struct iommu_domain *domain) { struct iommu_dma_cookie *cookie = domain->iova_cookie; - struct iova_fq __percpu *queue; - int i, cpu; + int rc; if (cookie->fq_domain) return 0; @@ -250,26 +336,16 @@ int iommu_dma_init_fq(struct iommu_domain *domain) atomic64_set(&cookie->fq_flush_start_cnt, 0); atomic64_set(&cookie->fq_flush_finish_cnt, 0); - queue = alloc_percpu(struct iova_fq); - if (!queue) { - pr_warn("iova flush queue initialization failed\n"); - return -ENOMEM; - } - - for_each_possible_cpu(cpu) { - struct iova_fq *fq = per_cpu_ptr(queue, cpu); - - fq->head = 0; - fq->tail = 0; - - spin_lock_init(&fq->lock); + if (domain->type == IOMMU_DOMAIN_DMA_FQ) + rc = iommu_dma_init_fq_percpu(cookie); + else + rc = iommu_dma_init_fq_single(cookie); - for (i = 0; i < IOVA_FQ_SIZE; i++) - INIT_LIST_HEAD(&fq->entries[i].freelist); + if (rc) { + pr_warn("iova flush queue initialization failed\n"); + return rc; } - cookie->fq = queue; - timer_setup(&cookie->fq_timer, fq_flush_timeout, 0); atomic_set(&cookie->fq_timer_on, 0); /* @@ -583,7 +659,7 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, goto done_unlock; /* If the FQ fails we can simply fall back to strict mode */ - if (domain->type == IOMMU_DOMAIN_DMA_FQ && iommu_dma_init_fq(domain)) + if (!!(domain->type & __IOMMU_DOMAIN_DMA_FQ) && iommu_dma_init_fq(domain)) domain->type = IOMMU_DOMAIN_DMA; ret = iova_reserve_iommu_regions(dev, domain); diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index d9bf94d198df..8d57c4a4c394 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -140,6 +140,7 @@ static const char *iommu_domain_type_str(unsigned int t) return "Unmanaged"; case IOMMU_DOMAIN_DMA: case IOMMU_DOMAIN_DMA_FQ: + case IOMMU_DOMAIN_DMA_SQ: return "Translated"; default: return "Unknown"; @@ -437,7 +438,7 @@ early_param("iommu.strict", iommu_dma_setup); void iommu_set_dma_strict(void) { iommu_dma_strict = true; - if (iommu_def_domain_type == IOMMU_DOMAIN_DMA_FQ) + if (iommu_def_domain_type & __IOMMU_DOMAIN_DMA_FQ) iommu_def_domain_type = IOMMU_DOMAIN_DMA; } @@ -638,6 +639,9 @@ static ssize_t iommu_group_show_type(struct iommu_group *group, case IOMMU_DOMAIN_DMA_FQ: type = "DMA-FQ\n"; break; + case IOMMU_DOMAIN_DMA_SQ: + type = "DMA-SQ\n"; + break; } } mutex_unlock(&group->mutex); @@ -2912,10 +2916,10 @@ static int iommu_change_dev_def_domain(struct iommu_group *group, } /* We can bring up a flush queue without tearing down the domain */ - if (type == IOMMU_DOMAIN_DMA_FQ && prev_dom->type == IOMMU_DOMAIN_DMA) { + if (!!(type & __IOMMU_DOMAIN_DMA_FQ) && prev_dom->type == IOMMU_DOMAIN_DMA) { ret = iommu_dma_init_fq(prev_dom); if (!ret) - prev_dom->type = IOMMU_DOMAIN_DMA_FQ; + prev_dom->type = type; goto out; } @@ -2986,6 +2990,8 @@ static ssize_t iommu_group_store_type(struct iommu_group *group, req_type = IOMMU_DOMAIN_DMA; else if (sysfs_streq(buf, "DMA-FQ")) req_type = IOMMU_DOMAIN_DMA_FQ; + else if (sysfs_streq(buf, "DMA-SQ")) + req_type = IOMMU_DOMAIN_DMA_SQ; else if (sysfs_streq(buf, "auto")) req_type = 0; else @@ -3037,7 +3043,8 @@ static ssize_t iommu_group_store_type(struct iommu_group *group, /* Check if the device in the group still has a driver bound to it */ device_lock(dev); - if (device_is_bound(dev) && !(req_type == IOMMU_DOMAIN_DMA_FQ && + if (device_is_bound(dev) && !((req_type == IOMMU_DOMAIN_DMA_FQ || + req_type == IOMMU_DOMAIN_DMA_SQ) && group->default_domain->type == IOMMU_DOMAIN_DMA)) { pr_err_ratelimited("Device is still bound to driver\n"); ret = -EBUSY; diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c index 9fd788d64ac8..087bb2acff30 100644 --- a/drivers/iommu/s390-iommu.c +++ b/drivers/iommu/s390-iommu.c @@ -325,6 +325,15 @@ static bool s390_iommu_capable(struct device *dev, enum iommu_cap cap) } } +static int s390_iommu_def_domain_type(struct device *dev) +{ + struct zpci_dev *zdev = to_zpci_dev(dev); + + if (zdev->tlb_refresh) + return IOMMU_DOMAIN_DMA_SQ; + return IOMMU_DOMAIN_DMA_FQ; +} + static struct iommu_domain *s390_domain_alloc(unsigned domain_type) { struct s390_domain *s390_domain; @@ -332,6 +341,7 @@ static struct iommu_domain *s390_domain_alloc(unsigned domain_type) switch (domain_type) { case IOMMU_DOMAIN_DMA: case IOMMU_DOMAIN_DMA_FQ: + case IOMMU_DOMAIN_DMA_SQ: case IOMMU_DOMAIN_UNMANAGED: break; default: @@ -773,6 +783,7 @@ subsys_initcall(s390_iommu_init); static const struct iommu_ops s390_iommu_ops = { .capable = s390_iommu_capable, + .def_domain_type = s390_iommu_def_domain_type, .domain_alloc = s390_domain_alloc, .probe_device = s390_iommu_probe_device, .probe_finalize = s390_iommu_probe_finalize, diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 3c9da1f8979e..be4f461df5ab 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -63,6 +63,7 @@ struct iommu_domain_geometry { implementation */ #define __IOMMU_DOMAIN_PT (1U << 2) /* Domain is identity mapped */ #define __IOMMU_DOMAIN_DMA_FQ (1U << 3) /* DMA-API uses flush queue */ +#define __IOMMU_DOMAIN_DMA_SQ (1U << 4) /* DMA-API uses single queue */ /* * This are the possible domain-types @@ -77,6 +78,8 @@ struct iommu_domain_geometry { * certain optimizations for these domains * IOMMU_DOMAIN_DMA_FQ - As above, but definitely using batched TLB * invalidation. + * IOMMU_DOMAIN_DMA_SQ - As above, but batched TLB validations use + * single global queue */ #define IOMMU_DOMAIN_BLOCKED (0U) #define IOMMU_DOMAIN_IDENTITY (__IOMMU_DOMAIN_PT) @@ -86,6 +89,10 @@ struct iommu_domain_geometry { #define IOMMU_DOMAIN_DMA_FQ (__IOMMU_DOMAIN_PAGING | \ __IOMMU_DOMAIN_DMA_API | \ __IOMMU_DOMAIN_DMA_FQ) +#define IOMMU_DOMAIN_DMA_SQ (__IOMMU_DOMAIN_PAGING | \ + __IOMMU_DOMAIN_DMA_API | \ + __IOMMU_DOMAIN_DMA_FQ | \ + __IOMMU_DOMAIN_DMA_SQ) struct iommu_domain { unsigned type; From patchwork Wed Nov 16 17:16:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niklas Schnelle X-Patchwork-Id: 21201 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp264260wru; Wed, 16 Nov 2022 09:29:00 -0800 (PST) X-Google-Smtp-Source: AA0mqf7klHPgUOGTyFsZ/1Wst4/KKkIrUELZvls3StYbdRUxOWmr40eiU9eBxExJch1Uo/falZlU X-Received: by 2002:a17:907:cc9d:b0:7ac:ef6b:1ef4 with SMTP id up29-20020a170907cc9d00b007acef6b1ef4mr19554229ejc.104.1668619740304; Wed, 16 Nov 2022 09:29:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668619740; cv=none; d=google.com; s=arc-20160816; b=SoiE2T4S2hchrP4TmGN893XNf7lW4vQDxHfIY+fsDi9gMelhTWlP2H7p6AbpdnDUzM FbvlrQwMHtf/nebLuDE3THRz1jQwM4cejbd1eQvR/v3ojAPa7bc+3IQXKsS4CUAD54a/ LCnUG7bAZd3DDT1wHYefWG1o42GYlsTud1bwO+aYyYQ0wjRjjFiGda6w385V7yok8IWo R0qFBSQQOB/O+qYDbrCSB5ffH2hCOn9Ye7ClevzeNyka31CNauODAW5nLhOV+UDJ0tf0 NRA+twCmTsh2LDX78rL8XGTfp5AQcQfj3LAD1pb3txIOGLFC976UVIuuu6UWNtST31E9 VDfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=6vz9aXSGWcK3/k/xUQb/56m+DKEONDep3n+Q9zAW/PM=; b=tGEt2txPbfsLhc5sRkOdAMpp+dex71P7J85jghvQyglxabsEznpIvBMitBAM/A4Wju /EXjydVT3+nngH9jzHD7gUZ217PlEoSq6oSuis2CAEyywhjkKW0SLcB8zpXxYqdNy6OA wKSjau7GBLT0Brt8slN+aVP59pPDxpel2VSrTOeM100aFgDPn+LGFRuyydp8isQXIlb5 bGFhW+OV//w/elzGTAgEzCk8JMDD26KwI+VlXnPfekK+WK8Fioz+9kyezwaQr7lQU3HZ G6kgS1MC5UMROgm2ux26u8ayJhvdJtZequ7XisQUOCDOT/n0Qc7CDzF2jGAgrcV2tC0z CSqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=AcP2tB1o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hs19-20020a1709073e9300b007ae72eabbcbsi16454157ejc.824.2022.11.16.09.28.36; Wed, 16 Nov 2022 09:29:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=AcP2tB1o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237885AbiKPRRr (ORCPT + 99 others); Wed, 16 Nov 2022 12:17:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234286AbiKPRRj (ORCPT ); Wed, 16 Nov 2022 12:17:39 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E52535B58A; Wed, 16 Nov 2022 09:17:24 -0800 (PST) Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AGHFK0T001521; Wed, 16 Nov 2022 17:17:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=6vz9aXSGWcK3/k/xUQb/56m+DKEONDep3n+Q9zAW/PM=; b=AcP2tB1oo7CjK2wXIpjpre1WVv8ZL5gYCs0wbOLPSo8dz72wv8Y8RlBdwVyvwl4toYpn W90Ysada1HNnoZeEUpiKI2WdxeywWbk237u9L5zYSJP0u1bvKI4ezy2qYcaHFJme4Aqr NqwrirWx15LRA1jZtyIWa21R1NwtnG5yFPezlKkA9qXx9ji6OVt0Z10c0LTyJIjZLjgV TMaOH3tS7aygc8+AXWl45Z2NUF5JP8izIIyTNrYbv5D37xqMlmPI1fbEDvzq41ddZVcz /bRc0XFIhtUzy4UQiDWD/1/gjechsDq8G3flg51nhSMKZAImpSg6e1ByqP5Ctocanolk nA== Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3kw43j81bs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:05 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AGH5mkC024486; Wed, 16 Nov 2022 17:17:03 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma04ams.nl.ibm.com with ESMTP id 3kt348x9k1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:03 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AGHH0Ow3408402 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 16 Nov 2022 17:17:00 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 526DB5204E; Wed, 16 Nov 2022 17:17:00 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id C63995204F; Wed, 16 Nov 2022 17:16:59 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , Gerd Bayer , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Wenjia Zhang Cc: Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org, Julian Ruess Subject: [PATCH v2 6/7] iommu/dma: Enable variable queue size and use larger single queue Date: Wed, 16 Nov 2022 18:16:55 +0100 Message-Id: <20221116171656.4128212-7-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221116171656.4128212-1-schnelle@linux.ibm.com> References: <20221116171656.4128212-1-schnelle@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: flf00TSU3yJi2O2JxEBUTpo5ztZoLGUE X-Proofpoint-ORIG-GUID: flf00TSU3yJi2O2JxEBUTpo5ztZoLGUE X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-16_03,2022-11-16_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 adultscore=0 impostorscore=0 bulkscore=0 priorityscore=1501 suspectscore=0 spamscore=0 clxscore=1015 malwarescore=0 lowpriorityscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160119 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749674613122784596?= X-GMAIL-MSGID: =?utf-8?q?1749674613122784596?= Flush queues currently use a fixed compile time size of 256 entries. This being a power of 2 allows the compiler to use shifts and mask instead of more expensive modulo operations. With per-CPU flush queues larger queue sizes would hit per-CPU allocation limits, with a single flush queue these limits do not apply however. As single flush queue mode is intended for environments with epensive IOTLB flushes it then makes sense to use a larger queue size and timeout. To this end re-order struct iova_fq so we can use a dynamic array and make the flush queue size and timeout variable. So as not to lose the shift and mask optimization, check that the variable length is a power of 2 and use explicit shift and mask instead of letting the compiler optimize this. For now use a large fixed queue size and timeout for single flush queues that brings its performance on s390 paged memory guests on par with the previous s390 specific DMA API implementation. In the future the flush queue size can then be turned into a config option or kernel parameter. Signed-off-by: Niklas Schnelle --- drivers/iommu/dma-iommu.c | 60 ++++++++++++++++++++++++++------------- 1 file changed, 41 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 1cdbf8579946..3801cdf11aa8 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -61,6 +61,8 @@ struct iommu_dma_cookie { struct timer_list fq_timer; /* 1 when timer is active, 0 when not */ atomic_t fq_timer_on; + /* timeout in ms */ + unsigned long fq_timer_timeout; }; /* Trivial linear page allocator for IOMMU_DMA_MSI_COOKIE */ dma_addr_t msi_iova; @@ -86,10 +88,16 @@ static int __init iommu_dma_forcedac_setup(char *str) early_param("iommu.forcedac", iommu_dma_forcedac_setup); /* Number of entries per flush queue */ -#define IOVA_FQ_SIZE 256 +#define IOVA_DEFAULT_FQ_SIZE 256 + +/* Number of entries for a single queue */ +#define IOVA_SINGLE_FQ_SIZE 32768 /* Timeout (in ms) after which entries are flushed from the queue */ -#define IOVA_FQ_TIMEOUT 10 +#define IOVA_DEFAULT_FQ_TIMEOUT 10 + +/* Timeout (in ms) for a single queue */ +#define IOVA_SINGLE_FQ_TIMEOUT 1000 /* Flush queue entry for deferred flushing */ struct iova_fq_entry { @@ -101,18 +109,19 @@ struct iova_fq_entry { /* Per-CPU flush queue structure */ struct iova_fq { - struct iova_fq_entry entries[IOVA_FQ_SIZE]; - unsigned int head, tail; spinlock_t lock; + unsigned int head, tail; + unsigned int mod_mask; + struct iova_fq_entry entries[]; }; #define fq_ring_for_each(i, fq) \ - for ((i) = (fq)->head; (i) != (fq)->tail; (i) = ((i) + 1) % IOVA_FQ_SIZE) + for ((i) = (fq)->head; (i) != (fq)->tail; (i) = ((i) + 1) & (fq)->mod_mask) static inline bool fq_full(struct iova_fq *fq) { assert_spin_locked(&fq->lock); - return (((fq->tail + 1) % IOVA_FQ_SIZE) == fq->head); + return (((fq->tail + 1) & fq->mod_mask) == fq->head); } static inline unsigned int fq_ring_add(struct iova_fq *fq) @@ -121,7 +130,7 @@ static inline unsigned int fq_ring_add(struct iova_fq *fq) assert_spin_locked(&fq->lock); - fq->tail = (idx + 1) % IOVA_FQ_SIZE; + fq->tail = (idx + 1) & fq->mod_mask; return idx; } @@ -143,7 +152,7 @@ static void fq_ring_free(struct iommu_dma_cookie *cookie, struct iova_fq *fq) fq->entries[idx].iova_pfn, fq->entries[idx].pages); - fq->head = (fq->head + 1) % IOVA_FQ_SIZE; + fq->head = (fq->head + 1) & fq->mod_mask; } } @@ -241,7 +250,7 @@ static void queue_iova(struct iommu_dma_cookie *cookie, if (!atomic_read(&cookie->fq_timer_on) && !atomic_xchg(&cookie->fq_timer_on, 1)) mod_timer(&cookie->fq_timer, - jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT)); + jiffies + msecs_to_jiffies(cookie->fq_timer_timeout)); } static void iommu_dma_free_fq_single(struct iova_fq *fq) @@ -283,43 +292,45 @@ static void iommu_dma_free_fq(struct iommu_dma_cookie *cookie) } -static void iommu_dma_init_one_fq(struct iova_fq *fq) +static void iommu_dma_init_one_fq(struct iova_fq *fq, unsigned int fq_size) { int i; fq->head = 0; fq->tail = 0; + fq->mod_mask = fq_size - 1; spin_lock_init(&fq->lock); - for (i = 0; i < IOVA_FQ_SIZE; i++) + for (i = 0; i < fq_size; i++) INIT_LIST_HEAD(&fq->entries[i].freelist); } -static int iommu_dma_init_fq_single(struct iommu_dma_cookie *cookie) +static int iommu_dma_init_fq_single(struct iommu_dma_cookie *cookie, unsigned int fq_size) { struct iova_fq *queue; - queue = vzalloc(sizeof(*queue)); + queue = vzalloc(struct_size(queue, entries, fq_size)); if (!queue) return -ENOMEM; - iommu_dma_init_one_fq(queue); + iommu_dma_init_one_fq(queue, fq_size); cookie->single_fq = queue; return 0; } -static int iommu_dma_init_fq_percpu(struct iommu_dma_cookie *cookie) +static int iommu_dma_init_fq_percpu(struct iommu_dma_cookie *cookie, unsigned int fq_size) { struct iova_fq __percpu *queue; int cpu; - queue = alloc_percpu(struct iova_fq); + queue = __alloc_percpu(struct_size(queue, entries, fq_size), + __alignof__(*queue)); if (!queue) return -ENOMEM; for_each_possible_cpu(cpu) - iommu_dma_init_one_fq(per_cpu_ptr(queue, cpu)); + iommu_dma_init_one_fq(per_cpu_ptr(queue, cpu), fq_size); cookie->percpu_fq = queue; return 0; } @@ -328,24 +339,35 @@ static int iommu_dma_init_fq_percpu(struct iommu_dma_cookie *cookie) int iommu_dma_init_fq(struct iommu_domain *domain) { struct iommu_dma_cookie *cookie = domain->iova_cookie; + unsigned int fq_size = IOVA_DEFAULT_FQ_SIZE; int rc; if (cookie->fq_domain) return 0; + if (domain->type == IOMMU_DOMAIN_DMA_SQ) + fq_size = IOVA_SINGLE_FQ_SIZE; + + if (!is_power_of_2(fq_size)) { + pr_err("FQ size must be a power of 2\n"); + return -EINVAL; + } + atomic64_set(&cookie->fq_flush_start_cnt, 0); atomic64_set(&cookie->fq_flush_finish_cnt, 0); if (domain->type == IOMMU_DOMAIN_DMA_FQ) - rc = iommu_dma_init_fq_percpu(cookie); + rc = iommu_dma_init_fq_percpu(cookie, fq_size); else - rc = iommu_dma_init_fq_single(cookie); + rc = iommu_dma_init_fq_single(cookie, fq_size); if (rc) { pr_warn("iova flush queue initialization failed\n"); return rc; } + cookie->fq_timer_timeout = (domain->type == IOMMU_DOMAIN_DMA_SQ) ? + IOVA_SINGLE_FQ_TIMEOUT : IOVA_DEFAULT_FQ_TIMEOUT; timer_setup(&cookie->fq_timer, fq_flush_timeout, 0); atomic_set(&cookie->fq_timer_on, 0); /* From patchwork Wed Nov 16 17:16:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niklas Schnelle X-Patchwork-Id: 21202 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp264340wru; Wed, 16 Nov 2022 09:29:11 -0800 (PST) X-Google-Smtp-Source: AA0mqf4bk5tglzq6ug5tTtHWooVnlZQhmbiD7u7CSDoEIyKEW2+ldj4BukVVxwovrzKCFjT7Gi1u X-Received: by 2002:a05:6402:611:b0:466:7500:b5df with SMTP id n17-20020a056402061100b004667500b5dfmr20129493edv.48.1668619751612; Wed, 16 Nov 2022 09:29:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668619751; cv=none; d=google.com; s=arc-20160816; b=Rg5RJz9fVR89yOjas7rN9+HbQiUKirMBZ0pU0ek6O0CbQw5L4OINe6gFr6xCpOxAuM 7VYAsVqEG6MpyCdHMWeOSTXFzXxfih067lUDKfv9bBC9AIStSM/XXGGay0AIzBxaY2cP diL+Wwi52Swonbgpm74N47KgrbLrZFbJycO5cK7CkYpfpauc2EyAn+2bHc472NoDzAKo mhCyz9vIyNphiWbYc0u0ZX88yTKul08+RJ1Avek9qyL6MivoIeY8Ow4mVHG10ahD9iuE tbmlgy8OtcY54/XCy70wEc6PY8RR5CbZCgO8z2RsSxLktpro7P2ZHO7BeaD4sToqNcgZ K6HQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tvO5QRhnZQNwwKCc2yWc5VgfQdhcmKr/lkbJbaWChko=; b=CK/Rz3WVJmD2trZ2gg98ERcI3wrYLZvJC64RdQqpc95l2zPGQRG0sMuFwwWmckeVQH ofnl/6sX+lohVbqM3reOgfCi0UxYYQpyy/13YHlyUuhkoaPJoYYrNPRDiywP3+DleKYc EbSJu95tLvkYfiP2riAfqxH63EnqnSbTDdr51j4lETl0OuIK3hn3B4zj66P5NhU7k8Ue 9FQ/MJgqFRbBqGLOq4nWX5O5Uifk07QIFcnlxBIfb3G8hpSoIMctXorMZNOl2zxBk5d/ IYwhSlZ3QCYHY/9suQEiiupoa2VLqLSnfqIiPOu+H2nJ2knlRTHKXY33GVhBXmPHbBE/ aoIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=G57pzugs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id oz12-20020a1709077d8c00b0079198b89adbsi15323840ejc.890.2022.11.16.09.28.48; Wed, 16 Nov 2022 09:29:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=G57pzugs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238529AbiKPRRv (ORCPT + 99 others); Wed, 16 Nov 2022 12:17:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234304AbiKPRRk (ORCPT ); Wed, 16 Nov 2022 12:17:40 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A1625B591; Wed, 16 Nov 2022 09:17:25 -0800 (PST) Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AGGg24V031453; Wed, 16 Nov 2022 17:17:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=tvO5QRhnZQNwwKCc2yWc5VgfQdhcmKr/lkbJbaWChko=; b=G57pzugsozEV35YcVL6LNsPFWKSHx4/WhveKRFpulayy74ZY2OriAJ3LPTbipmuf0uG/ Nz4YpAxRS8X8IhFbO1yK1ZIWusLa3fzdqjyPfbd/RMjyooUKbdknjRPeZS1v1PCTgBq8 q+ouHeR419B2EpmfO9UyRPt76p5lYQir0zBXZhpFvBaNCNgjVvZ6Mwi42G08/9jRpwEk VrV6spQfX97ET6QmaF5NjJQtl6lMTWdP+fM1kdW7bw+HizZQfQBjo7MaRDZOmDIan56M XfbGLSY7BoTpFrxrPLWOMx3ny4b/mW34x5O0/nfbwJjKp4GFQ1PNyntj52z5NiGB5Rtl WQ== Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3kw3m50w66-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:05 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AGH4kwp006968; Wed, 16 Nov 2022 17:17:04 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma05fra.de.ibm.com with ESMTP id 3kt3494qum-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 16 Nov 2022 17:17:03 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AGHH02G60490164 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 16 Nov 2022 17:17:00 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D40D552054; Wed, 16 Nov 2022 17:17:00 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 54A0D52050; Wed, 16 Nov 2022 17:17:00 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , Gerd Bayer , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Wenjia Zhang Cc: Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org, Julian Ruess Subject: [PATCH v2 7/7] iommu/s390: flush queued IOVAs on RPCIT out of resource indication Date: Wed, 16 Nov 2022 18:16:56 +0100 Message-Id: <20221116171656.4128212-8-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221116171656.4128212-1-schnelle@linux.ibm.com> References: <20221116171656.4128212-1-schnelle@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 3urLWCGZdldKrrByunqB7bPZQSfeAO4S X-Proofpoint-ORIG-GUID: 3urLWCGZdldKrrByunqB7bPZQSfeAO4S X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-16_03,2022-11-16_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 priorityscore=1501 suspectscore=0 mlxlogscore=999 phishscore=0 malwarescore=0 spamscore=0 adultscore=0 bulkscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211160119 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749674624854384940?= X-GMAIL-MSGID: =?utf-8?q?1749674624854384940?= When RPCIT indicates that the underlying hypervisor has run out of resources it often means that its IOVA space is exhausted and IOVAs need to be freed before new ones can be created. By triggering a flush of the IOVA queue we can get the queued IOVAs freed and also get the new mapping established during the global flush. Signed-off-by: Niklas Schnelle --- drivers/iommu/dma-iommu.c | 14 +++++++++----- drivers/iommu/dma-iommu.h | 1 + drivers/iommu/s390-iommu.c | 7 +++++-- 3 files changed, 15 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 3801cdf11aa8..54e7f63fd0d9 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -188,19 +188,23 @@ static void fq_flush_single(struct iommu_dma_cookie *cookie) spin_unlock_irqrestore(&fq->lock, flags); } -static void fq_flush_timeout(struct timer_list *t) +void iommu_dma_flush_fq(struct iommu_dma_cookie *cookie) { - struct iommu_dma_cookie *cookie = from_timer(cookie, t, fq_timer); - - atomic_set(&cookie->fq_timer_on, 0); fq_flush_iotlb(cookie); - if (cookie->fq_domain->type == IOMMU_DOMAIN_DMA_FQ) fq_flush_percpu(cookie); else fq_flush_single(cookie); } +static void fq_flush_timeout(struct timer_list *t) +{ + struct iommu_dma_cookie *cookie = from_timer(cookie, t, fq_timer); + + atomic_set(&cookie->fq_timer_on, 0); + iommu_dma_flush_fq(cookie); +} + static void queue_iova(struct iommu_dma_cookie *cookie, unsigned long pfn, unsigned long pages, struct list_head *freelist) diff --git a/drivers/iommu/dma-iommu.h b/drivers/iommu/dma-iommu.h index 942790009292..cac06030aa26 100644 --- a/drivers/iommu/dma-iommu.h +++ b/drivers/iommu/dma-iommu.h @@ -13,6 +13,7 @@ int iommu_get_dma_cookie(struct iommu_domain *domain); void iommu_put_dma_cookie(struct iommu_domain *domain); int iommu_dma_init_fq(struct iommu_domain *domain); +void iommu_dma_flush_fq(struct iommu_dma_cookie *cookie); void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list); diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c index 087bb2acff30..9c2782c4043e 100644 --- a/drivers/iommu/s390-iommu.c +++ b/drivers/iommu/s390-iommu.c @@ -538,14 +538,17 @@ static void s390_iommu_iotlb_sync_map(struct iommu_domain *domain, { struct s390_domain *s390_domain = to_s390_domain(domain); struct zpci_dev *zdev; + int rc; rcu_read_lock(); list_for_each_entry_rcu(zdev, &s390_domain->devices, iommu_list) { if (!zdev->tlb_refresh) continue; atomic64_inc(&s390_domain->ctrs.sync_map_rpcits); - zpci_refresh_trans((u64)zdev->fh << 32, - iova, size); + rc = zpci_refresh_trans((u64)zdev->fh << 32, + iova, size); + if (rc == -ENOMEM) + iommu_dma_flush_fq(domain->iova_cookie); } rcu_read_unlock(); }