Message ID | 20230228023341.973671-1-baolu.lu@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp2774355wrd; Mon, 27 Feb 2023 18:53:57 -0800 (PST) X-Google-Smtp-Source: AK7set+oQTMuUuIYLacHP08tbFnINPd5rUlYLaCHKBUXvojQyplmsg/vquKWqKLM+leJYEnCLwNg X-Received: by 2002:a17:906:80da:b0:8ae:11ca:81de with SMTP id a26-20020a17090680da00b008ae11ca81demr773653ejx.34.1677552837523; Mon, 27 Feb 2023 18:53:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677552837; cv=none; d=google.com; s=arc-20160816; b=bFkON+hfRmoSXEPnKkT0JfbvHhNoPpCz+yEq72ns2zEgN73dUZe9D6K3FnqwX60lHs p0mWXGJMI32KwARh2vhwgEqM6qbQgdCDNB5nhNBcPNgvPJSi0W5FrbNBa9c+zONrixc0 TQPTPEOql0iRoD/Jid3teEXoiER0AQYyZ79ITnGfAoGHF8UbJPyy+laDUuwC5QUqkB07 vQDDuPoHUIY1oXrt6L/P+m1ortSW06pYONZ5tpMXTwAJA3H2FG65cTguDI4yER/Im/dh CBLC644Kg/SHYwXNPp+omrK/fHBGWcJCIJroyLxi48FZ1yubY/rz5285MqeyNmFvIP/Q yNPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=fNlFi2f+ki3hEnV8bhIK+Xh8Jx9SsqBwwRSMO8vKruU=; b=mQpi50rqv4Xa+Saz7gYv3Ctoxcz2A+1meD94suswoBrPhTY3j2NqUOcCon8cWlU2F4 P/Ey9XA3o/JBSepi069NzlicCYf7Tan+JWhMUVbp/+kcORz0tKU0BclXt0dmcZOaclOh 3UjCryYJ7s2XJHCqPMeVlVhnucTE+yKrovLwcgzw6ptl7PBGCsp1tls5hEK8scuJRes0 EV9dWNJYxQqY6o/JaGIC3NolKi5B+D4TDrQEa5SU8dF6+7uKxwBzVACW/GTZzMMLRW6Q yF5RoU+G9TMn+72+pjy+h4DoR7Bo7nNRamSYqPX3xgs0fe9kD3TBLoDLbB5DZVoIVmf+ Dj5A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=T631mDim; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sa23-20020a170906edb700b008d0d38769dcsi10418302ejb.324.2023.02.27.18.53.34; Mon, 27 Feb 2023 18:53:57 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=T631mDim; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229525AbjB1Cm1 (ORCPT <rfc822;liqunnana@gmail.com> + 99 others); Mon, 27 Feb 2023 21:42:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229471AbjB1CmZ (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 27 Feb 2023 21:42:25 -0500 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89D7028D3E for <linux-kernel@vger.kernel.org>; Mon, 27 Feb 2023 18:42:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677552141; x=1709088141; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=A0CRrdLyu+AkROlOaV6sxjn4YAoa5LBpAtnl3pAoGX8=; b=T631mDimijDQTmeZ0aRqjVg6B2lLQWh99jZtCugdALht7WpMLRuZ+vwn g3ubAJV2Ys8ZztbzlU3m2Gf/OnEaKOvBA7apMQZYnZG7xOCc1o+6Wpotu F7M22bP2sC2SKZEYFO1Sx+w0a3alpLzO7T1HSQyQF4fCRLaB+xP2YOb+c aQ8oDylzQVKgq40wpxQ9Lias3ENGrhnqZrWJttcFgNIdYhIbwYa/VvKid 2Y92YYzqFzjeaEuP2SvaTJgGc5Cog2mp93QGKhlg7AsqdqlvJLzAWnvJY QdOVCtdyz9msSTrrzpkruo5VOs6Hbrae0dmEpL1HsTH3s04byJFG9qHlh Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="322278097" X-IronPort-AV: E=Sophos;i="5.98,220,1673942400"; d="scan'208";a="322278097" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Feb 2023 18:42:20 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="673963809" X-IronPort-AV: E=Sophos;i="5.98,220,1673942400"; d="scan'208";a="673963809" Received: from allen-box.sh.intel.com ([10.239.159.48]) by orsmga002.jf.intel.com with ESMTP; 27 Feb 2023 18:42:18 -0800 From: Lu Baolu <baolu.lu@linux.intel.com> To: iommu@lists.linux.dev Cc: Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>, Robin Murphy <robin.murphy@arm.com>, Kevin Tian <kevin.tian@intel.com>, Jason Gunthorpe <jgg@nvidia.com>, linux-kernel@vger.kernel.org, Lu Baolu <baolu.lu@linux.intel.com> Subject: [PATCH 1/1] iommu/vt-d: Add opt-in for ATS support on discrete devices Date: Tue, 28 Feb 2023 10:33:41 +0800 Message-Id: <20230228023341.973671-1-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759041644188360424?= X-GMAIL-MSGID: =?utf-8?q?1759041644188360424?= |
Series |
[1/1] iommu/vt-d: Add opt-in for ATS support on discrete devices
|
|
Commit Message
Baolu Lu
Feb. 28, 2023, 2:33 a.m. UTC
In normal processing of PCIe ATS requests, the IOMMU performs address
translation and returns the device a physical memory address which
will be stored in that device's IOTLB. The device may subsequently
issue Translated DMA request containing physical memory address. The
IOMMU only checks that the device was allowed to issue such requests
and does not attempt to validate the physical address.
The Intel IOMMU implementation only allows PCIe ATS on several SOC-
integrated devices which are opt-in’ed through the ACPI tables to
prevent any compromised device from accessing arbitrary physical
memory.
Add a kernel option intel_iommu=relax_ats to allow users to have an
opt-in to allow turning on ATS at as wish, especially for CSP-owned
vertical devices. In any case, risky devices are not allowed to use
ATS.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
Documentation/admin-guide/kernel-parameters.txt | 6 ++++++
drivers/iommu/intel/iommu.c | 7 +++++++
2 files changed, 13 insertions(+)
Comments
On Tue, Feb 28, 2023 at 10:33:41AM +0800, Lu Baolu wrote: > In normal processing of PCIe ATS requests, the IOMMU performs address > translation and returns the device a physical memory address which > will be stored in that device's IOTLB. The device may subsequently > issue Translated DMA request containing physical memory address. The > IOMMU only checks that the device was allowed to issue such requests > and does not attempt to validate the physical address. > > The Intel IOMMU implementation only allows PCIe ATS on several SOC- > integrated devices which are opt-in’ed through the ACPI tables to > prevent any compromised device from accessing arbitrary physical > memory. > > Add a kernel option intel_iommu=relax_ats to allow users to have an > opt-in to allow turning on ATS at as wish, especially for CSP-owned > vertical devices. In any case, risky devices are not allowed to use > ATS. Why is this an intel specific option? all it does is effectively disable untrusted? Why not a global option? All iommu with ATS will need this? Also, why doesn't a "CSP" set their ACPI to make the devices they want to use ATS with trusted instead of this? Jason
On 2/28/23 8:23 PM, Jason Gunthorpe wrote: > On Tue, Feb 28, 2023 at 10:33:41AM +0800, Lu Baolu wrote: >> In normal processing of PCIe ATS requests, the IOMMU performs address >> translation and returns the device a physical memory address which >> will be stored in that device's IOTLB. The device may subsequently >> issue Translated DMA request containing physical memory address. The >> IOMMU only checks that the device was allowed to issue such requests >> and does not attempt to validate the physical address. >> >> The Intel IOMMU implementation only allows PCIe ATS on several SOC- >> integrated devices which are opt-in’ed through the ACPI tables to >> prevent any compromised device from accessing arbitrary physical >> memory. >> >> Add a kernel option intel_iommu=relax_ats to allow users to have an >> opt-in to allow turning on ATS at as wish, especially for CSP-owned >> vertical devices. In any case, risky devices are not allowed to use >> ATS. > Why is this an intel specific option? I only see similar situation on ARM SMMUv3 platforms. The device ATS is only allowed when the ATS bit is set in RC node of the ACPI/IORT table. > all it does is effectively > disable untrusted? It's irrelevant to untrusted devices. Untrusted devices, with pci_dev->untrusted set, means device connects to the system through some kind of untrusted external port, e.g. thunderbolt ports. For those devices, ATS shouldn't be enabled for those devices. Here we are talking about soc-integrated devices vs. discrete PCI devices (connected to the system through internal PCI slot). The problem is that the DMAR ACPI table only defines ATS attribute for Soc- integrated devices, which causes ATS (and its ancillary features) on the discrete PCI devices not to work. > Why not a global option? All iommu with ATS will > need this? > > Also, why doesn't a "CSP" set their ACPI to make the devices they want > to use ATS with trusted instead of this? For example, users might purchase general servers and use their own or third-party PCIe adapters on them. They have no means to customize the BIOS. Best regards, baolu
On Wed, Mar 01, 2023 at 12:22:23PM +0800, Baolu Lu wrote: > On 2/28/23 8:23 PM, Jason Gunthorpe wrote: > > On Tue, Feb 28, 2023 at 10:33:41AM +0800, Lu Baolu wrote: > > > In normal processing of PCIe ATS requests, the IOMMU performs address > > > translation and returns the device a physical memory address which > > > will be stored in that device's IOTLB. The device may subsequently > > > issue Translated DMA request containing physical memory address. The > > > IOMMU only checks that the device was allowed to issue such requests > > > and does not attempt to validate the physical address. > > > > > > The Intel IOMMU implementation only allows PCIe ATS on several SOC- > > > integrated devices which are opt-in’ed through the ACPI tables to > > > prevent any compromised device from accessing arbitrary physical > > > memory. > > > > > > Add a kernel option intel_iommu=relax_ats to allow users to have an > > > opt-in to allow turning on ATS at as wish, especially for CSP-owned > > > vertical devices. In any case, risky devices are not allowed to use > > > ATS. > > Why is this an intel specific option? > > I only see similar situation on ARM SMMUv3 platforms. The device ATS is > only allowed when the ATS bit is set in RC node of the ACPI/IORT table. It should be common, all iommus using ATS need this logic. > > all it does is effectively > > disable untrusted? > > It's irrelevant to untrusted devices. > > Untrusted devices, with pci_dev->untrusted set, means device connects to > the system through some kind of untrusted external port, e.g. > thunderbolt ports. For those devices, ATS shouldn't be enabled for those > devices. Yes > Here we are talking about soc-integrated devices vs. discrete PCI > devices (connected to the system through internal PCI slot). The problem > is that the DMAR ACPI table only defines ATS attribute for Soc- > integrated devices, which causes ATS (and its ancillary features) on the > discrete PCI devices not to work. So, IMHO, it is a bug to set what Linux calls as untrusted for discrete slots. We also definately don't want internal slots forced to non-identity mode either, for example. This should be addressed at the PCI layer. Untrusted should always mean that the iommu layer should fully secure the device. It means identity translation should be blocked, it means the HW should reject translated requests, and ATS thus is not supported. Every single iommu driver should implement this consistently across the whole subsystem. If you don't want devices to be secured then that is a PCI/bios layer problem to get the correct data into the untrusted flag. Not an iommu problem to ignore the flag. Jason
On 2023-03-01 14:04, Jason Gunthorpe wrote: > On Wed, Mar 01, 2023 at 12:22:23PM +0800, Baolu Lu wrote: >> On 2/28/23 8:23 PM, Jason Gunthorpe wrote: >>> On Tue, Feb 28, 2023 at 10:33:41AM +0800, Lu Baolu wrote: >>>> In normal processing of PCIe ATS requests, the IOMMU performs address >>>> translation and returns the device a physical memory address which >>>> will be stored in that device's IOTLB. The device may subsequently >>>> issue Translated DMA request containing physical memory address. The >>>> IOMMU only checks that the device was allowed to issue such requests >>>> and does not attempt to validate the physical address. >>>> >>>> The Intel IOMMU implementation only allows PCIe ATS on several SOC- >>>> integrated devices which are opt-in’ed through the ACPI tables to >>>> prevent any compromised device from accessing arbitrary physical >>>> memory. >>>> >>>> Add a kernel option intel_iommu=relax_ats to allow users to have an >>>> opt-in to allow turning on ATS at as wish, especially for CSP-owned >>>> vertical devices. In any case, risky devices are not allowed to use >>>> ATS. >>> Why is this an intel specific option? >> >> I only see similar situation on ARM SMMUv3 platforms. The device ATS is >> only allowed when the ATS bit is set in RC node of the ACPI/IORT table. > > It should be common, all iommus using ATS need this logic. The IORT flags are not this kind of policy, they are a necessary description of the hardware. The mix-and-match nature of licensable IP means that just because an SMMU supports the ATS-relevant features defined by the SMMU architecture, that doesn't say that whatever PCIe IP the customer has chosen to pair it with also supports ATS. Even when both ends nominally support it, it's still possible to integrate them together in ways where ATS wouldn't be functional. In general, if a feature is marked as unsupported in IORT, the only way to "relax" that would be if you have a silicon fab handy. If any system vendor *was* to misuse IORT to impose arbitrary and unwelcome usage policy on their customers, then those customers should demand a firmware update (or at least use their own patched IORT, which is pretty trivial with the kernel's existing ACPI table override mechanism). Thanks, Robin. >>> all it does is effectively >>> disable untrusted? >> >> It's irrelevant to untrusted devices. >> >> Untrusted devices, with pci_dev->untrusted set, means device connects to >> the system through some kind of untrusted external port, e.g. >> thunderbolt ports. For those devices, ATS shouldn't be enabled for those >> devices. > > Yes > >> Here we are talking about soc-integrated devices vs. discrete PCI >> devices (connected to the system through internal PCI slot). The problem >> is that the DMAR ACPI table only defines ATS attribute for Soc- >> integrated devices, which causes ATS (and its ancillary features) on the >> discrete PCI devices not to work. > > So, IMHO, it is a bug to set what Linux calls as untrusted for > discrete slots. We also definately don't want internal slots forced to > non-identity mode either, for example. > > This should be addressed at the PCI layer. Untrusted should always > mean that the iommu layer should fully secure the device. It means > identity translation should be blocked, it means the HW should reject > translated requests, and ATS thus is not supported. > > Every single iommu driver should implement this consistently across > the whole subsystem. > > If you don't want devices to be secured then that is a PCI/bios layer > problem to get the correct data into the untrusted flag. > > Not an iommu problem to ignore the flag. > > Jason
On Wed, Mar 01, 2023 at 05:15:33PM +0000, Robin Murphy wrote: > On 2023-03-01 14:04, Jason Gunthorpe wrote: > > On Wed, Mar 01, 2023 at 12:22:23PM +0800, Baolu Lu wrote: > > > On 2/28/23 8:23 PM, Jason Gunthorpe wrote: > > > > On Tue, Feb 28, 2023 at 10:33:41AM +0800, Lu Baolu wrote: > > > > > In normal processing of PCIe ATS requests, the IOMMU performs address > > > > > translation and returns the device a physical memory address which > > > > > will be stored in that device's IOTLB. The device may subsequently > > > > > issue Translated DMA request containing physical memory address. The > > > > > IOMMU only checks that the device was allowed to issue such requests > > > > > and does not attempt to validate the physical address. > > > > > > > > > > The Intel IOMMU implementation only allows PCIe ATS on several SOC- > > > > > integrated devices which are opt-in’ed through the ACPI tables to > > > > > prevent any compromised device from accessing arbitrary physical > > > > > memory. > > > > > > > > > > Add a kernel option intel_iommu=relax_ats to allow users to have an > > > > > opt-in to allow turning on ATS at as wish, especially for CSP-owned > > > > > vertical devices. In any case, risky devices are not allowed to use > > > > > ATS. > > > > Why is this an intel specific option? > > > > > > I only see similar situation on ARM SMMUv3 platforms. The device ATS is > > > only allowed when the ATS bit is set in RC node of the ACPI/IORT table. > > > > It should be common, all iommus using ATS need this logic. > > The IORT flags are not this kind of policy, they are a necessary description > of the hardware. The mix-and-match nature of licensable IP means that just > because an SMMU supports the ATS-relevant features defined by the SMMU > architecture, that doesn't say that whatever PCIe IP the customer has chosen > to pair it with also supports ATS. Even when both ends nominally support it, > it's still possible to integrate them together in ways where ATS wouldn't be > functional. > > In general, if a feature is marked as unsupported in IORT, the only way to > "relax" that would be if you have a silicon fab handy. If any system vendor > *was* to misuse IORT to impose arbitrary and unwelcome usage policy on their > customers, then those customers should demand a firmware update (or at least > use their own patched IORT, which is pretty trivial with the kernel's > existing ACPI table override mechanism). This makes sense. I think Intel has confused their version of the IORT. The ACPI tables read by the iommu driver should be strictly about IOMMU HW capability, like Robin describes for ARM. Security policy flows through the ExternalFacingPort ACPI via pci_acpi_set_external_facing() and triggers pdev->untrusted. When an iommu driver sees pdev->untrusted it is supposed to ensure that translated TLPs are blocked. Since nothing does this explicitly it is presumably happening because ATS being disabled also blocks translated TLPs and we check untrusted as part of pci_enable_ats() If Intel BIOS's have populated the "satcu" to say that ATS is not supported by the HW when the HW supports ATS perfectly fine, then get the BIOS fixed or patch the ACPI until it is fixed. The BIOS should not be saying that the HW does not support ATS when it does, it is a simple BIOS bug. Alternatively if you have some definitive way to know that the HW supports ATS then you should route the satcu information to pdev->untrusted and ignore it at the iommu driver level. Jason
On 2023-03-01 17:42, Jason Gunthorpe wrote: > On Wed, Mar 01, 2023 at 05:15:33PM +0000, Robin Murphy wrote: >> On 2023-03-01 14:04, Jason Gunthorpe wrote: >>> On Wed, Mar 01, 2023 at 12:22:23PM +0800, Baolu Lu wrote: >>>> On 2/28/23 8:23 PM, Jason Gunthorpe wrote: >>>>> On Tue, Feb 28, 2023 at 10:33:41AM +0800, Lu Baolu wrote: >>>>>> In normal processing of PCIe ATS requests, the IOMMU performs address >>>>>> translation and returns the device a physical memory address which >>>>>> will be stored in that device's IOTLB. The device may subsequently >>>>>> issue Translated DMA request containing physical memory address. The >>>>>> IOMMU only checks that the device was allowed to issue such requests >>>>>> and does not attempt to validate the physical address. >>>>>> >>>>>> The Intel IOMMU implementation only allows PCIe ATS on several SOC- >>>>>> integrated devices which are opt-in’ed through the ACPI tables to >>>>>> prevent any compromised device from accessing arbitrary physical >>>>>> memory. >>>>>> >>>>>> Add a kernel option intel_iommu=relax_ats to allow users to have an >>>>>> opt-in to allow turning on ATS at as wish, especially for CSP-owned >>>>>> vertical devices. In any case, risky devices are not allowed to use >>>>>> ATS. >>>>> Why is this an intel specific option? >>>> >>>> I only see similar situation on ARM SMMUv3 platforms. The device ATS is >>>> only allowed when the ATS bit is set in RC node of the ACPI/IORT table. >>> >>> It should be common, all iommus using ATS need this logic. >> >> The IORT flags are not this kind of policy, they are a necessary description >> of the hardware. The mix-and-match nature of licensable IP means that just >> because an SMMU supports the ATS-relevant features defined by the SMMU >> architecture, that doesn't say that whatever PCIe IP the customer has chosen >> to pair it with also supports ATS. Even when both ends nominally support it, >> it's still possible to integrate them together in ways where ATS wouldn't be >> functional. >> >> In general, if a feature is marked as unsupported in IORT, the only way to >> "relax" that would be if you have a silicon fab handy. If any system vendor >> *was* to misuse IORT to impose arbitrary and unwelcome usage policy on their >> customers, then those customers should demand a firmware update (or at least >> use their own patched IORT, which is pretty trivial with the kernel's >> existing ACPI table override mechanism). > > This makes sense. > > I think Intel has confused their version of the IORT. > > The ACPI tables read by the iommu driver should be strictly about > IOMMU HW capability, like Robin describes for ARM. > > Security policy flows through the ExternalFacingPort ACPI via > pci_acpi_set_external_facing() and triggers pdev->untrusted. > > When an iommu driver sees pdev->untrusted it is supposed to ensure > that translated TLPs are blocked. Since nothing does this explicitly > it is presumably happening because ATS being disabled also blocks > translated TLPs and we check untrusted as part of pci_enable_ats() At least for SMMU, we seem to be relying on pci_ats_supported() including pdev->untrusted in its decision - that will propagate back to master->ats_enabled = false inside the driver, which in turn will lead to arm_smmu_write_strtab_ent() leaving STE.EATS at the default setting which aborts all translation requests and translated transactions. > If Intel BIOS's have populated the "satcu" to say that ATS is not > supported by the HW when the HW supports ATS perfectly fine, then get > the BIOS fixed or patch the ACPI until it is fixed. The BIOS should > not be saying that the HW does not support ATS when it does, it is a > simple BIOS bug. > > Alternatively if you have some definitive way to know that the HW > supports ATS then you should route the satcu information to > pdev->untrusted and ignore it at the iommu driver level. From a quick look at the VT-d spec, it sounds like the ATSR structure is intended to be functionally equivalent to IORT's Root Complex "ATS Attribute", while the SATC is a slightly specialised version for RCiEPs. The spec even says "Software must enable ATS on endpoint devices behind a Root Port only if the Root Port is reported as supporting ATS transactions". It also seems to be implied that this should be based on what Intel themselves have validated, so an option for the user to say "sure, ATS works everywhere, I know better" and simply bypass all the existing checks doesn't really seem safe to me :/ I'd be inclined to hold the same opinion as for IORT here - if a user ever really does need to engage expert mode to safely work around a bad BIOS with known-good information, they should already have the tools to override the whole DMAR table as they see fit. Thanks, Robin.
On 3/1/23 10:04 PM, Jason Gunthorpe wrote: >> Here we are talking about soc-integrated devices vs. discrete PCI >> devices (connected to the system through internal PCI slot). The problem >> is that the DMAR ACPI table only defines ATS attribute for Soc- >> integrated devices, which causes ATS (and its ancillary features) on the >> discrete PCI devices not to work. > So, IMHO, it is a bug to set what Linux calls as untrusted for > discrete slots. We also definately don't want internal slots forced to > non-identity mode either, for example. Linux doesn't set untrusted for PCI discrete slots. It reads port's ExternalFacingPort property and set pdev->untrusted for all devices underneath it. For ACPI compatible platforms, this property is only set for thunderbolt ports as far as I have seen. drivers/pci/pci-acpi.c: 1373 static void pci_acpi_set_external_facing(struct pci_dev *dev) 1374 { 1375 u8 val; 1376 1377 if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) 1378 return; 1379 if (device_property_read_u8(&dev->dev, "ExternalFacingPort", &val)) 1380 return; 1381 1382 /* 1383 * These root ports expose PCIe (including DMA) outside of the 1384 * system. Everything downstream from them is external. 1385 */ 1386 if (val) 1387 dev->external_facing = 1; 1388 } and drivers/pci/probe.c: 1597 static void set_pcie_untrusted(struct pci_dev *dev) 1598 { 1599 struct pci_dev *parent; 1600 1601 /* 1602 * If the upstream bridge is untrusted we treat this device 1603 * untrusted as well. 1604 */ 1605 parent = pci_upstream_bridge(dev); 1606 if (parent && (parent->untrusted || parent->external_facing)) 1607 dev->untrusted = true; 1608 } > > This should be addressed at the PCI layer. Untrusted should always > mean that the iommu layer should fully secure the device. It means > identity translation should be blocked, it means the HW should reject > translated requests, and ATS thus is not supported. > > Every single iommu driver should implement this consistently across > the whole subsystem. > > If you don't want devices to be secured then that is a PCI/bios layer > problem to get the correct data into the untrusted flag. > > Not an iommu problem to ignore the flag. The problem here is that, even for PCI *trusted* devices, the IOMMU drivers (Intel and SMMUv3) still have a allowed list for ATS. Only devices in this list are allowed to use ATS. This cause some PCI discrete devices not able to use ATS even its pdev->untrust is *not* set. The purpose of this patch is to allow privileged users to choose to skip the allowed list according to their wishes. So that, as long as the PCI layer treats the device as trusted, it can use ATS in the IOMMU layer. At present, this patch is only for VT-d driver, but I have no objection to bring it up to the core as a common mechanism. Best regards, baolu
On 3/2/23 2:19 AM, Robin Murphy wrote: > On 2023-03-01 17:42, Jason Gunthorpe wrote: >> On Wed, Mar 01, 2023 at 05:15:33PM +0000, Robin Murphy wrote: >>> On 2023-03-01 14:04, Jason Gunthorpe wrote: >>>> On Wed, Mar 01, 2023 at 12:22:23PM +0800, Baolu Lu wrote: >>>>> On 2/28/23 8:23 PM, Jason Gunthorpe wrote: >>>>>> On Tue, Feb 28, 2023 at 10:33:41AM +0800, Lu Baolu wrote: >>>>>>> In normal processing of PCIe ATS requests, the IOMMU performs >>>>>>> address >>>>>>> translation and returns the device a physical memory address which >>>>>>> will be stored in that device's IOTLB. The device may subsequently >>>>>>> issue Translated DMA request containing physical memory address. The >>>>>>> IOMMU only checks that the device was allowed to issue such requests >>>>>>> and does not attempt to validate the physical address. >>>>>>> >>>>>>> The Intel IOMMU implementation only allows PCIe ATS on several SOC- >>>>>>> integrated devices which are opt-in’ed through the ACPI tables to >>>>>>> prevent any compromised device from accessing arbitrary physical >>>>>>> memory. >>>>>>> >>>>>>> Add a kernel option intel_iommu=relax_ats to allow users to have an >>>>>>> opt-in to allow turning on ATS at as wish, especially for CSP-owned >>>>>>> vertical devices. In any case, risky devices are not allowed to use >>>>>>> ATS. >>>>>> Why is this an intel specific option? >>>>> >>>>> I only see similar situation on ARM SMMUv3 platforms. The device >>>>> ATS is >>>>> only allowed when the ATS bit is set in RC node of the ACPI/IORT >>>>> table. >>>> >>>> It should be common, all iommus using ATS need this logic. >>> >>> The IORT flags are not this kind of policy, they are a necessary >>> description >>> of the hardware. The mix-and-match nature of licensable IP means that >>> just >>> because an SMMU supports the ATS-relevant features defined by the SMMU >>> architecture, that doesn't say that whatever PCIe IP the customer has >>> chosen >>> to pair it with also supports ATS. Even when both ends nominally >>> support it, >>> it's still possible to integrate them together in ways where ATS >>> wouldn't be >>> functional. >>> >>> In general, if a feature is marked as unsupported in IORT, the only >>> way to >>> "relax" that would be if you have a silicon fab handy. If any system >>> vendor >>> *was* to misuse IORT to impose arbitrary and unwelcome usage policy >>> on their >>> customers, then those customers should demand a firmware update (or >>> at least >>> use their own patched IORT, which is pretty trivial with the kernel's >>> existing ACPI table override mechanism). >> >> This makes sense. >> >> I think Intel has confused their version of the IORT. >> >> The ACPI tables read by the iommu driver should be strictly about >> IOMMU HW capability, like Robin describes for ARM. >> >> Security policy flows through the ExternalFacingPort ACPI via >> pci_acpi_set_external_facing() and triggers pdev->untrusted. >> >> When an iommu driver sees pdev->untrusted it is supposed to ensure >> that translated TLPs are blocked. Since nothing does this explicitly >> it is presumably happening because ATS being disabled also blocks >> translated TLPs and we check untrusted as part of pci_enable_ats() > > At least for SMMU, we seem to be relying on pci_ats_supported() > including pdev->untrusted in its decision - that will propagate back to > master->ats_enabled = false inside the driver, which in turn will lead > to arm_smmu_write_strtab_ent() leaving STE.EATS at the default setting > which aborts all translation requests and translated transactions. Intel VT-d does the same thing. > >> If Intel BIOS's have populated the "satcu" to say that ATS is not >> supported by the HW when the HW supports ATS perfectly fine, then get >> the BIOS fixed or patch the ACPI until it is fixed. The BIOS should >> not be saying that the HW does not support ATS when it does, it is a >> simple BIOS bug. >> >> Alternatively if you have some definitive way to know that the HW >> supports ATS then you should route the satcu information to >> pdev->untrusted and ignore it at the iommu driver level. > > From a quick look at the VT-d spec, it sounds like the ATSR structure > is intended to be functionally equivalent to IORT's Root Complex "ATS > Attribute", while the SATC is a slightly specialised version for RCiEPs. > The spec even says "Software must enable ATS on endpoint devices behind > a Root Port only if the Root Port is reported as supporting ATS > transactions". It also seems to be implied that this should be based on > what Intel themselves have validated, so an option for the user to say > "sure, ATS works everywhere, I know better" and simply bypass all the > existing checks doesn't really seem safe to me :/ > > I'd be inclined to hold the same opinion as for IORT here - if a user > ever really does need to engage expert mode to safely work around a bad > BIOS with known-good information, they should already have the tools to > override the whole DMAR table as they see fit. Make sense to me. BIOS upgrading or ACPI table overriding should help in such cases. I will stop this patch unless there're any other special reasons. > Thanks, > Robin. Best regards, baolu
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Thursday, March 2, 2023 1:43 AM > > If Intel BIOS's have populated the "satcu" to say that ATS is not > supported by the HW when the HW supports ATS perfectly fine, then get > the BIOS fixed or patch the ACPI until it is fixed. The BIOS should > not be saying that the HW does not support ATS when it does, it is a > simple BIOS bug. > That is not the purpose of SATC. The ATS support in VT-d side is reported in two interfaces: 1) "Device-TLB support" in Extended Capability Register; 2) Root port ATS capability in ACPI ATSR structure; A device gets ATS enabled if 1/2 are true and !pdev->untrusted. Same as SMMU does. The main purpose of SATC is to describe which ATS-capable integrated device meets the requirements of securely using ATS as stated in VT-d spec 4.4. Kind of a way of building trust on the device which is fully validated to not allow misusing translated transaction to attack memory outside of its attached domain. System software which cares about it could use the information in a more restrictive ATS model (beyond !pdev->untrusted). But this is moot in current intel-iommu driver which still makes decision based on !pdev->untrusted as other IOMMU drivers do. SATC is used kind of as a quick path to bypass ATSR check given no root port for integrated devices (plus a small trick on HW managed ATS). So I agree this patch is not required. We should not have a relax option to bypass ATSR check anyway. But Baolu, seems there is a small bug on handling satcu->atc_required. This indicates that ATS must be enabled as a functional requirement. Then we should handle the failure of pci_enable_ats() on such device. Thanks Kevin
On 2023/3/3 16:19, Tian, Kevin wrote: > But Baolu, seems there is a small bug on handling satcu->atc_required. > This indicates that ATS must be enabled as a functional requirement. > Then we should handle the failure of pci_enable_ats() on such device. Yes. We have people working on this. For example, when pci=noats is opt- in, pci_enable_ats() definitely will return failure. Best regards, baolu
On Fri, Mar 03, 2023 at 08:19:29AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg@nvidia.com> > > Sent: Thursday, March 2, 2023 1:43 AM > > > > If Intel BIOS's have populated the "satcu" to say that ATS is not > > supported by the HW when the HW supports ATS perfectly fine, then get > > the BIOS fixed or patch the ACPI until it is fixed. The BIOS should > > not be saying that the HW does not support ATS when it does, it is a > > simple BIOS bug. > > > > That is not the purpose of SATC. > > The ATS support in VT-d side is reported in two interfaces: > > 1) "Device-TLB support" in Extended Capability Register; > 2) Root port ATS capability in ACPI ATSR structure; > > A device gets ATS enabled if 1/2 are true and !pdev->untrusted. Same > as SMMU does. > > The main purpose of SATC is to describe which ATS-capable integrated > device meets the requirements of securely using ATS as stated in VT-d > spec 4.4. Then it should be mapped to pdev->untrusted and possibly pdev->untrusted to be enhanced to be more descriptive. iommu driver and BIOS should have no role in security policy beyond feeding in data to a common policy engine. Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, March 3, 2023 9:18 PM > > On Fri, Mar 03, 2023 at 08:19:29AM +0000, Tian, Kevin wrote: > > > From: Jason Gunthorpe <jgg@nvidia.com> > > > Sent: Thursday, March 2, 2023 1:43 AM > > > > > > If Intel BIOS's have populated the "satcu" to say that ATS is not > > > supported by the HW when the HW supports ATS perfectly fine, then get > > > the BIOS fixed or patch the ACPI until it is fixed. The BIOS should > > > not be saying that the HW does not support ATS when it does, it is a > > > simple BIOS bug. > > > > > > > That is not the purpose of SATC. > > > > The ATS support in VT-d side is reported in two interfaces: > > > > 1) "Device-TLB support" in Extended Capability Register; > > 2) Root port ATS capability in ACPI ATSR structure; > > > > A device gets ATS enabled if 1/2 are true and !pdev->untrusted. Same > > as SMMU does. > > > > The main purpose of SATC is to describe which ATS-capable integrated > > device meets the requirements of securely using ATS as stated in VT-d > > spec 4.4. > > Then it should be mapped to pdev->untrusted and possibly > pdev->untrusted to be enhanced to be more descriptive. > > iommu driver and BIOS should have no role in security policy beyond > feeding in data to a common policy engine. > That makes sense.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 6221a1d057dd..490fae585f73 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2075,6 +2075,12 @@ Note that using this option lowers the security provided by tboot because it makes the system vulnerable to DMA attacks. + relax_ats + By default, the Intel IOMMU implementation only allows + ATS to be enabled on certain devices. The platform + advertises its allowed devices in ACPI tables like SATC + and ATSR. With this option, this ATS requirement is + relaxed so that discrete PCI devices can also use ATS. intel_idle.max_cstate= [KNL,HW,ACPI,X86] 0 disables intel_idle and fall back on acpi_idle. diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 7c2f4bd33582..4f6c6d8716bd 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -290,6 +290,7 @@ static int dmar_map_gfx = 1; static int intel_iommu_superpage = 1; static int iommu_identity_mapping; static int iommu_skip_te_disable; +static int iommu_relaxed_ats; #define IDENTMAP_GFX 2 #define IDENTMAP_AZALIA 4 @@ -349,6 +350,9 @@ static int __init intel_iommu_setup(char *str) } else if (!strncmp(str, "tboot_noforce", 13)) { pr_info("Intel-IOMMU: not forcing on after tboot. This could expose security risk for tboot\n"); intel_iommu_tboot_noforce = 1; + } else if (!strncmp(str, "relax_ats", 9)) { + pr_info("ATS reqirement is relaxed\n"); + iommu_relaxed_ats = 1; } else { pr_notice("Unknown option - '%s'\n", str); } @@ -3557,6 +3561,9 @@ static int dmar_ats_supported(struct pci_dev *dev, struct intel_iommu *iommu) struct dmar_atsr_unit *atsru; struct dmar_satc_unit *satcu; + if (iommu_relaxed_ats && !dev->untrusted) + return 1; + dev = pci_physfn(dev); satcu = dmar_find_matched_satc_unit(dev); if (satcu)