From patchwork Fri Mar 10 09:23:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrzej Hajda X-Patchwork-Id: 67307 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp772207wrd; Fri, 10 Mar 2023 01:33:15 -0800 (PST) X-Google-Smtp-Source: AK7set/RT9iC91zYO3XHc57bOQmiV229JzSODJCGmtC0ifYXWHY0+rNFI3iKzEx9IxbSjF06167G X-Received: by 2002:a17:902:ec8d:b0:19a:a9dd:ed3f with SMTP id x13-20020a170902ec8d00b0019aa9dded3fmr29388578plg.49.1678440795239; Fri, 10 Mar 2023 01:33:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1678440795; cv=none; d=google.com; s=arc-20160816; b=qOsfbMyRgBc33uYJnuYZBgq/s7ls12b41E+85uZ5EEq1UuRJZvP4TC75bNk9IjXYI+ o6i8WoMs4sr73q4o/ICbuiSUMQ4PcDJB+O8z5YTO2Gq0CM3UCedR2yLBgOpn/u/x0y5g rsiGjwHcPMSaKwbtd+wHRUoFSU6sifmb/7FpZBGXZqj2N4dgKAjfKZirgfbdvkfi9+u/ fF+PQkEH5zFg0j4m2dqII2/alqMNAzTOz9C2EuUR2IQkLtl5yP834sRvJKKW67GHObsY dPlMFGEskRlB4WRpxP2d4J/ECXEZkncfPdoEzwYkK/AkLGbaOzsoPEgIbKYlBdKK8S8q V7gA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:in-reply-to:references:message-id :content-transfer-encoding:mime-version:subject:date:from :dkim-signature; bh=s+mktajfqaxGoWafJ92WPiw8JPeEIvPC7rmu+tIQmkw=; b=EBCg7SYir3IXH9+0u6Px7ID3XlFzr35o89wlDXE4I2fjIXTrCMxRRHHGX7MlWWJGYb kr9u2kLrcYcAmNB4mOyxpHaoE652p69jJksi2hKs7pmnK2Yst3zIjXI2ExSAhy83Y1u4 6hW5cEB5Kpd039Aqg4JW6eiCCPIb9fj2hAy9PlLQHytoSkhPzuzsDTICo5hdhyRc8AaB jIA94GHDWE00JgEjAcRoJd/86cWRHyWYSfz9kfP7lkyNVlyEpqQox8kUzwrSjEMt09b1 0FzrzjbPF+mM3ylFhZGW+5G8snFfMxZl5KBPHGP0r3Mnf1ggfNlOzp7dOmBMlBuf49yT MEJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hfIZtPEd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ij9-20020a170902ab4900b0019ca806fc81si1724593plb.89.2023.03.10.01.33.02; Fri, 10 Mar 2023 01:33:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hfIZtPEd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230186AbjCJJ1x (ORCPT + 99 others); Fri, 10 Mar 2023 04:27:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230110AbjCJJ10 (ORCPT ); Fri, 10 Mar 2023 04:27:26 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BA737DD13 for ; Fri, 10 Mar 2023 01:24:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678440249; x=1709976249; h=from:date:subject:mime-version:content-transfer-encoding: message-id:references:in-reply-to:to:cc; bh=cnXIrMH6SE0F1eNgQYElRfVjjQ186pEPV0fuykHDPxU=; b=hfIZtPEdtectUYDCPV/XFrXIgR2FFJoIJPJIdc2WGJMmFvRhnLSQJ8V0 nKDHgpR46WZ2ACBhRk5GRblpoQw3ds+oqgJQ6MzsoSQqNyyNutl+dcyg7 09371uukgNwhlFZ5+6RrAK+LTvcynRmVn4dVAqQsdr7FtWCG18KkIutyF 7Yniu/p7TtSytgIWftIpt+jvprI7usyLMosSJ3zSL1JPVRkl+cBVbdcXq RlKEZMCQ8lTzn8JYV+AplHYcGnH29i3vCH+O6Xv/KVN+LPPrl7uavTaqU 6g8Y/k5Lv+HnmwPtbivY/D9wtdj/r0eXbcGW4ln/d4ANYGF+Lf/KmfKh7 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10644"; a="334164689" X-IronPort-AV: E=Sophos;i="5.98,249,1673942400"; d="scan'208";a="334164689" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Mar 2023 01:24:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10644"; a="801513249" X-IronPort-AV: E=Sophos;i="5.98,249,1673942400"; d="scan'208";a="801513249" Received: from lab-ah.igk.intel.com ([10.102.42.211]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Mar 2023 01:24:06 -0800 From: Andrzej Hajda Date: Fri, 10 Mar 2023 10:23:50 +0100 Subject: [PATCH v6 2/2] drm/i915: add guard page to ggtt->error_capture MIME-Version: 1.0 Message-Id: <20230308-guard_error_capture-v6-2-1b5f31422563@intel.com> References: <20230308-guard_error_capture-v6-0-1b5f31422563@intel.com> In-Reply-To: <20230308-guard_error_capture-v6-0-1b5f31422563@intel.com> To: Jani Nikula , Joonas Lahtinen , Rodrigo Vivi , Tvrtko Ursulin Cc: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Andi Shyti , Chris Wilson , Nirmoy Das , Andrzej Hajda X-Mailer: b4 0.11.1 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759972496292748909?= X-GMAIL-MSGID: =?utf-8?q?1759972735261395902?= Write-combining memory allows speculative reads by CPU. ggtt->error_capture is WC mapped to CPU, so CPU/MMU can try to prefetch memory beyond the error_capture, ie it tries to read memory pointed by next PTE in GGTT. If this PTE points to invalid address DMAR errors will occur. This behaviour was observed on ADL and RPL platforms. To avoid it, guard scratch page should be added after error_capture. The patch fixes the most annoying issue with error capture but since WC reads are used also in other places there is a risk similar problem can affect them as well. v2: - modified commit message (I hope the diagnosis is correct), - added bug checks to ensure scratch is initialized on gen3 platforms. CI produces strange stacktrace for it suggesting scratch[0] is NULL, to be removed after resolving the issue with gen3 platforms. v3: - removed bug checks, replaced with gen check. v4: - change code for scratch page insertion to support all platforms, - add info in commit message there could be more similar issues v5: - check for nop_clear_range instead of gen8 (Tvrtko), - re-insert scratch pages on resume (Tvrtko) v6: - use scratch_range callback to set scratch pages (Chris) Signed-off-by: Andrzej Hajda Reviewed-by: Andi Shyti Acked-by: Nirmoy Das --- drivers/gpu/drm/i915/gt/intel_ggtt.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index 38e6f0b207fe0c..5ef7e03b11c8e6 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -572,8 +572,12 @@ static int init_ggtt(struct i915_ggtt *ggtt) * paths, and we trust that 0 will remain reserved. However, * the only likely reason for failure to insert is a driver * bug, which we expect to cause other failures... + * + * Since CPU can perform speculative reads on error capture + * (write-combining allows it) add scratch page after error + * capture to avoid DMAR errors. */ - ggtt->error_capture.size = I915_GTT_PAGE_SIZE; + ggtt->error_capture.size = 2 * I915_GTT_PAGE_SIZE; ggtt->error_capture.color = I915_COLOR_UNEVICTABLE; if (drm_mm_reserve_node(&ggtt->vm.mm, &ggtt->error_capture)) drm_mm_insert_node_in_range(&ggtt->vm.mm, @@ -583,11 +587,15 @@ static int init_ggtt(struct i915_ggtt *ggtt) 0, ggtt->mappable_end, DRM_MM_INSERT_LOW); } - if (drm_mm_node_allocated(&ggtt->error_capture)) + if (drm_mm_node_allocated(&ggtt->error_capture)) { + u64 start = ggtt->error_capture.start; + u64 size = ggtt->error_capture.size; + + ggtt->vm.scratch_range(&ggtt->vm, start, size); drm_dbg(&ggtt->vm.i915->drm, "Reserved GGTT:[%llx, %llx] for use by error capture\n", - ggtt->error_capture.start, - ggtt->error_capture.start + ggtt->error_capture.size); + start, start + size); + } /* * The upper portion of the GuC address space has a sizeable hole @@ -1280,6 +1288,10 @@ void i915_ggtt_resume(struct i915_ggtt *ggtt) flush = i915_ggtt_resume_vm(&ggtt->vm); + if (drm_mm_node_allocated(&ggtt->error_capture)) + ggtt->vm.scratch_range(&ggtt->vm, ggtt->error_capture.start, + ggtt->error_capture.size); + ggtt->invalidate(ggtt); if (flush)