From patchwork Mon Jul 10 01:32:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qiuxu Zhuo X-Patchwork-Id: 117542 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp4716073vqx; Sun, 9 Jul 2023 18:43:58 -0700 (PDT) X-Google-Smtp-Source: APBJJlGclBkZNQw+dgqr57em05uG93Pvi8etGNhB16bbks/L6l9S7gfBtrcxVzgPgzTF//yD1iCC X-Received: by 2002:a17:90a:8b07:b0:263:1e6c:16f4 with SMTP id y7-20020a17090a8b0700b002631e6c16f4mr21356668pjn.20.1688953438260; Sun, 09 Jul 2023 18:43:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688953438; cv=none; d=google.com; s=arc-20160816; b=Xz9PN1BMi9hmARVL9VN+CYW5VC9KFi5rmfEO9TlnjjLlFjwv9bPRIVKFyO1Q7dtw1p A46Etc85wZwmFWb6JFAvCPjJo6VG5eQixS557sgBz6u0WJbBsiDtEAN+/g4i0SNQpNOt RAAuIbkubkaklaqeMc12t1hW9UtzSM0HiPpMS548nwdeSCFB7jmwmtQYWjP66MWsLKih 6Z+CS4rJ2LUiOptL/XAGT5qd6XrsdsWgODxUe8VikhRb4vVYM/d+4sg2QoMwMOv9Cr4n diNLYpJ4EbhdnUlmAM2RdAF+PKcWxFgqCFqVz/5wyCWxtAbRtscKpxVtQ7js8RseZKWx k4lA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=KCIIg37GiBzuHMchW66V6h3EvwmgPZ3iiTMsssjuhQs=; fh=rSMI4ex0N1qnsuedX+nkKLGE/zFEU2IHsDAAai1F0K8=; b=lgJoX1HojtW+b3eLvT0hDQlqtikfrgmPbZFZk4S1A6CyNAIbykB9RA4z3Kxtynfqlw 3MJHdEBgquN3Rt9/VyKr00IF2b88GGBDKrhsTLDIJAgVVz6/GWYB3f+f4QYUX2qflYE0 daXelRI47ZVUUjv0XQD5to8ipy5PKAV5w6BYdd4QFQQBaj+CGjjetPRujOwJjqAnjC4j VP0ZftPVUe3WRo/1HvwzzngoXglFcGqhbfKVKTUY1UMd1oAEvNoV756mnQM2ly2aCNLf 89BjrG5XO+rqsv+5Akr2tmMIOwcziXSEg62OzbOACegU/VDK6BUS/U8UUVI//qXCZQMS U0cg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MrajgvgO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fw4-20020a17090b128400b00263509dc0adsi6245468pjb.137.2023.07.09.18.43.44; Sun, 09 Jul 2023 18:43:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MrajgvgO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230295AbjGJBh6 (ORCPT + 99 others); Sun, 9 Jul 2023 21:37:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230011AbjGJBh4 (ORCPT ); Sun, 9 Jul 2023 21:37:56 -0400 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDB49F1; Sun, 9 Jul 2023 18:37:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1688953062; x=1720489062; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=7wSBK4N8j7ayhRfNDW/D4i6cAQi0AIQIDpCI8hZX3s8=; b=MrajgvgOYfwcHJxTp0UZA6QhnF3akYNB6/mfAOJUHyLQaAvEG/MU9Eq3 cEHdz0X/hICbLZfWk+UEe0f9zqvfoE0RyRfx9ID/2PATcDY9xlr7Zz+Gp U5DuiCQq9Q1mtd6ERsPImQBCiMF4Yd+Ur8hTLt2qKX2tyCI5CJr3OGCgr gKeZVUq56a5KwTvXIQ0BeJKC/yke14Pd0aCLoP8J1XlzeUoJpqpwCp+Y7 iZp4mFstWklAhuz2lMfxs7pwop3KybDG7PPGT4dDECOzNaR0udPuo495/ JeGtmTKpKUw+ip5E1BhX/FJmvt/7Ul14yUj45IcKRuit8o8V3XYF1HeJD w==; X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="364273925" X-IronPort-AV: E=Sophos;i="6.01,193,1684825200"; d="scan'208";a="364273925" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Jul 2023 18:37:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="844723906" X-IronPort-AV: E=Sophos;i="6.01,193,1684825200"; d="scan'208";a="844723906" Received: from qiuxu-clx.sh.intel.com ([10.239.53.109]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Jul 2023 18:37:39 -0700 From: Qiuxu Zhuo To: Tony Luck Cc: Qiuxu Zhuo , Borislav Petkov , Aristeu Rozanski , Mauro Carvalho Chehab , Kai-Heng Feng , Koba Ko , linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 1/1] EDAC/i10nm: Skip the absent memory controllers Date: Mon, 10 Jul 2023 09:32:32 +0800 Message-Id: <20230710013232.59712-1-qiuxu.zhuo@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1770679523334520861 X-GMAIL-MSGID: 1770996040632039137 Some Sapphire Rapids workstations' absent memory controllers still appear as PCIe devices that fool the i10nm_edac driver and result in "shift exponent -66 is negative" call traces from skx_get_dimm_info(). Skip the absent memory controllers to avoid the call traces. Reported-by: Kai-Heng Feng Closes: https://lore.kernel.org/linux-edac/CAAd53p41Ku1m1rapeqb1xtD+kKuk+BaUW=dumuoF0ZO3GhFjFA@mail.gmail.com/T/#m5de16dce60a8c836ec235868c7c16e3fefad0cc2 Tested-by: Kai-Heng Feng Reported-by: Koba Ko Closes: https://lore.kernel.org/linux-edac/SA1PR11MB71305B71CCCC3D9305835202892AA@SA1PR11MB7130.namprd11.prod.outlook.com/T/#t Tested-by: Koba Ko Fixes: d4dc89d069aa ("EDAC, i10nm: Add a driver for Intel 10nm server processors") Signed-off-by: Qiuxu Zhuo --- v1->v2: - No function changes. - s/exponet/exponent/ in the commit message. - Add two tags of "Tested-by". drivers/edac/i10nm_base.c | 54 +++++++++++++++++++++++++++++++++++---- 1 file changed, 49 insertions(+), 5 deletions(-) diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c index a897b6aff368..349ff6cfb379 100644 --- a/drivers/edac/i10nm_base.c +++ b/drivers/edac/i10nm_base.c @@ -658,13 +658,49 @@ static struct pci_dev *get_ddr_munit(struct skx_dev *d, int i, u32 *offset, unsi return mdev; } +/** + * i10nm_imc_absent() - Check whether the memory controller @imc is absent + * + * @imc : The pointer to the structure of memory controller EDAC device. + * + * RETURNS : true if the memory controller EDAC device is absent, false otherwise. + */ +static bool i10nm_imc_absent(struct skx_imc *imc) +{ + u32 mcmtr; + int i; + + switch (res_cfg->type) { + case SPR: + for (i = 0; i < res_cfg->ddr_chan_num; i++) { + mcmtr = I10NM_GET_MCMTR(imc, i); + edac_dbg(1, "ch%d mcmtr reg %x\n", i, mcmtr); + if (mcmtr != ~0) + return false; + } + + /* + * Some workstations' absent memory controllers still + * appear as PCIe devices, misleading the EDAC driver. + * By observing that the MMIO registers of these absent + * memory controllers consistently hold the value of ~0. + * + * We identify a memory controller as absent by checking + * if its MMIO register "mcmtr" == ~0 in all its channels. + */ + return true; + default: + return false; + } +} + static int i10nm_get_ddr_munits(void) { struct pci_dev *mdev; void __iomem *mbase; unsigned long size; struct skx_dev *d; - int i, j = 0; + int i, lmc, j = 0; u32 reg, off; u64 base; @@ -690,7 +726,7 @@ static int i10nm_get_ddr_munits(void) edac_dbg(2, "socket%d mmio base 0x%llx (reg 0x%x)\n", j++, base, reg); - for (i = 0; i < res_cfg->ddr_imc_num; i++) { + for (lmc = 0, i = 0; i < res_cfg->ddr_imc_num; i++) { mdev = get_ddr_munit(d, i, &off, &size); if (i == 0 && !mdev) { @@ -700,8 +736,6 @@ static int i10nm_get_ddr_munits(void) if (!mdev) continue; - d->imc[i].mdev = mdev; - edac_dbg(2, "mc%d mmio base 0x%llx size 0x%lx (reg 0x%x)\n", i, base + off, size, reg); @@ -712,7 +746,17 @@ static int i10nm_get_ddr_munits(void) return -ENODEV; } - d->imc[i].mbase = mbase; + d->imc[lmc].mbase = mbase; + if (i10nm_imc_absent(&d->imc[lmc])) { + pci_dev_put(mdev); + iounmap(mbase); + d->imc[lmc].mbase = NULL; + edac_dbg(2, "Skip absent mc%d\n", i); + continue; + } else { + d->imc[lmc].mdev = mdev; + lmc++; + } } }