Message ID | 20231128142049.GTZWX3QQTSaQk/+u53@fat_crate.local |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce62:0:b0:403:3b70:6f57 with SMTP id o2csp4094501vqx; Tue, 28 Nov 2023 09:30:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IFG/2xoQdWWVsAwf46jczslm8XYUnwN8b109PtCTfpEvyrRfEI9VRxCUTXkqTiYmmr4bRkU X-Received: by 2002:a17:902:e746:b0:1cf:b989:9660 with SMTP id p6-20020a170902e74600b001cfb9899660mr11460003plf.68.1701192613117; Tue, 28 Nov 2023 09:30:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701192613; cv=none; d=google.com; s=arc-20160816; b=qnaJbdr1tPI5pZRWOxK42JkxlOV3s3pEc022MaS6GllR0cu2C449bYMnyRUBzXdMoj VVFIGp0GbcvoxUV94zocQ0nujxPLEQMVG/v3xNkQQ1+0qjyMSRDHufT5MOqvugXRvOd6 9WTQ0pI5VSvE9hkK8jVnk7cq7vvri8JLdoHvgEd/kW/w6wvjkM0+hJLTpTR4ypBkWWNO Oe7BMIkLZc9N81nUefrHgxp99FuufaLdr0M6l4tpP0yTa37nog6FGXtP2wVrUHhoVUZY EvYCUSsbCSlohaHQ3Hb0teeckML3HbRanKyBKuKfe2dSl08aYuj35SXJJ46SBCV1Dr4j l4sQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=djgexPykShCCOWmlqzQx+qXNGN05nNP+MDkkppHP5vk=; fh=VWGR6OPaHL98c9u39lTBy89G3wg+JvbFJPie+3uuiQc=; b=xCbnNVhF+FzgVPsX45wRj2fBlQ0Gn0Yqb2ovPsd8efS7sqRayB4k4oyZsSg07xl2qb cBiIjZX9zOJAefYM8+mZiEbZJEA8zNqs22vMgQySS6XrsCS2BauQnnsyAFT3hojEJFLo NavnXyGlJ3klZJ6WnvJ5ZuFNAkFR1E5VCqoapJroYq83ICzI7NsWImZ4NzvG/G3HXzR+ ZfkXIxhFhtDn8YwBbselSA+LkQv5B/hVJqwmpkeHpKDx68HRoENzfX9hfO11xnFYI02M y6TppGhs8SeRgd//3YnSBNd/lhQlSv504bK/DhF6TK0xD0ykVPLFaThhSZ5kXUUHfkJ7 O56A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@alien8.de header.s=alien8 header.b=lQQbbVM5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id z8-20020a1709027e8800b001cfdd2fe63csi3477792pla.312.2023.11.28.09.30.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Nov 2023 09:30:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@alien8.de header.s=alien8 header.b=lQQbbVM5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 7AB198057E36; Tue, 28 Nov 2023 06:21:20 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346115AbjK1OVE (ORCPT <rfc822;toshivichauhan@gmail.com> + 99 others); Tue, 28 Nov 2023 09:21:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345683AbjK1OVD (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 28 Nov 2023 09:21:03 -0500 Received: from mail.alien8.de (mail.alien8.de [65.109.113.108]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF877C1; Tue, 28 Nov 2023 06:21:09 -0800 (PST) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTP id A5C4F40E0031; Tue, 28 Nov 2023 14:21:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at mail.alien8.de Authentication-Results: mail.alien8.de (amavisd-new); dkim=pass (4096-bit key) header.d=alien8.de Received: from mail.alien8.de ([127.0.0.1]) by localhost (mail.alien8.de [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id FkzHQZ5LqZ19; Tue, 28 Nov 2023 14:21:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=alien8; t=1701181262; bh=djgexPykShCCOWmlqzQx+qXNGN05nNP+MDkkppHP5vk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=lQQbbVM53H/aJ4N6bEkQJVcO39J/ja2tSgX0+UKo4glNWsK1sRo6EyH/7vu4JrU/M 2WiDTGS7ddwGD8InQN2D0MqmxbYK0++1EKCoCMHc63DH3Xb1BdZ8rO0EEB8NuSCkH1 kFefNmZnA9w7gtNhal55lIUuA7THvTekO9Bh/arDViTpE2ExivXbojg5OL2/wivvxE HQGiuD9BtypuL9wt2iwuQbOek0NXg7HsqqJ0C9RKFC+rt/qJ9Ldzwnq5HSB9N7DKLO 0IaquxV3lWw7dNgZw8eoFtkGOvWLEx8QZy+aqVgzZcZmlLVVNIZDyAR4XE6w4r99c1 xc9X548TgjfaIRKSkOmU58I6eW6bLkIYxaYmejprmT9j8Zrbfm7CrkAAUBe6EWYKrA UyP+EK72F/bHMsT8K60Yw+BwaVg87EYxPa/3UKmYnPpSB2pj773f1x7r9ggrw0caJv p1NxT1a3P6w5z9eY+d4/KE4gTcIghWfKWQzvSGoJWr0+Tec+BqUeT7sj1Vh8pp7AVE QxvJk71VZEco5RkCVOvz2xGs1wlQTvpO7h9WS1b5GkwPhAy6p8xNg2CY1N7Z/9g16T pv0V88ZHfrbneNVmtodespieQ0pmK+l2xefiE33pDiLzJmS8KoQVs3Hyu8ajDDdLls 8y3W0AiTuTZvZx0aS6tCfkXI= Received: from zn.tnic (pd95304da.dip0.t-ipconnect.de [217.83.4.218]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id 7DF9040E01A5; Tue, 28 Nov 2023 14:20:55 +0000 (UTC) Date: Tue, 28 Nov 2023 15:20:49 +0100 From: Borislav Petkov <bp@alien8.de> To: Tony Luck <tony.luck@intel.com>, Yazen Ghannam <yazen.ghannam@amd.com> Cc: Muralidhara M K <muralimk@amd.com>, linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org, Muralidhara M K <muralidhara.mk@amd.com>, linux-doc@vger.kernel.org Subject: [PATCH] Documentation: Begin a RAS section Message-ID: <20231128142049.GTZWX3QQTSaQk/+u53@fat_crate.local> References: <20231102114225.2006878-1-muralimk@amd.com> <20231102114225.2006878-2-muralimk@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20231102114225.2006878-2-muralimk@amd.com> X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Tue, 28 Nov 2023 06:21:20 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783821781615609013 X-GMAIL-MSGID: 1783829745367932988 |
Series |
Documentation: Begin a RAS section
|
|
Commit Message
Borislav Petkov
Nov. 28, 2023, 2:20 p.m. UTC
On Thu, Nov 02, 2023 at 11:42:22AM +0000, Muralidhara M K wrote: > From: Muralidhara M K <muralidhara.mk@amd.com> > > AMD systems with Scalable MCA, each machine check error of a SMCA bank > type has an associated bit position in the bank's control (CTL) register. Ontop of this. It is long overdue: --- From: "Borislav Petkov (AMD)" <bp@alien8.de> Date: Tue, 28 Nov 2023 14:37:56 +0100 Add some initial RAS documentation. The expectation is for this to collect all the user-visible features for interacting with the RAS features of the kernel. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> --- Documentation/RAS/ras.rst | 26 ++++++++++++++++++++++++++ Documentation/index.rst | 1 + 2 files changed, 27 insertions(+) create mode 100644 Documentation/RAS/ras.rst
Comments
Borislav Petkov <bp@alien8.de> writes: > On Thu, Nov 02, 2023 at 11:42:22AM +0000, Muralidhara M K wrote: >> From: Muralidhara M K <muralidhara.mk@amd.com> >> >> AMD systems with Scalable MCA, each machine check error of a SMCA bank >> type has an associated bit position in the bank's control (CTL) register. > > Ontop of this. It is long overdue: > > --- > From: "Borislav Petkov (AMD)" <bp@alien8.de> > Date: Tue, 28 Nov 2023 14:37:56 +0100 > > Add some initial RAS documentation. The expectation is for this to > collect all the user-visible features for interacting with the RAS > features of the kernel. > > Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> > --- > Documentation/RAS/ras.rst | 26 ++++++++++++++++++++++++++ > Documentation/index.rst | 1 + > 2 files changed, 27 insertions(+) > create mode 100644 Documentation/RAS/ras.rst I wish I'd been copied on this ... I've been working to get a handle on the top-level Documentation/ directories for a while, and would rather not see a new one added for this. Offhand, based on this first document, it looks like material that belongs under Documentation/admin-guide; can we move it there, please? Thanks, jon
On Tue, Jan 09, 2024 at 10:47:29AM -0700, Jonathan Corbet wrote: > I wish I'd been copied on this ... linux-doc was CCed: https://lore.kernel.org/all/20231128142049.GTZWX3QQTSaQk%2F+u53@fat_crate.local/ Or did you prefer you directly? I've been working to get a handle on > the top-level Documentation/ directories for a while, and would rather > not see a new one added for this. Offhand, based on this first > document, it looks like material that belongs under > Documentation/admin-guide; can we move it there, please? Not really an admin guide thing - yes, based on the current content but actually, the aim for this is to document all things RAS, so it is more likely a subsystem thing. And all the subsystems are directories under Documentation/. So where do you want me to put it? Thx.
Borislav Petkov <bp@alien8.de> writes: > On Tue, Jan 09, 2024 at 10:47:29AM -0700, Jonathan Corbet wrote: >> I wish I'd been copied on this ... > > linux-doc was CCed: > > https://lore.kernel.org/all/20231128142049.GTZWX3QQTSaQk%2F+u53@fat_crate.local/ > > Or did you prefer you directly? Lots of stuff goes to linux-doc, I can miss things. Of course, I miss things in my own email too...you know the drill... > I've been working to get a handle on >> the top-level Documentation/ directories for a while, and would rather >> not see a new one added for this. Offhand, based on this first >> document, it looks like material that belongs under >> Documentation/admin-guide; can we move it there, please? > > Not really an admin guide thing - yes, based on the current content but > actually, the aim for this is to document all things RAS, so it is more > likely a subsystem thing. And all the subsystems are directories under > Documentation/. > > So where do you want me to put it? The hope with all of this documentation thrashing has been to organize our docs with the *reader* in mind. "All things RAS" is convenient for RAS developers, but not for (say) a sysadmin trying to figure out how to make use of it. So I would really rather see RAS documentation placed under admin-guide or userspace-api as appropriate. Yes, there is a lot of existing documentation that still doesn't live up to this idea, but we can try to follow it for new stuff while the rest is (slowly) fixed up. Make sense? Thanks, jon
On Tue, Jan 09, 2024 at 12:44:41PM -0700, Jonathan Corbet wrote: > Of course, I miss things in my own email too...you know the drill... Yeah, tell me about it. My train of thought with CCing maintainers in such cases usually is: I'd CC the mailing list as I don't want to bother the maintainer - she/he gets too much email anyway and this is an FYI thing anyway so she/he'll find it in the archives eventually. > Yes, there is a lot of existing documentation that still doesn't live up > to this idea, but we can try to follow it for new stuff while the rest > is (slowly) fixed up. The problem I see here is that not all of the RAS stuff will be "admin-guide" stuff but some design decisions we've made. I mean, if it is a really curious admin, it'll fit her/his alley but it won't be purely administrative tasks' descriptions. In the end of the day, I don't really care where it is as long as it is in one place and we can point people to it and say, here, that's why we did it the way we did it and what you can do about it. So I'm fine with admin-guide too - just pointing out a potential issue I see. Thx.
On Tue, Jan 09, 2024 at 09:04:34PM +0100, Borislav Petkov wrote: > So I'm fine with admin-guide too - just pointing out a potential issue > I see. Ok, how does that look like? I've merged it with ras.rst which we had there already and with some more new documentation that is coming from: https://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git/log/?h=edac-amd-atl Thx. --- From: "Borislav Petkov (AMD)" <bp@alien8.de> Date: Wed, 24 Jan 2024 13:37:52 +0100 Subject: [PATCH] Documentation: Move RAS section to admin-guide This is where this stuff should be. Requested-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> --- Documentation/RAS/index.rst | 14 -------------- .../{ => admin-guide}/RAS/address-translation.rst | 0 .../{ => admin-guide}/RAS/error-decoding.rst | 0 Documentation/admin-guide/RAS/index.rst | 7 +++++++ .../admin-guide/{ras.rst => RAS/main.rst} | 10 +++++++--- Documentation/admin-guide/index.rst | 2 +- Documentation/index.rst | 1 - 7 files changed, 15 insertions(+), 19 deletions(-) delete mode 100644 Documentation/RAS/index.rst rename Documentation/{ => admin-guide}/RAS/address-translation.rst (100%) rename Documentation/{ => admin-guide}/RAS/error-decoding.rst (100%) create mode 100644 Documentation/admin-guide/RAS/index.rst rename Documentation/admin-guide/{ras.rst => RAS/main.rst} (99%) diff --git a/Documentation/RAS/index.rst b/Documentation/RAS/index.rst deleted file mode 100644 index 2794c1816e90..000000000000 --- a/Documentation/RAS/index.rst +++ /dev/null @@ -1,14 +0,0 @@ -.. SPDX-License-Identifier: GPL-2.0 - -=========================================================== -Reliability, Availability and Serviceability (RAS) features -=========================================================== - -This documents different aspects of the RAS functionality present in the -kernel. - -.. toctree:: - :maxdepth: 2 - - error-decoding - address-translation diff --git a/Documentation/RAS/address-translation.rst b/Documentation/admin-guide/RAS/address-translation.rst similarity index 100% rename from Documentation/RAS/address-translation.rst rename to Documentation/admin-guide/RAS/address-translation.rst diff --git a/Documentation/RAS/error-decoding.rst b/Documentation/admin-guide/RAS/error-decoding.rst similarity index 100% rename from Documentation/RAS/error-decoding.rst rename to Documentation/admin-guide/RAS/error-decoding.rst diff --git a/Documentation/admin-guide/RAS/index.rst b/Documentation/admin-guide/RAS/index.rst new file mode 100644 index 000000000000..f4087040a7c0 --- /dev/null +++ b/Documentation/admin-guide/RAS/index.rst @@ -0,0 +1,7 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. toctree:: + :maxdepth: 2 + + main + error-decoding + address-translation diff --git a/Documentation/admin-guide/ras.rst b/Documentation/admin-guide/RAS/main.rst similarity index 99% rename from Documentation/admin-guide/ras.rst rename to Documentation/admin-guide/RAS/main.rst index 8e03751d126d..7ac1d4ccc509 100644 --- a/Documentation/admin-guide/ras.rst +++ b/Documentation/admin-guide/RAS/main.rst @@ -1,8 +1,12 @@ +.. SPDX-License-Identifier: GPL-2.0 .. include:: <isonum.txt> -============================================ -Reliability, Availability and Serviceability -============================================ +================================================== +Reliability, Availability and Serviceability (RAS) +================================================== + +This documents different aspects of the RAS functionality present in the +kernel. RAS concepts ************ diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst index fb40a1f6f79e..dfc06fab9432 100644 --- a/Documentation/admin-guide/index.rst +++ b/Documentation/admin-guide/index.rst @@ -122,7 +122,7 @@ configure specific aspects of kernel behavior to your liking. pmf pnp rapidio - ras + RAS/index rtc serial-console svga diff --git a/Documentation/index.rst b/Documentation/index.rst index 07f2aa07f0fa..9dfdc826618c 100644 --- a/Documentation/index.rst +++ b/Documentation/index.rst @@ -113,7 +113,6 @@ to ReStructured Text format, or are simply too old. :maxdepth: 1 staging/index - RAS/index Translations
On Wed, Jan 24, 2024 at 01:40:30PM +0100, Borislav Petkov wrote: > From: "Borislav Petkov (AMD)" <bp@alien8.de> > Date: Wed, 24 Jan 2024 13:37:52 +0100 > Subject: [PATCH] Documentation: Move RAS section to admin-guide > > This is where this stuff should be. > > Requested-by: Jonathan Corbet <corbet@lwn.net> > Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> > --- > Documentation/RAS/index.rst | 14 -------------- > .../{ => admin-guide}/RAS/address-translation.rst | 0 > .../{ => admin-guide}/RAS/error-decoding.rst | 0 > Documentation/admin-guide/RAS/index.rst | 7 +++++++ > .../admin-guide/{ras.rst => RAS/main.rst} | 10 +++++++--- > Documentation/admin-guide/index.rst | 2 +- > Documentation/index.rst | 1 - > 7 files changed, 15 insertions(+), 19 deletions(-) > delete mode 100644 Documentation/RAS/index.rst > rename Documentation/{ => admin-guide}/RAS/address-translation.rst (100%) > rename Documentation/{ => admin-guide}/RAS/error-decoding.rst (100%) > create mode 100644 Documentation/admin-guide/RAS/index.rst > rename Documentation/admin-guide/{ras.rst => RAS/main.rst} (99%) Now queued. Thx.
diff --git a/Documentation/RAS/ras.rst b/Documentation/RAS/ras.rst new file mode 100644 index 000000000000..2556b397cd27 --- /dev/null +++ b/Documentation/RAS/ras.rst @@ -0,0 +1,26 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Reliability, Availability and Serviceability features +===================================================== + +This documents different aspects of the RAS functionality present in the +kernel. + +Error decoding +--------------- + +* x86 + +Error decoding on AMD systems should be done using the rasdaemon tool: +https://github.com/mchehab/rasdaemon/ + +While the daemon is running, it would automatically log and decode +errors. If not, one can still decode such errors by supplying the +hardware information from the error:: + + $ rasdaemon -p --status <STATUS> --ipid <IPID> --smca + +Also, the user can pass particular family and model to decode the error +string:: + + $ rasdaemon -p --status <STATUS> --ipid <IPID> --smca --family <CPU Family> --model <CPU Model> --bank <BANK_NUM> diff --git a/Documentation/index.rst b/Documentation/index.rst index 9dfdc826618c..36e61783437c 100644 --- a/Documentation/index.rst +++ b/Documentation/index.rst @@ -113,6 +113,7 @@ to ReStructured Text format, or are simply too old. :maxdepth: 1 staging/index + RAS/ras Translations