Message ID | 20221110064723.8882-2-mario.limonciello@amd.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp765090wru; Wed, 9 Nov 2022 22:56:32 -0800 (PST) X-Google-Smtp-Source: AMsMyM5G0CKgLq1FFQ1eAe+MITjnlv9H5MLiO3/LbLpGw0Zqx2CMWsGySNL2S1f1Jt7kRdpT90J8 X-Received: by 2002:a17:906:48e:b0:78d:b3ef:656c with SMTP id f14-20020a170906048e00b0078db3ef656cmr2184686eja.627.1668063392408; Wed, 09 Nov 2022 22:56:32 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1668063392; cv=pass; d=google.com; s=arc-20160816; b=iKS2IgGxATgRlNBjdRHqEEwIzgthkvPMltkjaUCPFka5vBBYLn59xFFj/D/mNRg8Rs Ces8X3qZFTqBONN2mmrv7IAmtMNBqNJPIuzVByigXSQP3KhbHqDJSwLkqHnnhSf89pfs UZGlMXBHDl3v+MtjD61PwQst2tUR4PzBptWFxSzN1jwbnjoXUuLnhQrGUt5MnWY9ZvwA CCEEIiTQIpefX1RUKJj5NXJ0iYPQLQSoVX+ht0s5ISy5f/37O6wrtmJXh/8B0WgqAK3k kPo9Fr8TxPmBI1k+wiKpLR4FvN6SAaKSel1QfaDwAWiQ4Jrmn7K0CGBQH54b2X1RW9B1 oo/A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=lcplZ2AaUdxKohANIdm9S+CdjyqW3W6VNIMpelnDjSc=; b=YXt9HanSkbdBcioueRTsPjgWdtspdkTHtUOEXj1YroMpTmCtJ6wag/dsWppMptBdZ1 +Gd7D3YN4R3Q2KhgsmXUrL1C2psNOx6rp+Ky7XgNvUIOJThOw4oJKnNoAyOPpn669Q1K FfnjM4Eyu18WPwI5dDYcwCtLgFHQtTgJt1CVvLXVdMBq+dLib0bKPWXoWJtxTZqcsMpS dt3+NMP4DYUsVcrEY+YuqYo6+BMxUQmTbsYs9MDxC87wZkgWTZr3598OHpbkh4ACNqsj upvefTBGj2q5XuMJkbrJelpVMwuiP2OkaqBpuXoMuedQTk0niBS7PTyoQTq2q6yWy66s 0X2g== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=OHiowf4O; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h12-20020a05640250cc00b00452dfa2bb27si6779723edb.134.2022.11.09.22.56.08; Wed, 09 Nov 2022 22:56:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=OHiowf4O; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232635AbiKJGtb (ORCPT <rfc822;dexuan.linux@gmail.com> + 99 others); Thu, 10 Nov 2022 01:49:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232555AbiKJGta (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 10 Nov 2022 01:49:30 -0500 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2054.outbound.protection.outlook.com [40.107.244.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3E0B2D752; Wed, 9 Nov 2022 22:49:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ediHTmqYSDZozhmnNtuR3VjlmOKD0otkujwCJgWEAJBEN4FXKzsOithKmJAYreqO/etcw4NtKSY7AoJRHKm0loT/A8ycsgJojgdDxEc2+9IoqMQlzcr/3dCYKBHSlJNdgbidBLSJ1u7D3BD3Z9MHsLoKk6F2uRu3Q7jPKP8Xgw4Nhrly7LrkXnPECGG4n2ynJB4kp1tXCnOQDGyBBWPoR82y5zHOltLHmp7r+IEGTg7iMvjrzlRFt4/HxZrhx0FdO5beWFRDGw7Om7+Gf7DQtumYmFYGqZw1zA8hF413+/Cs7LC0YAkgqwaReyh5DkYJ6VZLSqp4uds5hnKv+S2L9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lcplZ2AaUdxKohANIdm9S+CdjyqW3W6VNIMpelnDjSc=; b=frFSRMU/jkZMNkisL45FfN+YYCWpprVedXv2msKarBEDa9vd/9ZX9RUvJiwO0WHIRzUC3QkQTZczFNyeCZ0U3x+zx9uIc0c7sc2q0kyGXJwdKXIl33Sx9xXReo3xOem4+sqgnHFqxnyhSRiT5NfmTwKA0eZxy9qHit4V38i56rv+yR3YbH5PGnaunc5kekL2WjuQNKunbB943x16QncucJhHyd5aPKt0MJT4SR+KkzCKVSP/iba7WpdDVl0P48qr7ZSrQjf7iKjBvOfsOW9zUI8esX1opRBhabxYcNnZ4Nx5xESzTTd0lMxlClDqYQYkDbjhihq80CIy362ezpqq9g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=chromium.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lcplZ2AaUdxKohANIdm9S+CdjyqW3W6VNIMpelnDjSc=; b=OHiowf4OhLHnpdBp/AZ2wERyDIcfuFwLQeK+QdsaF9R5BNJ25lbcMV39lV1dalSUy8Jb6QJ3iI+q6F/zqmPqxXrgdCwumU/jYFtdF5iZz51O1vZYxLMlpYp4oftB5nogFndikPxmaJ/SRp6Tmtnp6KQnciYgaefezcjoQU5iuQI= Received: from MW4PR03CA0024.namprd03.prod.outlook.com (2603:10b6:303:8f::29) by DM4PR12MB6280.namprd12.prod.outlook.com (2603:10b6:8:a2::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.26; Thu, 10 Nov 2022 06:49:27 +0000 Received: from CO1NAM11FT103.eop-nam11.prod.protection.outlook.com (2603:10b6:303:8f:cafe::62) by MW4PR03CA0024.outlook.office365.com (2603:10b6:303:8f::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.21 via Frontend Transport; Thu, 10 Nov 2022 06:49:26 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT103.mail.protection.outlook.com (10.13.174.252) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5813.12 via Frontend Transport; Thu, 10 Nov 2022 06:49:26 +0000 Received: from AUS-LX-MLIMONCI.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Thu, 10 Nov 2022 00:49:04 -0600 From: Mario Limonciello <mario.limonciello@amd.com> To: Sven van Ashbrook <svenva@chromium.org>, Rafael J Wysocki <rafael@kernel.org>, <linux-pm@vger.kernel.org>, <platform-driver-x86@vger.kernel.org>, Pavel Machek <pavel@ucw.cz>, Len Brown <len.brown@intel.com>, John Stultz <jstultz@google.com>, Thomas Gleixner <tglx@linutronix.de>, Stephen Boyd <sboyd@kernel.org> CC: Rajneesh Bhardwaj <irenic.rajneesh@gmail.com>, S-k Shyam-sundar <Shyam-sundar.S-k@amd.com>, <rrangel@chromium.org>, Rajat Jain <rajatja@google.com>, David E Box <david.e.box@intel.com>, Hans de Goede <hdegoede@redhat.com>, <linux-kernel@vger.kernel.org>, Mario Limonciello <mario.limonciello@amd.com> Subject: [RFC v2 1/3] PM: Add a sysfs files to represent sleep duration Date: Thu, 10 Nov 2022 00:47:21 -0600 Message-ID: <20221110064723.8882-2-mario.limonciello@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221110064723.8882-1-mario.limonciello@amd.com> References: <20221110064723.8882-1-mario.limonciello@amd.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT103:EE_|DM4PR12MB6280:EE_ X-MS-Office365-Filtering-Correlation-Id: 9cbf64c7-a050-44cf-ebc2-08dac2e7b5bf X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 50ATUK4afSFoNmaik0L4HkU8dCXqepgSq7XWzO4ou1N2XdG+7KabclB+XtVuDE6FPhzWO4PwxX/6K4TjLjp7aEKoB1jkWMuX9m9vtB+gl5ZPzQvEckhu4CxeYg6r5R3v6yGJRvRF9e+pVM9Sxtktoy7jcObSQcNM1BhdfbtbnsZ3UxU/a2Kjw1CQp8YDaP67yEzc7LrxtMk/1UMnOAyr7lxOI1+7Frg/Ufrru69qZVtHmJa/2XpRwWYZx/JL8G5Bmm0P/fafvp00wx12I9WZjPmu0vJKLEzfr4RovUjSxtDpU4hkzs+BzyAVHKKB3mUsy/ncmmHY+1Ves8kMhMgX01ah+C9HEX+yk72KXuWWxHBzt/jWF9K9jh30YJ+8vLL/qc+yPlCSUROK4FB95U6i7JYgpNvEuSNKyVzX1Xg6NxZM0ahGZ6tPz85REz328XZl8e9JLxbakh8E33OEKkcfH2bW+NZcg/ZoayBohTSCsQfdlRZCoqgw4ADRQSGc8yGOomvgKXypU/vvubF7O8RVZ4uE5ilX4QCz7sVdOnGkwt3O+biy30ztc9lKApF1spFO9NMs5+LiL7HDwgjAfieVKOZ1bTpHOKx64B3OWHgL5dDuAtXHGRrZkko/x+WqV3GM4JTIEu3OK/JfWOIzHe4PjHGvXIK8Fjk9nNnuB8EwfAi3IsLNWq+khKwBVgWLUOMeHmjso5u0slJWRmAbXutKwLJn1TvksThuhvviNLokeuAaUtFy6Zxj6Q+orDEX/3YoGCA5x/VFsxbFkDp6uZqZJpAqL4mz7oQuYEm91NcnVVmdXB16JsPAUmVwU2LBp8Ut X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230022)(4636009)(396003)(39860400002)(346002)(136003)(376002)(451199015)(36840700001)(46966006)(40470700004)(36756003)(66899015)(81166007)(86362001)(44832011)(47076005)(7416002)(2906002)(1076003)(26005)(426003)(336012)(83380400001)(5660300002)(16526019)(7696005)(186003)(36860700001)(82740400003)(8676002)(8936002)(2616005)(4326008)(41300700001)(70206006)(110136005)(316002)(70586007)(356005)(40480700001)(54906003)(82310400005)(6666004)(478600001)(40460700003)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Nov 2022 06:49:26.7428 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9cbf64c7-a050-44cf-ebc2-08dac2e7b5bf X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT103.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB6280 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749091239891727697?= X-GMAIL-MSGID: =?utf-8?q?1749091239891727697?= |
Series |
Introduce infrastructure to report time in hardware sleep state
|
|
Commit Message
Mario Limonciello
Nov. 10, 2022, 6:47 a.m. UTC
Both AMD and Intel SoCs have a concept of reporting whether the hardware
reached a hardware sleep state over s2idle as well as how much
time was spent in such a state.
This information is valuable to both chip designers and system designers
as it helps to identify when there are problems with power consumption
over an s2idle cycle.
To make the information discoverable, create a new sysfs file and a symbol
that drivers from supported manufacturers can use to advertise this
information. This file will only be exported when the system supports low
power idle in the ACPI table.
In order to effectively use this information you will ideally want to
compare against the total duration of sleep, so export a second sysfs file
that will show total time. This file will be exported on all systems and
used both for s2idle and s3.
Suggested-by: David E Box <david.e.box@intel.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
Documentation/ABI/testing/sysfs-power | 17 +++++++++++
include/linux/suspend.h | 4 +++
kernel/power/main.c | 42 +++++++++++++++++++++++++++
kernel/power/suspend.c | 2 ++
kernel/time/timekeeping.c | 2 ++
5 files changed, 67 insertions(+)
Comments
On Thu, Nov 10 2022 at 00:47, Mario Limonciello wrote: 'Add a sysfs files'? Can you please decide whether that's 'a file' or 'multiple files'? > Both AMD and Intel SoCs have a concept of reporting whether the hardware > reached a hardware sleep state over s2idle as well as how much > time was spent in such a state. Nice, but ... > This information is valuable to both chip designers and system designers > as it helps to identify when there are problems with power consumption > over an s2idle cycle. > > To make the information discoverable, create a new sysfs file and a symbol > that drivers from supported manufacturers can use to advertise this > information. This file will only be exported when the system supports low > power idle in the ACPI table. > > In order to effectively use this information you will ideally want to > compare against the total duration of sleep, so export a second sysfs file > that will show total time. This file will be exported on all systems and > used both for s2idle and s3. The above is incomprehensible word salad. Can you come up with some coherent explanation of what you are trying to achieve please? > +void pm_set_hw_state_residency(u64 duration) > +{ > + suspend_stats.last_hw_state_residency = duration; > +} > +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); > + > +void pm_account_suspend_type(const struct timespec64 *t) > +{ > + suspend_stats.last_suspend_total += (s64)t->tv_sec * USEC_PER_SEC + > + t->tv_nsec / NSEC_PER_USEC; Conversion functions for timespecs to scalar nanoseconds exist for a reason. Why does this need special treatment and open code it? > +} > +EXPORT_SYMBOL_GPL(pm_account_suspend_type); So none of these functions has any kind of documentation. kernel-doc exists for a reason especially for exported functions. That said, what's the justification to export any of these functions at all? AFAICT pm_account_suspend_type() is only used by builtin code... > +static umode_t suspend_attr_is_visible(struct kobject *kobj, struct attribute *attr, int idx) > +{ > + if (attr != &last_hw_state_residency.attr) > + return 0444; > +#ifdef CONFIG_ACPI > + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) > + return 0444; > +#endif > + return 0; > +} > + > static const struct attribute_group suspend_attr_group = { > .name = "suspend_stats", > .attrs = suspend_attrs, > + .is_visible = suspend_attr_is_visible, How is this change related to the changelog above? We are not hiding subtle changes to the existing code in some conglomorate patch. See Documentation/process/... > --- a/kernel/time/timekeeping.c > +++ b/kernel/time/timekeeping.c > @@ -24,6 +24,7 @@ > #include <linux/compiler.h> > #include <linux/audit.h> > #include <linux/random.h> > +#include <linux/suspend.h> > > #include "tick-internal.h" > #include "ntp_internal.h" > @@ -1698,6 +1699,7 @@ static void __timekeeping_inject_sleeptime(struct timekeeper *tk, > tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *delta)); > tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); > tk_debug_account_sleep_time(delta); > + pm_account_suspend_type(delta); That function name is really self explaining - NOT ! pm_account_suspend_type(delta); So this will account a suspend type depending on the time spent in suspend, right? It's totally obvious that the suspend type (whatever it is) depends on the time delta argument... especially when the function at hand has absolutely nothing to do with a type: > +void pm_account_suspend_type(const struct timespec64 *t) > +{ > + suspend_stats.last_suspend_total += (s64)t->tv_sec * USEC_PER_SEC + > + t->tv_nsec / NSEC_PER_USEC; > +} Sigh.... Thanks, tglx
On Thu, Nov 10, 2022 at 12:47:21AM -0600, Mario Limonciello wrote: > +static ssize_t last_hw_state_residency_show(struct kobject *kobj, > + struct kobj_attribute *attr, char *buf) > +{ > + return sprintf(buf, "%llu\n", suspend_stats.last_hw_state_residency); sysfs_emit() please for sysfs files, not a "raw" sprintf(). checkpatch.pl should have caught that for you, but sometimes it doesn't. thanks, greg k-h
[Public] Thanks! Appreciate the comments. At least conceptually is there agreement to this idea for the two sysfs files and userspace can use them to do this comparison? A few nested replies below, but I'll clean it up for RFC v3 or submit as PATCH v1 if there is conceptual alignment before then. > On Thu, Nov 10 2022 at 00:47, Mario Limonciello wrote: > > 'Add a sysfs files'? > > Can you please decide whether that's 'a file' or 'multiple files'? Yup thanks; bad find and replace in the commit message when I added the second file. > > > Both AMD and Intel SoCs have a concept of reporting whether the > hardware > > reached a hardware sleep state over s2idle as well as how much > > time was spent in such a state. > > Nice, but ... > > > This information is valuable to both chip designers and system designers > > as it helps to identify when there are problems with power consumption > > over an s2idle cycle. > > > > To make the information discoverable, create a new sysfs file and a symbol > > that drivers from supported manufacturers can use to advertise this > > information. This file will only be exported when the system supports low > > power idle in the ACPI table. > > > > In order to effectively use this information you will ideally want to > > compare against the total duration of sleep, so export a second sysfs file > > that will show total time. This file will be exported on all systems and > > used both for s2idle and s3. > > The above is incomprehensible word salad. Can you come up with some > coherent explanation of what you are trying to achieve please? > > > +void pm_set_hw_state_residency(u64 duration) > > +{ > > + suspend_stats.last_hw_state_residency = duration; > > +} > > +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); > > + > > +void pm_account_suspend_type(const struct timespec64 *t) > > +{ > > + suspend_stats.last_suspend_total += (s64)t->tv_sec * > USEC_PER_SEC + > > + t->tv_nsec / > NSEC_PER_USEC; > > Conversion functions for timespecs to scalar nanoseconds exist for a > reason. Why does this need special treatment and open code it? Will fixup to use conversion functions. > > > +} > > +EXPORT_SYMBOL_GPL(pm_account_suspend_type); > > So none of these functions has any kind of documentation. kernel-doc > exists for a reason especially for exported functions. > > That said, what's the justification to export any of these functions at > all? AFAICT pm_account_suspend_type() is only used by builtin code... I think you're right; they shouldn't export; will fix. > > > +static umode_t suspend_attr_is_visible(struct kobject *kobj, struct > attribute *attr, int idx) > > +{ > > + if (attr != &last_hw_state_residency.attr) > > + return 0444; > > +#ifdef CONFIG_ACPI > > + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) > > + return 0444; > > +#endif > > + return 0; > > +} > > + > > static const struct attribute_group suspend_attr_group = { > > .name = "suspend_stats", > > .attrs = suspend_attrs, > > + .is_visible = suspend_attr_is_visible, > > How is this change related to the changelog above? We are not hiding > subtle changes to the existing code in some conglomorate patch. See > Documentation/process/... It was from feedback from RFC v1 from David Box that this file should only be visible when s2idle is supported on the hardware. Will adjust commit message to make it clearer. > > > --- a/kernel/time/timekeeping.c > > +++ b/kernel/time/timekeeping.c > > @@ -24,6 +24,7 @@ > > #include <linux/compiler.h> > > #include <linux/audit.h> > > #include <linux/random.h> > > +#include <linux/suspend.h> > > > > #include "tick-internal.h" > > #include "ntp_internal.h" > > @@ -1698,6 +1699,7 @@ static void > __timekeeping_inject_sleeptime(struct timekeeper *tk, > > tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, > *delta)); > > tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); > > tk_debug_account_sleep_time(delta); > > + pm_account_suspend_type(delta); > > That function name is really self explaining - NOT ! > > pm_account_suspend_type(delta); > > So this will account a suspend type depending on the time spent in > suspend, right? > > It's totally obvious that the suspend type (whatever it is) depends on > the time delta argument... especially when the function at hand has > absolutely nothing to do with a type: > I fat fingered this. In my mind I thought I wrote pm_account_suspend_time() Will fix. > > +void pm_account_suspend_type(const struct timespec64 *t) > > +{ > > + suspend_stats.last_suspend_total += (s64)t->tv_sec * > USEC_PER_SEC + > > + t->tv_nsec / > NSEC_PER_USEC; > > +} > > Sigh.... > > Thanks, > > tglx
Hi Mario, On 11/14/22 20:12, Limonciello, Mario wrote: > [Public] > > Thanks! Appreciate the comments. > At least conceptually is there agreement to this idea for the two sysfs files > and userspace can use them to do this comparison? First of all let me say that I think that having some generic mechanism which allows userspace to check if deep enough sleep-state were reached is a good idea. And thank you for working on this! I wonder though if it would not be better to have some mechanism where a list of sleep states + time spend in each time is printed ? E.g. I know that on Intel Bay Trail and Cherry Trail devices (just an example I'm familiar with) there are S0i0 - S0i3 and we really want to reach S0i3 during suspend. Sometimes on S0i1 or S0i2 is reached due to some part of the hw not getting suspended properly. So then we have reached "a hardware sleep state over s2idle" but no the one we want. OTOH I can image that if we start adding support for functionality like standby-connect under Linux that then we may not always reach the deepest hw sleep-state. So I'm a bit worried that having just a single number for last_hw_state_residency is not enough. I think that it might be better to have a mechanism to set a set of names for hw-states (once) and then set the residency per state (*) after resume and have the sysfs file print the entire list. This list could then also always include the total suspend time, also avoiding the need for a second sysfs file and we could also use the same format for non s2idle suspend having it print only the total suspend time when no hw-state names are set. Regards, Hans *) Using an array, so up to MAX_HW_RESIDENCY_STATES > > A few nested replies below, but I'll clean it up for > RFC v3 or submit as PATCH v1 if there is conceptual alignment before then. > >> On Thu, Nov 10 2022 at 00:47, Mario Limonciello wrote: >> >> 'Add a sysfs files'? >> >> Can you please decide whether that's 'a file' or 'multiple files'? > > Yup thanks; bad find and replace in the commit message when I added > the second file. > >> >>> Both AMD and Intel SoCs have a concept of reporting whether the >> hardware >>> reached a hardware sleep state over s2idle as well as how much >>> time was spent in such a state. >> >> Nice, but ... >> >>> This information is valuable to both chip designers and system designers >>> as it helps to identify when there are problems with power consumption >>> over an s2idle cycle. >>> >>> To make the information discoverable, create a new sysfs file and a symbol >>> that drivers from supported manufacturers can use to advertise this >>> information. This file will only be exported when the system supports low >>> power idle in the ACPI table. >>> >>> In order to effectively use this information you will ideally want to >>> compare against the total duration of sleep, so export a second sysfs file >>> that will show total time. This file will be exported on all systems and >>> used both for s2idle and s3. >> >> The above is incomprehensible word salad. Can you come up with some >> coherent explanation of what you are trying to achieve please? >> >>> +void pm_set_hw_state_residency(u64 duration) >>> +{ >>> + suspend_stats.last_hw_state_residency = duration; >>> +} >>> +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); >>> + >>> +void pm_account_suspend_type(const struct timespec64 *t) >>> +{ >>> + suspend_stats.last_suspend_total += (s64)t->tv_sec * >> USEC_PER_SEC + >>> + t->tv_nsec / >> NSEC_PER_USEC; >> >> Conversion functions for timespecs to scalar nanoseconds exist for a >> reason. Why does this need special treatment and open code it? > > Will fixup to use conversion functions. > >> >>> +} >>> +EXPORT_SYMBOL_GPL(pm_account_suspend_type); >> >> So none of these functions has any kind of documentation. kernel-doc >> exists for a reason especially for exported functions. >> >> That said, what's the justification to export any of these functions at >> all? AFAICT pm_account_suspend_type() is only used by builtin code... > > I think you're right; they shouldn't export; will fix. > >> >>> +static umode_t suspend_attr_is_visible(struct kobject *kobj, struct >> attribute *attr, int idx) >>> +{ >>> + if (attr != &last_hw_state_residency.attr) >>> + return 0444; >>> +#ifdef CONFIG_ACPI >>> + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) >>> + return 0444; >>> +#endif >>> + return 0; >>> +} >>> + >>> static const struct attribute_group suspend_attr_group = { >>> .name = "suspend_stats", >>> .attrs = suspend_attrs, >>> + .is_visible = suspend_attr_is_visible, >> >> How is this change related to the changelog above? We are not hiding >> subtle changes to the existing code in some conglomorate patch. See >> Documentation/process/... > > It was from feedback from RFC v1 from David Box that this file should only > be visible when s2idle is supported on the hardware. Will adjust commit > message to make it clearer. > >> >>> --- a/kernel/time/timekeeping.c >>> +++ b/kernel/time/timekeeping.c >>> @@ -24,6 +24,7 @@ >>> #include <linux/compiler.h> >>> #include <linux/audit.h> >>> #include <linux/random.h> >>> +#include <linux/suspend.h> >>> >>> #include "tick-internal.h" >>> #include "ntp_internal.h" >>> @@ -1698,6 +1699,7 @@ static void >> __timekeeping_inject_sleeptime(struct timekeeper *tk, >>> tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, >> *delta)); >>> tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); >>> tk_debug_account_sleep_time(delta); >>> + pm_account_suspend_type(delta); >> >> That function name is really self explaining - NOT ! >> >> pm_account_suspend_type(delta); >> >> So this will account a suspend type depending on the time spent in >> suspend, right? >> >> It's totally obvious that the suspend type (whatever it is) depends on >> the time delta argument... especially when the function at hand has >> absolutely nothing to do with a type: >> > > I fat fingered this. In my mind I thought I wrote pm_account_suspend_time() > Will fix. > >>> +void pm_account_suspend_type(const struct timespec64 *t) >>> +{ >>> + suspend_stats.last_suspend_total += (s64)t->tv_sec * >> USEC_PER_SEC + >>> + t->tv_nsec / >> NSEC_PER_USEC; >>> +} >> >> Sigh.... >> >> Thanks, >> >> tglx >
On 11/15/2022 04:32, Hans de Goede wrote: > Hi Mario, > > On 11/14/22 20:12, Limonciello, Mario wrote: >> [Public] >> >> Thanks! Appreciate the comments. >> At least conceptually is there agreement to this idea for the two sysfs files >> and userspace can use them to do this comparison? > > First of all let me say that I think that having some generic mechanism > which allows userspace to check if deep enough sleep-state were reached > is a good idea. And thank you for working on this! > Sure! > I wonder though if it would not be better to have some mechanism > where a list of sleep states + time spend in each time is printed ? > > E.g. I know that on Intel Bay Trail and Cherry Trail devices (just an > example I'm familiar with) there are S0i0 - S0i3 and we really want > to reach S0i3 during suspend. > > Sometimes on S0i1 or S0i2 is reached due to some part of the hw > not getting suspended properly. > > So then we have reached "a hardware sleep state over s2idle" > but no the one we want. At least the way it's built right now it's tracking the s0ix counter for Intel and the s0i3 counter for AMD. BTW - when I did all the cleanups suggested in RFC v2 I notice I was taking the raw number for Intel, and I have that fixed for the next version. I don't know if other counters exist for Intel for various hardware states. On the current AMD silicon this is the interesting metric. > > OTOH I can image that if we start adding support for functionality > like standby-connect under Linux that then we may not always > reach the deepest hw sleep-state. Can you elaborate what you mean by standby connect? WoWLAN? At least on the current AMD platforms WoWLAN can happen while the silicon is in the deepest hardware sleep state. > > So I'm a bit worried that having just a single number for > last_hw_state_residency is not enough. > > I think that it might be better to have a mechanism to set > a set of names for hw-states (once) and then set the residency > per state (*) after resume and have the sysfs file print > the entire list. > > This list could then also always include the total suspend time, > also avoiding the need for a second sysfs file and we could also > use the same format for non s2idle suspend having it print > only the total suspend time when no hw-state names are set. So is your thought is to have a single sysfs file something like /sys/power/suspend_stats/s2idle_stats that would show this? state \t % \t duration (us) s0i3 \t 99.5% \t 1000 For AMD that would be a single line and I don't think it's worth the extra code. I would like to know if it actually makes sense for Intel though. We also need to think about what will be actionable with this information by consumers of it because I'm certain it will be leading to bug reports. Let's think about a hypothetical bug report: "Intel System only spent 20% of time in deepest hardware state". They attach to the bug report s2idle_stats that looks like this: state \t % \t duration (us) s0i2 \t 80.0% \t 1000000 s0i3 \t 20.0% \t 100000 Is that any more actionable than /sys/power/last_hw_state_residency showing 100000 and /sys/power/suspend_total showing 500000 I think in either case the next action is more debugging will be needed, such as turning on dynamic debug or some module parameters. "Practically" I expect software like systemd or powerd to be reading these sysfs files. > > Regards, > > Hans > > > *) Using an array, so up to MAX_HW_RESIDENCY_STATES > > >> >> A few nested replies below, but I'll clean it up for >> RFC v3 or submit as PATCH v1 if there is conceptual alignment before then. >> >>> On Thu, Nov 10 2022 at 00:47, Mario Limonciello wrote: >>> >>> 'Add a sysfs files'? >>> >>> Can you please decide whether that's 'a file' or 'multiple files'? >> >> Yup thanks; bad find and replace in the commit message when I added >> the second file. >> >>> >>>> Both AMD and Intel SoCs have a concept of reporting whether the >>> hardware >>>> reached a hardware sleep state over s2idle as well as how much >>>> time was spent in such a state. >>> >>> Nice, but ... >>> >>>> This information is valuable to both chip designers and system designers >>>> as it helps to identify when there are problems with power consumption >>>> over an s2idle cycle. >>>> >>>> To make the information discoverable, create a new sysfs file and a symbol >>>> that drivers from supported manufacturers can use to advertise this >>>> information. This file will only be exported when the system supports low >>>> power idle in the ACPI table. >>>> >>>> In order to effectively use this information you will ideally want to >>>> compare against the total duration of sleep, so export a second sysfs file >>>> that will show total time. This file will be exported on all systems and >>>> used both for s2idle and s3. >>> >>> The above is incomprehensible word salad. Can you come up with some >>> coherent explanation of what you are trying to achieve please? >>> >>>> +void pm_set_hw_state_residency(u64 duration) >>>> +{ >>>> + suspend_stats.last_hw_state_residency = duration; >>>> +} >>>> +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); >>>> + >>>> +void pm_account_suspend_type(const struct timespec64 *t) >>>> +{ >>>> + suspend_stats.last_suspend_total += (s64)t->tv_sec * >>> USEC_PER_SEC + >>>> + t->tv_nsec / >>> NSEC_PER_USEC; >>> >>> Conversion functions for timespecs to scalar nanoseconds exist for a >>> reason. Why does this need special treatment and open code it? >> >> Will fixup to use conversion functions. >> >>> >>>> +} >>>> +EXPORT_SYMBOL_GPL(pm_account_suspend_type); >>> >>> So none of these functions has any kind of documentation. kernel-doc >>> exists for a reason especially for exported functions. >>> >>> That said, what's the justification to export any of these functions at >>> all? AFAICT pm_account_suspend_type() is only used by builtin code... >> >> I think you're right; they shouldn't export; will fix. >> >>> >>>> +static umode_t suspend_attr_is_visible(struct kobject *kobj, struct >>> attribute *attr, int idx) >>>> +{ >>>> + if (attr != &last_hw_state_residency.attr) >>>> + return 0444; >>>> +#ifdef CONFIG_ACPI >>>> + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) >>>> + return 0444; >>>> +#endif >>>> + return 0; >>>> +} >>>> + >>>> static const struct attribute_group suspend_attr_group = { >>>> .name = "suspend_stats", >>>> .attrs = suspend_attrs, >>>> + .is_visible = suspend_attr_is_visible, >>> >>> How is this change related to the changelog above? We are not hiding >>> subtle changes to the existing code in some conglomorate patch. See >>> Documentation/process/... >> >> It was from feedback from RFC v1 from David Box that this file should only >> be visible when s2idle is supported on the hardware. Will adjust commit >> message to make it clearer. >> >>> >>>> --- a/kernel/time/timekeeping.c >>>> +++ b/kernel/time/timekeeping.c >>>> @@ -24,6 +24,7 @@ >>>> #include <linux/compiler.h> >>>> #include <linux/audit.h> >>>> #include <linux/random.h> >>>> +#include <linux/suspend.h> >>>> >>>> #include "tick-internal.h" >>>> #include "ntp_internal.h" >>>> @@ -1698,6 +1699,7 @@ static void >>> __timekeeping_inject_sleeptime(struct timekeeper *tk, >>>> tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, >>> *delta)); >>>> tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); >>>> tk_debug_account_sleep_time(delta); >>>> + pm_account_suspend_type(delta); >>> >>> That function name is really self explaining - NOT ! >>> >>> pm_account_suspend_type(delta); >>> >>> So this will account a suspend type depending on the time spent in >>> suspend, right? >>> >>> It's totally obvious that the suspend type (whatever it is) depends on >>> the time delta argument... especially when the function at hand has >>> absolutely nothing to do with a type: >>> >> >> I fat fingered this. In my mind I thought I wrote pm_account_suspend_time() >> Will fix. >> >>>> +void pm_account_suspend_type(const struct timespec64 *t) >>>> +{ >>>> + suspend_stats.last_suspend_total += (s64)t->tv_sec * >>> USEC_PER_SEC + >>>> + t->tv_nsec / >>> NSEC_PER_USEC; >>>> +} >>> >>> Sigh.... >>> >>> Thanks, >>> >>> tglx >> >
On Thu, Nov 10, 2022 at 7:49 AM Mario Limonciello <mario.limonciello@amd.com> wrote: > > Both AMD and Intel SoCs have a concept of reporting whether the hardware > reached a hardware sleep state over s2idle as well as how much > time was spent in such a state. > > This information is valuable to both chip designers and system designers > as it helps to identify when there are problems with power consumption > over an s2idle cycle. > > To make the information discoverable, create a new sysfs file and a symbol > that drivers from supported manufacturers can use to advertise this > information. This file will only be exported when the system supports low > power idle in the ACPI table. > > In order to effectively use this information you will ideally want to > compare against the total duration of sleep, so export a second sysfs file > that will show total time. This file will be exported on all systems and > used both for s2idle and s3. Well, my first question would be how this is related to /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us and /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us > Suggested-by: David E Box <david.e.box@intel.com> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > --- > Documentation/ABI/testing/sysfs-power | 17 +++++++++++ > include/linux/suspend.h | 4 +++ > kernel/power/main.c | 42 +++++++++++++++++++++++++++ > kernel/power/suspend.c | 2 ++ > kernel/time/timekeeping.c | 2 ++ > 5 files changed, 67 insertions(+) > > diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power > index f99d433ff311..5b47cbb4dc9e 100644 > --- a/Documentation/ABI/testing/sysfs-power > +++ b/Documentation/ABI/testing/sysfs-power > @@ -413,6 +413,23 @@ Description: > The /sys/power/suspend_stats/last_failed_step file contains > the last failed step in the suspend/resume path. > > +What: /sys/power/suspend_stats/last_hw_state_residency > +Date: December 2022 > +Contact: Mario Limonciello <mario.limonciello@amd.com> > +Description: > + The /sys/power/suspend_stats/last_hw_state_residency file contains > + the amount of time spent in a hardware sleep state. > + This attribute is only available if the system supports > + low power idle. This is measured in microseconds. > + > +What: /sys/power/suspend_stats/last_suspend_total > +Date: December 2022 > +Contact: Mario Limonciello <mario.limonciello@amd.com> > +Description: > + The /sys/power/suspend_stats/last_suspend_total file contains > + the total duration of the sleep cycle. > + This is measured in microseconds. > + > What: /sys/power/sync_on_suspend > Date: October 2019 > Contact: Jonas Meurer <jonas@freesources.org> > diff --git a/include/linux/suspend.h b/include/linux/suspend.h > index cfe19a028918..af343c3f8198 100644 > --- a/include/linux/suspend.h > +++ b/include/linux/suspend.h > @@ -68,6 +68,8 @@ struct suspend_stats { > int last_failed_errno; > int errno[REC_FAILED_NUM]; > int last_failed_step; > + u64 last_hw_state_residency; > + u64 last_suspend_total; > enum suspend_stat_step failed_steps[REC_FAILED_NUM]; > }; > > @@ -489,6 +491,8 @@ void restore_processor_state(void); > extern int register_pm_notifier(struct notifier_block *nb); > extern int unregister_pm_notifier(struct notifier_block *nb); > extern void ksys_sync_helper(void); > +extern void pm_set_hw_state_residency(u64 duration); > +extern void pm_account_suspend_type(const struct timespec64 *t); > > #define pm_notifier(fn, pri) { \ > static struct notifier_block fn##_nb = \ > diff --git a/kernel/power/main.c b/kernel/power/main.c > index 31ec4a9b9d70..11bd658583b0 100644 > --- a/kernel/power/main.c > +++ b/kernel/power/main.c > @@ -6,6 +6,7 @@ > * Copyright (c) 2003 Open Source Development Lab > */ > > +#include <linux/acpi.h> > #include <linux/export.h> > #include <linux/kobject.h> > #include <linux/string.h> > @@ -54,6 +55,19 @@ void unlock_system_sleep(unsigned int flags) > } > EXPORT_SYMBOL_GPL(unlock_system_sleep); > > +void pm_set_hw_state_residency(u64 duration) > +{ > + suspend_stats.last_hw_state_residency = duration; > +} > +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); > + > +void pm_account_suspend_type(const struct timespec64 *t) > +{ > + suspend_stats.last_suspend_total += (s64)t->tv_sec * USEC_PER_SEC + > + t->tv_nsec / NSEC_PER_USEC; > +} > +EXPORT_SYMBOL_GPL(pm_account_suspend_type); > + > void ksys_sync_helper(void) > { > ktime_t start; > @@ -377,6 +391,20 @@ static ssize_t last_failed_step_show(struct kobject *kobj, > } > static struct kobj_attribute last_failed_step = __ATTR_RO(last_failed_step); > > +static ssize_t last_hw_state_residency_show(struct kobject *kobj, > + struct kobj_attribute *attr, char *buf) > +{ > + return sprintf(buf, "%llu\n", suspend_stats.last_hw_state_residency); > +} > +static struct kobj_attribute last_hw_state_residency = __ATTR_RO(last_hw_state_residency); > + > +static ssize_t last_suspend_total_show(struct kobject *kobj, > + struct kobj_attribute *attr, char *buf) > +{ > + return sprintf(buf, "%llu\n", suspend_stats.last_suspend_total); > +} > +static struct kobj_attribute last_suspend_total = __ATTR_RO(last_suspend_total); > + > static struct attribute *suspend_attrs[] = { > &success.attr, > &fail.attr, > @@ -391,12 +419,26 @@ static struct attribute *suspend_attrs[] = { > &last_failed_dev.attr, > &last_failed_errno.attr, > &last_failed_step.attr, > + &last_hw_state_residency.attr, > + &last_suspend_total.attr, > NULL, > }; > > +static umode_t suspend_attr_is_visible(struct kobject *kobj, struct attribute *attr, int idx) > +{ > + if (attr != &last_hw_state_residency.attr) > + return 0444; > +#ifdef CONFIG_ACPI > + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) > + return 0444; > +#endif > + return 0; > +} > + > static const struct attribute_group suspend_attr_group = { > .name = "suspend_stats", > .attrs = suspend_attrs, > + .is_visible = suspend_attr_is_visible, > }; > > #ifdef CONFIG_DEBUG_FS > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c > index fa3bf161d13f..b6c4a3733212 100644 > --- a/kernel/power/suspend.c > +++ b/kernel/power/suspend.c > @@ -423,6 +423,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup) > if (suspend_test(TEST_PLATFORM)) > goto Platform_wake; > > + suspend_stats.last_suspend_total = 0; > + > if (state == PM_SUSPEND_TO_IDLE) { > s2idle_loop(); > goto Platform_wake; > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > index f72b9f1de178..e1b356787e53 100644 > --- a/kernel/time/timekeeping.c > +++ b/kernel/time/timekeeping.c > @@ -24,6 +24,7 @@ > #include <linux/compiler.h> > #include <linux/audit.h> > #include <linux/random.h> > +#include <linux/suspend.h> > > #include "tick-internal.h" > #include "ntp_internal.h" > @@ -1698,6 +1699,7 @@ static void __timekeeping_inject_sleeptime(struct timekeeper *tk, > tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *delta)); > tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); > tk_debug_account_sleep_time(delta); > + pm_account_suspend_type(delta); > } > > #if defined(CONFIG_PM_SLEEP) && defined(CONFIG_RTC_HCTOSYS_DEVICE) > -- > 2.34.1 >
On 11/15/2022 08:45, Rafael J. Wysocki wrote: > On Thu, Nov 10, 2022 at 7:49 AM Mario Limonciello > <mario.limonciello@amd.com> wrote: >> >> Both AMD and Intel SoCs have a concept of reporting whether the hardware >> reached a hardware sleep state over s2idle as well as how much >> time was spent in such a state. >> >> This information is valuable to both chip designers and system designers >> as it helps to identify when there are problems with power consumption >> over an s2idle cycle. >> >> To make the information discoverable, create a new sysfs file and a symbol >> that drivers from supported manufacturers can use to advertise this >> information. This file will only be exported when the system supports low >> power idle in the ACPI table. >> >> In order to effectively use this information you will ideally want to >> compare against the total duration of sleep, so export a second sysfs file >> that will show total time. This file will be exported on all systems and >> used both for s2idle and s3. > > Well, my first question would be how this is related to > > /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us > This has a dependency on the platform firmware offering an ACPI LPIT table. I don't know how common that is. As this series started from the needs on ChromeOS I would ask is that typically populated by coreboot? I would hope it's the same number that is populated in that file on supported systems though. > and > > /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us > No relation to this one for what's in the series. >> Suggested-by: David E Box <david.e.box@intel.com> >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >> --- >> Documentation/ABI/testing/sysfs-power | 17 +++++++++++ >> include/linux/suspend.h | 4 +++ >> kernel/power/main.c | 42 +++++++++++++++++++++++++++ >> kernel/power/suspend.c | 2 ++ >> kernel/time/timekeeping.c | 2 ++ >> 5 files changed, 67 insertions(+) >> >> diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power >> index f99d433ff311..5b47cbb4dc9e 100644 >> --- a/Documentation/ABI/testing/sysfs-power >> +++ b/Documentation/ABI/testing/sysfs-power >> @@ -413,6 +413,23 @@ Description: >> The /sys/power/suspend_stats/last_failed_step file contains >> the last failed step in the suspend/resume path. >> >> +What: /sys/power/suspend_stats/last_hw_state_residency >> +Date: December 2022 >> +Contact: Mario Limonciello <mario.limonciello@amd.com> >> +Description: >> + The /sys/power/suspend_stats/last_hw_state_residency file contains >> + the amount of time spent in a hardware sleep state. >> + This attribute is only available if the system supports >> + low power idle. This is measured in microseconds. >> + >> +What: /sys/power/suspend_stats/last_suspend_total >> +Date: December 2022 >> +Contact: Mario Limonciello <mario.limonciello@amd.com> >> +Description: >> + The /sys/power/suspend_stats/last_suspend_total file contains >> + the total duration of the sleep cycle. >> + This is measured in microseconds. >> + >> What: /sys/power/sync_on_suspend >> Date: October 2019 >> Contact: Jonas Meurer <jonas@freesources.org> >> diff --git a/include/linux/suspend.h b/include/linux/suspend.h >> index cfe19a028918..af343c3f8198 100644 >> --- a/include/linux/suspend.h >> +++ b/include/linux/suspend.h >> @@ -68,6 +68,8 @@ struct suspend_stats { >> int last_failed_errno; >> int errno[REC_FAILED_NUM]; >> int last_failed_step; >> + u64 last_hw_state_residency; >> + u64 last_suspend_total; >> enum suspend_stat_step failed_steps[REC_FAILED_NUM]; >> }; >> >> @@ -489,6 +491,8 @@ void restore_processor_state(void); >> extern int register_pm_notifier(struct notifier_block *nb); >> extern int unregister_pm_notifier(struct notifier_block *nb); >> extern void ksys_sync_helper(void); >> +extern void pm_set_hw_state_residency(u64 duration); >> +extern void pm_account_suspend_type(const struct timespec64 *t); >> >> #define pm_notifier(fn, pri) { \ >> static struct notifier_block fn##_nb = \ >> diff --git a/kernel/power/main.c b/kernel/power/main.c >> index 31ec4a9b9d70..11bd658583b0 100644 >> --- a/kernel/power/main.c >> +++ b/kernel/power/main.c >> @@ -6,6 +6,7 @@ >> * Copyright (c) 2003 Open Source Development Lab >> */ >> >> +#include <linux/acpi.h> >> #include <linux/export.h> >> #include <linux/kobject.h> >> #include <linux/string.h> >> @@ -54,6 +55,19 @@ void unlock_system_sleep(unsigned int flags) >> } >> EXPORT_SYMBOL_GPL(unlock_system_sleep); >> >> +void pm_set_hw_state_residency(u64 duration) >> +{ >> + suspend_stats.last_hw_state_residency = duration; >> +} >> +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); >> + >> +void pm_account_suspend_type(const struct timespec64 *t) >> +{ >> + suspend_stats.last_suspend_total += (s64)t->tv_sec * USEC_PER_SEC + >> + t->tv_nsec / NSEC_PER_USEC; >> +} >> +EXPORT_SYMBOL_GPL(pm_account_suspend_type); >> + >> void ksys_sync_helper(void) >> { >> ktime_t start; >> @@ -377,6 +391,20 @@ static ssize_t last_failed_step_show(struct kobject *kobj, >> } >> static struct kobj_attribute last_failed_step = __ATTR_RO(last_failed_step); >> >> +static ssize_t last_hw_state_residency_show(struct kobject *kobj, >> + struct kobj_attribute *attr, char *buf) >> +{ >> + return sprintf(buf, "%llu\n", suspend_stats.last_hw_state_residency); >> +} >> +static struct kobj_attribute last_hw_state_residency = __ATTR_RO(last_hw_state_residency); >> + >> +static ssize_t last_suspend_total_show(struct kobject *kobj, >> + struct kobj_attribute *attr, char *buf) >> +{ >> + return sprintf(buf, "%llu\n", suspend_stats.last_suspend_total); >> +} >> +static struct kobj_attribute last_suspend_total = __ATTR_RO(last_suspend_total); >> + >> static struct attribute *suspend_attrs[] = { >> &success.attr, >> &fail.attr, >> @@ -391,12 +419,26 @@ static struct attribute *suspend_attrs[] = { >> &last_failed_dev.attr, >> &last_failed_errno.attr, >> &last_failed_step.attr, >> + &last_hw_state_residency.attr, >> + &last_suspend_total.attr, >> NULL, >> }; >> >> +static umode_t suspend_attr_is_visible(struct kobject *kobj, struct attribute *attr, int idx) >> +{ >> + if (attr != &last_hw_state_residency.attr) >> + return 0444; >> +#ifdef CONFIG_ACPI >> + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) >> + return 0444; >> +#endif >> + return 0; >> +} >> + >> static const struct attribute_group suspend_attr_group = { >> .name = "suspend_stats", >> .attrs = suspend_attrs, >> + .is_visible = suspend_attr_is_visible, >> }; >> >> #ifdef CONFIG_DEBUG_FS >> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c >> index fa3bf161d13f..b6c4a3733212 100644 >> --- a/kernel/power/suspend.c >> +++ b/kernel/power/suspend.c >> @@ -423,6 +423,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup) >> if (suspend_test(TEST_PLATFORM)) >> goto Platform_wake; >> >> + suspend_stats.last_suspend_total = 0; >> + >> if (state == PM_SUSPEND_TO_IDLE) { >> s2idle_loop(); >> goto Platform_wake; >> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c >> index f72b9f1de178..e1b356787e53 100644 >> --- a/kernel/time/timekeeping.c >> +++ b/kernel/time/timekeeping.c >> @@ -24,6 +24,7 @@ >> #include <linux/compiler.h> >> #include <linux/audit.h> >> #include <linux/random.h> >> +#include <linux/suspend.h> >> >> #include "tick-internal.h" >> #include "ntp_internal.h" >> @@ -1698,6 +1699,7 @@ static void __timekeeping_inject_sleeptime(struct timekeeper *tk, >> tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *delta)); >> tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); >> tk_debug_account_sleep_time(delta); >> + pm_account_suspend_type(delta); >> } >> >> #if defined(CONFIG_PM_SLEEP) && defined(CONFIG_RTC_HCTOSYS_DEVICE) >> -- >> 2.34.1 >>
On Tue, Nov 15, 2022 at 4:17 PM Limonciello, Mario <mario.limonciello@amd.com> wrote: > > On 11/15/2022 08:45, Rafael J. Wysocki wrote: > > On Thu, Nov 10, 2022 at 7:49 AM Mario Limonciello > > <mario.limonciello@amd.com> wrote: > >> > >> Both AMD and Intel SoCs have a concept of reporting whether the hardware > >> reached a hardware sleep state over s2idle as well as how much > >> time was spent in such a state. > >> > >> This information is valuable to both chip designers and system designers > >> as it helps to identify when there are problems with power consumption > >> over an s2idle cycle. > >> > >> To make the information discoverable, create a new sysfs file and a symbol > >> that drivers from supported manufacturers can use to advertise this > >> information. This file will only be exported when the system supports low > >> power idle in the ACPI table. > >> > >> In order to effectively use this information you will ideally want to > >> compare against the total duration of sleep, so export a second sysfs file > >> that will show total time. This file will be exported on all systems and > >> used both for s2idle and s3. > > > > Well, my first question would be how this is related to > > > > /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us > > > > This has a dependency on the platform firmware offering an ACPI LPIT > table. I don't know how common that is. Required for running Windows with Modern Standby AFAICS. > As this series started from the needs on ChromeOS I would ask is that typically populated by coreboot? It should be, but I'd need to ask for confirmation. > I would hope it's the same number that is populated in that file on > supported systems though. Well, which is exactly where I'm going. Since there is one sysfs file for exposing this value already and it is used (for example, by sleepgraph), perhaps the way to go would be to extend this interface to systems that don't have LPIT instead of introducing a new one possibly exposing the same value? > > and > > > > /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us > > > > No relation to this one for what's in the series. > > >> Suggested-by: David E Box <david.e.box@intel.com> > >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > >> --- > >> Documentation/ABI/testing/sysfs-power | 17 +++++++++++ > >> include/linux/suspend.h | 4 +++ > >> kernel/power/main.c | 42 +++++++++++++++++++++++++++ > >> kernel/power/suspend.c | 2 ++ > >> kernel/time/timekeeping.c | 2 ++ > >> 5 files changed, 67 insertions(+) > >> > >> diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power > >> index f99d433ff311..5b47cbb4dc9e 100644 > >> --- a/Documentation/ABI/testing/sysfs-power > >> +++ b/Documentation/ABI/testing/sysfs-power > >> @@ -413,6 +413,23 @@ Description: > >> The /sys/power/suspend_stats/last_failed_step file contains > >> the last failed step in the suspend/resume path. > >> > >> +What: /sys/power/suspend_stats/last_hw_state_residency > >> +Date: December 2022 > >> +Contact: Mario Limonciello <mario.limonciello@amd.com> > >> +Description: > >> + The /sys/power/suspend_stats/last_hw_state_residency file contains > >> + the amount of time spent in a hardware sleep state. > >> + This attribute is only available if the system supports > >> + low power idle. This is measured in microseconds. > >> + > >> +What: /sys/power/suspend_stats/last_suspend_total > >> +Date: December 2022 > >> +Contact: Mario Limonciello <mario.limonciello@amd.com> > >> +Description: > >> + The /sys/power/suspend_stats/last_suspend_total file contains > >> + the total duration of the sleep cycle. > >> + This is measured in microseconds. > >> + > >> What: /sys/power/sync_on_suspend > >> Date: October 2019 > >> Contact: Jonas Meurer <jonas@freesources.org> > >> diff --git a/include/linux/suspend.h b/include/linux/suspend.h > >> index cfe19a028918..af343c3f8198 100644 > >> --- a/include/linux/suspend.h > >> +++ b/include/linux/suspend.h > >> @@ -68,6 +68,8 @@ struct suspend_stats { > >> int last_failed_errno; > >> int errno[REC_FAILED_NUM]; > >> int last_failed_step; > >> + u64 last_hw_state_residency; > >> + u64 last_suspend_total; > >> enum suspend_stat_step failed_steps[REC_FAILED_NUM]; > >> }; > >> > >> @@ -489,6 +491,8 @@ void restore_processor_state(void); > >> extern int register_pm_notifier(struct notifier_block *nb); > >> extern int unregister_pm_notifier(struct notifier_block *nb); > >> extern void ksys_sync_helper(void); > >> +extern void pm_set_hw_state_residency(u64 duration); > >> +extern void pm_account_suspend_type(const struct timespec64 *t); > >> > >> #define pm_notifier(fn, pri) { \ > >> static struct notifier_block fn##_nb = \ > >> diff --git a/kernel/power/main.c b/kernel/power/main.c > >> index 31ec4a9b9d70..11bd658583b0 100644 > >> --- a/kernel/power/main.c > >> +++ b/kernel/power/main.c > >> @@ -6,6 +6,7 @@ > >> * Copyright (c) 2003 Open Source Development Lab > >> */ > >> > >> +#include <linux/acpi.h> > >> #include <linux/export.h> > >> #include <linux/kobject.h> > >> #include <linux/string.h> > >> @@ -54,6 +55,19 @@ void unlock_system_sleep(unsigned int flags) > >> } > >> EXPORT_SYMBOL_GPL(unlock_system_sleep); > >> > >> +void pm_set_hw_state_residency(u64 duration) > >> +{ > >> + suspend_stats.last_hw_state_residency = duration; > >> +} > >> +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); > >> + > >> +void pm_account_suspend_type(const struct timespec64 *t) > >> +{ > >> + suspend_stats.last_suspend_total += (s64)t->tv_sec * USEC_PER_SEC + > >> + t->tv_nsec / NSEC_PER_USEC; > >> +} > >> +EXPORT_SYMBOL_GPL(pm_account_suspend_type); > >> + > >> void ksys_sync_helper(void) > >> { > >> ktime_t start; > >> @@ -377,6 +391,20 @@ static ssize_t last_failed_step_show(struct kobject *kobj, > >> } > >> static struct kobj_attribute last_failed_step = __ATTR_RO(last_failed_step); > >> > >> +static ssize_t last_hw_state_residency_show(struct kobject *kobj, > >> + struct kobj_attribute *attr, char *buf) > >> +{ > >> + return sprintf(buf, "%llu\n", suspend_stats.last_hw_state_residency); > >> +} > >> +static struct kobj_attribute last_hw_state_residency = __ATTR_RO(last_hw_state_residency); > >> + > >> +static ssize_t last_suspend_total_show(struct kobject *kobj, > >> + struct kobj_attribute *attr, char *buf) > >> +{ > >> + return sprintf(buf, "%llu\n", suspend_stats.last_suspend_total); > >> +} > >> +static struct kobj_attribute last_suspend_total = __ATTR_RO(last_suspend_total); > >> + > >> static struct attribute *suspend_attrs[] = { > >> &success.attr, > >> &fail.attr, > >> @@ -391,12 +419,26 @@ static struct attribute *suspend_attrs[] = { > >> &last_failed_dev.attr, > >> &last_failed_errno.attr, > >> &last_failed_step.attr, > >> + &last_hw_state_residency.attr, > >> + &last_suspend_total.attr, > >> NULL, > >> }; > >> > >> +static umode_t suspend_attr_is_visible(struct kobject *kobj, struct attribute *attr, int idx) > >> +{ > >> + if (attr != &last_hw_state_residency.attr) > >> + return 0444; > >> +#ifdef CONFIG_ACPI > >> + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) > >> + return 0444; > >> +#endif > >> + return 0; > >> +} > >> + > >> static const struct attribute_group suspend_attr_group = { > >> .name = "suspend_stats", > >> .attrs = suspend_attrs, > >> + .is_visible = suspend_attr_is_visible, > >> }; > >> > >> #ifdef CONFIG_DEBUG_FS > >> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c > >> index fa3bf161d13f..b6c4a3733212 100644 > >> --- a/kernel/power/suspend.c > >> +++ b/kernel/power/suspend.c > >> @@ -423,6 +423,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup) > >> if (suspend_test(TEST_PLATFORM)) > >> goto Platform_wake; > >> > >> + suspend_stats.last_suspend_total = 0; > >> + > >> if (state == PM_SUSPEND_TO_IDLE) { > >> s2idle_loop(); > >> goto Platform_wake; > >> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > >> index f72b9f1de178..e1b356787e53 100644 > >> --- a/kernel/time/timekeeping.c > >> +++ b/kernel/time/timekeeping.c > >> @@ -24,6 +24,7 @@ > >> #include <linux/compiler.h> > >> #include <linux/audit.h> > >> #include <linux/random.h> > >> +#include <linux/suspend.h> > >> > >> #include "tick-internal.h" > >> #include "ntp_internal.h" > >> @@ -1698,6 +1699,7 @@ static void __timekeeping_inject_sleeptime(struct timekeeper *tk, > >> tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *delta)); > >> tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); > >> tk_debug_account_sleep_time(delta); > >> + pm_account_suspend_type(delta); > >> } > >> > >> #if defined(CONFIG_PM_SLEEP) && defined(CONFIG_RTC_HCTOSYS_DEVICE) > >> -- > >> 2.34.1 > >> >
On 11/15/2022 11:20, Raul Rangel wrote: > > > On Tue, Nov 15, 2022 at 9:35 AM Rafael J. Wysocki <rafael@kernel.org > <mailto:rafael@kernel.org>> wrote: > > On Tue, Nov 15, 2022 at 4:17 PM Limonciello, Mario > <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> wrote: > > > > On 11/15/2022 08:45, Rafael J. Wysocki wrote: > > > On Thu, Nov 10, 2022 at 7:49 AM Mario Limonciello > > > <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> > wrote: > > >> > > >> Both AMD and Intel SoCs have a concept of reporting whether > the hardware > > >> reached a hardware sleep state over s2idle as well as how much > > >> time was spent in such a state. > > >> > > >> This information is valuable to both chip designers and system > designers > > >> as it helps to identify when there are problems with power > consumption > > >> over an s2idle cycle. > > >> > > >> To make the information discoverable, create a new sysfs file > and a symbol > > >> that drivers from supported manufacturers can use to advertise > this > > >> information. This file will only be exported when the system > supports low > > >> power idle in the ACPI table. > > >> > > >> In order to effectively use this information you will ideally > want to > > >> compare against the total duration of sleep, so export a > second sysfs file > > >> that will show total time. This file will be exported on all > systems and > > >> used both for s2idle and s3. > > > > > > Well, my first question would be how this is related to > > > > > > /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us > > > > > > > This has a dependency on the platform firmware offering an ACPI LPIT > > table. I don't know how common that is. > > Required for running Windows with Modern Standby AFAICS. > > > As this series started from the needs on ChromeOS I would ask is > that typically populated by coreboot? > > It should be, but I'd need to ask for confirmation. > > > It looks like Intel platforms have support for the LPIT table: > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/third_party/coreboot/src/soc/intel/common/block/acpi/lpit.c?q=f:LPIT%20f:coreboot&ss=chromiumos <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.chromium.org%2Fchromiumos%2Fchromiumos%2Fcodesearch%2F%2B%2Fmain%3Asrc%2Fthird_party%2Fcoreboot%2Fsrc%2Fsoc%2Fintel%2Fcommon%2Fblock%2Facpi%2Flpit.c%3Fq%3Df%3ALPIT%2520f%3Acoreboot%26ss%3Dchromiumos&data=05%7C01%7Cmario.limonciello%40amd.com%7C701602845ad14f37abbb08dac72db514%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041296400209575%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9ig2jlDevXMjzmTUf42WS5Ey3rLd2lDUXjncz3mbyMI%3D&reserved=0> > > For AMD, we had some patches to add _LPIL > https://review.coreboot.org/c/coreboot/+/52381/1 > <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freview.coreboot.org%2Fc%2Fcoreboot%2F%2B%2F52381%2F1&data=05%7C01%7Cmario.limonciello%40amd.com%7C701602845ad14f37abbb08dac72db514%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041296400209575%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KV6ASbdfNOex%2FZtJYcdItZU1gdjCIXEcP1ExiY0pkf8%3D&reserved=0> > They never got merged though. We could add an LPIT table to coreboot for > AMD platforms if necessary. _LPI I don't think makes a lot of sense on X86 today, which is why this was sent up: eb087f305919e ("ACPI: processor idle: Check for architectural support for LPI") As for LPIT - I've never seen LPIT on AMD UEFI systems either. I guess it's an Intel specific table? > > > I would hope it's the same number that is populated in that file on > > supported systems though. > > Well, which is exactly where I'm going. > > Since there is one sysfs file for exposing this value already and it > is used (for example, by sleepgraph), perhaps the way to go would be > to extend this interface to systems that don't have LPIT instead of > introducing a new one possibly exposing the same value? > Ah; so since Raul confirmed coreboot on Chrome exports that maybe we just need to add another way to populate that sysfs file for systems without LPIT (IE AMD). I think that's a very good idea; thanks. I think we still probably want to have a way to get the total suspend time out programmatically though to compare to. So perhaps the other sysfs file I had in the RFC v2 makes sense still. > > > and > > > > > > /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us > > > > > > > No relation to this one for what's in the series. > > > > >> Suggested-by: David E Box <david.e.box@intel.com > <mailto:david.e.box@intel.com>> > > >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com > <mailto:mario.limonciello@amd.com>> > > >> --- > > >> Documentation/ABI/testing/sysfs-power | 17 +++++++++++ > > >> include/linux/suspend.h | 4 +++ > > >> kernel/power/main.c | 42 > +++++++++++++++++++++++++++ > > >> kernel/power/suspend.c | 2 ++ > > >> kernel/time/timekeeping.c | 2 ++ > > >> 5 files changed, 67 insertions(+) > > >> > > >> diff --git a/Documentation/ABI/testing/sysfs-power > b/Documentation/ABI/testing/sysfs-power > > >> index f99d433ff311..5b47cbb4dc9e 100644 > > >> --- a/Documentation/ABI/testing/sysfs-power > > >> +++ b/Documentation/ABI/testing/sysfs-power > > >> @@ -413,6 +413,23 @@ Description: > > >> The /sys/power/suspend_stats/last_failed_step > file contains > > >> the last failed step in the suspend/resume path. > > >> > > >> +What: /sys/power/suspend_stats/last_hw_state_residency > > >> +Date: December 2022 > > >> +Contact: Mario Limonciello <mario.limonciello@amd.com > <mailto:mario.limonciello@amd.com>> > > >> +Description: > > >> + The > /sys/power/suspend_stats/last_hw_state_residency file contains > > >> + the amount of time spent in a hardware sleep > state. > > >> + This attribute is only available if the system > supports > > >> + low power idle. This is measured in microseconds. > > >> + > > >> +What: /sys/power/suspend_stats/last_suspend_total > > >> +Date: December 2022 > > >> +Contact: Mario Limonciello <mario.limonciello@amd.com > <mailto:mario.limonciello@amd.com>> > > >> +Description: > > >> + The > /sys/power/suspend_stats/last_suspend_total file contains > > >> + the total duration of the sleep cycle. > > >> + This is measured in microseconds. > > >> + > > >> What: /sys/power/sync_on_suspend > > >> Date: October 2019 > > >> Contact: Jonas Meurer <jonas@freesources.org > <mailto:jonas@freesources.org>> > > >> diff --git a/include/linux/suspend.h b/include/linux/suspend.h > > >> index cfe19a028918..af343c3f8198 100644 > > >> --- a/include/linux/suspend.h > > >> +++ b/include/linux/suspend.h > > >> @@ -68,6 +68,8 @@ struct suspend_stats { > > >> int last_failed_errno; > > >> int errno[REC_FAILED_NUM]; > > >> int last_failed_step; > > >> + u64 last_hw_state_residency; > > >> + u64 last_suspend_total; > > >> enum suspend_stat_step failed_steps[REC_FAILED_NUM]; > > >> }; > > >> > > >> @@ -489,6 +491,8 @@ void restore_processor_state(void); > > >> extern int register_pm_notifier(struct notifier_block *nb); > > >> extern int unregister_pm_notifier(struct notifier_block *nb); > > >> extern void ksys_sync_helper(void); > > >> +extern void pm_set_hw_state_residency(u64 duration); > > >> +extern void pm_account_suspend_type(const struct timespec64 *t); > > >> > > >> #define pm_notifier(fn, pri) { \ > > >> static struct notifier_block fn##_nb = \ > > >> diff --git a/kernel/power/main.c b/kernel/power/main.c > > >> index 31ec4a9b9d70..11bd658583b0 100644 > > >> --- a/kernel/power/main.c > > >> +++ b/kernel/power/main.c > > >> @@ -6,6 +6,7 @@ > > >> * Copyright (c) 2003 Open Source Development Lab > > >> */ > > >> > > >> +#include <linux/acpi.h> > > >> #include <linux/export.h> > > >> #include <linux/kobject.h> > > >> #include <linux/string.h> > > >> @@ -54,6 +55,19 @@ void unlock_system_sleep(unsigned int flags) > > >> } > > >> EXPORT_SYMBOL_GPL(unlock_system_sleep); > > >> > > >> +void pm_set_hw_state_residency(u64 duration) > > >> +{ > > >> + suspend_stats.last_hw_state_residency = duration; > > >> +} > > >> +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); > > >> + > > >> +void pm_account_suspend_type(const struct timespec64 *t) > > >> +{ > > >> + suspend_stats.last_suspend_total += (s64)t->tv_sec * > USEC_PER_SEC + > > >> + t->tv_nsec / > NSEC_PER_USEC; > > >> +} > > >> +EXPORT_SYMBOL_GPL(pm_account_suspend_type); > > >> + > > >> void ksys_sync_helper(void) > > >> { > > >> ktime_t start; > > >> @@ -377,6 +391,20 @@ static ssize_t > last_failed_step_show(struct kobject *kobj, > > >> } > > >> static struct kobj_attribute last_failed_step = > __ATTR_RO(last_failed_step); > > >> > > >> +static ssize_t last_hw_state_residency_show(struct kobject *kobj, > > >> + struct kobj_attribute *attr, char *buf) > > >> +{ > > >> + return sprintf(buf, "%llu\n", > suspend_stats.last_hw_state_residency); > > >> +} > > >> +static struct kobj_attribute last_hw_state_residency = > __ATTR_RO(last_hw_state_residency); > > >> + > > >> +static ssize_t last_suspend_total_show(struct kobject *kobj, > > >> + struct kobj_attribute *attr, char *buf) > > >> +{ > > >> + return sprintf(buf, "%llu\n", > suspend_stats.last_suspend_total); > > >> +} > > >> +static struct kobj_attribute last_suspend_total = > __ATTR_RO(last_suspend_total); > > >> + > > >> static struct attribute *suspend_attrs[] = { > > >> &success.attr, > > >> &fail.attr, > > >> @@ -391,12 +419,26 @@ static struct attribute *suspend_attrs[] = { > > >> &last_failed_dev.attr, > > >> &last_failed_errno.attr, > > >> &last_failed_step.attr, > > >> + &last_hw_state_residency.attr, > > >> + &last_suspend_total.attr, > > >> NULL, > > >> }; > > >> > > >> +static umode_t suspend_attr_is_visible(struct kobject *kobj, > struct attribute *attr, int idx) > > >> +{ > > >> + if (attr != &last_hw_state_residency.attr) > > >> + return 0444; > > >> +#ifdef CONFIG_ACPI > > >> + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) > > >> + return 0444; > > >> +#endif > > >> + return 0; > > >> +} > > >> + > > >> static const struct attribute_group suspend_attr_group = { > > >> .name = "suspend_stats", > > >> .attrs = suspend_attrs, > > >> + .is_visible = suspend_attr_is_visible, > > >> }; > > >> > > >> #ifdef CONFIG_DEBUG_FS > > >> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c > > >> index fa3bf161d13f..b6c4a3733212 100644 > > >> --- a/kernel/power/suspend.c > > >> +++ b/kernel/power/suspend.c > > >> @@ -423,6 +423,8 @@ static int suspend_enter(suspend_state_t > state, bool *wakeup) > > >> if (suspend_test(TEST_PLATFORM)) > > >> goto Platform_wake; > > >> > > >> + suspend_stats.last_suspend_total = 0; > > >> + > > >> if (state == PM_SUSPEND_TO_IDLE) { > > >> s2idle_loop(); > > >> goto Platform_wake; > > >> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > > >> index f72b9f1de178..e1b356787e53 100644 > > >> --- a/kernel/time/timekeeping.c > > >> +++ b/kernel/time/timekeeping.c > > >> @@ -24,6 +24,7 @@ > > >> #include <linux/compiler.h> > > >> #include <linux/audit.h> > > >> #include <linux/random.h> > > >> +#include <linux/suspend.h> > > >> > > >> #include "tick-internal.h" > > >> #include "ntp_internal.h" > > >> @@ -1698,6 +1699,7 @@ static void > __timekeeping_inject_sleeptime(struct timekeeper *tk, > > >> tk_set_wall_to_mono(tk, > timespec64_sub(tk->wall_to_monotonic, *delta)); > > >> tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); > > >> tk_debug_account_sleep_time(delta); > > >> + pm_account_suspend_type(delta); > > >> } > > >> > > >> #if defined(CONFIG_PM_SLEEP) && > defined(CONFIG_RTC_HCTOSYS_DEVICE) > > >> -- > > >> 2.34.1 > > >> > > >
On Tue, Nov 15, 2022 at 6:27 PM Limonciello, Mario <mario.limonciello@amd.com> wrote: > > On 11/15/2022 11:20, Raul Rangel wrote: > > > > > > On Tue, Nov 15, 2022 at 9:35 AM Rafael J. Wysocki <rafael@kernel.org > > <mailto:rafael@kernel.org>> wrote: > > > > On Tue, Nov 15, 2022 at 4:17 PM Limonciello, Mario > > <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> wrote: > > > > > > On 11/15/2022 08:45, Rafael J. Wysocki wrote: > > > > On Thu, Nov 10, 2022 at 7:49 AM Mario Limonciello > > > > <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> > > wrote: > > > >> > > > >> Both AMD and Intel SoCs have a concept of reporting whether > > the hardware > > > >> reached a hardware sleep state over s2idle as well as how much > > > >> time was spent in such a state. > > > >> > > > >> This information is valuable to both chip designers and system > > designers > > > >> as it helps to identify when there are problems with power > > consumption > > > >> over an s2idle cycle. > > > >> > > > >> To make the information discoverable, create a new sysfs file > > and a symbol > > > >> that drivers from supported manufacturers can use to advertise > > this > > > >> information. This file will only be exported when the system > > supports low > > > >> power idle in the ACPI table. > > > >> > > > >> In order to effectively use this information you will ideally > > want to > > > >> compare against the total duration of sleep, so export a > > second sysfs file > > > >> that will show total time. This file will be exported on all > > systems and > > > >> used both for s2idle and s3. > > > > > > > > Well, my first question would be how this is related to > > > > > > > > /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us > > > > > > > > > > This has a dependency on the platform firmware offering an ACPI LPIT > > > table. I don't know how common that is. > > > > Required for running Windows with Modern Standby AFAICS. > > > > > As this series started from the needs on ChromeOS I would ask is > > that typically populated by coreboot? > > > > It should be, but I'd need to ask for confirmation. > > > > > > It looks like Intel platforms have support for the LPIT table: > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/third_party/coreboot/src/soc/intel/common/block/acpi/lpit.c?q=f:LPIT%20f:coreboot&ss=chromiumos <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.chromium.org%2Fchromiumos%2Fchromiumos%2Fcodesearch%2F%2B%2Fmain%3Asrc%2Fthird_party%2Fcoreboot%2Fsrc%2Fsoc%2Fintel%2Fcommon%2Fblock%2Facpi%2Flpit.c%3Fq%3Df%3ALPIT%2520f%3Acoreboot%26ss%3Dchromiumos&data=05%7C01%7Cmario.limonciello%40amd.com%7C701602845ad14f37abbb08dac72db514%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041296400209575%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9ig2jlDevXMjzmTUf42WS5Ey3rLd2lDUXjncz3mbyMI%3D&reserved=0> > > > > For AMD, we had some patches to add _LPIL > > https://review.coreboot.org/c/coreboot/+/52381/1 > > <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freview.coreboot.org%2Fc%2Fcoreboot%2F%2B%2F52381%2F1&data=05%7C01%7Cmario.limonciello%40amd.com%7C701602845ad14f37abbb08dac72db514%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041296400209575%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KV6ASbdfNOex%2FZtJYcdItZU1gdjCIXEcP1ExiY0pkf8%3D&reserved=0> > > They never got merged though. We could add an LPIT table to coreboot for > > AMD platforms if necessary. > > _LPI I don't think makes a lot of sense on X86 today, which is why this > was sent up: > eb087f305919e ("ACPI: processor idle: Check for architectural support > for LPI") Well, LPI has nothing to do with LPIT. [I guess this could not be even more confusing, but that's what you get in the world of 4-letter TLAs.] > As for LPIT - I've never seen LPIT on AMD UEFI systems either. I guess > it's an Intel specific table? It used to be. The spec is UEFI-hosted now. > > > > > I would hope it's the same number that is populated in that file on > > > supported systems though. > > > > Well, which is exactly where I'm going. > > > > Since there is one sysfs file for exposing this value already and it > > is used (for example, by sleepgraph), perhaps the way to go would be > > to extend this interface to systems that don't have LPIT instead of > > introducing a new one possibly exposing the same value? > > > > Ah; so since Raul confirmed coreboot on Chrome exports that maybe we > just need to add another way to populate that sysfs file for systems > without LPIT (IE AMD). I think that's a very good idea; thanks. > > I think we still probably want to have a way to get the total suspend > time out programmatically though to compare to. So perhaps the other > sysfs file I had in the RFC v2 makes sense still. Well there are trace points to get that (sleepgraph uses these too), see Documentation/trace/events-power.rst (and you can git grep for "machine_suspend" to find where this comes from). I guess there could be a sysfs file in addition to them, but I'm not sure if the extra overhead would be worth the benefit. > > > > and > > > > > > > > /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us > > > > > > > > > > No relation to this one for what's in the series. > > > > > > >> Suggested-by: David E Box <david.e.box@intel.com > > <mailto:david.e.box@intel.com>> > > > >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com > > <mailto:mario.limonciello@amd.com>> > > > >> --- > > > >> Documentation/ABI/testing/sysfs-power | 17 +++++++++++ > > > >> include/linux/suspend.h | 4 +++ > > > >> kernel/power/main.c | 42 > > +++++++++++++++++++++++++++ > > > >> kernel/power/suspend.c | 2 ++ > > > >> kernel/time/timekeeping.c | 2 ++ > > > >> 5 files changed, 67 insertions(+) > > > >> > > > >> diff --git a/Documentation/ABI/testing/sysfs-power > > b/Documentation/ABI/testing/sysfs-power > > > >> index f99d433ff311..5b47cbb4dc9e 100644 > > > >> --- a/Documentation/ABI/testing/sysfs-power > > > >> +++ b/Documentation/ABI/testing/sysfs-power > > > >> @@ -413,6 +413,23 @@ Description: > > > >> The /sys/power/suspend_stats/last_failed_step > > file contains > > > >> the last failed step in the suspend/resume path. > > > >> > > > >> +What: /sys/power/suspend_stats/last_hw_state_residency > > > >> +Date: December 2022 > > > >> +Contact: Mario Limonciello <mario.limonciello@amd.com > > <mailto:mario.limonciello@amd.com>> > > > >> +Description: > > > >> + The > > /sys/power/suspend_stats/last_hw_state_residency file contains > > > >> + the amount of time spent in a hardware sleep > > state. > > > >> + This attribute is only available if the system > > supports > > > >> + low power idle. This is measured in microseconds. > > > >> + > > > >> +What: /sys/power/suspend_stats/last_suspend_total > > > >> +Date: December 2022 > > > >> +Contact: Mario Limonciello <mario.limonciello@amd.com > > <mailto:mario.limonciello@amd.com>> > > > >> +Description: > > > >> + The > > /sys/power/suspend_stats/last_suspend_total file contains > > > >> + the total duration of the sleep cycle. > > > >> + This is measured in microseconds. > > > >> + > > > >> What: /sys/power/sync_on_suspend > > > >> Date: October 2019 > > > >> Contact: Jonas Meurer <jonas@freesources.org > > <mailto:jonas@freesources.org>> > > > >> diff --git a/include/linux/suspend.h b/include/linux/suspend.h > > > >> index cfe19a028918..af343c3f8198 100644 > > > >> --- a/include/linux/suspend.h > > > >> +++ b/include/linux/suspend.h > > > >> @@ -68,6 +68,8 @@ struct suspend_stats { > > > >> int last_failed_errno; > > > >> int errno[REC_FAILED_NUM]; > > > >> int last_failed_step; > > > >> + u64 last_hw_state_residency; > > > >> + u64 last_suspend_total; > > > >> enum suspend_stat_step failed_steps[REC_FAILED_NUM]; > > > >> }; > > > >> > > > >> @@ -489,6 +491,8 @@ void restore_processor_state(void); > > > >> extern int register_pm_notifier(struct notifier_block *nb); > > > >> extern int unregister_pm_notifier(struct notifier_block *nb); > > > >> extern void ksys_sync_helper(void); > > > >> +extern void pm_set_hw_state_residency(u64 duration); > > > >> +extern void pm_account_suspend_type(const struct timespec64 *t); > > > >> > > > >> #define pm_notifier(fn, pri) { \ > > > >> static struct notifier_block fn##_nb = \ > > > >> diff --git a/kernel/power/main.c b/kernel/power/main.c > > > >> index 31ec4a9b9d70..11bd658583b0 100644 > > > >> --- a/kernel/power/main.c > > > >> +++ b/kernel/power/main.c > > > >> @@ -6,6 +6,7 @@ > > > >> * Copyright (c) 2003 Open Source Development Lab > > > >> */ > > > >> > > > >> +#include <linux/acpi.h> > > > >> #include <linux/export.h> > > > >> #include <linux/kobject.h> > > > >> #include <linux/string.h> > > > >> @@ -54,6 +55,19 @@ void unlock_system_sleep(unsigned int flags) > > > >> } > > > >> EXPORT_SYMBOL_GPL(unlock_system_sleep); > > > >> > > > >> +void pm_set_hw_state_residency(u64 duration) > > > >> +{ > > > >> + suspend_stats.last_hw_state_residency = duration; > > > >> +} > > > >> +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); > > > >> + > > > >> +void pm_account_suspend_type(const struct timespec64 *t) > > > >> +{ > > > >> + suspend_stats.last_suspend_total += (s64)t->tv_sec * > > USEC_PER_SEC + > > > >> + t->tv_nsec / > > NSEC_PER_USEC; > > > >> +} > > > >> +EXPORT_SYMBOL_GPL(pm_account_suspend_type); > > > >> + > > > >> void ksys_sync_helper(void) > > > >> { > > > >> ktime_t start; > > > >> @@ -377,6 +391,20 @@ static ssize_t > > last_failed_step_show(struct kobject *kobj, > > > >> } > > > >> static struct kobj_attribute last_failed_step = > > __ATTR_RO(last_failed_step); > > > >> > > > >> +static ssize_t last_hw_state_residency_show(struct kobject *kobj, > > > >> + struct kobj_attribute *attr, char *buf) > > > >> +{ > > > >> + return sprintf(buf, "%llu\n", > > suspend_stats.last_hw_state_residency); > > > >> +} > > > >> +static struct kobj_attribute last_hw_state_residency = > > __ATTR_RO(last_hw_state_residency); > > > >> + > > > >> +static ssize_t last_suspend_total_show(struct kobject *kobj, > > > >> + struct kobj_attribute *attr, char *buf) > > > >> +{ > > > >> + return sprintf(buf, "%llu\n", > > suspend_stats.last_suspend_total); > > > >> +} > > > >> +static struct kobj_attribute last_suspend_total = > > __ATTR_RO(last_suspend_total); > > > >> + > > > >> static struct attribute *suspend_attrs[] = { > > > >> &success.attr, > > > >> &fail.attr, > > > >> @@ -391,12 +419,26 @@ static struct attribute *suspend_attrs[] = { > > > >> &last_failed_dev.attr, > > > >> &last_failed_errno.attr, > > > >> &last_failed_step.attr, > > > >> + &last_hw_state_residency.attr, > > > >> + &last_suspend_total.attr, > > > >> NULL, > > > >> }; > > > >> > > > >> +static umode_t suspend_attr_is_visible(struct kobject *kobj, > > struct attribute *attr, int idx) > > > >> +{ > > > >> + if (attr != &last_hw_state_residency.attr) > > > >> + return 0444; > > > >> +#ifdef CONFIG_ACPI > > > >> + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) > > > >> + return 0444; > > > >> +#endif > > > >> + return 0; > > > >> +} > > > >> + > > > >> static const struct attribute_group suspend_attr_group = { > > > >> .name = "suspend_stats", > > > >> .attrs = suspend_attrs, > > > >> + .is_visible = suspend_attr_is_visible, > > > >> }; > > > >> > > > >> #ifdef CONFIG_DEBUG_FS > > > >> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c > > > >> index fa3bf161d13f..b6c4a3733212 100644 > > > >> --- a/kernel/power/suspend.c > > > >> +++ b/kernel/power/suspend.c > > > >> @@ -423,6 +423,8 @@ static int suspend_enter(suspend_state_t > > state, bool *wakeup) > > > >> if (suspend_test(TEST_PLATFORM)) > > > >> goto Platform_wake; > > > >> > > > >> + suspend_stats.last_suspend_total = 0; > > > >> + > > > >> if (state == PM_SUSPEND_TO_IDLE) { > > > >> s2idle_loop(); > > > >> goto Platform_wake; > > > >> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > > > >> index f72b9f1de178..e1b356787e53 100644 > > > >> --- a/kernel/time/timekeeping.c > > > >> +++ b/kernel/time/timekeeping.c > > > >> @@ -24,6 +24,7 @@ > > > >> #include <linux/compiler.h> > > > >> #include <linux/audit.h> > > > >> #include <linux/random.h> > > > >> +#include <linux/suspend.h> > > > >> > > > >> #include "tick-internal.h" > > > >> #include "ntp_internal.h" > > > >> @@ -1698,6 +1699,7 @@ static void > > __timekeeping_inject_sleeptime(struct timekeeper *tk, > > > >> tk_set_wall_to_mono(tk, > > timespec64_sub(tk->wall_to_monotonic, *delta)); > > > >> tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); > > > >> tk_debug_account_sleep_time(delta); > > > >> + pm_account_suspend_type(delta); > > > >> } > > > >> > > > >> #if defined(CONFIG_PM_SLEEP) && > > defined(CONFIG_RTC_HCTOSYS_DEVICE) > > > >> -- > > > >> 2.34.1 > > > >> > > > > > >
On 11/15/2022 11:52, Rafael J. Wysocki wrote: > On Tue, Nov 15, 2022 at 6:27 PM Limonciello, Mario > <mario.limonciello@amd.com> wrote: >> >> On 11/15/2022 11:20, Raul Rangel wrote: >>> >>> >>> On Tue, Nov 15, 2022 at 9:35 AM Rafael J. Wysocki <rafael@kernel.org >>> <mailto:rafael@kernel.org>> wrote: >>> >>> On Tue, Nov 15, 2022 at 4:17 PM Limonciello, Mario >>> <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> wrote: >>> > >>> > On 11/15/2022 08:45, Rafael J. Wysocki wrote: >>> > > On Thu, Nov 10, 2022 at 7:49 AM Mario Limonciello >>> > > <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> >>> wrote: >>> > >> >>> > >> Both AMD and Intel SoCs have a concept of reporting whether >>> the hardware >>> > >> reached a hardware sleep state over s2idle as well as how much >>> > >> time was spent in such a state. >>> > >> >>> > >> This information is valuable to both chip designers and system >>> designers >>> > >> as it helps to identify when there are problems with power >>> consumption >>> > >> over an s2idle cycle. >>> > >> >>> > >> To make the information discoverable, create a new sysfs file >>> and a symbol >>> > >> that drivers from supported manufacturers can use to advertise >>> this >>> > >> information. This file will only be exported when the system >>> supports low >>> > >> power idle in the ACPI table. >>> > >> >>> > >> In order to effectively use this information you will ideally >>> want to >>> > >> compare against the total duration of sleep, so export a >>> second sysfs file >>> > >> that will show total time. This file will be exported on all >>> systems and >>> > >> used both for s2idle and s3. >>> > > >>> > > Well, my first question would be how this is related to >>> > > >>> > > /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us >>> > > >>> > >>> > This has a dependency on the platform firmware offering an ACPI LPIT >>> > table. I don't know how common that is. >>> >>> Required for running Windows with Modern Standby AFAICS. >>> >>> > As this series started from the needs on ChromeOS I would ask is >>> that typically populated by coreboot? >>> >>> It should be, but I'd need to ask for confirmation. >>> >>> >>> It looks like Intel platforms have support for the LPIT table: >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.chromium.org%2Fchromiumos%2Fchromiumos%2Fcodesearch%2F%2B%2Fmain%3Asrc%2Fthird_party%2Fcoreboot%2Fsrc%2Fsoc%2Fintel%2Fcommon%2Fblock%2Facpi%2Flpit.c%3Fq%3Df%3ALPIT%2520f%3Acoreboot%26ss%3Dchromiumos&data=05%7C01%7Cmario.limonciello%40amd.com%7C37e6dda56f924fe641f008dac7323c01%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041315852648377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PusftlebIMFtbaMy1XkBjHFMXLjdOzt7hA%2Fm3AM7v7A%3D&reserved=0 <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.chromium.org%2Fchromiumos%2Fchromiumos%2Fcodesearch%2F%2B%2Fmain%3Asrc%2Fthird_party%2Fcoreboot%2Fsrc%2Fsoc%2Fintel%2Fcommon%2Fblock%2Facpi%2Flpit.c%3Fq%3Df%3ALPIT%2520f%3Acoreboot%26ss%3Dchromiumos&data=05%7C01%7Cmario.limonciello%40amd.com%7C37e6dda56f924fe641f008dac7323c01%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041315852648377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PusftlebIMFtbaMy1XkBjHFMXLjdOzt7hA%2Fm3AM7v7A%3D&reserved=0> >>> >>> For AMD, we had some patches to add _LPIL >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freview.coreboot.org%2Fc%2Fcoreboot%2F%2B%2F52381%2F1&data=05%7C01%7Cmario.limonciello%40amd.com%7C37e6dda56f924fe641f008dac7323c01%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041315852648377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gUYdMWZBNVALF8Xzhgswlyw9hCUv7LQ6eomz6gfIYrk%3D&reserved=0 >>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freview.coreboot.org%2Fc%2Fcoreboot%2F%2B%2F52381%2F1&data=05%7C01%7Cmario.limonciello%40amd.com%7C37e6dda56f924fe641f008dac7323c01%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041315852648377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gUYdMWZBNVALF8Xzhgswlyw9hCUv7LQ6eomz6gfIYrk%3D&reserved=0> >>> They never got merged though. We could add an LPIT table to coreboot for >>> AMD platforms if necessary. >> >> _LPI I don't think makes a lot of sense on X86 today, which is why this >> was sent up: >> eb087f305919e ("ACPI: processor idle: Check for architectural support >> for LPI") > > Well, LPI has nothing to do with LPIT. [I guess this could not be > even more confusing, but that's what you get in the world of 4-letter > TLAs.] > >> As for LPIT - I've never seen LPIT on AMD UEFI systems either. I guess >> it's an Intel specific table? > > It used to be. The spec is UEFI-hosted now. > Got it. >>> >>> > I would hope it's the same number that is populated in that file on >>> > supported systems though. >>> >>> Well, which is exactly where I'm going. >>> >>> Since there is one sysfs file for exposing this value already and it >>> is used (for example, by sleepgraph), perhaps the way to go would be >>> to extend this interface to systems that don't have LPIT instead of >>> introducing a new one possibly exposing the same value? >>> >> >> Ah; so since Raul confirmed coreboot on Chrome exports that maybe we >> just need to add another way to populate that sysfs file for systems >> without LPIT (IE AMD). I think that's a very good idea; thanks. >> >> I think we still probably want to have a way to get the total suspend >> time out programmatically though to compare to. So perhaps the other >> sysfs file I had in the RFC v2 makes sense still. > > Well there are trace points to get that (sleepgraph uses these too), > see Documentation/trace/events-power.rst (and you can git grep for > "machine_suspend" to find where this comes from). > > I guess there could be a sysfs file in addition to them, but I'm not > sure if the extra overhead would be worth the benefit. At least the way that I envisioned this all working was that userspace software that wanted to could query some sysfs files and figure out a percentage of time spent. If it was below a threshold users could be notified, or logs can be sent up to a server for analysis etc. Trace points would mean that userspace software like systemd and powerd would need to turn on the tracing every time to get the raw total numbers to do such a comparison. > >>> > > and >>> > > >>> > > /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us >>> > > >>> > >>> > No relation to this one for what's in the series. >>> > >>> > >> Suggested-by: David E Box <david.e.box@intel.com >>> <mailto:david.e.box@intel.com>> >>> > >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com >>> <mailto:mario.limonciello@amd.com>> >>> > >> --- >>> > >> Documentation/ABI/testing/sysfs-power | 17 +++++++++++ >>> > >> include/linux/suspend.h | 4 +++ >>> > >> kernel/power/main.c | 42 >>> +++++++++++++++++++++++++++ >>> > >> kernel/power/suspend.c | 2 ++ >>> > >> kernel/time/timekeeping.c | 2 ++ >>> > >> 5 files changed, 67 insertions(+) >>> > >> >>> > >> diff --git a/Documentation/ABI/testing/sysfs-power >>> b/Documentation/ABI/testing/sysfs-power >>> > >> index f99d433ff311..5b47cbb4dc9e 100644 >>> > >> --- a/Documentation/ABI/testing/sysfs-power >>> > >> +++ b/Documentation/ABI/testing/sysfs-power >>> > >> @@ -413,6 +413,23 @@ Description: >>> > >> The /sys/power/suspend_stats/last_failed_step >>> file contains >>> > >> the last failed step in the suspend/resume path. >>> > >> >>> > >> +What: /sys/power/suspend_stats/last_hw_state_residency >>> > >> +Date: December 2022 >>> > >> +Contact: Mario Limonciello <mario.limonciello@amd.com >>> <mailto:mario.limonciello@amd.com>> >>> > >> +Description: >>> > >> + The >>> /sys/power/suspend_stats/last_hw_state_residency file contains >>> > >> + the amount of time spent in a hardware sleep >>> state. >>> > >> + This attribute is only available if the system >>> supports >>> > >> + low power idle. This is measured in microseconds. >>> > >> + >>> > >> +What: /sys/power/suspend_stats/last_suspend_total >>> > >> +Date: December 2022 >>> > >> +Contact: Mario Limonciello <mario.limonciello@amd.com >>> <mailto:mario.limonciello@amd.com>> >>> > >> +Description: >>> > >> + The >>> /sys/power/suspend_stats/last_suspend_total file contains >>> > >> + the total duration of the sleep cycle. >>> > >> + This is measured in microseconds. >>> > >> + >>> > >> What: /sys/power/sync_on_suspend >>> > >> Date: October 2019 >>> > >> Contact: Jonas Meurer <jonas@freesources.org >>> <mailto:jonas@freesources.org>> >>> > >> diff --git a/include/linux/suspend.h b/include/linux/suspend.h >>> > >> index cfe19a028918..af343c3f8198 100644 >>> > >> --- a/include/linux/suspend.h >>> > >> +++ b/include/linux/suspend.h >>> > >> @@ -68,6 +68,8 @@ struct suspend_stats { >>> > >> int last_failed_errno; >>> > >> int errno[REC_FAILED_NUM]; >>> > >> int last_failed_step; >>> > >> + u64 last_hw_state_residency; >>> > >> + u64 last_suspend_total; >>> > >> enum suspend_stat_step failed_steps[REC_FAILED_NUM]; >>> > >> }; >>> > >> >>> > >> @@ -489,6 +491,8 @@ void restore_processor_state(void); >>> > >> extern int register_pm_notifier(struct notifier_block *nb); >>> > >> extern int unregister_pm_notifier(struct notifier_block *nb); >>> > >> extern void ksys_sync_helper(void); >>> > >> +extern void pm_set_hw_state_residency(u64 duration); >>> > >> +extern void pm_account_suspend_type(const struct timespec64 *t); >>> > >> >>> > >> #define pm_notifier(fn, pri) { \ >>> > >> static struct notifier_block fn##_nb = \ >>> > >> diff --git a/kernel/power/main.c b/kernel/power/main.c >>> > >> index 31ec4a9b9d70..11bd658583b0 100644 >>> > >> --- a/kernel/power/main.c >>> > >> +++ b/kernel/power/main.c >>> > >> @@ -6,6 +6,7 @@ >>> > >> * Copyright (c) 2003 Open Source Development Lab >>> > >> */ >>> > >> >>> > >> +#include <linux/acpi.h> >>> > >> #include <linux/export.h> >>> > >> #include <linux/kobject.h> >>> > >> #include <linux/string.h> >>> > >> @@ -54,6 +55,19 @@ void unlock_system_sleep(unsigned int flags) >>> > >> } >>> > >> EXPORT_SYMBOL_GPL(unlock_system_sleep); >>> > >> >>> > >> +void pm_set_hw_state_residency(u64 duration) >>> > >> +{ >>> > >> + suspend_stats.last_hw_state_residency = duration; >>> > >> +} >>> > >> +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); >>> > >> + >>> > >> +void pm_account_suspend_type(const struct timespec64 *t) >>> > >> +{ >>> > >> + suspend_stats.last_suspend_total += (s64)t->tv_sec * >>> USEC_PER_SEC + >>> > >> + t->tv_nsec / >>> NSEC_PER_USEC; >>> > >> +} >>> > >> +EXPORT_SYMBOL_GPL(pm_account_suspend_type); >>> > >> + >>> > >> void ksys_sync_helper(void) >>> > >> { >>> > >> ktime_t start; >>> > >> @@ -377,6 +391,20 @@ static ssize_t >>> last_failed_step_show(struct kobject *kobj, >>> > >> } >>> > >> static struct kobj_attribute last_failed_step = >>> __ATTR_RO(last_failed_step); >>> > >> >>> > >> +static ssize_t last_hw_state_residency_show(struct kobject *kobj, >>> > >> + struct kobj_attribute *attr, char *buf) >>> > >> +{ >>> > >> + return sprintf(buf, "%llu\n", >>> suspend_stats.last_hw_state_residency); >>> > >> +} >>> > >> +static struct kobj_attribute last_hw_state_residency = >>> __ATTR_RO(last_hw_state_residency); >>> > >> + >>> > >> +static ssize_t last_suspend_total_show(struct kobject *kobj, >>> > >> + struct kobj_attribute *attr, char *buf) >>> > >> +{ >>> > >> + return sprintf(buf, "%llu\n", >>> suspend_stats.last_suspend_total); >>> > >> +} >>> > >> +static struct kobj_attribute last_suspend_total = >>> __ATTR_RO(last_suspend_total); >>> > >> + >>> > >> static struct attribute *suspend_attrs[] = { >>> > >> &success.attr, >>> > >> &fail.attr, >>> > >> @@ -391,12 +419,26 @@ static struct attribute *suspend_attrs[] = { >>> > >> &last_failed_dev.attr, >>> > >> &last_failed_errno.attr, >>> > >> &last_failed_step.attr, >>> > >> + &last_hw_state_residency.attr, >>> > >> + &last_suspend_total.attr, >>> > >> NULL, >>> > >> }; >>> > >> >>> > >> +static umode_t suspend_attr_is_visible(struct kobject *kobj, >>> struct attribute *attr, int idx) >>> > >> +{ >>> > >> + if (attr != &last_hw_state_residency.attr) >>> > >> + return 0444; >>> > >> +#ifdef CONFIG_ACPI >>> > >> + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) >>> > >> + return 0444; >>> > >> +#endif >>> > >> + return 0; >>> > >> +} >>> > >> + >>> > >> static const struct attribute_group suspend_attr_group = { >>> > >> .name = "suspend_stats", >>> > >> .attrs = suspend_attrs, >>> > >> + .is_visible = suspend_attr_is_visible, >>> > >> }; >>> > >> >>> > >> #ifdef CONFIG_DEBUG_FS >>> > >> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c >>> > >> index fa3bf161d13f..b6c4a3733212 100644 >>> > >> --- a/kernel/power/suspend.c >>> > >> +++ b/kernel/power/suspend.c >>> > >> @@ -423,6 +423,8 @@ static int suspend_enter(suspend_state_t >>> state, bool *wakeup) >>> > >> if (suspend_test(TEST_PLATFORM)) >>> > >> goto Platform_wake; >>> > >> >>> > >> + suspend_stats.last_suspend_total = 0; >>> > >> + >>> > >> if (state == PM_SUSPEND_TO_IDLE) { >>> > >> s2idle_loop(); >>> > >> goto Platform_wake; >>> > >> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c >>> > >> index f72b9f1de178..e1b356787e53 100644 >>> > >> --- a/kernel/time/timekeeping.c >>> > >> +++ b/kernel/time/timekeeping.c >>> > >> @@ -24,6 +24,7 @@ >>> > >> #include <linux/compiler.h> >>> > >> #include <linux/audit.h> >>> > >> #include <linux/random.h> >>> > >> +#include <linux/suspend.h> >>> > >> >>> > >> #include "tick-internal.h" >>> > >> #include "ntp_internal.h" >>> > >> @@ -1698,6 +1699,7 @@ static void >>> __timekeeping_inject_sleeptime(struct timekeeper *tk, >>> > >> tk_set_wall_to_mono(tk, >>> timespec64_sub(tk->wall_to_monotonic, *delta)); >>> > >> tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); >>> > >> tk_debug_account_sleep_time(delta); >>> > >> + pm_account_suspend_type(delta); >>> > >> } >>> > >> >>> > >> #if defined(CONFIG_PM_SLEEP) && >>> defined(CONFIG_RTC_HCTOSYS_DEVICE) >>> > >> -- >>> > >> 2.34.1 >>> > >> >>> > >>> >>
On Tue, Nov 15, 2022 at 6:58 PM Limonciello, Mario <mario.limonciello@amd.com> wrote: > > On 11/15/2022 11:52, Rafael J. Wysocki wrote: > > On Tue, Nov 15, 2022 at 6:27 PM Limonciello, Mario > > <mario.limonciello@amd.com> wrote: > >> > >> On 11/15/2022 11:20, Raul Rangel wrote: > >>> > >>> > >>> On Tue, Nov 15, 2022 at 9:35 AM Rafael J. Wysocki <rafael@kernel.org > >>> <mailto:rafael@kernel.org>> wrote: > >>> > >>> On Tue, Nov 15, 2022 at 4:17 PM Limonciello, Mario > >>> <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> wrote: > >>> > > >>> > On 11/15/2022 08:45, Rafael J. Wysocki wrote: > >>> > > On Thu, Nov 10, 2022 at 7:49 AM Mario Limonciello > >>> > > <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> > >>> wrote: > >>> > >> > >>> > >> Both AMD and Intel SoCs have a concept of reporting whether > >>> the hardware > >>> > >> reached a hardware sleep state over s2idle as well as how much > >>> > >> time was spent in such a state. > >>> > >> > >>> > >> This information is valuable to both chip designers and system > >>> designers > >>> > >> as it helps to identify when there are problems with power > >>> consumption > >>> > >> over an s2idle cycle. > >>> > >> > >>> > >> To make the information discoverable, create a new sysfs file > >>> and a symbol > >>> > >> that drivers from supported manufacturers can use to advertise > >>> this > >>> > >> information. This file will only be exported when the system > >>> supports low > >>> > >> power idle in the ACPI table. > >>> > >> > >>> > >> In order to effectively use this information you will ideally > >>> want to > >>> > >> compare against the total duration of sleep, so export a > >>> second sysfs file > >>> > >> that will show total time. This file will be exported on all > >>> systems and > >>> > >> used both for s2idle and s3. > >>> > > > >>> > > Well, my first question would be how this is related to > >>> > > > >>> > > /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us > >>> > > > >>> > > >>> > This has a dependency on the platform firmware offering an ACPI LPIT > >>> > table. I don't know how common that is. > >>> > >>> Required for running Windows with Modern Standby AFAICS. > >>> > >>> > As this series started from the needs on ChromeOS I would ask is > >>> that typically populated by coreboot? > >>> > >>> It should be, but I'd need to ask for confirmation. > >>> > >>> > >>> It looks like Intel platforms have support for the LPIT table: > >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.chromium.org%2Fchromiumos%2Fchromiumos%2Fcodesearch%2F%2B%2Fmain%3Asrc%2Fthird_party%2Fcoreboot%2Fsrc%2Fsoc%2Fintel%2Fcommon%2Fblock%2Facpi%2Flpit.c%3Fq%3Df%3ALPIT%2520f%3Acoreboot%26ss%3Dchromiumos&data=05%7C01%7Cmario.limonciello%40amd.com%7C37e6dda56f924fe641f008dac7323c01%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041315852648377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PusftlebIMFtbaMy1XkBjHFMXLjdOzt7hA%2Fm3AM7v7A%3D&reserved=0 <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.chromium.org%2Fchromiumos%2Fchromiumos%2Fcodesearch%2F%2B%2Fmain%3Asrc%2Fthird_party%2Fcoreboot%2Fsrc%2Fsoc%2Fintel%2Fcommon%2Fblock%2Facpi%2Flpit.c%3Fq%3Df%3ALPIT%2520f%3Acoreboot%26ss%3Dchromiumos&data=05%7C01%7Cmario.limonciello%40amd.com%7C37e6dda56f924fe641f008dac7323c01%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041315852648377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PusftlebIMFtbaMy1XkBjHFMXLjdOzt7hA%2Fm3AM7v7A%3D&reserved=0> > >>> > >>> For AMD, we had some patches to add _LPIL > >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freview.coreboot.org%2Fc%2Fcoreboot%2F%2B%2F52381%2F1&data=05%7C01%7Cmario.limonciello%40amd.com%7C37e6dda56f924fe641f008dac7323c01%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041315852648377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gUYdMWZBNVALF8Xzhgswlyw9hCUv7LQ6eomz6gfIYrk%3D&reserved=0 > >>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freview.coreboot.org%2Fc%2Fcoreboot%2F%2B%2F52381%2F1&data=05%7C01%7Cmario.limonciello%40amd.com%7C37e6dda56f924fe641f008dac7323c01%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638041315852648377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gUYdMWZBNVALF8Xzhgswlyw9hCUv7LQ6eomz6gfIYrk%3D&reserved=0> > >>> They never got merged though. We could add an LPIT table to coreboot for > >>> AMD platforms if necessary. > >> > >> _LPI I don't think makes a lot of sense on X86 today, which is why this > >> was sent up: > >> eb087f305919e ("ACPI: processor idle: Check for architectural support > >> for LPI") > > > > Well, LPI has nothing to do with LPIT. [I guess this could not be > > even more confusing, but that's what you get in the world of 4-letter > > TLAs.] > > > >> As for LPIT - I've never seen LPIT on AMD UEFI systems either. I guess > >> it's an Intel specific table? > > > > It used to be. The spec is UEFI-hosted now. > > > > Got it. > > >>> > >>> > I would hope it's the same number that is populated in that file on > >>> > supported systems though. > >>> > >>> Well, which is exactly where I'm going. > >>> > >>> Since there is one sysfs file for exposing this value already and it > >>> is used (for example, by sleepgraph), perhaps the way to go would be > >>> to extend this interface to systems that don't have LPIT instead of > >>> introducing a new one possibly exposing the same value? > >>> > >> > >> Ah; so since Raul confirmed coreboot on Chrome exports that maybe we > >> just need to add another way to populate that sysfs file for systems > >> without LPIT (IE AMD). I think that's a very good idea; thanks. > >> > >> I think we still probably want to have a way to get the total suspend > >> time out programmatically though to compare to. So perhaps the other > >> sysfs file I had in the RFC v2 makes sense still. > > > > Well there are trace points to get that (sleepgraph uses these too), > > see Documentation/trace/events-power.rst (and you can git grep for > > "machine_suspend" to find where this comes from). > > > > I guess there could be a sysfs file in addition to them, but I'm not > > sure if the extra overhead would be worth the benefit. > > At least the way that I envisioned this all working was that userspace > software that wanted to could query some sysfs files and figure out a > percentage of time spent. If it was below a threshold users could be > notified, or logs can be sent up to a server for analysis etc. > > Trace points would mean that userspace software like systemd and powerd > would need to turn on the tracing every time to get the raw total > numbers to do such a comparison. Fair enough, but there are quite some considerations to be made here regarding what exactly is included in the "total sleep time" and how to compare that with the residency value (note: this needs to work cross-platform).
On Tue, 2022-11-15 at 08:13 -0600, Limonciello, Mario wrote: > On 11/15/2022 04:32, Hans de Goede wrote: > > Hi Mario, > > > > On 11/14/22 20:12, Limonciello, Mario wrote: > > > [Public] > > > > > > Thanks! Appreciate the comments. > > > At least conceptually is there agreement to this idea for the two sysfs > > > files > > > and userspace can use them to do this comparison? > > > > First of all let me say that I think that having some generic mechanism > > which allows userspace to check if deep enough sleep-state were reached > > is a good idea. And thank you for working on this! > > > > Sure! > > > I wonder though if it would not be better to have some mechanism > > where a list of sleep states + time spend in each time is printed ? > > > > E.g. I know that on Intel Bay Trail and Cherry Trail devices (just an > > example I'm familiar with) there are S0i0 - S0i3 and we really want > > to reach S0i3 during suspend. > > > > Sometimes on S0i1 or S0i2 is reached due to some part of the hw > > not getting suspended properly. > > > > So then we have reached "a hardware sleep state over s2idle" > > but no the one we want. > > At least the way it's built right now it's tracking the s0ix counter for > Intel and the s0i3 counter for AMD. > > BTW - when I did all the cleanups suggested in RFC v2 I notice I was > taking the raw number for Intel, and I have that fixed for the next version. > > I don't know if other counters exist for Intel for various hardware states. They do, but the implementation is highly platform specific. > On the current AMD silicon this is the interesting metric. > > > > > OTOH I can image that if we start adding support for functionality > > like standby-connect under Linux that then we may not always > > reach the deepest hw sleep-state. > > Can you elaborate what you mean by standby connect? WoWLAN? > At least on the current AMD platforms WoWLAN can happen while the > silicon is in the deepest hardware sleep state. > > > > > So I'm a bit worried that having just a single number for > > last_hw_state_residency is not enough. > > > > I think that it might be better to have a mechanism to set > > a set of names for hw-states (once) and then set the residency > > per state (*) after resume and have the sysfs file print > > the entire list. > > > This list could then also always include the total suspend time, > > also avoiding the need for a second sysfs file and we could also > > use the same format for non s2idle suspend having it print > > only the total suspend time when no hw-state names are set. > > So is your thought is to have a single sysfs file something like > /sys/power/suspend_stats/s2idle_stats that would show this? > > state \t % \t duration (us) > s0i3 \t 99.5% \t 1000 > > For AMD that would be a single line and I don't think it's worth the > extra code. I would like to know if it actually makes sense for Intel > though. Not here. Engineers care, but the pmc driver already provides this. Most users are only concerned about whether their systems reach low power idle, whichever S0ix state it is. > > We also need to think about what will be actionable with this > information by consumers of it because I'm certain it will be leading to > bug reports. I agree. While Intel SoCs may support multiple states, it is not always the case (particularly for Tiger Lake and newer) that you need to reach the deepest state in order to achieve very good power savings. David > > Let's think about a hypothetical bug report: > "Intel System only spent 20% of time in deepest hardware state". > They attach to the bug report s2idle_stats that looks like this: > > state \t % \t duration (us) > s0i2 \t 80.0% \t 1000000 > s0i3 \t 20.0% \t 100000 > > Is that any more actionable than > /sys/power/last_hw_state_residency showing 100000 > and > /sys/power/suspend_total showing 500000 > > I think in either case the next action is more debugging will be needed, > such as turning on dynamic debug or some module parameters. > > "Practically" I expect software like systemd or powerd to be reading > these sysfs files. > > > > > Regards, > > > > Hans > > > > > > *) Using an array, so up to MAX_HW_RESIDENCY_STATES > > > > > > > > > > A few nested replies below, but I'll clean it up for > > > RFC v3 or submit as PATCH v1 if there is conceptual alignment before then. > > > > > > > On Thu, Nov 10 2022 at 00:47, Mario Limonciello wrote: > > > > > > > > 'Add a sysfs files'? > > > > > > > > Can you please decide whether that's 'a file' or 'multiple files'? > > > > > > Yup thanks; bad find and replace in the commit message when I added > > > the second file. > > > > > > > > > > > > Both AMD and Intel SoCs have a concept of reporting whether the > > > > hardware > > > > > reached a hardware sleep state over s2idle as well as how much > > > > > time was spent in such a state. > > > > > > > > Nice, but ... > > > > > > > > > This information is valuable to both chip designers and system > > > > > designers > > > > > as it helps to identify when there are problems with power consumption > > > > > over an s2idle cycle. > > > > > > > > > > To make the information discoverable, create a new sysfs file and a > > > > > symbol > > > > > that drivers from supported manufacturers can use to advertise this > > > > > information. This file will only be exported when the system supports > > > > > low > > > > > power idle in the ACPI table. > > > > > > > > > > In order to effectively use this information you will ideally want to > > > > > compare against the total duration of sleep, so export a second sysfs > > > > > file > > > > > that will show total time. This file will be exported on all systems > > > > > and > > > > > used both for s2idle and s3. > > > > > > > > The above is incomprehensible word salad. Can you come up with some > > > > coherent explanation of what you are trying to achieve please? > > > > > > > > > +void pm_set_hw_state_residency(u64 duration) > > > > > +{ > > > > > + suspend_stats.last_hw_state_residency = duration; > > > > > +} > > > > > +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); > > > > > + > > > > > +void pm_account_suspend_type(const struct timespec64 *t) > > > > > +{ > > > > > + suspend_stats.last_suspend_total += (s64)t->tv_sec * > > > > USEC_PER_SEC + > > > > > + t->tv_nsec / > > > > NSEC_PER_USEC; > > > > > > > > Conversion functions for timespecs to scalar nanoseconds exist for a > > > > reason. Why does this need special treatment and open code it? > > > > > > Will fixup to use conversion functions. > > > > > > > > > > > > +} > > > > > +EXPORT_SYMBOL_GPL(pm_account_suspend_type); > > > > > > > > So none of these functions has any kind of documentation. kernel-doc > > > > exists for a reason especially for exported functions. > > > > > > > > That said, what's the justification to export any of these functions at > > > > all? AFAICT pm_account_suspend_type() is only used by builtin code... > > > > > > I think you're right; they shouldn't export; will fix. > > > > > > > > > > > > +static umode_t suspend_attr_is_visible(struct kobject *kobj, struct > > > > attribute *attr, int idx) > > > > > +{ > > > > > + if (attr != &last_hw_state_residency.attr) > > > > > + return 0444; > > > > > +#ifdef CONFIG_ACPI > > > > > + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) > > > > > + return 0444; > > > > > +#endif > > > > > + return 0; > > > > > +} > > > > > + > > > > > static const struct attribute_group suspend_attr_group = { > > > > > .name = "suspend_stats", > > > > > .attrs = suspend_attrs, > > > > > + .is_visible = suspend_attr_is_visible, > > > > > > > > How is this change related to the changelog above? We are not hiding > > > > subtle changes to the existing code in some conglomorate patch. See > > > > Documentation/process/... > > > > > > It was from feedback from RFC v1 from David Box that this file should only > > > be visible when s2idle is supported on the hardware. Will adjust commit > > > message to make it clearer. > > > > > > > > > > > > --- a/kernel/time/timekeeping.c > > > > > +++ b/kernel/time/timekeeping.c > > > > > @@ -24,6 +24,7 @@ > > > > > #include <linux/compiler.h> > > > > > #include <linux/audit.h> > > > > > #include <linux/random.h> > > > > > +#include <linux/suspend.h> > > > > > > > > > > #include "tick-internal.h" > > > > > #include "ntp_internal.h" > > > > > @@ -1698,6 +1699,7 @@ static void > > > > __timekeeping_inject_sleeptime(struct timekeeper *tk, > > > > > tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, > > > > *delta)); > > > > > tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); > > > > > tk_debug_account_sleep_time(delta); > > > > > + pm_account_suspend_type(delta); > > > > > > > > That function name is really self explaining - NOT ! > > > > > > > > pm_account_suspend_type(delta); > > > > > > > > So this will account a suspend type depending on the time spent in > > > > suspend, right? > > > > > > > > It's totally obvious that the suspend type (whatever it is) depends on > > > > the time delta argument... especially when the function at hand has > > > > absolutely nothing to do with a type: > > > > > > > > > > I fat fingered this. In my mind I thought I wrote > > > pm_account_suspend_time() > > > Will fix. > > > > > > > > +void pm_account_suspend_type(const struct timespec64 *t) > > > > > +{ > > > > > + suspend_stats.last_suspend_total += (s64)t->tv_sec * > > > > USEC_PER_SEC + > > > > > + t->tv_nsec / > > > > NSEC_PER_USEC; > > > > > +} > > > > > > > > Sigh.... > > > > > > > > Thanks, > > > > > > > > tglx > > > > > >
diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power index f99d433ff311..5b47cbb4dc9e 100644 --- a/Documentation/ABI/testing/sysfs-power +++ b/Documentation/ABI/testing/sysfs-power @@ -413,6 +413,23 @@ Description: The /sys/power/suspend_stats/last_failed_step file contains the last failed step in the suspend/resume path. +What: /sys/power/suspend_stats/last_hw_state_residency +Date: December 2022 +Contact: Mario Limonciello <mario.limonciello@amd.com> +Description: + The /sys/power/suspend_stats/last_hw_state_residency file contains + the amount of time spent in a hardware sleep state. + This attribute is only available if the system supports + low power idle. This is measured in microseconds. + +What: /sys/power/suspend_stats/last_suspend_total +Date: December 2022 +Contact: Mario Limonciello <mario.limonciello@amd.com> +Description: + The /sys/power/suspend_stats/last_suspend_total file contains + the total duration of the sleep cycle. + This is measured in microseconds. + What: /sys/power/sync_on_suspend Date: October 2019 Contact: Jonas Meurer <jonas@freesources.org> diff --git a/include/linux/suspend.h b/include/linux/suspend.h index cfe19a028918..af343c3f8198 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -68,6 +68,8 @@ struct suspend_stats { int last_failed_errno; int errno[REC_FAILED_NUM]; int last_failed_step; + u64 last_hw_state_residency; + u64 last_suspend_total; enum suspend_stat_step failed_steps[REC_FAILED_NUM]; }; @@ -489,6 +491,8 @@ void restore_processor_state(void); extern int register_pm_notifier(struct notifier_block *nb); extern int unregister_pm_notifier(struct notifier_block *nb); extern void ksys_sync_helper(void); +extern void pm_set_hw_state_residency(u64 duration); +extern void pm_account_suspend_type(const struct timespec64 *t); #define pm_notifier(fn, pri) { \ static struct notifier_block fn##_nb = \ diff --git a/kernel/power/main.c b/kernel/power/main.c index 31ec4a9b9d70..11bd658583b0 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -6,6 +6,7 @@ * Copyright (c) 2003 Open Source Development Lab */ +#include <linux/acpi.h> #include <linux/export.h> #include <linux/kobject.h> #include <linux/string.h> @@ -54,6 +55,19 @@ void unlock_system_sleep(unsigned int flags) } EXPORT_SYMBOL_GPL(unlock_system_sleep); +void pm_set_hw_state_residency(u64 duration) +{ + suspend_stats.last_hw_state_residency = duration; +} +EXPORT_SYMBOL_GPL(pm_set_hw_state_residency); + +void pm_account_suspend_type(const struct timespec64 *t) +{ + suspend_stats.last_suspend_total += (s64)t->tv_sec * USEC_PER_SEC + + t->tv_nsec / NSEC_PER_USEC; +} +EXPORT_SYMBOL_GPL(pm_account_suspend_type); + void ksys_sync_helper(void) { ktime_t start; @@ -377,6 +391,20 @@ static ssize_t last_failed_step_show(struct kobject *kobj, } static struct kobj_attribute last_failed_step = __ATTR_RO(last_failed_step); +static ssize_t last_hw_state_residency_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%llu\n", suspend_stats.last_hw_state_residency); +} +static struct kobj_attribute last_hw_state_residency = __ATTR_RO(last_hw_state_residency); + +static ssize_t last_suspend_total_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%llu\n", suspend_stats.last_suspend_total); +} +static struct kobj_attribute last_suspend_total = __ATTR_RO(last_suspend_total); + static struct attribute *suspend_attrs[] = { &success.attr, &fail.attr, @@ -391,12 +419,26 @@ static struct attribute *suspend_attrs[] = { &last_failed_dev.attr, &last_failed_errno.attr, &last_failed_step.attr, + &last_hw_state_residency.attr, + &last_suspend_total.attr, NULL, }; +static umode_t suspend_attr_is_visible(struct kobject *kobj, struct attribute *attr, int idx) +{ + if (attr != &last_hw_state_residency.attr) + return 0444; +#ifdef CONFIG_ACPI + if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) + return 0444; +#endif + return 0; +} + static const struct attribute_group suspend_attr_group = { .name = "suspend_stats", .attrs = suspend_attrs, + .is_visible = suspend_attr_is_visible, }; #ifdef CONFIG_DEBUG_FS diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index fa3bf161d13f..b6c4a3733212 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -423,6 +423,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup) if (suspend_test(TEST_PLATFORM)) goto Platform_wake; + suspend_stats.last_suspend_total = 0; + if (state == PM_SUSPEND_TO_IDLE) { s2idle_loop(); goto Platform_wake; diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index f72b9f1de178..e1b356787e53 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -24,6 +24,7 @@ #include <linux/compiler.h> #include <linux/audit.h> #include <linux/random.h> +#include <linux/suspend.h> #include "tick-internal.h" #include "ntp_internal.h" @@ -1698,6 +1699,7 @@ static void __timekeeping_inject_sleeptime(struct timekeeper *tk, tk_set_wall_to_mono(tk, timespec64_sub(tk->wall_to_monotonic, *delta)); tk_update_sleep_time(tk, timespec64_to_ktime(*delta)); tk_debug_account_sleep_time(delta); + pm_account_suspend_type(delta); } #if defined(CONFIG_PM_SLEEP) && defined(CONFIG_RTC_HCTOSYS_DEVICE)