From patchwork Tue Apr 18 17:13:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: srinivas pandruvada X-Patchwork-Id: 84985 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp3011537vqo; Tue, 18 Apr 2023 10:19:49 -0700 (PDT) X-Google-Smtp-Source: AKy350ZHbjH1m+EsZV2a06RpIwmphI9wQQmBrXYdcmWpjZM2eEUGdo7lHq51W2vvD/Qa1XHrEceZ X-Received: by 2002:a17:902:d052:b0:1a6:e564:6046 with SMTP id l18-20020a170902d05200b001a6e5646046mr2143838pll.46.1681838389251; Tue, 18 Apr 2023 10:19:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681838389; cv=none; d=google.com; s=arc-20160816; b=LjMGEJ2GzG0cuZHnNxX8C/jJVjVcxeDg66tWPWHSwr1PsWUDxPov6sAqO1rVuW1pIL Sisv6YOwV1LN0t3/S9bMFhY2UReGbT9cb6Ss6//URjSHovouCyKQOA4EiP2UN97iehVX CmpDDVfcvLhqhXAZprcl5IgpjoT1vQypjdiv2iQfc4Eh7JSbGE1DBMMQIHGXU0moAngr MalFaSppPS0elDtJMtxfmY26sms2khgxS/AcysQbbYmFzlc6ihjBgVlpTOJaecrvhEvu PqqycrQ+1RZ1Gp14k+BZ0ueJBORUtrVAjfRykldubHJo/Xq9/LK/TJetfELq416FPOdQ Gr8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DWIOSbXWo3BwhJ/f20qB5w9BK9U9sfvKwYL3M62zjvo=; b=L35GfDUc+egX1VknYfuUbXsaTr/nlTNTMzs95b/N9RLqBhYwbnwRRDo4A33muExqBG a+FsFq3cQeTu/8jFPseHlWCz2RdxP4hYUznMdApyDcj2MhoPKlODvOT7zZ8SOvt8M0Vk uQJKxvZVUYofE/6N1tRPrsCgefhT4mbWECmo8Fwt2OKKf5xlN+w+EHkb4+LDsNFTNLvh dWRsq3uviGm5eMKNfbZ9QzUrlzREVgzJxK192+7UaYisWq4KPmNjh/n2/KlKWzApkG28 OLm1SVvvXNSIwyAgs5n7HO4IrNaQQ02UjktEgZmZ6Pl7FDTZtX+SbnNy/jYiO6J1p1Xw 8o5Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="h9/Ri5kE"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x20-20020a170902ea9400b001a645e63b73si14087454plb.566.2023.04.18.10.19.33; Tue, 18 Apr 2023 10:19:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="h9/Ri5kE"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231352AbjDRROU (ORCPT + 99 others); Tue, 18 Apr 2023 13:14:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231268AbjDRRNs (ORCPT ); Tue, 18 Apr 2023 13:13:48 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D4AF55BE; Tue, 18 Apr 2023 10:13:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838026; x=1713374026; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9YFHjmYFzkdk9xAaWavc4jSA7IgsJAZ8KXMJSIrPnjo=; b=h9/Ri5kEyfCi9UKu3wrbrh5GOuEVE3Vru6H1chhfs+/tA3c9fpwQomBU DaLcrofLaZxEUwJ6GrOhTkx/Nl3h+yCFK7ngZZ8JSr1vmWBCMMBoQKFHY st/uLVHd4S0KAVSxCILym18a2lNH1BASzKIc6K/wRoKue3UOWWC1A8v8N inToA7FiDOIDzZc54QYmkcWIkLABHxgTe35YRX42xV8eHWlSOBP9jZDBU 4WfkONaLdD77/D/tW77IcymAbNMbzj38CN3aPET/h+GnfaaCnNqwXkZkH hi07pC5qFUhEJjb0EHysWuNBqcZWzfxN62HkUHokl4jrFtyMzcxh/x5HP Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="347084257" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="347084257" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:13:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="755762656" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="755762656" Received: from spandruv-desk.jf.intel.com ([10.54.75.8]) by fmsmga008.fm.intel.com with ESMTP; 18 Apr 2023 10:13:43 -0700 From: Srinivas Pandruvada To: hdegoede@redhat.com, markgross@kernel.org Cc: platform-driver-x86@vger.kernel.org, linux-kernel@vger.kernel.org, Srinivas Pandruvada , Zhang Rui , Wendy Wang Subject: [PATCH v2 1/3] platform/x86/intel-uncore-freq: Uncore frequency control via TPMI Date: Tue, 18 Apr 2023 10:13:38 -0700 Message-Id: <20230418171340.681662-2-srinivas.pandruvada@linux.intel.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230418171340.681662-1-srinivas.pandruvada@linux.intel.com> References: <20230418171340.681662-1-srinivas.pandruvada@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1763535370461008068?= X-GMAIL-MSGID: =?utf-8?q?1763535370461008068?= Implement support of uncore frequency control via TPMI (Topology Aware Register and PM Capsule Interface). This driver provides the similar functionality as the current uncore frequency driver using MSRs. The hardware interface to read/write is basically substitution of MSR 0x620 and 0x621. There are specific MMIO offset and bits to get/set minimum and maximum uncore ratio, similar to MSRs. The scope of the uncore MSRs is package/die. But new generation of CPUs have more granular control at a cluster level. Each package/die can have multiple power domains, which further can have multiple clusters. The TPMI interface allows control at cluster level. The primary use case for uncore sysfs is to set maximum and minimum uncore frequency to reduce power consumption or latency. The current uncore sysfs control is per package/die. This is enough for the majority of users as workload will move to different power domains as it moves between different CPUs. The current uncore sysfs provides controls at package/die level. When user sets maximum/minimum limits, the driver sets the same limits to each cluster. Here number of power domains = number of resources in this aux device. There are offsets and bits to discover number of clusters and offset for each cluster level controls. The TPMI documentation can be downloaded from: https://github.com/intel/tpmi_power_management Signed-off-by: Srinivas Pandruvada Reviewed-by: Zhang Rui Tested-by: Wendy Wang --- v2 - Changed mmio to u8* (Hans) - Not setting pd_info->uncore_base to NULL (Hans) - Handling failure of devm_kcalloc() (Hans) - Merged init/remove to probe/remove functions (Rui) - Log when platform is NULL (Rui) .../x86/intel/uncore-frequency/Kconfig | 4 + .../x86/intel/uncore-frequency/Makefile | 2 + .../uncore-frequency/uncore-frequency-tpmi.c | 338 ++++++++++++++++++ 3 files changed, 344 insertions(+) create mode 100644 drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c diff --git a/drivers/platform/x86/intel/uncore-frequency/Kconfig b/drivers/platform/x86/intel/uncore-frequency/Kconfig index 21b209124916..a56d55056927 100644 --- a/drivers/platform/x86/intel/uncore-frequency/Kconfig +++ b/drivers/platform/x86/intel/uncore-frequency/Kconfig @@ -6,9 +6,13 @@ menu "Intel Uncore Frequency Control" depends on X86_64 || COMPILE_TEST +config INTEL_UNCORE_FREQ_CONTROL_TPMI + tristate + config INTEL_UNCORE_FREQ_CONTROL tristate "Intel Uncore frequency control driver" depends on X86_64 + select INTEL_UNCORE_FREQ_CONTROL_TPMI if INTEL_TPMI help This driver allows control of Uncore frequency limits on supported server platforms. diff --git a/drivers/platform/x86/intel/uncore-frequency/Makefile b/drivers/platform/x86/intel/uncore-frequency/Makefile index e0f7968e8285..08ff57492b28 100644 --- a/drivers/platform/x86/intel/uncore-frequency/Makefile +++ b/drivers/platform/x86/intel/uncore-frequency/Makefile @@ -7,3 +7,5 @@ obj-$(CONFIG_INTEL_UNCORE_FREQ_CONTROL) += intel-uncore-frequency.o intel-uncore-frequency-y := uncore-frequency.o obj-$(CONFIG_INTEL_UNCORE_FREQ_CONTROL) += intel-uncore-frequency-common.o intel-uncore-frequency-common-y := uncore-frequency-common.o +obj-$(CONFIG_INTEL_UNCORE_FREQ_CONTROL_TPMI) += intel-uncore-frequency-tpmi.o +intel-uncore-frequency-tpmi-y := uncore-frequency-tpmi.o diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c new file mode 100644 index 000000000000..5e454e9dd4a7 --- /dev/null +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c @@ -0,0 +1,338 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * uncore-frquency-tpmi: Uncore frequency scaling using TPMI + * + * Copyright (c) 2023, Intel Corporation. + * All Rights Reserved. + * + * The hardware interface to read/write is basically substitution of + * MSR 0x620 and 0x621. + * There are specific MMIO offset and bits to get/set minimum and + * maximum uncore ratio, similar to MSRs. + * The scope of the uncore MSRs was package scope. But TPMI allows + * new gen CPUs to have multiple uncore controls at uncore-cluster + * level. Each package can have multiple power domains which further + * can have multiple clusters. + * Here number of power domains = number of resources in this aux + * device. There are offsets and bits to discover number of clusters + * and offset for each cluster level controls. + * + */ + +#include +#include +#include +#include +#include +#include + +#include "uncore-frequency-common.h" + +#define UNCORE_HEADER_VERSION 1 +#define UNCORE_HEADER_INDEX 0 +#define UNCORE_FABRIC_CLUSTER_OFFSET 8 + +/* status + control + adv_ctl1 + adv_ctl2 */ +#define UNCORE_FABRIC_CLUSTER_SIZE (4 * 8) + +#define UNCORE_STATUS_INDEX 0 +#define UNCORE_CONTROL_INDEX 8 + +#define UNCORE_FREQ_KHZ_MULTIPLIER 100000 + +struct tpmi_uncore_struct; + +/* Information for each cluster */ +struct tpmi_uncore_cluster_info { + u8 __iomem *cluster_base; + struct uncore_data uncore_data; + struct tpmi_uncore_struct *uncore_root; +}; + +/* Information for each power domain */ +struct tpmi_uncore_power_domain_info { + u8 __iomem *uncore_base; + int ufs_header_ver; + int cluster_count; + struct tpmi_uncore_cluster_info *cluster_infos; +}; + +/* Information for all power domains in a package */ +struct tpmi_uncore_struct { + int power_domain_count; + struct tpmi_uncore_power_domain_info *pd_info; + struct tpmi_uncore_cluster_info root_cluster; +}; + +#define UNCORE_GENMASK_MIN_RATIO GENMASK_ULL(21, 15) +#define UNCORE_GENMASK_MAX_RATIO GENMASK_ULL(14, 8) + +/* Helper function to read MMIO offset for max/min control frequency */ +static void read_control_freq(struct tpmi_uncore_cluster_info *cluster_info, + unsigned int *min, unsigned int *max) +{ + u64 control; + + control = readq(cluster_info->cluster_base + UNCORE_CONTROL_INDEX); + *max = FIELD_GET(UNCORE_GENMASK_MAX_RATIO, control) * UNCORE_FREQ_KHZ_MULTIPLIER; + *min = FIELD_GET(UNCORE_GENMASK_MIN_RATIO, control) * UNCORE_FREQ_KHZ_MULTIPLIER; +} + +#define UNCORE_MAX_RATIO 0x7F + +/* Callback for sysfs read for max/min frequencies. Called under mutex locks */ +static int uncore_read_control_freq(struct uncore_data *data, unsigned int *min, + unsigned int *max) +{ + struct tpmi_uncore_cluster_info *cluster_info; + struct tpmi_uncore_struct *uncore_root; + int i, _min = 0, _max = 0; + + cluster_info = container_of(data, struct tpmi_uncore_cluster_info, uncore_data); + uncore_root = cluster_info->uncore_root; + + *min = UNCORE_MAX_RATIO * UNCORE_FREQ_KHZ_MULTIPLIER; + *max = 0; + + /* + * Get the max/min by looking at each cluster. Get the lowest + * min and highest max. + */ + for (i = 0; i < uncore_root->power_domain_count; ++i) { + int j; + + for (j = 0; j < uncore_root->pd_info[i].cluster_count; ++j) { + read_control_freq(&uncore_root->pd_info[i].cluster_infos[j], + &_min, &_max); + if (*min > _min) + *min = _min; + if (*max < _max) + *max = _max; + } + } + + return 0; +} + +/* Helper function to write MMIO offset for max/min control frequency */ +static void write_control_freq(struct tpmi_uncore_cluster_info *cluster_info, unsigned int input, + unsigned int min_max) +{ + u64 control; + + control = readq(cluster_info->cluster_base + UNCORE_CONTROL_INDEX); + + if (min_max) { + control &= ~UNCORE_GENMASK_MAX_RATIO; + control |= FIELD_PREP(UNCORE_GENMASK_MAX_RATIO, input); + } else { + control &= ~UNCORE_GENMASK_MIN_RATIO; + control |= FIELD_PREP(UNCORE_GENMASK_MIN_RATIO, input); + } + + writeq(control, (cluster_info->cluster_base + UNCORE_CONTROL_INDEX)); +} + +/* Callback for sysfs write for max/min frequencies. Called under mutex locks */ +static int uncore_write_control_freq(struct uncore_data *data, unsigned int input, + unsigned int min_max) +{ + struct tpmi_uncore_cluster_info *cluster_info; + struct tpmi_uncore_struct *uncore_root; + int i; + + input /= UNCORE_FREQ_KHZ_MULTIPLIER; + if (!input || input > UNCORE_MAX_RATIO) + return -EINVAL; + + cluster_info = container_of(data, struct tpmi_uncore_cluster_info, uncore_data); + uncore_root = cluster_info->uncore_root; + + /* Update each cluster in a package */ + for (i = 0; i < uncore_root->power_domain_count; ++i) { + int j; + + for (j = 0; j < uncore_root->pd_info[i].cluster_count; ++j) + write_control_freq(&uncore_root->pd_info[i].cluster_infos[j], + input, min_max); + } + + return 0; +} + +/* Callback for sysfs read for the current uncore frequency. Called under mutex locks */ +static int uncore_read_freq(struct uncore_data *data, unsigned int *freq) +{ + return -ENODATA; +} + +#define UNCORE_GENMASK_VERSION GENMASK_ULL(7, 0) +#define UNCORE_LOCAL_FABRIC_CLUSTER_ID_MASK GENMASK_ULL(15, 8) +#define UNCORE_CLUSTER_OFF_MASK GENMASK_ULL(7, 0) +#define UNCORE_MAX_CLUSTER_PER_DOMAIN 8 + +static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_device_id *id) +{ + struct intel_tpmi_plat_info *plat_info; + struct tpmi_uncore_struct *tpmi_uncore; + int ret, i, pkg = 0; + int num_resources; + + /* Get number of power domains, which is equal to number of resources */ + num_resources = tpmi_get_resource_count(auxdev); + if (!num_resources) + return -EINVAL; + + /* Register callbacks to uncore core */ + ret = uncore_freq_common_init(uncore_read_control_freq, uncore_write_control_freq, + uncore_read_freq); + if (ret) + return ret; + + /* Allocate uncore instance per package */ + tpmi_uncore = devm_kzalloc(&auxdev->dev, sizeof(*tpmi_uncore), GFP_KERNEL); + if (!tpmi_uncore) { + ret = -ENOMEM; + goto err_rem_common; + } + + /* Allocate memory for all power domains in a package */ + tpmi_uncore->pd_info = devm_kcalloc(&auxdev->dev, num_resources, + sizeof(*tpmi_uncore->pd_info), + GFP_KERNEL); + if (!tpmi_uncore->pd_info) { + ret = -ENOMEM; + goto err_rem_common; + } + + tpmi_uncore->power_domain_count = num_resources; + + /* Get the package ID from the TPMI core */ + plat_info = tpmi_get_platform_data(auxdev); + if (plat_info) + pkg = plat_info->package_id; + else + dev_info(&auxdev->dev, "Platform information is NULL\n"); + + for (i = 0; i < num_resources; ++i) { + struct tpmi_uncore_power_domain_info *pd_info; + struct resource *res; + u64 cluster_offset; + u8 cluster_mask; + int mask, j; + u64 header; + + res = tpmi_get_resource_at_index(auxdev, i); + if (!res) + continue; + + pd_info = &tpmi_uncore->pd_info[i]; + + pd_info->uncore_base = devm_ioremap_resource(&auxdev->dev, res); + if (IS_ERR(pd_info->uncore_base)) { + ret = PTR_ERR(pd_info->uncore_base); + goto err_rem_common; + } + + /* Check for version and skip this resource if there is mismatch */ + header = readq(pd_info->uncore_base); + pd_info->ufs_header_ver = header & UNCORE_GENMASK_VERSION; + if (pd_info->ufs_header_ver != UNCORE_HEADER_VERSION) { + dev_info(&auxdev->dev, "Uncore: Unsupported version:%d\n", + pd_info->ufs_header_ver); + continue; + } + + /* Get Cluster ID Mask */ + cluster_mask = FIELD_GET(UNCORE_LOCAL_FABRIC_CLUSTER_ID_MASK, header); + if (!cluster_mask) { + dev_info(&auxdev->dev, "Uncore: Invalid cluster mask:%x\n", cluster_mask); + continue; + } + + /* Find out number of clusters in this resource */ + mask = 0x01; + for (j = 0; j < UNCORE_MAX_CLUSTER_PER_DOMAIN; ++j) { + if (cluster_mask & mask) + pd_info->cluster_count++; + mask <<= 1; + } + + pd_info->cluster_infos = devm_kcalloc(&auxdev->dev, pd_info->cluster_count, + sizeof(struct tpmi_uncore_cluster_info), + GFP_KERNEL); + if (!pd_info->cluster_infos) { + ret = -ENOMEM; + goto err_rem_common; + } + /* + * Each byte in the register point to status and control + * registers belonging to cluster id 0-8. + */ + cluster_offset = readq(pd_info->uncore_base + + UNCORE_FABRIC_CLUSTER_OFFSET); + + for (j = 0; j < pd_info->cluster_count; ++j) { + struct tpmi_uncore_cluster_info *cluster_info; + + /* Get the offset for this cluster */ + mask = (cluster_offset & UNCORE_CLUSTER_OFF_MASK); + /* Offset in QWORD, so change to bytes */ + mask <<= 3; + + cluster_info = &pd_info->cluster_infos[j]; + + cluster_info->cluster_base = pd_info->uncore_base + mask; + + cluster_info->uncore_data.package_id = pkg; + /* There are no dies like Cascade Lake */ + cluster_info->uncore_data.die_id = 0; + + /* Point to next cluster offset */ + cluster_offset >>= UNCORE_MAX_CLUSTER_PER_DOMAIN; + } + } + + auxiliary_set_drvdata(auxdev, tpmi_uncore); + + tpmi_uncore->root_cluster.uncore_root = tpmi_uncore; + tpmi_uncore->root_cluster.uncore_data.package_id = pkg; + ret = uncore_freq_add_entry(&tpmi_uncore->root_cluster.uncore_data, 0); + if (ret) + goto err_rem_common; + + return 0; + +err_rem_common: + uncore_freq_common_exit(); + + return ret; +} + +static void uncore_remove(struct auxiliary_device *auxdev) +{ + struct tpmi_uncore_struct *tpmi_uncore = auxiliary_get_drvdata(auxdev); + + uncore_freq_remove_die_entry(&tpmi_uncore->root_cluster.uncore_data); + + uncore_freq_common_exit(); +} + +static const struct auxiliary_device_id intel_uncore_id_table[] = { + { .name = "intel_vsec.tpmi-uncore" }, + {} +}; +MODULE_DEVICE_TABLE(auxiliary, intel_uncore_id_table); + +static struct auxiliary_driver intel_uncore_aux_driver = { + .id_table = intel_uncore_id_table, + .remove = uncore_remove, + .probe = uncore_probe, +}; + +module_auxiliary_driver(intel_uncore_aux_driver); + +MODULE_IMPORT_NS(INTEL_TPMI); +MODULE_IMPORT_NS(INTEL_UNCORE_FREQUENCY); +MODULE_DESCRIPTION("Intel TPMI UFS Driver"); +MODULE_LICENSE("GPL"); From patchwork Tue Apr 18 17:13:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: srinivas pandruvada X-Patchwork-Id: 84983 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp3010514vqo; Tue, 18 Apr 2023 10:18:01 -0700 (PDT) X-Google-Smtp-Source: AKy350Z74IS4sENOfUaQKxHufgIc7h35THB7vocOu6tYWf++vOgZbwBxoS8i/yIkIAXf0NfIF/ds X-Received: by 2002:a17:90b:3606:b0:247:a22d:2a44 with SMTP id ml6-20020a17090b360600b00247a22d2a44mr313375pjb.36.1681838280817; Tue, 18 Apr 2023 10:18:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681838280; cv=none; d=google.com; s=arc-20160816; b=fIAaX07cqf3Ulmw3w5yC1xbup6v1kM0Pbp9FRR38Ttv4ZbqlsAsWcGG+/nAQz9Jl4i H8mqcaSuqeR1CnEdC2q2dLpx6IUxtXjja0oIF0QWHTQYA6h6L5bS07rorQGJRiEkZIN6 N0nnvl0ylJ+cCR0/zXaC2SNuGXqWwNL5CAVOTMv7R5ybgTuCmFXkF06HCgYchpX0UVKa pWxPVC61DwhqN/cxyE6BV7UkD0IKvRbJOGAaF8Rj35RkCyK8PlCuSKnF1ahz7ycBKXJe nsYY6DTMHC8DpHvTtKj2OL5cW3FRrRkGu93XM7rsB8BwpwHg7twD1MJwndJ/Nt8X7PIt ka9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=pbIZ3fb66fh0jAI9C9jsNhAoc5l1Q5+pt2t38WbUsCA=; b=pf9zmtIpcSotzA5hqmrPdXPxnm3w1+YEINRCK9jpUU2wANmXwhv63fZg31rE/Z9dim nTUtBqAJzBiPBORbZbAkWz+AGQrnUJmUH8kC60jtnV/j5AOV227rkj8uxwmw1nBKqc0f lYoV0cltalIvf+h5QhVSAcD7pkK+dII9xTlKEjz6g4F2E+ODIwjN5KGM9QdqZel7kFTm KrpeGOQQSEGULwDIswGkdwivyc4/g4QWKxeZIEWR+rf0Rx8YLaAaLOMK3dvQ3ImombQ/ syjhuNqI/J7MLHwYCicz6oTE6m++lxRyrQCcSPrJTBgeXSAFlYPcJ3EOuGxY/iXyfJ44 Ci7g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="IwQ/hJMp"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u19-20020a170902a61300b0019c3ce49a13si14652760plq.372.2023.04.18.10.17.46; Tue, 18 Apr 2023 10:18:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="IwQ/hJMp"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232226AbjDRROP (ORCPT + 99 others); Tue, 18 Apr 2023 13:14:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231472AbjDRRNu (ORCPT ); Tue, 18 Apr 2023 13:13:50 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 530D37EFA; Tue, 18 Apr 2023 10:13:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838028; x=1713374028; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pcz0xuzvKFQ972qMSTpd7/yWgsuX10hbC8u9kalTPD0=; b=IwQ/hJMppyMnHxeUvh7grFbRXFCS86jEOTQNiLED3JyU/qBM2y/xbz/P I7a+87akASOS8ZIUdC9GCm27cH1V0v9x7uIWSlon2rObpJtFScuOiyrv1 PBBckS2eLQ5nRWABd5Rc7j53+9k+0AoahWUobEbxmbTO9R639oT2iAO1o c+nTi6+cVC7hAjgPNFCbwHXiat4pLFjj0Ri7RkC4Kk2+X1XzmFdHXnhrG mlX1o+pzUILeU6x8Rb5olp1cy5r05aSyrF6igKQYUCFTLlguhoD395dk8 ibZU35N7xk4k8ypjc7nsPJGS0esPQFgr7gTO2dOTHft0aZuri4NXTnHSA w==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="347084264" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="347084264" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:13:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="755762661" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="755762661" Received: from spandruv-desk.jf.intel.com ([10.54.75.8]) by fmsmga008.fm.intel.com with ESMTP; 18 Apr 2023 10:13:44 -0700 From: Srinivas Pandruvada To: hdegoede@redhat.com, markgross@kernel.org Cc: platform-driver-x86@vger.kernel.org, linux-kernel@vger.kernel.org, Srinivas Pandruvada , Zhang Rui , Wendy Wang Subject: [PATCH v2 2/3] platform/x86/intel-uncore-freq: Support for cluster level controls Date: Tue, 18 Apr 2023 10:13:39 -0700 Message-Id: <20230418171340.681662-3-srinivas.pandruvada@linux.intel.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230418171340.681662-1-srinivas.pandruvada@linux.intel.com> References: <20230418171340.681662-1-srinivas.pandruvada@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1763535257551182031?= X-GMAIL-MSGID: =?utf-8?q?1763535257551182031?= An SoC can contain multiple power domains with individual or collection of mesh partitions. This partition is called fabric cluster. Certain type of meshes will need to run at the same frequency, they will be placed in the same fabric cluster. Benefit of fabric cluster is that it offers a scalable mechanism to deal with partitioned fabrics in a SoC. The current sysfs interface supports control at package and die level. This interface is not enough to support more granular control at fabric cluster level. SoCs with the support of TPMI (Topology Aware Register and PM Capsule Interface), can have multiple power domains. Each power domain can contain one or more fabric clusters. To support such granular controls, enhance uncore common to optionally create new directories to provide controls at fabric cluster level. It is also important to have flexibility to change granularity for future version of SoCs. If the directory name contains scope like: "package_*_die_*_power_domain_*_cluster_*", then this is not expandable. The cpufreq policies also have different scopes. There the scope of the policy (affected_cpus) specified by attributes inside each policy. So, follow the same model for uncore frequency scaling sysfs as: "sys/devices/system/cpu/cpufreq/policy*" Allow client drivers to optionally support granular control for each fabric cluster. Here, the directory name will be "uncore" suffixed with an unique instance number. For example: uncore00, uncore01 etc. Attributes in the directory identify package id, power domain and fabric cluster id. This interface is expandable even if some new level of granularity is introduced. A new sysfs attribute can identify new level. For compatibility with the existing sysfs and provide easy way to set limits for each fabric cluster in the package/die, the existing control at package/die levels are still provided. For majority of users, this is an easy approach. For example: On a single package/die system, with three power domains and one fabric cluster per power domain: $tree -L 2 /sys/devices/system/cpu/intel_uncore_frequency/ /sys/devices/system/cpu/intel_uncore_frequency/ ├── package_00_die_00 │   ├── current_freq_khz │   ├── initial_max_freq_khz │   ├── initial_min_freq_khz │   ├── max_freq_khz │   └── min_freq_khz ├── uncore00 │   ├── current_freq_khz │   ├── domain_id │   ├── fabric_cluster_id │   ├── initial_max_freq_khz │   ├── initial_min_freq_khz │   ├── max_freq_khz │   ├── min_freq_khz │   └── package_id ├── uncore01 │   ├── current_freq_khz │   ├── domain_id │   ├── fabric_cluster_id │   ├── initial_max_freq_khz │   ├── initial_min_freq_khz │   ├── max_freq_khz │   ├── min_freq_khz │   └── package_id └── uncore02 ├── current_freq_khz ├── domain_id ├── fabric_cluster_id ├── initial_max_freq_khz ├── initial_min_freq_khz ├── max_freq_khz ├── min_freq_khz └── package_id The attribute for cluster id is "fabric_cluster_id" instead of just "cluster_id" is to avoid confusion with usage of term clusters in other part of the Linux kernel. Signed-off-by: Srinivas Pandruvada Reviewed-by: Zhang Rui Tested-by: Wendy Wang --- New patch with this series. .../pm/intel_uncore_frequency_scaling.rst | 57 ++++++++++++++++++- .../uncore-frequency-common.c | 51 ++++++++++++++++- .../uncore-frequency-common.h | 16 +++++- .../intel/uncore-frequency/uncore-frequency.c | 1 + 4 files changed, 121 insertions(+), 4 deletions(-) diff --git a/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst b/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst index 09169d935835..5ab3440e6cee 100644 --- a/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst +++ b/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst @@ -5,7 +5,7 @@ Intel Uncore Frequency Scaling ============================== -:Copyright: |copy| 2022 Intel Corporation +:Copyright: |copy| 2022-2023 Intel Corporation :Author: Srinivas Pandruvada @@ -58,3 +58,58 @@ Each package_*_die_* contains the following attributes: ``current_freq_khz`` This attribute is used to get the current uncore frequency. + +SoCs with TPMI (Topology Aware Register and PM Capsule Interface) +----------------------------------------------------------------- + +An SoC can contain multiple power domains with individual or collection +of mesh partitions. This partition is called fabric cluster. + +Certain type of meshes will need to run at the same frequency, they will +be placed in the same fabric cluster. Benefit of fabric cluster is that it +offers a scalable mechanism to deal with partitioned fabrics in a SoC. + +The current sysfs interface supports controls at package and die level. +This interface is not enough to support more granular control at +fabric cluster level. + +SoCs with the support of TPMI (Topology Aware Register and PM Capsule +Interface), can have multiple power domains. Each power domain can +contain one or more fabric clusters. + +To represent controls at fabric cluster level in addition to the +controls at package and die level (like systems without TPMI +support), sysfs is enhanced. This granular interface is presented in the +sysfs with directories names prefixed with "uncore". For example: +uncore00, uncore01 etc. + +The scope of control is specified by attributes "package_id", "domain_id" +and "fabric_cluster_id" in the directory. + +Attributes in each directory: + +``domain_id`` + This attribute is used to get the power domain id of this instance. + +``fabric_cluster_id`` + This attribute is used to get the fabric cluster id of this instance. + +``package_id`` + This attribute is used to get the package id of this instance. + +The other attributes are same as presented at package_*_die_* level. + +In most of current use cases, the "max_freq_khz" and "min_freq_khz" +is updated at "package_*_die_*" level. This model will be still supported +with the following approach: + +When user uses controls at "package_*_die_*" level, then every fabric +cluster is affected in that package and die. For example: user changes +"max_freq_khz" in the package_00_die_00, then "max_freq_khz" for uncore* +directory with the same package id will be updated. In this case user can +still update "max_freq_khz" at each uncore* level, which is more restrictive. +Similarly, user can update "min_freq_khz" at "package_*_die_*" level +to apply at each uncore* level. + +Support for "current_freq_khz" is available only at each fabric cluster +level (i.e., in uncore* directory). diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.c index fa8f14c925ec..b86e65a8ffdc 100644 --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.c +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.c @@ -16,11 +16,34 @@ static struct kobject *uncore_root_kobj; /* uncore instance count */ static int uncore_instance_count; +static DEFINE_IDA(intel_uncore_ida); + /* callbacks for actual HW read/write */ static int (*uncore_read)(struct uncore_data *data, unsigned int *min, unsigned int *max); static int (*uncore_write)(struct uncore_data *data, unsigned int input, unsigned int min_max); static int (*uncore_read_freq)(struct uncore_data *data, unsigned int *freq); +static ssize_t show_domain_id(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct uncore_data *data = container_of(attr, struct uncore_data, domain_id_dev_attr); + + return sprintf(buf, "%u\n", data->domain_id); +} + +static ssize_t show_fabric_cluster_id(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct uncore_data *data = container_of(attr, struct uncore_data, fabric_cluster_id_dev_attr); + + return sprintf(buf, "%u\n", data->cluster_id); +} + +static ssize_t show_package_id(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct uncore_data *data = container_of(attr, struct uncore_data, package_id_dev_attr); + + return sprintf(buf, "%u\n", data->package_id); +} + static ssize_t show_min_max_freq_khz(struct uncore_data *data, char *buf, int min_max) { @@ -161,6 +184,15 @@ static int create_attr_group(struct uncore_data *data, char *name) init_attribute_ro(initial_max_freq_khz); init_attribute_root_ro(current_freq_khz); + if (data->domain_id != UNCORE_DOMAIN_ID_INVALID) { + init_attribute_root_ro(domain_id); + data->uncore_attrs[index++] = &data->domain_id_dev_attr.attr; + init_attribute_root_ro(fabric_cluster_id); + data->uncore_attrs[index++] = &data->fabric_cluster_id_dev_attr.attr; + init_attribute_root_ro(package_id); + data->uncore_attrs[index++] = &data->package_id_dev_attr.attr; + } + data->uncore_attrs[index++] = &data->max_freq_khz_dev_attr.attr; data->uncore_attrs[index++] = &data->min_freq_khz_dev_attr.attr; data->uncore_attrs[index++] = &data->initial_min_freq_khz_dev_attr.attr; @@ -191,12 +223,24 @@ int uncore_freq_add_entry(struct uncore_data *data, int cpu) goto uncore_unlock; } - sprintf(data->name, "package_%02d_die_%02d", data->package_id, data->die_id); + if (data->domain_id != UNCORE_DOMAIN_ID_INVALID) { + ret = ida_alloc(&intel_uncore_ida, GFP_KERNEL); + if (ret < 0) + goto uncore_unlock; + + data->instance_id = ret; + sprintf(data->name, "uncore%02d", ret); + } else { + sprintf(data->name, "package_%02d_die_%02d", data->package_id, data->die_id); + } uncore_read(data, &data->initial_min_freq_khz, &data->initial_max_freq_khz); ret = create_attr_group(data, data->name); - if (!ret) { + if (ret) { + if (data->domain_id != UNCORE_DOMAIN_ID_INVALID) + ida_free(&intel_uncore_ida, data->instance_id); + } else { data->control_cpu = cpu; data->valid = true; } @@ -214,6 +258,9 @@ void uncore_freq_remove_die_entry(struct uncore_data *data) delete_attr_group(data, data->name); data->control_cpu = -1; data->valid = false; + if (data->domain_id != UNCORE_DOMAIN_ID_INVALID) + ida_free(&intel_uncore_ida, data->instance_id); + mutex_unlock(&uncore_lock); } EXPORT_SYMBOL_NS_GPL(uncore_freq_remove_die_entry, INTEL_UNCORE_FREQUENCY); diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h index f5dcfa2fb285..7afb69977c7e 100644 --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-common.h @@ -21,6 +21,9 @@ * @valid: Mark the data valid/invalid * @package_id: Package id for this instance * @die_id: Die id for this instance + * @domain_id: Power domain id for this instance + * @cluster_id: cluster id in a domain + * @instance_id: Unique instance id to append to directory name * @name: Sysfs entry name for this instance * @uncore_attr_group: Attribute group storage * @max_freq_khz_dev_attr: Storage for device attribute max_freq_khz @@ -28,6 +31,9 @@ * @initial_max_freq_khz_dev_attr: Storage for device attribute initial_max_freq_khz * @initial_min_freq_khz_dev_attr: Storage for device attribute initial_min_freq_khz * @current_freq_khz_dev_attr: Storage for device attribute current_freq_khz + * @domain_id_dev_attr: Storage for device attribute domain_id + * @fabric_cluster_id_dev_attr: Storage for device attribute fabric_cluster_id + * @package_id_dev_attr: Storage for device attribute package_id * @uncore_attrs: Attribute storage for group creation * * This structure is used to encapsulate all data related to uncore sysfs @@ -41,6 +47,9 @@ struct uncore_data { bool valid; int package_id; int die_id; + int domain_id; + int cluster_id; + int instance_id; char name[32]; struct attribute_group uncore_attr_group; @@ -49,9 +58,14 @@ struct uncore_data { struct device_attribute initial_max_freq_khz_dev_attr; struct device_attribute initial_min_freq_khz_dev_attr; struct device_attribute current_freq_khz_dev_attr; - struct attribute *uncore_attrs[6]; + struct device_attribute domain_id_dev_attr; + struct device_attribute fabric_cluster_id_dev_attr; + struct device_attribute package_id_dev_attr; + struct attribute *uncore_attrs[9]; }; +#define UNCORE_DOMAIN_ID_INVALID -1 + int uncore_freq_common_init(int (*read_control_freq)(struct uncore_data *data, unsigned int *min, unsigned int *max), int (*write_control_freq)(struct uncore_data *data, unsigned int input, unsigned int min_max), int (*uncore_read_freq)(struct uncore_data *data, unsigned int *freq)); diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c index 00ac7e381441..0ea13c5fbba8 100644 --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c @@ -136,6 +136,7 @@ static int uncore_event_cpu_online(unsigned int cpu) data->package_id = topology_physical_package_id(cpu); data->die_id = topology_die_id(cpu); + data->domain_id = UNCORE_DOMAIN_ID_INVALID; return uncore_freq_add_entry(data, cpu); } From patchwork Tue Apr 18 17:13:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: srinivas pandruvada X-Patchwork-Id: 84984 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp3010595vqo; Tue, 18 Apr 2023 10:18:09 -0700 (PDT) X-Google-Smtp-Source: AKy350YNSB4HIcRX2Wi8Wp6ySbN07iHtEdmRJ76HbECH1Jf+SZfkM4yx+IOfLSljIxMM+2OtzdR1 X-Received: by 2002:a17:903:1c6:b0:1a0:f8ba:ae55 with SMTP id e6-20020a17090301c600b001a0f8baae55mr2941255plh.7.1681838289355; Tue, 18 Apr 2023 10:18:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681838289; cv=none; d=google.com; s=arc-20160816; b=kyOtNRhVfLewngXSUNbgvJlf5+Ixx9kkfmefOsFF4cIH7xnsH5GIFtdirqgaBl7RwX qd0acVe/5+NaMpbwrNZyfFyoiUTVZPd9qE34jfbhZjP/Bg1TX9LyMEBz1Xmb8tipQmgP r1e/gZJQm6d9e3RRMy1DQTGTZAb58qvU3Le3vH7wc/HxtQ6C1A2/dldQ7yQN7zq0Agvb z5j+iaVgnIqkFymI5pvWKJPIKm7R1s6DVC9Sn0RYC8BYggh1dYulEC6a05ojD5wdvCAJ 1MdYKZrfZuyWUG69yIWTMNl4XZFXQZUHd4BOWRLQT0h0Fwvuw0xzDAO54T0PvzOKpHXr 8Iww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=G69a5puLoKUa3/Cbn8/N/YX0pdEzc1Kd8Wh4FudLGAI=; b=EPSA+7+84jI0/WQhv1IOmmh9hWNVGHFBygQMYZ72zvQtOuPBhw+GaB63WoH+RWIxLk Nv5D2j1oWBIgRdSo4qYaZqW1O0DJUQ1LOkZjn021S3clNdkMBN8tBaW8h7azM3+PnOmf sPJBXDa4on7Z85bf5sPkUB+QpBf+tIJVwgpvI1BIeVDDDPfA0iZl8SHHzdus/jA+5Zh1 agtr1/yTERHBnD46sWvaY49dbA6kOzFXoOOKY3zk2NIfDcfh6+smCKlZc5ZuJsgSVHrr VRr7nqSBbB6CvAFbNYiDS87Zu8VpJWVAIJaqaxGf2jHVzaEKGLLxjvPp83+QlenhQrl4 7HDg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=WhsdgoRD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e11-20020a170902ed8b00b001a647aadbe0si14137018plj.568.2023.04.18.10.17.54; Tue, 18 Apr 2023 10:18:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=WhsdgoRD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232313AbjDRROL (ORCPT + 99 others); Tue, 18 Apr 2023 13:14:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231144AbjDRRNs (ORCPT ); Tue, 18 Apr 2023 13:13:48 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E08C5FC0; Tue, 18 Apr 2023 10:13:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838026; x=1713374026; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EW/NzUatcqccaAZf4t6HInPFSpn4oc5stb1VqBGNgJw=; b=WhsdgoRDJ0FASDUeShDCiufYE3XTiTuKgA8fdhX12/5soYcZqsXc9wwn I1Kf7boUb3weMdk4uramnxwv9QPGtunTFz7563O5TZQ7gvPoSNrZlZcG4 qgXg4dI+Q/lzCYj/CNxemNkQd9sCAc5j9sAbLvLHmNgoAcjIt2s5tIXF7 s8fqghj7IbRdaK4AsbAXC3OIyAEYHugiPp0Bux1nMgAqQ4tcA1Ip8qU8f JCn2IP0criUJrr3QsDaWAhwTmNDJ3+s1n73pIbh6HhKTUy0GuKsRpTqHA 0/I4ZVoZNFhPXo33tPTK7+m6N6H2VJnmBGzmbYITr9neL3bRRMTk/ABsm Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="347084269" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="347084269" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:13:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="755762667" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="755762667" Received: from spandruv-desk.jf.intel.com ([10.54.75.8]) by fmsmga008.fm.intel.com with ESMTP; 18 Apr 2023 10:13:44 -0700 From: Srinivas Pandruvada To: hdegoede@redhat.com, markgross@kernel.org Cc: platform-driver-x86@vger.kernel.org, linux-kernel@vger.kernel.org, Srinivas Pandruvada , Zhang Rui , Wendy Wang Subject: [PATCH v2 3/3] platform/x86/intel-uncore-freq: tpmi: Provide cluster level control Date: Tue, 18 Apr 2023 10:13:40 -0700 Message-Id: <20230418171340.681662-4-srinivas.pandruvada@linux.intel.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230418171340.681662-1-srinivas.pandruvada@linux.intel.com> References: <20230418171340.681662-1-srinivas.pandruvada@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1763535265641876215?= X-GMAIL-MSGID: =?utf-8?q?1763535265641876215?= The new generation of CPUs have granular control at a cluster level. Each package/die can have multiple power domains, which further can have multiple fabric clusters. The TPMI interface allows control at fabric cluster level. Use the updated uncore sysfs feature to expose controls at cluster level. At each cluster level there is a control for maximum and minimum uncore frequency. Also present current uncore frequency at a cluster level. Signed-off-by: Srinivas Pandruvada Reviewed-by: Zhang Rui Tested-by: Wendy Wang --- New patch with this series. .../uncore-frequency/uncore-frequency-tpmi.c | 136 ++++++++++++++---- 1 file changed, 108 insertions(+), 28 deletions(-) diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c index 5e454e9dd4a7..b7f7d2a7f42c 100644 --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c @@ -44,6 +44,7 @@ struct tpmi_uncore_struct; /* Information for each cluster */ struct tpmi_uncore_cluster_info { + bool root_domain; u8 __iomem *cluster_base; struct uncore_data uncore_data; struct tpmi_uncore_struct *uncore_root; @@ -60,12 +61,15 @@ struct tpmi_uncore_power_domain_info { /* Information for all power domains in a package */ struct tpmi_uncore_struct { int power_domain_count; + int max_ratio; + int min_ratio; struct tpmi_uncore_power_domain_info *pd_info; struct tpmi_uncore_cluster_info root_cluster; }; #define UNCORE_GENMASK_MIN_RATIO GENMASK_ULL(21, 15) #define UNCORE_GENMASK_MAX_RATIO GENMASK_ULL(14, 8) +#define UNCORE_GENMASK_CURRENT_RATIO GENMASK_ULL(6, 0) /* Helper function to read MMIO offset for max/min control frequency */ static void read_control_freq(struct tpmi_uncore_cluster_info *cluster_info, @@ -85,32 +89,37 @@ static int uncore_read_control_freq(struct uncore_data *data, unsigned int *min, unsigned int *max) { struct tpmi_uncore_cluster_info *cluster_info; - struct tpmi_uncore_struct *uncore_root; - int i, _min = 0, _max = 0; cluster_info = container_of(data, struct tpmi_uncore_cluster_info, uncore_data); - uncore_root = cluster_info->uncore_root; - *min = UNCORE_MAX_RATIO * UNCORE_FREQ_KHZ_MULTIPLIER; - *max = 0; + if (cluster_info->root_domain) { + struct tpmi_uncore_struct *uncore_root = cluster_info->uncore_root; + int i, _min = 0, _max = 0; - /* - * Get the max/min by looking at each cluster. Get the lowest - * min and highest max. - */ - for (i = 0; i < uncore_root->power_domain_count; ++i) { - int j; + *min = UNCORE_MAX_RATIO * UNCORE_FREQ_KHZ_MULTIPLIER; + *max = 0; - for (j = 0; j < uncore_root->pd_info[i].cluster_count; ++j) { - read_control_freq(&uncore_root->pd_info[i].cluster_infos[j], - &_min, &_max); - if (*min > _min) - *min = _min; - if (*max < _max) - *max = _max; + /* + * Get the max/min by looking at each cluster. Get the lowest + * min and highest max. + */ + for (i = 0; i < uncore_root->power_domain_count; ++i) { + int j; + + for (j = 0; j < uncore_root->pd_info[i].cluster_count; ++j) { + read_control_freq(&uncore_root->pd_info[i].cluster_infos[j], + &_min, &_max); + if (*min > _min) + *min = _min; + if (*max < _max) + *max = _max; + } } + return 0; } + read_control_freq(cluster_info, min, max); + return 0; } @@ -139,7 +148,6 @@ static int uncore_write_control_freq(struct uncore_data *data, unsigned int inpu { struct tpmi_uncore_cluster_info *cluster_info; struct tpmi_uncore_struct *uncore_root; - int i; input /= UNCORE_FREQ_KHZ_MULTIPLIER; if (!input || input > UNCORE_MAX_RATIO) @@ -149,21 +157,72 @@ static int uncore_write_control_freq(struct uncore_data *data, unsigned int inpu uncore_root = cluster_info->uncore_root; /* Update each cluster in a package */ - for (i = 0; i < uncore_root->power_domain_count; ++i) { - int j; + if (cluster_info->root_domain) { + struct tpmi_uncore_struct *uncore_root = cluster_info->uncore_root; + int i; + + for (i = 0; i < uncore_root->power_domain_count; ++i) { + int j; + + for (j = 0; j < uncore_root->pd_info[i].cluster_count; ++j) + write_control_freq(&uncore_root->pd_info[i].cluster_infos[j], + input, min_max); + } - for (j = 0; j < uncore_root->pd_info[i].cluster_count; ++j) - write_control_freq(&uncore_root->pd_info[i].cluster_infos[j], - input, min_max); + if (min_max) + uncore_root->max_ratio = input; + else + uncore_root->min_ratio = input; + + return 0; } + if (min_max && uncore_root->max_ratio && uncore_root->max_ratio < input) + return -EINVAL; + + if (!min_max && uncore_root->min_ratio && uncore_root->min_ratio > input) + return -EINVAL; + + write_control_freq(cluster_info, input, min_max); + return 0; } /* Callback for sysfs read for the current uncore frequency. Called under mutex locks */ static int uncore_read_freq(struct uncore_data *data, unsigned int *freq) { - return -ENODATA; + struct tpmi_uncore_cluster_info *cluster_info; + u64 status; + + cluster_info = container_of(data, struct tpmi_uncore_cluster_info, uncore_data); + if (cluster_info->root_domain) + return -ENODATA; + + status = readq((u8 __iomem *)cluster_info->cluster_base + UNCORE_STATUS_INDEX); + *freq = FIELD_GET(UNCORE_GENMASK_CURRENT_RATIO, status) * UNCORE_FREQ_KHZ_MULTIPLIER; + + return 0; +} + +static void remove_cluster_entries(struct tpmi_uncore_struct *tpmi_uncore) +{ + int i; + + for (i = 0; i < tpmi_uncore->power_domain_count; ++i) { + struct tpmi_uncore_power_domain_info *pd_info; + int j; + + pd_info = &tpmi_uncore->pd_info[i]; + if (!pd_info->uncore_base) + continue; + + for (j = 0; j < pd_info->cluster_count; ++j) { + struct tpmi_uncore_cluster_info *cluster_info; + + cluster_info = &pd_info->cluster_infos[j]; + uncore_freq_remove_die_entry(&cluster_info->uncore_data); + } + } } #define UNCORE_GENMASK_VERSION GENMASK_ULL(7, 0) @@ -231,7 +290,13 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_ pd_info->uncore_base = devm_ioremap_resource(&auxdev->dev, res); if (IS_ERR(pd_info->uncore_base)) { ret = PTR_ERR(pd_info->uncore_base); - goto err_rem_common; + /* + * Set to NULL so that clean up can still remove other + * entries already created if any by + * remove_cluster_entries() + */ + pd_info->uncore_base = NULL; + goto remove_clusters; } /* Check for version and skip this resource if there is mismatch */ @@ -263,7 +328,7 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_ GFP_KERNEL); if (!pd_info->cluster_infos) { ret = -ENOMEM; - goto err_rem_common; + goto remove_clusters; } /* * Each byte in the register point to status and control @@ -287,7 +352,16 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_ cluster_info->uncore_data.package_id = pkg; /* There are no dies like Cascade Lake */ cluster_info->uncore_data.die_id = 0; + cluster_info->uncore_data.domain_id = i; + cluster_info->uncore_data.cluster_id = j; + + cluster_info->uncore_root = tpmi_uncore; + ret = uncore_freq_add_entry(&cluster_info->uncore_data, 0); + if (ret) { + cluster_info->cluster_base = NULL; + goto remove_clusters; + } /* Point to next cluster offset */ cluster_offset >>= UNCORE_MAX_CLUSTER_PER_DOMAIN; } @@ -295,14 +369,19 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_ auxiliary_set_drvdata(auxdev, tpmi_uncore); + tpmi_uncore->root_cluster.root_domain = true; tpmi_uncore->root_cluster.uncore_root = tpmi_uncore; + tpmi_uncore->root_cluster.uncore_data.package_id = pkg; + tpmi_uncore->root_cluster.uncore_data.domain_id = UNCORE_DOMAIN_ID_INVALID; ret = uncore_freq_add_entry(&tpmi_uncore->root_cluster.uncore_data, 0); if (ret) - goto err_rem_common; + goto remove_clusters; return 0; +remove_clusters: + remove_cluster_entries(tpmi_uncore); err_rem_common: uncore_freq_common_exit(); @@ -314,6 +393,7 @@ static void uncore_remove(struct auxiliary_device *auxdev) struct tpmi_uncore_struct *tpmi_uncore = auxiliary_get_drvdata(auxdev); uncore_freq_remove_die_entry(&tpmi_uncore->root_cluster.uncore_data); + remove_cluster_entries(tpmi_uncore); uncore_freq_common_exit(); }