From patchwork Mon May 22 11:30:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Kan" X-Patchwork-Id: 97335 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1388227vqo; Mon, 22 May 2023 04:56:20 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5xafDO7i7z7DhJ78WAU4EhdOATfZ1ml/IYhsb2K5zkZTymrVdm9G6Kkqhum6+uXeK4/Dq5 X-Received: by 2002:a05:6a00:290a:b0:643:4608:7c2d with SMTP id cg10-20020a056a00290a00b0064346087c2dmr14061352pfb.12.1684756580315; Mon, 22 May 2023 04:56:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684756580; cv=none; d=google.com; s=arc-20160816; b=y7Wtjmy9ki6FBUl7G8smNtMfCTQxFMjcso46r/lBza2yhyv8qgPZiR4z8Sym+3+b6R rtv3Dec1RPp+hQSJfN/9bCLdiVjyl9QVivVHhipxVkR8VUPQPD3qaoUCO1PoWDK1sJpc jmDaDULauc5iKFVOZxPxIX1fwYSV7uLBatN+eVMz/pY/neIegZW2frgM2gQEFYmie4LN J2LPuvx58UgJMVOlYxYIXQlyk0OlOd1zD+mvBNVCkTDnt+MDz4OaQyd91VFUfL67eymg M75B+fIaNWi8wk8YQgMKj2b+lj2+5FPYiW1lWtE4NIVgFLo6M6TxO3EzOdeKOOciynuf 4skg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=wyXsgoPne4Fk/eqLtJJm6Mh5WR5k2BaxxF8CmWginXk=; b=GYPXi1bHlq1bSV1toj15iURuCBwpEE/YJU3xpgSU86KWgMPWJ0F5dltfb17rTHxhps i8f0VC+/ZsCtU3sVHSVtmlflb0lPIrMI1ByS75LFSCiXWfF+2t5RPkbGPmMmwvmzI0UG qSsfOIyWmuAwhNhpcMbdRLB3cBphZbUYrhwgvUEO6l5QUBoQfnNBeoiv8+Ain0g3+v6y 9sqAyWEEX6fzfu+UfSWokuDzZ5apC9M5WiplKMsQksNTZjMPwZG5Wp9gtMY+SRVCOruC dsaOT9HA/0ZQqwaLZsdHPFswu/xzJwILCfX7UPtyZJ7TGmfLe2vZj8NNXw2Tf2khyrm1 1Hag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hf89YbJy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z2-20020aa79582000000b0064d740b7459si1119161pfj.102.2023.05.22.04.56.05; Mon, 22 May 2023 04:56:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hf89YbJy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230018AbjEVLbM (ORCPT + 99 others); Mon, 22 May 2023 07:31:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230424AbjEVLbJ (ORCPT ); Mon, 22 May 2023 07:31:09 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 164A5DC for ; Mon, 22 May 2023 04:30:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684755056; x=1716291056; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=pcNY+douPUH+tIBftDWU0u4zR1ZXX1Y9AQDCN8nPlSc=; b=hf89YbJyzCy/N4FfGBvpzRXlOb1KULfWII0448OvlQUmjAPmj9/EwJob 2/LqlPbbBwfEJvf73Ke17WdbSIc7TFEANHXwkAQtx9n4YEOZ+wCfxYNfp 9H8fXhqGlKHb4DuXh2E/drFwDC5LpphRul7AHTXm02FD/XKS2HOH4U5sH 6w1Lu8Sb21eY5/lFCSQNL7WMRFN5Twh3/+nATTppAHpIEmR/MGs6+pdWf thcaHYCeFqdO7xVLMzUgZmqcYGDQUQH2h/Ge3cwFvU9yii8/Sy4ySdUdC L+WVrPzO6MkTqKNIBmoFo+ZYFUuB+kcytH9BW6ekc8wHF7VWyZt+3ilbK A==; X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="416356729" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="416356729" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 May 2023 04:30:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="703468236" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="703468236" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orsmga002.jf.intel.com with ESMTP; 22 May 2023 04:30:55 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com, ak@linux.intel.com, eranian@google.com, alexey.v.bayduraev@linux.intel.com, tinghao.zhang@intel.com, Kan Liang Subject: [PATCH V2 1/6] perf/x86/intel: Add Grand Ridge and Sierra Forest Date: Mon, 22 May 2023 04:30:35 -0700 Message-Id: <20230522113040.2329924-1-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766595315839921534?= X-GMAIL-MSGID: =?utf-8?q?1766595315839921534?= From: Kan Liang The Grand Ridge and Sierra Forest are successors to Snow Ridge. They both have Crestmont core. From the core PMU's perspective, they are similar to the e-core of MTL. The only difference is the LBR event logging feature, which will be implemented in the following patches. Create a non-hybrid PMU setup for Grand Ridge and Sierra Forest. Reviewed-by: Andi Kleen Signed-off-by: Kan Liang --- No changes since V1. arch/x86/events/intel/core.c | 52 +++++++++++++++++++++++++++++++++++- arch/x86/events/intel/ds.c | 9 +++++-- arch/x86/events/perf_event.h | 2 ++ 3 files changed, 60 insertions(+), 3 deletions(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index a3fb996a86a1..ba2a971e6b8a 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2119,6 +2119,17 @@ static struct extra_reg intel_grt_extra_regs[] __read_mostly = { EVENT_EXTRA_END }; +EVENT_ATTR_STR(topdown-retiring, td_retiring_cmt, "event=0x72,umask=0x0"); +EVENT_ATTR_STR(topdown-bad-spec, td_bad_spec_cmt, "event=0x73,umask=0x0"); + +static struct attribute *cmt_events_attrs[] = { + EVENT_PTR(td_fe_bound_tnt), + EVENT_PTR(td_retiring_cmt), + EVENT_PTR(td_bad_spec_cmt), + EVENT_PTR(td_be_bound_tnt), + NULL +}; + static struct extra_reg intel_cmt_extra_regs[] __read_mostly = { /* must define OFFCORE_RSP_X first, see intel_fixup_er() */ INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x800ff3ffffffffffull, RSP_0), @@ -4830,6 +4841,8 @@ PMU_FORMAT_ATTR(ldlat, "config1:0-15"); PMU_FORMAT_ATTR(frontend, "config1:0-23"); +PMU_FORMAT_ATTR(snoop_rsp, "config1:0-63"); + static struct attribute *intel_arch3_formats_attr[] = { &format_attr_event.attr, &format_attr_umask.attr, @@ -4860,6 +4873,13 @@ static struct attribute *slm_format_attr[] = { NULL }; +static struct attribute *cmt_format_attr[] = { + &format_attr_offcore_rsp.attr, + &format_attr_ldlat.attr, + &format_attr_snoop_rsp.attr, + NULL +}; + static struct attribute *skl_format_attr[] = { &format_attr_frontend.attr, NULL, @@ -5630,7 +5650,6 @@ static struct attribute *adl_hybrid_extra_attr[] = { NULL }; -PMU_FORMAT_ATTR_SHOW(snoop_rsp, "config1:0-63"); FORMAT_ATTR_HYBRID(snoop_rsp, hybrid_small); static struct attribute *mtl_hybrid_extra_attr_rtm[] = { @@ -6178,6 +6197,37 @@ __init int intel_pmu_init(void) name = "gracemont"; break; + case INTEL_FAM6_GRANDRIDGE: + case INTEL_FAM6_SIERRAFOREST_X: + x86_pmu.mid_ack = true; + memcpy(hw_cache_event_ids, glp_hw_cache_event_ids, + sizeof(hw_cache_event_ids)); + memcpy(hw_cache_extra_regs, tnt_hw_cache_extra_regs, + sizeof(hw_cache_extra_regs)); + hw_cache_event_ids[C(ITLB)][C(OP_READ)][C(RESULT_ACCESS)] = -1; + + x86_pmu.event_constraints = intel_slm_event_constraints; + x86_pmu.pebs_constraints = intel_grt_pebs_event_constraints; + x86_pmu.extra_regs = intel_cmt_extra_regs; + + x86_pmu.pebs_aliases = NULL; + x86_pmu.pebs_prec_dist = true; + x86_pmu.lbr_pt_coexist = true; + x86_pmu.pebs_block = true; + x86_pmu.flags |= PMU_FL_HAS_RSP_1; + x86_pmu.flags |= PMU_FL_INSTR_LATENCY; + + intel_pmu_pebs_data_source_cmt(); + x86_pmu.pebs_latency_data = mtl_latency_data_small; + x86_pmu.get_event_constraints = cmt_get_event_constraints; + x86_pmu.limit_period = spr_limit_period; + td_attr = cmt_events_attrs; + mem_attr = grt_mem_attrs; + extra_attr = cmt_format_attr; + pr_cont("Crestmont events, "); + name = "crestmont"; + break; + case INTEL_FAM6_WESTMERE: case INTEL_FAM6_WESTMERE_EP: case INTEL_FAM6_WESTMERE_EX: diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index a2e566e53076..608e220e46aa 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -144,7 +144,7 @@ void __init intel_pmu_pebs_data_source_adl(void) __intel_pmu_pebs_data_source_grt(data_source); } -static void __init intel_pmu_pebs_data_source_cmt(u64 *data_source) +static void __init __intel_pmu_pebs_data_source_cmt(u64 *data_source) { data_source[0x07] = OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOPX, FWD); data_source[0x08] = OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HITM); @@ -164,7 +164,12 @@ void __init intel_pmu_pebs_data_source_mtl(void) data_source = x86_pmu.hybrid_pmu[X86_HYBRID_PMU_ATOM_IDX].pebs_data_source; memcpy(data_source, pebs_data_source, sizeof(pebs_data_source)); - intel_pmu_pebs_data_source_cmt(data_source); + __intel_pmu_pebs_data_source_cmt(data_source); +} + +void __init intel_pmu_pebs_data_source_cmt(void) +{ + __intel_pmu_pebs_data_source_cmt(pebs_data_source); } static u64 precise_store_data(u64 status) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index d6de4487348c..c8ba2be7585d 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -1606,6 +1606,8 @@ void intel_pmu_pebs_data_source_grt(void); void intel_pmu_pebs_data_source_mtl(void); +void intel_pmu_pebs_data_source_cmt(void); + int intel_pmu_setup_lbr_filter(struct perf_event *event); void intel_pt_interrupt(void); From patchwork Mon May 22 11:30:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Kan" X-Patchwork-Id: 97344 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1389035vqo; Mon, 22 May 2023 04:57:55 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4N6dV6Ge50J1PK3W/vIMtnW0PicPmG+dX83v5jCo8D5qjAb87IGgbims1gfoLNqdZFVu2U X-Received: by 2002:a17:902:f547:b0:1ae:b2a:8ed7 with SMTP id h7-20020a170902f54700b001ae0b2a8ed7mr11419753plf.29.1684756675011; Mon, 22 May 2023 04:57:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684756674; cv=none; d=google.com; s=arc-20160816; b=Lk+i80m8Cn2YdracB3N/doMFCw7IUOKuqKxHwgMUw7TqVtP9DJREqlNFZ/AVbTkTPb PZ9Ho9kVZyrqzaDSM84e3dp2J2xh9fPvYWGqwbXZSQh+6XfBXzuipEF+Q4F7vmxQkJtI FQcB2JA34J7xSyg6PLZwJMU7wcXdfLtTSIkcc1k/YW5zzH7RzhF12JlOnSG9s26QLXsO QiE1CVjEFkpL7Wk0pJGPqNPZsooG/MP8gSOBRbXqTEzM+GomyRZM20Vv/V0Pni1xxl6K PaCCC/gvtHWEERMMTdFekEcO/Iu6dQ1o8tS6U1jh4N7sqOsuwPxiSn3lihxpe90CnXji sIcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Abodo+vgctSHQjrLTQk0AFENDU1x6gyE3OoGllgoxTc=; b=xMiCn15VrE0mePxmWC10mdy/3vamLI4PF37aDomC20Y01snVmpGlifHdjHQ+jzN1Ep AFEMDWPdV5R6ze+K3XAhx5d+c3Uv7liDvtRI1S4k1YYVRRp9hPw86dyNOzAB4UcKIORM qhlsRWXMq5WlbUWiNk41WxI3jwK0+m4ZAAUG+p3xq/5tVCVHZdVShtolm06cpWctgwqZ kGYGD47nw609ioXdmYZwiNSEWg1lJHwRO0JH/v1LAp7VH1/7MsjVUYIttxFGVpiTyyZK SLC+jlkipavnIpcjO0yW2YGzeRtCbCf4iEFagTIAZ0Vkt8ydYp97INPgxNaUPegm4ePs 37AA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=nC4RtKEL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o5-20020a170903210500b001a6d4eab490si4376878ple.63.2023.05.22.04.57.42; Mon, 22 May 2023 04:57:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=nC4RtKEL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232879AbjEVLbQ (ORCPT + 99 others); Mon, 22 May 2023 07:31:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230523AbjEVLbM (ORCPT ); Mon, 22 May 2023 07:31:12 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 84F29102 for ; Mon, 22 May 2023 04:30:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684755059; x=1716291059; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pDJxX0xEINN52I76eruquUlqRmcdL87f6osPOtH4oY0=; b=nC4RtKELTVRjBE0mV7ekWchr1Nl8wAHQqVRWTkLt9Vyb5F9evbE2IXaT ooe3BCxNVgJZd0DFFx/gh7snnMHJkMNBkxTGnPJvB9iufpw0iqpGdCVhT rApZOi9yQux4eWFG/Zspd7O2SUex6Rmm73Bk1pkxkWWGPhwe2YifYRg5o GvN8hQXM40+ACMEK2eKq+dzllUk/x2SaLH/2EAtB1Wkk1OkzJozAg+VSP Lrg5jJR7v9Mv30q7ZzmE5eJtw/Ba8t4Dzk/8/kMM8LVsnhVCMM8FKhXys bYV30fIGMR9nUHCEb6q5p0QP1QxIYGypD081ycKZk4kJPVNWeY8x4KJof w==; X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="416356739" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="416356739" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 May 2023 04:30:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="703468252" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="703468252" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orsmga002.jf.intel.com with ESMTP; 22 May 2023 04:30:58 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com, ak@linux.intel.com, eranian@google.com, alexey.v.bayduraev@linux.intel.com, tinghao.zhang@intel.com, Kan Liang , Sandipan Das , Ravi Bangoria , Athira Rajeev Subject: [PATCH V2 2/6] perf: Add branch stack extension Date: Mon, 22 May 2023 04:30:36 -0700 Message-Id: <20230522113040.2329924-2-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230522113040.2329924-1-kan.liang@linux.intel.com> References: <20230522113040.2329924-1-kan.liang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766595414929639699?= X-GMAIL-MSGID: =?utf-8?q?1766595414929639699?= From: Kan Liang Currently, the extra information of a branch entry is stored in a u64 space. With more and more information added, the space is running out. For example, the information of occurrences of events will be added for each branch. Add an extension space to record the new information for each branch entry. The space is appended after the struct perf_branch_stack. Add a bit in struct perf_branch_entry to indicate whether the extra information is included. Reviewed-by: Andi Kleen Signed-off-by: Kan Liang Cc: Sandipan Das Cc: Ravi Bangoria Cc: Athira Rajeev Reviewed-by: Sandipan Das --- New patch - Introduce a generic extension space which can be used to store the LBR event information for Intel. It can also be used by other ARCHs for the other purpose. - Add a new bit in struct perf_branch_entry to indicate whether the extra information is included. arch/powerpc/perf/core-book3s.c | 2 +- arch/x86/events/amd/core.c | 2 +- arch/x86/events/intel/core.c | 2 +- arch/x86/events/intel/ds.c | 4 ++-- include/linux/perf_event.h | 18 +++++++++++++++++- include/uapi/linux/perf_event.h | 4 +++- kernel/events/core.c | 5 +++++ 7 files changed, 30 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index 8c1f7def596e..3c14596bbfaf 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2313,7 +2313,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val, struct cpu_hw_events *cpuhw; cpuhw = this_cpu_ptr(&cpu_hw_events); power_pmu_bhrb_read(event, cpuhw); - perf_sample_save_brstack(&data, event, &cpuhw->bhrb_stack); + perf_sample_save_brstack(&data, event, &cpuhw->bhrb_stack, NULL); } if (event->attr.sample_type & PERF_SAMPLE_DATA_SRC && diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index bccea57dee81..facee84aeecb 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -930,7 +930,7 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs) continue; if (has_branch_stack(event)) - perf_sample_save_brstack(&data, event, &cpuc->lbr_stack); + perf_sample_save_brstack(&data, event, &cpuc->lbr_stack, NULL); if (perf_event_overflow(event, &data, regs)) x86_pmu_stop(event, 0); diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index ba2a971e6b8a..21566f66bfd8 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3048,7 +3048,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status) perf_sample_data_init(&data, 0, event->hw.last_period); if (has_branch_stack(event)) - perf_sample_save_brstack(&data, event, &cpuc->lbr_stack); + perf_sample_save_brstack(&data, event, &cpuc->lbr_stack, NULL); if (perf_event_overflow(event, &data, regs)) x86_pmu_stop(event, 0); diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 608e220e46aa..3f16e95e99dd 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1747,7 +1747,7 @@ static void setup_pebs_fixed_sample_data(struct perf_event *event, setup_pebs_time(event, data, pebs->tsc); if (has_branch_stack(event)) - perf_sample_save_brstack(data, event, &cpuc->lbr_stack); + perf_sample_save_brstack(data, event, &cpuc->lbr_stack, NULL); } static void adaptive_pebs_save_regs(struct pt_regs *regs, @@ -1904,7 +1904,7 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event, if (has_branch_stack(event)) { intel_pmu_store_pebs_lbrs(lbr); - perf_sample_save_brstack(data, event, &cpuc->lbr_stack); + perf_sample_save_brstack(data, event, &cpuc->lbr_stack, NULL); } } diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index d5628a7b5eaa..e2e04dc39199 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -126,6 +126,16 @@ struct perf_branch_stack { struct perf_branch_entry entries[]; }; +/* + * The extension space is appended after the struct perf_branch_stack. + * It is used to store the extra data of each branch, e.g., + * the occurrences of events since the last branch entry for Intel LBR. + */ +struct perf_branch_stack_ext { + __u64 nr; + __u64 data[]; +}; + struct task_struct; /* @@ -1161,6 +1171,7 @@ struct perf_sample_data { struct perf_callchain_entry *callchain; struct perf_raw_record *raw; struct perf_branch_stack *br_stack; + struct perf_branch_stack_ext *br_stack_ext; union perf_sample_weight weight; union perf_mem_data_src data_src; u64 txn; @@ -1237,7 +1248,8 @@ static inline void perf_sample_save_raw_data(struct perf_sample_data *data, static inline void perf_sample_save_brstack(struct perf_sample_data *data, struct perf_event *event, - struct perf_branch_stack *brs) + struct perf_branch_stack *brs, + struct perf_branch_stack_ext *brs_ext) { int size = sizeof(u64); /* nr */ @@ -1245,7 +1257,11 @@ static inline void perf_sample_save_brstack(struct perf_sample_data *data, size += sizeof(u64); size += brs->nr * sizeof(struct perf_branch_entry); + if (brs_ext) + size += (1 + brs_ext->nr) * sizeof(u64); + data->br_stack = brs; + data->br_stack_ext = brs_ext; data->dyn_size += size; data->sample_flags |= PERF_SAMPLE_BRANCH_STACK; } diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 37675437b768..1b3b90965b6b 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -1410,6 +1410,7 @@ union perf_mem_data_src { * cycles: cycles from last branch (or 0 if not supported) * type: branch type * spec: branch speculation info (or 0 if not supported) + * ext: has extension space for extra info (or 0 if not supported) */ struct perf_branch_entry { __u64 from; @@ -1423,7 +1424,8 @@ struct perf_branch_entry { spec:2, /* branch speculation info */ new_type:4, /* additional branch type */ priv:3, /* privilege level */ - reserved:31; + ext:1, /* has extension */ + reserved:30; }; union perf_sample_weight { diff --git a/kernel/events/core.c b/kernel/events/core.c index 435815d3be3f..dfd6703139a1 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7324,6 +7324,11 @@ void perf_output_sample(struct perf_output_handle *handle, if (branch_sample_hw_index(event)) perf_output_put(handle, data->br_stack->hw_idx); perf_output_copy(handle, data->br_stack->entries, size); + if (data->br_stack_ext) { + size = data->br_stack_ext->nr * sizeof(u64); + perf_output_put(handle, data->br_stack_ext->nr); + perf_output_copy(handle, data->br_stack_ext->data, size); + } } else { /* * we always store at least the value of nr From patchwork Mon May 22 11:30:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Kan" X-Patchwork-Id: 97338 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1388477vqo; Mon, 22 May 2023 04:56:49 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5cMLDsFD5DII58QHB8kdPM/67gczhNT3jed+S8DkzIv8p9OktmO7Qny6+K+p6DT0AuFWwg X-Received: by 2002:a17:903:188:b0:1ae:600d:3d07 with SMTP id z8-20020a170903018800b001ae600d3d07mr12675104plg.4.1684756609276; Mon, 22 May 2023 04:56:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684756609; cv=none; d=google.com; s=arc-20160816; b=0h7AqRLsMfHqPWSz4JjcJV0ajQYtuLPht7O5ylOZQ4+Op/H/3C7I2FBJcedD1KoHqD W8+72Ze+OxIAQReMdDwL7HjxyepAlulefjpapH58Dk8/TQRDxwQ7/Mh9vhZ6vtwoSOOF oefuX56NO2kUnUDJ8eXWAwckTMw7xpP9kkN3McceVRutszSRY2BQZuxT7aACBqHM9pF0 JaPqDe50Vn8WMo1y2Qb4pEp21D/Z9SDO2N0yfXOt00fKPZ9zGVyEf91DCpUBgD6qJvMc A/dGu8T3Dqz4PPJXr7PyfC/KvBJS/3qNdMujWaOnrMdtgMW0ZE0/g3oIXjYPtsQczEEz TsUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ShXvPQvYVS8ElWKlmzeQ/V88mNwFS4Ck9CgNTxbKjtY=; b=PyaqYeexMcv9Qw3SkO2zQGaGI38J6gAeE14qa28s4AhPnTvPgPTwg582c5XsSM8Khp LCpCtjga9XyOItSRTcAdpOTrE09wUtbFBNi0pfwfWyqUSvY9y7ZfRS/fikshSe+t1qtF AEmrRpnSv2D3TGvOsOgq4ahgnRXRNTmSaQ1SixouHf/dOMTa9rbD4Z2fNwKPmxfQtrVW qQtB/FiJAGlEHnynJbUVZi9zGlBC0UUxjAGDsKCS9rEgx2JidYW0PeOy36kkgrBddBbc ZENxhwoExUkzyOKNOof/beCpizRARx06038SfT/Z3yqVNYHTGnn+A/g0wf2eJj6l0jRD FM0w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=g3n1EfO4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v5-20020a170902b7c500b001a6ade4c8c2si4398681plz.142.2023.05.22.04.56.34; Mon, 22 May 2023 04:56:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=g3n1EfO4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232891AbjEVLbU (ORCPT + 99 others); Mon, 22 May 2023 07:31:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232747AbjEVLbN (ORCPT ); Mon, 22 May 2023 07:31:13 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4555F9 for ; Mon, 22 May 2023 04:31:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684755060; x=1716291060; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UbeQmBAHRioVJIl/LEdjqpc4KZUNiSUTMwnWy/W3vOw=; b=g3n1EfO43c5MXZZHKlNF0DeiwhkUWqVtZiIZpOxLCcGLmbGIZhKjTxuU hIUCRXms0PVhLj4W1odWc5DjTZ637eRLgY9oax/lJWQnta4pnoxW87Q4R NwkbEXU0i7Ho38TJWmEwNLP4K2+obc6RaBKD+RsDSCd4VbVSbm/7VGOde Jkcnv0Ym5x4RyQl4k+50FAjzrWMehndxYSZpXZipqAxtrsH8ngivYFuVE IXy1JGffL19bKg5ymop5RpwVXqSRGTkqYBthNSy8kyS9YDpPNHj5FkQjN ig/WC9IsllwSJfeZLXg6q3SXWyP1tdScThCKCSNLsipvXX0GY+wKMVaeO g==; X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="416356755" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="416356755" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 May 2023 04:31:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="703468258" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="703468258" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orsmga002.jf.intel.com with ESMTP; 22 May 2023 04:30:59 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com, ak@linux.intel.com, eranian@google.com, alexey.v.bayduraev@linux.intel.com, tinghao.zhang@intel.com, Kan Liang Subject: [PATCH V2 3/6] perf: Support branch events Date: Mon, 22 May 2023 04:30:37 -0700 Message-Id: <20230522113040.2329924-3-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230522113040.2329924-1-kan.liang@linux.intel.com> References: <20230522113040.2329924-1-kan.liang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766595346332853303?= X-GMAIL-MSGID: =?utf-8?q?1766595346332853303?= From: Kan Liang With the cycle time information between branches, stalls can be easily observed. But it's difficult to explain what causes the long delay. Add a new branch sample type to indicate whether include occurrences of events in branch info. The information will be stored in the branch stack extension space. Reviewed-by: Andi Kleen Signed-off-by: Kan Liang --- Changes since V1: - Rename to PERF_SAMPLE_BRANCH_EVT_CNTRS - Drop the event ID sample type include/linux/perf_event.h | 4 ++++ include/uapi/linux/perf_event.h | 4 ++++ 2 files changed, 8 insertions(+) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index e2e04dc39199..823c6779a96d 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1137,6 +1137,10 @@ static inline bool branch_sample_priv(const struct perf_event *event) return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_PRIV_SAVE; } +static inline bool branch_sample_evt_cntrs(const struct perf_event *event) +{ + return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_EVT_CNTRS; +} struct perf_sample_data { /* diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 1b3b90965b6b..3911cf000e8a 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -204,6 +204,8 @@ enum perf_branch_sample_type_shift { PERF_SAMPLE_BRANCH_PRIV_SAVE_SHIFT = 18, /* save privilege mode */ + PERF_SAMPLE_BRANCH_EVT_CNTRS_SHIFT = 19, /* save occurrences of events on a branch */ + PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */ }; @@ -235,6 +237,8 @@ enum perf_branch_sample_type { PERF_SAMPLE_BRANCH_PRIV_SAVE = 1U << PERF_SAMPLE_BRANCH_PRIV_SAVE_SHIFT, + PERF_SAMPLE_BRANCH_EVT_CNTRS = 1U << PERF_SAMPLE_BRANCH_EVT_CNTRS_SHIFT, + PERF_SAMPLE_BRANCH_MAX = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT, }; From patchwork Mon May 22 11:30:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Kan" X-Patchwork-Id: 97325 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1384348vqo; Mon, 22 May 2023 04:49:06 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5yrywol58aJvzeCptwLfcLVMJFuN9f+Zk8D/ER01rklKAGVOf3ruTnPOpxPxvbfcM73LqP X-Received: by 2002:a05:6a00:1389:b0:643:b27f:6c43 with SMTP id t9-20020a056a00138900b00643b27f6c43mr15328029pfg.27.1684756145575; Mon, 22 May 2023 04:49:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684756145; cv=none; d=google.com; s=arc-20160816; b=k4TK5gQaUDxYXIoblQ7wGxP8GT4Ccf4n082eSWlI2z4WB5YTGGr26c38yOiPHXzwLF 750B+7Kl1qNYpqtHDfLWGO31XNgZBiv8UXrYeWje1XLKV1Pq5g+Au/9/booW66dIfOpb TDmsKOMG++2YimEfkTjPXY6l66jgzpMnd3oRCDsysNN3pvbMH7Ny0TIZ3EYbOVq2lqnL Bn6k4y/XgBOI8Poe1yrOyp8A8n+//yGZt3kSYHevQOLDVC9kLrAKNtnIV9TIz6+kG+sE 1ddVrMVAUs1mk/rX0uJYD7jipqwLxxiRsbxiS1NEWme230HNbannPqnAKXLelX4f1vNP BJNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0pHLr5k8pK6ajV7ynwi2tgQmgQx7YBcVSyHhq4YvrX0=; b=BecHqocEMPHKJdG55B0aZuHHW9Le+TOXQtKlKpCoCimNVfe2btDnCjX6esKfotJTB5 Rx8TUlWnk3kEXnNut8SbJpCoP1E0UMxPtfunerfGHcCQiCjJH/YZSkGNGjHiECCi0n2f +YH9/LooTHJzS5kHL9ZLynXlZhjhvcmXJpPCku8IfHd4bW7M0mdY4VwtC5nP06cu8auH vj4xFRZ0X7lx8hIqdrhGopyS5/KizDi8+JtDt0xAZQO1H7np8S64u1rVbpt0AYoqH5Ay apyxQFbSaUWOSDY8DTfPkIylPoZVcb5QUmyPIgRxM9HvXUV7IYHjyjB4zRXzU1Xo+oJk cGdg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="K289Q/Nb"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x16-20020aa79a50000000b0063b13efdd06si4590730pfj.345.2023.05.22.04.48.48; Mon, 22 May 2023 04:49:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="K289Q/Nb"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232940AbjEVLbc (ORCPT + 99 others); Mon, 22 May 2023 07:31:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233002AbjEVLbX (ORCPT ); Mon, 22 May 2023 07:31:23 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35E74185 for ; Mon, 22 May 2023 04:31:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684755069; x=1716291069; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Szl580wLFev4I9BvDvgwxbomxBHTQubRyY5TKKlei1g=; b=K289Q/NbktdweSVE+EYjNHlsZdWfpbfUi+aRXzhZNQSl0LDKjosDHh+f wPPKKu4dCrhFRSKJAed9hzwsNBPIwWqun09st+hPAbE1PqG1iKrEw+5/f Ml6YPqA44qq/UBTnLL5hY8wFuZqrF/rQWc3RKJYjkap5Fwm9TxeXxjvaX RC6HFBxRBX2JzbcL8BeUZljMe1qJGqZUOm6p3cBKD8XtiAeZ+F0zp8C89 09ss9N4XXFzZ0oOqYv881c7MUpqMX782uk5Zqj6pZTjhfffGBUwOj79+A mMJcXiVrSike0DkkAF0i+adO5fEc0Rqdo29s21J0AqZU44VbvUaL4GCZT g==; X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="416356764" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="416356764" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 May 2023 04:31:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="703468264" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="703468264" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orsmga002.jf.intel.com with ESMTP; 22 May 2023 04:31:00 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com, ak@linux.intel.com, eranian@google.com, alexey.v.bayduraev@linux.intel.com, tinghao.zhang@intel.com, Kan Liang Subject: [PATCH V2 4/6] perf/x86/intel: Support LBR event logging Date: Mon, 22 May 2023 04:30:38 -0700 Message-Id: <20230522113040.2329924-4-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230522113040.2329924-1-kan.liang@linux.intel.com> References: <20230522113040.2329924-1-kan.liang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766594859687110515?= X-GMAIL-MSGID: =?utf-8?q?1766594859687110515?= From: Kan Liang The LBR event logging introduces a per-counter indication of precise event occurrences in LBRs. It can provide a means to attribute exposed retirement latency to combinations of events across a block of instructions. It also provides a means of attributing Timed LBR latencies to events. The feature is first introduced on SRF/GRR. It is an enhancement of the ARCH LBR. It adds new fields in the LBR_INFO MSRs to log the occurrences of events on the GP counters. The information is displayed by the order of counters. The design proposed in this patch requires that the events which are logged must be in a group with the event that has LBR. If there are more than one LBR group, the event logging information only from the current group (overflowed) are stored for the perf tool, otherwise the perf tool cannot know which and when other groups are scheduled especially when multiplexing is triggered. The user can ensure it uses the maximum number of counters that support LBR info (4 by now) by making the group large enough. The HW only logs events by the order of counters. The order may be different from the order of enabling which the perf tool can understand. When parsing the information of each branch entry, convert the counter order to the enabled order, and store the enabled order in the extension space. Unconditionally reset LBRs for an LBR event group when it's deleted. The logged events' occurrences information is only valid for the current LBR group. If another LBR group is scheduled later, the information from the stale LBRs would be otherwise wrongly interpreted. Add a sanity check in intel_pmu_hw_config(). Disable the feature if other counter filters (inv, cmask, edge, in_tx) are set or LBR call stack mode is enabled. (For the LBR call stack mode, we cannot simply flush the LBR, since it will break the call stack. Also, there is no obvious usage with the call stack mode for now.) Adding a new event kernel flag, PERF_X86_EVENT_LBR_EVENT, to indicate the event whose occurrences information is recorded in the branch information in kernel. The event was marked by the PERF_SAMPLE_BRANCH_EVT_CNTRS bit in the attr struct. Any perf sample branch type will triggers branch stack setup. But the event itself doesn't require a branch stack setup. When initializing the event, clear the PERF_SAMPLE_BRANCH_EVT_CNTRS bit to avoid a branch stack setup. Add a SW bit LBR_EVENT_LOG_BIT to indicate a LBR event logging group. Users may not want to record the event occurrences of the event which collect LBR, e.g., -e "{cpu/E1,branch_type=any/,cpu/E2,branch_type=event/}". The PERF_X86_EVENT_LBR_EVENT may not be set for the LBR event. When saving the LBRs, a LBR flag is required to tell whether storing the event occurrences information into the extension space. Reviewed-by: Andi Kleen Signed-off-by: Kan Liang --- Changes since V1: - Using the enabling order. The kernel will change the order of counters to the order of enabling. - Remove the PERF_MAX_BRANCH_EVENTS. The max value should read from the CPUID enumeration. - The enabled order may be undetermined (multiple LBR groups). Only keep the event logging information for the current group. - Add a new event kernel flag, PERF_X86_EVENT_LBR_EVENT, to indicate the event whose occurrences information is recorded in the branch information in kernel. - Add a new LBR flag, LBR_EVENT_LOG_BIT, to indicate a LBR event logging group. arch/x86/events/core.c | 2 +- arch/x86/events/intel/core.c | 41 +++++++- arch/x86/events/intel/ds.c | 2 +- arch/x86/events/intel/lbr.c | 162 ++++++++++++++++++++++++++++- arch/x86/events/perf_event.h | 17 +++ arch/x86/events/perf_event_flags.h | 1 + arch/x86/include/asm/msr-index.h | 2 + arch/x86/include/asm/perf_event.h | 4 + include/linux/perf_event.h | 5 + 9 files changed, 228 insertions(+), 8 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index d096b04bf80e..2f1b161cb2bc 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -603,7 +603,7 @@ int x86_pmu_hw_config(struct perf_event *event) } } - if (event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_CALL_STACK) + if (branch_sample_call_stack(event)) event->attach_state |= PERF_ATTACH_TASK_DATA; /* diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 21566f66bfd8..ec3939fe9098 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2788,6 +2788,7 @@ static void intel_pmu_enable_fixed(struct perf_event *event) static void intel_pmu_enable_event(struct perf_event *event) { + u64 enable_mask = ARCH_PERFMON_EVENTSEL_ENABLE; struct hw_perf_event *hwc = &event->hw; int idx = hwc->idx; @@ -2796,8 +2797,10 @@ static void intel_pmu_enable_event(struct perf_event *event) switch (idx) { case 0 ... INTEL_PMC_IDX_FIXED - 1: + if (log_event_in_branch(event)) + enable_mask |= ARCH_PERFMON_EVENTSEL_LBR_LOG; intel_set_masks(event, idx); - __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE); + __x86_pmu_enable_event(hwc, enable_mask); break; case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1: case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END: @@ -3048,7 +3051,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status) perf_sample_data_init(&data, 0, event->hw.last_period); if (has_branch_stack(event)) - perf_sample_save_brstack(&data, event, &cpuc->lbr_stack, NULL); + intel_pmu_lbr_save_brstack(&data, cpuc, event); if (perf_event_overflow(event, &data, regs)) x86_pmu_stop(event, 0); @@ -3613,6 +3616,13 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx, if (cpuc->excl_cntrs) return intel_get_excl_constraints(cpuc, event, idx, c2); + /* The LBR event logging is only available for some counters. */ + if (log_event_in_branch(event)) { + c2 = dyn_constraint(cpuc, c2, idx); + c2->idxmsk64 &= x86_pmu.lbr_events; + c2->weight = hweight64(c2->idxmsk64); + } + return c2; } @@ -3871,6 +3881,17 @@ static inline bool intel_pmu_has_cap(struct perf_event *event, int idx) return test_bit(idx, (unsigned long *)&intel_cap->capabilities); } +static bool intel_pmu_needs_branch_stack(struct perf_event *event) +{ + /* NO LBR setup for a counting event */ + if (!is_sampling_event(event)) { + event->attr.branch_sample_type = 0; + return false; + } + + return needs_branch_stack(event); +} + static int intel_pmu_hw_config(struct perf_event *event) { int ret = x86_pmu_hw_config(event); @@ -3898,7 +3919,19 @@ static int intel_pmu_hw_config(struct perf_event *event) x86_pmu.pebs_aliases(event); } - if (needs_branch_stack(event)) { + if (branch_sample_evt_cntrs(event)) { + if (!(x86_pmu.flags & PMU_FL_LBR_EVENT) || + (event->attr.config & ~INTEL_ARCH_EVENT_MASK)) + return -EINVAL; + + ret = intel_pmu_setup_lbr_event(event); + if (ret) + return ret; + + event->hw.flags |= PERF_X86_EVENT_LBR_EVENT; + } + + if (intel_pmu_needs_branch_stack(event)) { ret = intel_pmu_setup_lbr_filter(event); if (ret) return ret; @@ -4549,7 +4582,7 @@ int intel_cpuc_prepare(struct cpu_hw_events *cpuc, int cpu) goto err; } - if (x86_pmu.flags & (PMU_FL_EXCL_CNTRS | PMU_FL_TFA)) { + if (x86_pmu.flags & (PMU_FL_EXCL_CNTRS | PMU_FL_TFA | PMU_FL_LBR_EVENT)) { size_t sz = X86_PMC_IDX_MAX * sizeof(struct event_constraint); cpuc->constraint_list = kzalloc_node(sz, GFP_KERNEL, cpu_to_node(cpu)); diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 3f16e95e99dd..47c0ecbc301d 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1904,7 +1904,7 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event, if (has_branch_stack(event)) { intel_pmu_store_pebs_lbrs(lbr); - perf_sample_save_brstack(data, event, &cpuc->lbr_stack, NULL); + intel_pmu_lbr_save_brstack(data, cpuc, event); } } diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index c3b0d15a9841..6ee9d9e88586 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -29,7 +29,13 @@ * the actual MSR. But it helps the constraint perf code to understand * that this is a separate configuration. */ -#define LBR_NO_INFO_BIT 63 /* don't read LBR_INFO. */ +#define LBR_NO_INFO_BIT 63 /* don't read LBR_INFO. */ +/* + * Indicate a LBR event logging group. + * The event logging feature is only available for the ARCH LBR, + * while the NO INFO is only applied for the legacy LBR. Reuse the bit. + */ +#define LBR_EVENT_LOG_BIT 63 #define LBR_KERNEL (1 << LBR_KERNEL_BIT) #define LBR_USER (1 << LBR_USER_BIT) @@ -42,6 +48,7 @@ #define LBR_FAR (1 << LBR_FAR_BIT) #define LBR_CALL_STACK (1 << LBR_CALL_STACK_BIT) #define LBR_NO_INFO (1ULL << LBR_NO_INFO_BIT) +#define LBR_EVENT_LOG (1ULL << LBR_EVENT_LOG_BIT) #define LBR_PLM (LBR_KERNEL | LBR_USER) @@ -676,6 +683,21 @@ void intel_pmu_lbr_del(struct perf_event *event) WARN_ON_ONCE(cpuc->lbr_users < 0); WARN_ON_ONCE(cpuc->lbr_pebs_users < 0); perf_sched_cb_dec(event->pmu); + + /* + * The logged occurrences information is only valid for the + * current LBR group. If another LBR group is scheduled in + * later, the information from the stale LBRs will be wrongly + * interpreted. Reset the LBRs here. + * For the context switch, the LBR will be unconditionally + * flushed when a new task is scheduled in. If both the new task + * and the old task are monitored by a LBR event group. The + * reset here is redundant. But the extra reset doesn't impact + * the functionality. It's hard to distinguish the above case. + * Keep the unconditionally reset for a LBR event group for now. + */ + if (intel_pmu_lbr_has_event_log(cpuc)) + intel_pmu_lbr_reset(); } static inline bool vlbr_exclude_host(void) @@ -866,6 +888,18 @@ static __always_inline u16 get_lbr_cycles(u64 info) return cycles; } +static __always_inline void get_lbr_events(struct cpu_hw_events *cpuc, + int i, u64 info) +{ + /* + * The later code will decide what content can be disclosed + * to the perf tool. It's no harmful to unconditionally update + * the cpuc->lbr_events. + * Pleae see intel_pmu_lbr_event_reorder() + */ + cpuc->lbr_events[i] = info & LBR_INFO_EVENTS; +} + static void intel_pmu_store_lbr(struct cpu_hw_events *cpuc, struct lbr_entry *entries) { @@ -898,11 +932,73 @@ static void intel_pmu_store_lbr(struct cpu_hw_events *cpuc, e->abort = !!(info & LBR_INFO_ABORT); e->cycles = get_lbr_cycles(info); e->type = get_lbr_br_type(info); + + get_lbr_events(cpuc, i, info); } cpuc->lbr_stack.nr = i; } +#define ARCH_LBR_EVENT_LOG_WIDTH 2 +#define ARCH_LBR_EVENT_LOG_MASK 0x3 + +static __always_inline void intel_pmu_update_lbr_event(u64 *lbr_events, int idx, int pos) +{ + u64 logs = *lbr_events >> (LBR_INFO_EVENTS_OFFSET + + idx * ARCH_LBR_EVENT_LOG_WIDTH); + + logs &= ARCH_LBR_EVENT_LOG_MASK; + *lbr_events |= logs << (pos * ARCH_LBR_EVENT_LOG_WIDTH); +} + +/* + * The enabled order may be different from the counter order. + * Update the lbr_events with the enabled order. + */ +static void intel_pmu_lbr_event_reorder(struct cpu_hw_events *cpuc, + struct perf_event *event) +{ + int i, j, pos = 0, enabled[X86_PMC_IDX_MAX]; + struct perf_event *leader, *sibling; + + leader = event->group_leader; + if (log_event_in_branch(event)) + enabled[pos++] = leader->hw.idx; + + for_each_sibling_event(sibling, leader) { + if (!log_event_in_branch(sibling)) + continue; + enabled[pos++] = sibling->hw.idx; + } + + if (!pos) { + cpuc->lbr_stack_ext.nr = 0; + return; + } + + cpuc->lbr_stack_ext.nr = cpuc->lbr_stack.nr; + for (i = 0; i < cpuc->lbr_stack_ext.nr; i++) { + cpuc->lbr_entries[i].ext = true; + + for (j = 0; j < pos; j++) + intel_pmu_update_lbr_event(&cpuc->lbr_events[i], enabled[j], j); + + /* Clear the original counter order */ + cpuc->lbr_events[i] &= ~LBR_INFO_EVENTS; + } +} + +void intel_pmu_lbr_save_brstack(struct perf_sample_data *data, + struct cpu_hw_events *cpuc, + struct perf_event *event) +{ + if (!intel_pmu_lbr_has_event_log(cpuc)) + perf_sample_save_brstack(data, event, &cpuc->lbr_stack, NULL); + + intel_pmu_lbr_event_reorder(cpuc, event); + perf_sample_save_brstack(data, event, &cpuc->lbr_stack, &cpuc->lbr_stack_ext); +} + static void intel_pmu_arch_lbr_read(struct cpu_hw_events *cpuc) { intel_pmu_store_lbr(cpuc, NULL); @@ -1045,6 +1141,10 @@ static int intel_pmu_setup_hw_lbr_filter(struct perf_event *event) * Enable the branch type by default for the Arch LBR. */ reg->reg |= X86_BR_TYPE_SAVE; + + if (log_event_in_branch(event)) + reg->config |= LBR_EVENT_LOG; + return 0; } @@ -1091,6 +1191,54 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event) return ret; } +bool intel_pmu_lbr_has_event_log(struct cpu_hw_events *cpuc) +{ + return cpuc->lbr_sel && (cpuc->lbr_sel->config & LBR_EVENT_LOG); +} + +int intel_pmu_setup_lbr_event(struct perf_event *event) +{ + struct perf_event *leader, *sibling; + + /* + * The event logging is not supported in the call stack mode + * yet, since we cannot simply flush the LBR during e.g., + * multiplexing. Also, there is no obvious usage with the call + * stack mode. Simply forbids it for now. + * + * If any events in the group which enable the LBR event logging + * feature, mark it as a LBR event logging group. + */ + leader = event->group_leader; + if (branch_sample_call_stack(leader)) + return -EINVAL; + if (leader->hw.branch_reg.idx == EXTRA_REG_LBR) + leader->hw.branch_reg.config |= LBR_EVENT_LOG; + + for_each_sibling_event(sibling, leader) { + if (branch_sample_call_stack(sibling)) + return -EINVAL; + if (sibling->hw.branch_reg.idx == EXTRA_REG_LBR) + sibling->hw.branch_reg.config |= LBR_EVENT_LOG; + } + + /* + * The PERF_SAMPLE_BRANCH_EVT_CNTRS is used to mark an event + * whose occurrences information should be recorded in the + * branch information. + * Only applying the PERF_SAMPLE_BRANCH_EVT_CNTRS doesn't + * require any branch stack setup. Clear the bit to avoid + * any branch stack setup. + */ + if (event->attr.branch_sample_type & + ~(PERF_SAMPLE_BRANCH_EVT_CNTRS | PERF_SAMPLE_BRANCH_PLM_ALL)) + event->attr.branch_sample_type &= ~PERF_SAMPLE_BRANCH_EVT_CNTRS; + else + event->attr.branch_sample_type = 0; + + return 0; +} + enum { ARCH_LBR_BR_TYPE_JCC = 0, ARCH_LBR_BR_TYPE_NEAR_IND_JMP = 1, @@ -1173,14 +1321,20 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc) for (i = 0; i < cpuc->lbr_stack.nr; ) { if (!cpuc->lbr_entries[i].from) { j = i; - while (++j < cpuc->lbr_stack.nr) + while (++j < cpuc->lbr_stack.nr) { cpuc->lbr_entries[j-1] = cpuc->lbr_entries[j]; + if (cpuc->lbr_stack_ext.nr) + cpuc->lbr_events[j-1] = cpuc->lbr_events[j]; + } cpuc->lbr_stack.nr--; if (!cpuc->lbr_entries[i].from) continue; } i++; } + + if (cpuc->lbr_stack_ext.nr) + cpuc->lbr_stack_ext.nr = cpuc->lbr_stack.nr; } void intel_pmu_store_pebs_lbrs(struct lbr_entry *lbr) @@ -1525,8 +1679,12 @@ void __init intel_pmu_arch_lbr_init(void) x86_pmu.lbr_mispred = ecx.split.lbr_mispred; x86_pmu.lbr_timed_lbr = ecx.split.lbr_timed_lbr; x86_pmu.lbr_br_type = ecx.split.lbr_br_type; + x86_pmu.lbr_events = ecx.split.lbr_events; x86_pmu.lbr_nr = lbr_nr; + if (!!x86_pmu.lbr_events) + x86_pmu.flags |= PMU_FL_LBR_EVENT; + if (x86_pmu.lbr_mispred) static_branch_enable(&x86_lbr_mispred); if (x86_pmu.lbr_timed_lbr) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index c8ba2be7585d..70207ac8e193 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -283,6 +283,8 @@ struct cpu_hw_events { int lbr_pebs_users; struct perf_branch_stack lbr_stack; struct perf_branch_entry lbr_entries[MAX_LBR_ENTRIES]; + struct perf_branch_stack_ext lbr_stack_ext; + u64 lbr_events[MAX_LBR_ENTRIES]; union { struct er_account *lbr_sel; struct er_account *lbr_ctl; @@ -881,6 +883,7 @@ struct x86_pmu { unsigned int lbr_mispred:1; unsigned int lbr_timed_lbr:1; unsigned int lbr_br_type:1; + unsigned int lbr_events:4; void (*lbr_reset)(void); void (*lbr_read)(struct cpu_hw_events *cpuc); @@ -1005,6 +1008,7 @@ do { \ #define PMU_FL_INSTR_LATENCY 0x80 /* Support Instruction Latency in PEBS Memory Info Record */ #define PMU_FL_MEM_LOADS_AUX 0x100 /* Require an auxiliary event for the complete memory info */ #define PMU_FL_RETIRE_LATENCY 0x200 /* Support Retire Latency in PEBS */ +#define PMU_FL_LBR_EVENT 0x400 /* Support LBR event logging */ #define EVENT_VAR(_id) event_attr_##_id #define EVENT_PTR(_id) &event_attr_##_id.attr.attr @@ -1457,6 +1461,11 @@ static __always_inline void __intel_pmu_lbr_disable(void) wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctl); } +static __always_inline bool log_event_in_branch(struct perf_event *event) +{ + return event->hw.flags & PERF_X86_EVENT_LBR_EVENT; +} + int intel_pmu_save_and_restart(struct perf_event *event); struct event_constraint * @@ -1545,6 +1554,14 @@ void intel_pmu_store_pebs_lbrs(struct lbr_entry *lbr); void intel_ds_init(void); +bool intel_pmu_lbr_has_event_log(struct cpu_hw_events *cpuc); + +void intel_pmu_lbr_save_brstack(struct perf_sample_data *data, + struct cpu_hw_events *cpuc, + struct perf_event *event); + +int intel_pmu_setup_lbr_event(struct perf_event *event); + void intel_pmu_lbr_swap_task_ctx(struct perf_event_pmu_context *prev_epc, struct perf_event_pmu_context *next_epc); diff --git a/arch/x86/events/perf_event_flags.h b/arch/x86/events/perf_event_flags.h index 1dc19b9b4426..03e27af123af 100644 --- a/arch/x86/events/perf_event_flags.h +++ b/arch/x86/events/perf_event_flags.h @@ -20,3 +20,4 @@ PERF_ARCH(TOPDOWN, 0x04000) /* Count Topdown slots/metrics events */ PERF_ARCH(PEBS_STLAT, 0x08000) /* st+stlat data address sampling */ PERF_ARCH(AMD_BRS, 0x10000) /* AMD Branch Sampling */ PERF_ARCH(PEBS_LAT_HYBRID, 0x20000) /* ld and st lat for hybrid */ +PERF_ARCH(LBR_EVENT, 0x40000) /* log the event in LBR */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index ad35355ee43e..b845eeb527ef 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -222,6 +222,8 @@ #define LBR_INFO_CYCLES 0xffff #define LBR_INFO_BR_TYPE_OFFSET 56 #define LBR_INFO_BR_TYPE (0xfull << LBR_INFO_BR_TYPE_OFFSET) +#define LBR_INFO_EVENTS_OFFSET 32 +#define LBR_INFO_EVENTS (0xffull << LBR_INFO_EVENTS_OFFSET) #define MSR_ARCH_LBR_CTL 0x000014ce #define ARCH_LBR_CTL_LBREN BIT(0) diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index 8fc15ed5e60b..2ae60c378e3a 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -31,6 +31,7 @@ #define ARCH_PERFMON_EVENTSEL_ENABLE (1ULL << 22) #define ARCH_PERFMON_EVENTSEL_INV (1ULL << 23) #define ARCH_PERFMON_EVENTSEL_CMASK 0xFF000000ULL +#define ARCH_PERFMON_EVENTSEL_LBR_LOG (1ULL << 35) #define HSW_IN_TX (1ULL << 32) #define HSW_IN_TX_CHECKPOINTED (1ULL << 33) @@ -203,6 +204,9 @@ union cpuid28_ecx { unsigned int lbr_timed_lbr:1; /* Branch Type Field Supported */ unsigned int lbr_br_type:1; + unsigned int reserved:13; + /* Event Logging Supported */ + unsigned int lbr_events:4; } split; unsigned int full; }; diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 823c6779a96d..618c0d8ce88f 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1142,6 +1142,11 @@ static inline bool branch_sample_evt_cntrs(const struct perf_event *event) return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_EVT_CNTRS; } +static inline bool branch_sample_call_stack(const struct perf_event *event) +{ + return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_CALL_STACK; +} + struct perf_sample_data { /* * Fields set by perf_sample_data_init() unconditionally, From patchwork Mon May 22 11:30:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Kan" X-Patchwork-Id: 97337 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1388428vqo; Mon, 22 May 2023 04:56:45 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6qA0BqArsOOk7L6QMCzDv857v7Red8ItboCLl44NStg2eL+j6F2KYr9QMAZ+iBxqk5fUgQ X-Received: by 2002:a17:90a:8a09:b0:253:3eb5:3ade with SMTP id w9-20020a17090a8a0900b002533eb53ademr11458977pjn.8.1684756604909; Mon, 22 May 2023 04:56:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684756604; cv=none; d=google.com; s=arc-20160816; b=rjyddBFOn/g+ftZfLGWBAhr9nUC8elomq9FawmYqVNJ7daAMRNZVoWvfNJvQcNLF6f qBYIE6WuEQe6W10j3CSo2z3um0dYYDWvzSng/vLU/O4gn+PeBEJ/T/cwJ5clFvdHRECL GyYCk2pojhPGlewRVjJYUaWMvxezvBZ4oWD3LaunldHuEeHgDwKq1RktqxEno2vpZhcm pSMI+NzRg1jnEyhiVKvI9Av2xpVSP26PvCVHgukEpZ023q2F86j62gruQzpHJ9lvyqGa rznS4RDgo2MhMaKeSHys1az74O43HDFTZCtRHgittUkAiKjhuTQsFHOvPXeFyX1NT0HT PkCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=6f4119/8K1qMBOuYARLMPCy/PKE/iNMxPJIzfAKZuh8=; b=kG43DHbjqgpwEsMgGcWb2HHdUzarK1M2PXQ0gKd1X+JD4hdFPBJU1z48pwtyQObu/q 1J4kCVvDgVj/FMa5ypyi2PtYMvYEGRwuO9rwQrO7Iw793XOEQs0Be0PgFPREFERIYHFE aJgaH1qayR/SVWVTUDAsQmxfxPkmUwHClLIE2FPNHPZISeO5i6eYKiIb8wEHd9eWod2/ X3yUheYBFdlf/dp6eSXhtq9hqidGSPYN2+9qeEw4o+pWyQv6ttXiJhjc0llLcLjyabwd oz1YLlKJ3iERSQmRUvbOAFzgeHPrIB4cy6ETc4YpzNeP1R36tGjEBaIEHkn+dyKjlMUo aLuQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=E6IEMLFx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fu12-20020a17090ad18c00b0024e117d6d7asi6803210pjb.110.2023.05.22.04.56.32; Mon, 22 May 2023 04:56:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=E6IEMLFx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233190AbjEVLbi (ORCPT + 99 others); Mon, 22 May 2023 07:31:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44222 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233014AbjEVLbX (ORCPT ); Mon, 22 May 2023 07:31:23 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EDDD9E0 for ; Mon, 22 May 2023 04:31:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684755072; x=1716291072; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=954JXwPXt2uxmtqFS4OWwF15sXfomT3M7aTOZ9Wr5II=; b=E6IEMLFxUuRcNf103mnpjmZMP3KmjuGIaoKuseJiaR6GIN/D4e5gqvEw bd4+Q/iLEiTdmbvozmBMivi/dPxbQwKu2HgvZy1/p4nI5TT9He38zFXvl Q2z+S22xInE3rC5PqUZNqVrssuVdL8rriKiSlb+gg+Ns/WQa0r1Kw7QdS OfICbFvwt4tp+qn1co4AVv0LcvqXbMT542HTIVp6INnsOZHVK/4A5n6TJ ZLe2Q8bEfJTugW8NpBKcDHENOi/Hcf3ikVVYfAv/dzhkcedqXE6Hc7gv9 ZZn8En4n5J/yBP5g38DeE6tQGv+e2BAutc3v6cVIUMdZpUKgvjydy/6Jn A==; X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="416356773" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="416356773" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 May 2023 04:31:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="703468275" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="703468275" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orsmga002.jf.intel.com with ESMTP; 22 May 2023 04:31:02 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com, ak@linux.intel.com, eranian@google.com, alexey.v.bayduraev@linux.intel.com, tinghao.zhang@intel.com, Kan Liang Subject: [PATCH V2 5/6] tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernel Date: Mon, 22 May 2023 04:30:39 -0700 Message-Id: <20230522113040.2329924-5-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230522113040.2329924-1-kan.liang@linux.intel.com> References: <20230522113040.2329924-1-kan.liang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766595341596056991?= X-GMAIL-MSGID: =?utf-8?q?1766595341596056991?= From: Kan Liang Sync the new sample type and extension bit for the branch event feature. Reviewed-by: Andi Kleen Signed-off-by: Kan Liang --- Changes since V1: - Rename to PERF_SAMPLE_BRANCH_EVT_CNTRS - Drop the event ID sample type - Add extension bit tools/include/uapi/linux/perf_event.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 37675437b768..3911cf000e8a 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -204,6 +204,8 @@ enum perf_branch_sample_type_shift { PERF_SAMPLE_BRANCH_PRIV_SAVE_SHIFT = 18, /* save privilege mode */ + PERF_SAMPLE_BRANCH_EVT_CNTRS_SHIFT = 19, /* save occurrences of events on a branch */ + PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */ }; @@ -235,6 +237,8 @@ enum perf_branch_sample_type { PERF_SAMPLE_BRANCH_PRIV_SAVE = 1U << PERF_SAMPLE_BRANCH_PRIV_SAVE_SHIFT, + PERF_SAMPLE_BRANCH_EVT_CNTRS = 1U << PERF_SAMPLE_BRANCH_EVT_CNTRS_SHIFT, + PERF_SAMPLE_BRANCH_MAX = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT, }; @@ -1410,6 +1414,7 @@ union perf_mem_data_src { * cycles: cycles from last branch (or 0 if not supported) * type: branch type * spec: branch speculation info (or 0 if not supported) + * ext: has extension space for extra info (or 0 if not supported) */ struct perf_branch_entry { __u64 from; @@ -1423,7 +1428,8 @@ struct perf_branch_entry { spec:2, /* branch speculation info */ new_type:4, /* additional branch type */ priv:3, /* privilege level */ - reserved:31; + ext:1, /* has extension */ + reserved:30; }; union perf_sample_weight { From patchwork Mon May 22 11:30:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Kan" X-Patchwork-Id: 97332 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1387722vqo; Mon, 22 May 2023 04:55:28 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7gUa+XP+yfZcbrSlvTTAkQRqCpYd9/sh+39hyip8pOwdDyLpLewNBmyA+xmsS5KCK81W1+ X-Received: by 2002:a17:90b:f91:b0:255:4f20:7ceb with SMTP id ft17-20020a17090b0f9100b002554f207cebmr4513626pjb.0.1684756527862; Mon, 22 May 2023 04:55:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684756527; cv=none; d=google.com; s=arc-20160816; b=d7eAiBPtS+PR+5fVEPRgqPwpG2C7MzmUdC/AjYjxur7o9eqZ6rxDGWMwXyx1hPs/ry GK03X52mH1odfM9Mo3Djs+GKI/mC/WyyXcfVveQUXrn6KsXbaLSGBrJ0z88/5GqUS6e4 HK20+yhVLsACeWd+kWCCnrqPk/Y91M0yNxyCgrhuJ9SfYHagH+MufJeSv9f+4SPCWIJH aq2QUPliQM5T2wv08lO5nbxSWVe1m23PXuq9101KaQF3mBFP+qFT9oVv7Hsdie7ftIOP PTq8n4TrkbLQWgwV1AIU3zJZtwjI6GEdhu3rgxYtfnlxjfvoH2W5grRqhcf52ZSlwn1O 5diQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=k85mmP28vERff1TRz+vLAyJu0jEhEh0/RO0s9AyXvJs=; b=YeHuKSDjWvjb75vCU24pqg0ohgcMhaTcTUT35ELzqiqKSBTlJ9nLu1NmWK1jHfOaL2 X9gi2HAHU1Ail5GMmY/Xdtkb1MnmQmiAeaTQSHlqDNHI05xamMQEXl+Wa7Pj/pTkVRo+ 8Ym8j+tq+oMLNIQCVoM5GQcXV5KWm4YmxLuvCutdYRck723sHgqALZXj11edjgTbZGQ4 cbPVmJyJ2ydTp2HmmePWLwZAXbkkIqOUq0y4abECyuIlGJJCzZxxG4Nt05kc9mOn3Acl iVT1k9I96cUd2/Mg1fCbfQAlVALufruxQAonWZhSY+xGhr/87fGet8RnJAwGzNPVlHoW u+nw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=moRljhg2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i10-20020a639d0a000000b005346c49e06fsi4491017pgd.845.2023.05.22.04.55.15; Mon, 22 May 2023 04:55:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=moRljhg2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233289AbjEVLbm (ORCPT + 99 others); Mon, 22 May 2023 07:31:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232946AbjEVLb2 (ORCPT ); Mon, 22 May 2023 07:31:28 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 787C5E9 for ; Mon, 22 May 2023 04:31:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684755073; x=1716291073; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=g4vus3dTWVAmpjb1PDMk1Af4mQaX4L+vKTTrLgQTJhk=; b=moRljhg2i7iOZP11zkpwWb4W2/VMeHdTdQoHsZNOQ8HUYd7R7SYJcAKV Mm74F4sPxrWAfqi2xmtrnsRrK87cR7X4CuswKC+fhKKxR79Nj0XWj0rEC GvhE67bWdUIxriNMLiGjMT0fU7pDud7Ti+ZdhoHg2tUWoNUOzFe1ebm+2 gkQx1iy0flN2p8fvtsGC7n7NILPZ999AwUdE3tM3Jq7mzc07a3f3rjNwH egoH/VJkeOieBGW4AdJoYisDKpy7pYiZtasRhN4r+5qSj7DLmnLA+tb+N wjPLYTJ4YCvp8fXa/uI5AMxRpY11A3xrsBE5JMN6aHPwOMbA6vmbDG4LE Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="416356783" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="416356783" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 May 2023 04:31:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10717"; a="703468279" X-IronPort-AV: E=Sophos;i="6.00,184,1681196400"; d="scan'208";a="703468279" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orsmga002.jf.intel.com with ESMTP; 22 May 2023 04:31:03 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com, ak@linux.intel.com, eranian@google.com, alexey.v.bayduraev@linux.intel.com, tinghao.zhang@intel.com, Kan Liang Subject: [PATCH V2 6/6] perf tools: Add branch event knob Date: Mon, 22 May 2023 04:30:40 -0700 Message-Id: <20230522113040.2329924-6-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230522113040.2329924-1-kan.liang@linux.intel.com> References: <20230522113040.2329924-1-kan.liang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766595261020079455?= X-GMAIL-MSGID: =?utf-8?q?1766595261020079455?= From: Kan Liang Add a new branch filter, "event", for the branch event option. It is used to mark the events which should be logged in the branch. If it is applied with the -j option, all the events should be logged in the branch. If the legacy kernel doesn't support the new branch sample type, switching off the branch event filter. The new extension space of each branch is dumped right after the regular branch stack information via perf report -D. Usage examples: perf record -e "{branch-instructions,branch-misses}:S" -j any,event Only the first event, branch-instructions, collect the LBR. Both branch-instructions and branch-misses are marked as logged events. The occurrences information of them can be found in the branch stack extension space of each branch. perf record -e "{cpu/branch-instructions,branch_type=any/, cpu/branch-misses,branch_type=event/}" Only the first event, branch-instructions, collect the LBR. Only the branch-misses event is marked as a logged event. Reviewed-by: Andi Kleen Signed-off-by: Kan Liang --- Notes: Since the new interfaces are still under review and may be changed later, the perf tool patch only provides minimum support for the current version. Once the interfaces are finalized, a more complete perf tool patch can be expected. Changes since V1: - Drop the support of the event ID sample type - Support the new branch stack extension tools/perf/Documentation/perf-record.txt | 4 +++ tools/perf/util/branch.h | 8 ++++- tools/perf/util/evsel.c | 39 ++++++++++++++++++++--- tools/perf/util/evsel.h | 6 ++++ tools/perf/util/parse-branch-options.c | 1 + tools/perf/util/perf_event_attr_fprintf.c | 1 + tools/perf/util/sample.h | 1 + tools/perf/util/session.c | 8 +++++ 8 files changed, 62 insertions(+), 6 deletions(-) diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index ff815c2f67e8..9183d9c414de 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -402,6 +402,10 @@ following filters are defined: 4th-Gen Xeon+ server), the save branch type is unconditionally enabled when the taken branch stack sampling is enabled. - priv: save privilege state during sampling in case binary is not available later + - event: save occurrences of the event since the last branch entry. Currently, the + feature is only supported by a newer CPU, e.g., Intel Sierra Forest and + later platforms. An error out is expected if it's used on the unsupported + kernel or CPUs. + The option requires at least one branch type among any, any_call, any_ret, ind_call, cond. diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h index e41bfffe2217..f765b05bbe5f 100644 --- a/tools/perf/util/branch.h +++ b/tools/perf/util/branch.h @@ -25,7 +25,8 @@ struct branch_flags { u64 spec:2; u64 new_type:4; u64 priv:3; - u64 reserved:31; + u64 ext:1; + u64 reserved:30; }; }; }; @@ -50,6 +51,11 @@ struct branch_stack { struct branch_entry entries[]; }; +struct branch_stack_ext { + u64 nr; + u64 data[]; +}; + /* * The hw_idx is only available when PERF_SAMPLE_BRANCH_HW_INDEX is applied. * Otherwise, the output format of a sample with branch stack is diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 51e8ce6edddc..19cc9272b669 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1850,6 +1850,8 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus, static void evsel__disable_missing_features(struct evsel *evsel) { + if (perf_missing_features.branch_event) + evsel->core.attr.branch_sample_type &= ~PERF_SAMPLE_BRANCH_EVT_CNTRS; if (perf_missing_features.read_lost) evsel->core.attr.read_format &= ~PERF_FORMAT_LOST; if (perf_missing_features.weight_struct) { @@ -1903,7 +1905,12 @@ bool evsel__detect_missing_features(struct evsel *evsel) * Must probe features in the order they were added to the * perf_event_attr interface. */ - if (!perf_missing_features.read_lost && + if (!perf_missing_features.branch_event && + (evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_EVT_CNTRS)) { + perf_missing_features.branch_event = true; + pr_debug2("switching off branch event support\n"); + return true; + } else if (!perf_missing_features.read_lost && (evsel->core.attr.read_format & PERF_FORMAT_LOST)) { perf_missing_features.read_lost = true; pr_debug2("switching off PERF_FORMAT_LOST support\n"); @@ -2339,7 +2346,8 @@ u64 evsel__bitfield_swap_branch_flags(u64 value) new_val |= bitfield_swap(value, 24, 2); new_val |= bitfield_swap(value, 26, 4); new_val |= bitfield_swap(value, 30, 3); - new_val |= bitfield_swap(value, 33, 31); + new_val |= bitfield_swap(value, 33, 1); + new_val |= bitfield_swap(value, 34, 30); } else { new_val = bitfield_swap(value, 63, 1); new_val |= bitfield_swap(value, 62, 1); @@ -2350,7 +2358,8 @@ u64 evsel__bitfield_swap_branch_flags(u64 value) new_val |= bitfield_swap(value, 38, 2); new_val |= bitfield_swap(value, 34, 4); new_val |= bitfield_swap(value, 31, 3); - new_val |= bitfield_swap(value, 0, 31); + new_val |= bitfield_swap(value, 30, 1); + new_val |= bitfield_swap(value, 0, 30); } return new_val; @@ -2550,7 +2559,8 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event, if (type & PERF_SAMPLE_BRANCH_STACK) { const u64 max_branch_nr = UINT64_MAX / sizeof(struct branch_entry); - struct branch_entry *e; + struct branch_entry *e, *e0; + bool has_ext = false; unsigned int i; OVERFLOW_CHECK_u64(array); @@ -2571,7 +2581,7 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event, */ e = (struct branch_entry *)&data->branch_stack->hw_idx; } - + e0 = e; if (swapped) { /* * struct branch_flag does not have endian @@ -2589,6 +2599,25 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event, OVERFLOW_CHECK(array, sz, max_size); array = (void *)array + sz; + + for (i = 0, e = e0; i < data->branch_stack->nr; i++, e++) { + if (e->flags.ext) { + has_ext = true; + break; + } + } + + if (has_ext) { + OVERFLOW_CHECK_u64(array); + + data->branch_stack_ext = (struct branch_stack_ext *)array++; + if (data->branch_stack_ext->nr > max_branch_nr) + return -EFAULT; + sz = data->branch_stack_ext->nr * sizeof(u64); + + OVERFLOW_CHECK(array, sz, max_size); + array = (void *)array + sz; + } } if (type & PERF_SAMPLE_REGS_USER) { diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index 24cb807ef6ce..aa666e24f8e6 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -187,6 +187,7 @@ struct perf_missing_features { bool code_page_size; bool weight_struct; bool read_lost; + bool branch_event; }; extern struct perf_missing_features perf_missing_features; @@ -473,6 +474,11 @@ static inline bool evsel__has_branch_hw_idx(const struct evsel *evsel) return evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX; } +static inline bool evsel__has_branch_evt_cntrs(const struct evsel *evsel) +{ + return evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_EVT_CNTRS; +} + static inline bool evsel__has_callchain(const struct evsel *evsel) { /* diff --git a/tools/perf/util/parse-branch-options.c b/tools/perf/util/parse-branch-options.c index fd67d204d720..ab5d6dabe659 100644 --- a/tools/perf/util/parse-branch-options.c +++ b/tools/perf/util/parse-branch-options.c @@ -36,6 +36,7 @@ static const struct branch_mode branch_modes[] = { BRANCH_OPT("stack", PERF_SAMPLE_BRANCH_CALL_STACK), BRANCH_OPT("hw_index", PERF_SAMPLE_BRANCH_HW_INDEX), BRANCH_OPT("priv", PERF_SAMPLE_BRANCH_PRIV_SAVE), + BRANCH_OPT("event", PERF_SAMPLE_BRANCH_EVT_CNTRS), BRANCH_END }; diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c index 7e5e7b30510d..3133a4f003eb 100644 --- a/tools/perf/util/perf_event_attr_fprintf.c +++ b/tools/perf/util/perf_event_attr_fprintf.c @@ -53,6 +53,7 @@ static void __p_branch_sample_type(char *buf, size_t size, u64 value) bit_name(COND), bit_name(CALL_STACK), bit_name(IND_JUMP), bit_name(CALL), bit_name(NO_FLAGS), bit_name(NO_CYCLES), bit_name(TYPE_SAVE), bit_name(HW_INDEX), bit_name(PRIV_SAVE), + bit_name(EVT_CNTRS), { .name = NULL, } }; #undef bit_name diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h index 33b08e0ac746..62abae1c9cd3 100644 --- a/tools/perf/util/sample.h +++ b/tools/perf/util/sample.h @@ -101,6 +101,7 @@ struct perf_sample { void *raw_data; struct ip_callchain *callchain; struct branch_stack *branch_stack; + struct branch_stack_ext *branch_stack_ext; struct regs_dump user_regs; struct regs_dump intr_regs; struct stack_dump user_stack; diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 749d5b5c135b..a1e303c2eaa8 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -1159,6 +1159,7 @@ static void callchain__printf(struct evsel *evsel, static void branch_stack__printf(struct perf_sample *sample, bool callstack) { struct branch_entry *entries = perf_sample__branch_entries(sample); + struct branch_stack_ext *branch_stack_ext = sample->branch_stack_ext; uint64_t i; if (!callstack) { @@ -1200,6 +1201,13 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack) } } } + + if (branch_stack_ext) { + printf("... branch stack ext: nr:%" PRIu64 "\n", sample->branch_stack_ext->nr); + for (i = 0; i < branch_stack_ext->nr; i++) { + printf("..... %2"PRIu64": %016" PRIx64 "\n", i, branch_stack_ext->data[i]); + } + } } static void regs_dump__printf(u64 mask, u64 *regs, const char *arch)