Message ID | 20230623182204.25199-2-tanmay@marvell.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp5972992vqr; Fri, 23 Jun 2023 11:46:50 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5mZ2uxuRy0XABT3gKAaIRpgf0RxJ7vWY5XdgVfemWnguFSU73dGfmes7BrNVldDnpr5KYS X-Received: by 2002:a17:902:dace:b0:1b6:783d:9ba7 with SMTP id q14-20020a170902dace00b001b6783d9ba7mr19364380plx.27.1687546010560; Fri, 23 Jun 2023 11:46:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687546010; cv=none; d=google.com; s=arc-20160816; b=x+AtK6AwZTqrEDFYHFVTBRPh0hu56RDa4KuE8sRA5bXsXanh3S9ZRCwe2snXiy00B2 DGHYooKX/9rLMkgwCqSQnMajxIv8rBgT4qZUkKLZIvysJoc3/3Ylj7GLd+/Svupl3hrj b0sQ41fGoU35/Ib4SJHRttQ+Y0rHWG8g/K/vSqjLpkp8psUR4TK5QcbEFL4YJ0+Z/OkX NL1kcgowNSmXE4zy2/rRZWyGAviawSdX+INFbKFczQO5bQ9DLVWpCOLpy1z5LBtz2vKx Lrvbfa2FqZ3L9TKeRzaajq2iXnGp849Mla+GpdTA6EnFJFAROb6muFetOmwBYqrmS/R3 tFLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=C+8iNnRd9R1P9pZPr3gdmOM33uf0tje+koUUR4wUYsM=; b=vnN4n8Bsfvxr8LAZqxye5YON4WDl4Cdx17oT8KbFDBEmWZq8eEm2OtQSZS9S43lUWq W/+LTIS1mXZehr9IR7nw5+tcX2jLuyw86fyioyURBk8wm2yk5MxiztC3vor9AjvJ6m9u +MVlDtAi9VnsUleNZ+dYuzj1Z878eRlX13FO8qJzWZE0Omo5Ltxkwn5NOzLDu5IqFaMv JzvnUHIiltqHG/20UZqT5iKDQS3DB7M/+ye9/lQZPJR73dwG6jDIXJgnuj3UBCn+VV0q H6tjbVUXypPwVEg4rDsDl10Zv/E/9KWDl400Z8aWajsYvbBu8o5YgJTViWkFaTyP2RsS TJ8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=fHFEkx0K; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z3-20020a170903018300b001b64bc274b5si4804335plg.137.2023.06.23.11.46.33; Fri, 23 Jun 2023 11:46:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=fHFEkx0K; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229446AbjFWSYc (ORCPT <rfc822;duw91626@gmail.com> + 99 others); Fri, 23 Jun 2023 14:24:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231873AbjFWSYb (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 23 Jun 2023 14:24:31 -0400 Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50CD02724; Fri, 23 Jun 2023 11:24:26 -0700 (PDT) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35NHWJIV005753; Fri, 23 Jun 2023 11:24:02 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=C+8iNnRd9R1P9pZPr3gdmOM33uf0tje+koUUR4wUYsM=; b=fHFEkx0KHGIMErnzR58luqEnRFq2+djJRkXcAqjB6hMK/tF2mEHxe+Lr7ytLzDvm6+/r lT4h4wUZBF8Gt62nq77gZ2g56RKt/UuRPGy4aFlBIJGZcWClaFci5qYRTohqnPB8KBLp MrDI952hBzS8Th23xnWqTyWF2aG7e2iNkeZ7QEZZogNVnW9GcDmu06tj2SAijXOTu9Xj CzPWt6XMUJvM7m0wdpKJQnfgM35JFsoHA6qbf+j+Ui7O0HRe6J97m46w76PyZ/WYYtoB jR44ZrRQNTv271w4CN0hJCVkaJiLEVD44rfYZ/ckszGIdwykcIJVjOIEavOG1kLIJA/m 7A== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3rcuqvbc0h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 23 Jun 2023 11:24:02 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Fri, 23 Jun 2023 11:24:00 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Fri, 23 Jun 2023 11:24:00 -0700 Received: from TJ-POWEREDGE.marvell.com (TJ-POWEREDGE.marvell.com [10.28.8.3]) by maili.marvell.com (Postfix) with ESMTP id 1F39E3F7045; Fri, 23 Jun 2023 11:23:54 -0700 (PDT) From: Tanmay Jagdale <tanmay@marvell.com> To: <john.g.garry@oracle.com>, <will@kernel.org>, <mike.leach@linaro.org>, <suzuki.poulose@arm.com>, <peterz@infradead.org>, <mingo@redhat.com>, <acme@kernel.org>, <mark.rutland@arm.com>, <irogers@google.com>, <adrian.hunter@intel.com> CC: <linux-arm-kernel@lists.infradead.org>, <linux-perf-users@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <sgoutham@marvell.com>, <gcherian@marvell.com>, <lcherian@marvell.com>, <areddy3@marvell.com>, Tanmay Jagdale <tanmay@marvell.com> Subject: [PATCH 1/1] perf: cs-etm: Fixes in instruction sample synthesis Date: Fri, 23 Jun 2023 23:52:04 +0530 Message-ID: <20230623182204.25199-2-tanmay@marvell.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230623182204.25199-1-tanmay@marvell.com> References: <20230623182204.25199-1-tanmay@marvell.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Proofpoint-ORIG-GUID: lRf39lGJPR6WXVfwyGbxk36MSepJ_T2X X-Proofpoint-GUID: lRf39lGJPR6WXVfwyGbxk36MSepJ_T2X X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-06-23_10,2023-06-22_02,2023-05-22_02 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769520245363546692?= X-GMAIL-MSGID: =?utf-8?q?1769520245363546692?= |
Series |
Fix Coresight instruction synthesis logic
|
|
Commit Message
Tanmay Jagdale
June 23, 2023, 6:22 p.m. UTC
The existing method of synthesizing instruction samples has the
following issues:
1. Non-branch instructions have mnemonics of branch instructions.
2. Branch target address is missing.
Set the sample flags only when we reach the last instruction in
the tidq (which would be a branch instruction) to solve issue 1).
To fix issue 2), start synthesizing the instructions from the
previous packet (tidq->prev_packet) instead of current packet
(tidq->packet). This way, it is easy to figure out the target
address of the branch instruction in tidq->prev_packet which
is the current packet's (tidq->packet) first executed instruction.
After the switch to processing the previous packet first, we no
longer need to swap the packets during cs_etm__flush().
Signed-off-by: Tanmay Jagdale <tanmay@marvell.com>
---
tools/perf/util/cs-etm.c | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)
Comments
On 23/06/2023 19:22, Tanmay Jagdale wrote: > The existing method of synthesizing instruction samples has the > following issues: > 1. Non-branch instructions have mnemonics of branch instructions. > 2. Branch target address is missing. > > Set the sample flags only when we reach the last instruction in > the tidq (which would be a branch instruction) to solve issue 1). > > To fix issue 2), start synthesizing the instructions from the > previous packet (tidq->prev_packet) instead of current packet > (tidq->packet). This way, it is easy to figure out the target > address of the branch instruction in tidq->prev_packet which > is the current packet's (tidq->packet) first executed instruction. > > After the switch to processing the previous packet first, we no > longer need to swap the packets during cs_etm__flush(). Hi Tanmay, I think the fix for setting the right flags and instruction type makes sense, but is it possible to do it without the change to swapping in cs_etm__flush() or some of the other changes to cs_etm__synth_instruction_sample()? I'm seeing some differences in the output related to the PID that's assigned to a sample and some of the addresses that aren't explained by the commit message. Also there is no corresponding change to cs_etm__synth_branch_sample(), which is also using prev_packet etc so I'm wondering if that's correct now without the swap? That function gets used with the default itrace options or itrace=b For example if I run 'perf script --itrace=i100ns' and diff the output before and after your change I see a difference even though branch and instruction info isn't printed, so I wouldn't expect to see any changes. This is on a systemwide recording of a system under load. Thanks James > > Signed-off-by: Tanmay Jagdale <tanmay@marvell.com> > --- > tools/perf/util/cs-etm.c | 32 +++++++++++++++++++++++++------- > 1 file changed, 25 insertions(+), 7 deletions(-) > > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c > index 91299cc56bf7..446e00d98fd5 100644 > --- a/tools/perf/util/cs-etm.c > +++ b/tools/perf/util/cs-etm.c > @@ -1418,10 +1418,26 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, > sample.stream_id = etmq->etm->instructions_id; > sample.period = period; > sample.cpu = tidq->packet->cpu; > - sample.flags = tidq->prev_packet->flags; > sample.cpumode = event->sample.header.misc; > > - cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample); > + cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->prev_packet, &sample); > + > + /* Populate branch target information only when we encounter > + * branch instruction, which is at the end of tidq->prev_packet. > + */ > + if (addr == (tidq->prev_packet->end_addr - 4)) { > + /* Update the perf_sample flags using the prev_packet > + * since that is the queue we are synthesizing. > + */ > + sample.flags = tidq->prev_packet->flags; > + > + /* The last instruction of the previous queue would be a > + * branch operation. Get the target of that branch by looking > + * into the first executed instruction of the current packet > + * queue. > + */ > + sample.addr = cs_etm__first_executed_instr(tidq->packet); > + } > > if (etm->synth_opts.last_branch) > sample.branch_stack = tidq->last_branch; > @@ -1641,7 +1657,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, > /* Get instructions remainder from previous packet */ > instrs_prev = tidq->period_instructions; > > - tidq->period_instructions += tidq->packet->instr_count; > + tidq->period_instructions += tidq->prev_packet->instr_count; > > /* > * Record a branch when the last instruction in > @@ -1721,8 +1737,11 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, > * been executed, but PC has not advanced to next > * instruction) > */ > + /* Get address from prev_packet since we are synthesizing > + * that in cs_etm__synth_instruction_sample() > + */ > addr = cs_etm__instr_addr(etmq, trace_chan_id, > - tidq->packet, offset - 1); > + tidq->prev_packet, offset - 1); > ret = cs_etm__synth_instruction_sample( > etmq, tidq, addr, > etm->instructions_sample_period); > @@ -1786,7 +1805,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq, > > /* Handle start tracing packet */ > if (tidq->prev_packet->sample_type == CS_ETM_EMPTY) > - goto swap_packet; > + goto reset_last_br; > > if (etmq->etm->synth_opts.last_branch && > etmq->etm->synth_opts.instructions && > @@ -1822,8 +1841,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq, > return err; > } > > -swap_packet: > - cs_etm__packet_swap(etm, tidq); > +reset_last_br: > > /* Reset last branches after flush the trace */ > if (etm->synth_opts.last_branch)
Hi James, > On 23/06/2023 19:22, Tanmay Jagdale wrote: > > The existing method of synthesizing instruction samples has the > > following issues: > > 1. Non-branch instructions have mnemonics of branch instructions. > > 2. Branch target address is missing. > > > > Set the sample flags only when we reach the last instruction in > > the tidq (which would be a branch instruction) to solve issue 1). > > > > To fix issue 2), start synthesizing the instructions from the > > previous packet (tidq->prev_packet) instead of current packet > > (tidq->packet). This way, it is easy to figure out the target > > address of the branch instruction in tidq->prev_packet which > > is the current packet's (tidq->packet) first executed instruction. > > > > After the switch to processing the previous packet first, we no > > longer need to swap the packets during cs_etm__flush(). > > Hi Tanmay, > > I think the fix for setting the right flags and instruction type makes > sense, but is it possible to do it without the change to swapping in > cs_etm__flush() or some of the other changes to > cs_etm__synth_instruction_sample()? Thanks for the review. I took this approach of swapping because it would be easy to figure out the target address of the branch. If we don't swap the packet synthesis, then we must decode the actual instruction to get the branch target address. IMHO, this would be a complex change. Since the swapping approach is meant to solve issue 2) I will split the patch while posting the next version. > > I'm seeing some differences in the output related to the PID that's > assigned to a sample and some of the addresses that aren't explained by > the commit message. Also there is no corresponding change to > cs_etm__synth_branch_sample(), which is also using prev_packet etc so > I'm wondering if that's correct now without the swap? That function gets > used with the default itrace options or itrace=b IMHO the existing way to handle itrace=b or itrace is correct and does not need any change since we are interested in only the last and first instruction from tidq->prev_packet and tidq->packet respectively, to generate the branching information. > > For example if I run 'perf script --itrace=i100ns' and diff the output > before and after your change I see a difference even though branch and > instruction info isn't printed, so I wouldn't expect to see any changes. > This is on a systemwide recording of a system under load. Yes, the "tr start" mnemonic isn't printed in the flags column, but rest of columns are the same. Will fix this in the next version. If you have observed any other differences, can you please share them ? Also, can you please share the test case commands so that I can run them before submitting the next version ? Thanks and regards, Tanmay > > Thanks > James > > > > > Signed-off-by: Tanmay Jagdale <tanmay@marvell.com> > > --- > > tools/perf/util/cs-etm.c | 32 +++++++++++++++++++++++++------- > > 1 file changed, 25 insertions(+), 7 deletions(-) > > > > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c > > index 91299cc56bf7..446e00d98fd5 100644 > > --- a/tools/perf/util/cs-etm.c > > +++ b/tools/perf/util/cs-etm.c > > @@ -1418,10 +1418,26 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, > > sample.stream_id = etmq->etm->instructions_id; > > sample.period = period; > > sample.cpu = tidq->packet->cpu; > > - sample.flags = tidq->prev_packet->flags; > > sample.cpumode = event->sample.header.misc; > > > > - cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample); > > + cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->prev_packet, &sample); > > + > > + /* Populate branch target information only when we encounter > > + * branch instruction, which is at the end of tidq->prev_packet. > > + */ > > + if (addr == (tidq->prev_packet->end_addr - 4)) { > > + /* Update the perf_sample flags using the prev_packet > > + * since that is the queue we are synthesizing. > > + */ > > + sample.flags = tidq->prev_packet->flags; > > + > > + /* The last instruction of the previous queue would be a > > + * branch operation. Get the target of that branch by looking > > + * into the first executed instruction of the current packet > > + * queue. > > + */ > > + sample.addr = cs_etm__first_executed_instr(tidq->packet); > > + } > > > > if (etm->synth_opts.last_branch) > > sample.branch_stack = tidq->last_branch; > > @@ -1641,7 +1657,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, > > /* Get instructions remainder from previous packet */ > > instrs_prev = tidq->period_instructions; > > > > - tidq->period_instructions += tidq->packet->instr_count; > > + tidq->period_instructions += tidq->prev_packet->instr_count; > > > > /* > > * Record a branch when the last instruction in > > @@ -1721,8 +1737,11 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, > > * been executed, but PC has not advanced to next > > * instruction) > > */ > > + /* Get address from prev_packet since we are synthesizing > > + * that in cs_etm__synth_instruction_sample() > > + */ > > addr = cs_etm__instr_addr(etmq, trace_chan_id, > > - tidq->packet, offset - 1); > > + tidq->prev_packet, offset - 1); > > ret = cs_etm__synth_instruction_sample( > > etmq, tidq, addr, > > etm->instructions_sample_period); > > @@ -1786,7 +1805,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq, > > > > /* Handle start tracing packet */ > > if (tidq->prev_packet->sample_type == CS_ETM_EMPTY) > > - goto swap_packet; > > + goto reset_last_br; > > > > if (etmq->etm->synth_opts.last_branch && > > etmq->etm->synth_opts.instructions && > > @@ -1822,8 +1841,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq, > > return err; > > } > > > > -swap_packet: > > - cs_etm__packet_swap(etm, tidq); > > +reset_last_br: > > > > /* Reset last branches after flush the trace */ > > if (etm->synth_opts.last_branch)
On 28/06/2023 09:01, Tanmay Jagdale wrote: > Hi James, > >> On 23/06/2023 19:22, Tanmay Jagdale wrote: >>> The existing method of synthesizing instruction samples has the >>> following issues: >>> 1. Non-branch instructions have mnemonics of branch instructions. >>> 2. Branch target address is missing. >>> >>> Set the sample flags only when we reach the last instruction in >>> the tidq (which would be a branch instruction) to solve issue 1). >>> >>> To fix issue 2), start synthesizing the instructions from the >>> previous packet (tidq->prev_packet) instead of current packet >>> (tidq->packet). This way, it is easy to figure out the target >>> address of the branch instruction in tidq->prev_packet which >>> is the current packet's (tidq->packet) first executed instruction. >>> >>> After the switch to processing the previous packet first, we no >>> longer need to swap the packets during cs_etm__flush(). >> >> Hi Tanmay, >> >> I think the fix for setting the right flags and instruction type makes >> sense, but is it possible to do it without the change to swapping in >> cs_etm__flush() or some of the other changes to >> cs_etm__synth_instruction_sample()? > Thanks for the review. I took this approach of swapping because > it would be easy to figure out the target address of the branch. > If we don't swap the packet synthesis, then we must decode the > actual instruction to get the branch target address. > IMHO, this would be a complex change. > > Since the swapping approach is meant to solve issue 2) I will > split the patch while posting the next version. > >> >> I'm seeing some differences in the output related to the PID that's >> assigned to a sample and some of the addresses that aren't explained by >> the commit message. Also there is no corresponding change to >> cs_etm__synth_branch_sample(), which is also using prev_packet etc so >> I'm wondering if that's correct now without the swap? That function gets >> used with the default itrace options or itrace=b > IMHO the existing way to handle itrace=b or itrace is correct and > does not need any change since we are interested in only the last and > first instruction from tidq->prev_packet and tidq->packet respectively, > to generate the branching information. > >> >> For example if I run 'perf script --itrace=i100ns' and diff the output >> before and after your change I see a difference even though branch and >> instruction info isn't printed, so I wouldn't expect to see any changes. >> This is on a systemwide recording of a system under load. > Yes, the "tr start" mnemonic isn't printed in the flags column, but > rest of columns are the same. Will fix this in the next version. > > If you have observed any other differences, can you please share them ? > > Also, can you please share the test case commands so that I can run them > before submitting the next version ? It's just if you diff any existing recording it seems that there are differences. In this case below I see a difference in the ordering of the samples generated. In a more complicated case with a VM running I also see a difference in which PID is assigned to some samples. It's probably not related to the VM but just that there was more going on on the machine. It's not necessarily wrong, but we don't currently have any tests that verify the complete correctness of the decoding, so unless the commit message explains why there should be a difference we shouldn't make any changes. stress -c 4 & sudo perf record -e cs_etm// -a -m,16M -- sleep 3 sudo perf-before-change script --itrace=i1000i > before sudo perf-after-change script --itrace=i1000i > after diff before after 29d28 < stress 8198 [000] 1096.381602: 1000 instructions: aaaaceb70f2c rand@plt+0xc (/usr/bin/stress) 30a30 > stress 8198 [000] 1096.381602: 1000 instructions: aaaaceb70f2c rand@plt+0xc (/usr/bin/stress) 193d192 < stress 8200 [001] 1096.381602: 1000 instructions: aaaaceb70f28 rand@plt+0x8 (/usr/bin/stress) 194a194 > stress 8200 [001] 1096.381602: 1000 instructions: aaaaceb70f28 rand@plt+0x8 (/usr/bin/stress) > > Thanks and regards, > Tanmay >> >> Thanks >> James >> >>> >>> Signed-off-by: Tanmay Jagdale <tanmay@marvell.com> >>> --- >>> tools/perf/util/cs-etm.c | 32 +++++++++++++++++++++++++------- >>> 1 file changed, 25 insertions(+), 7 deletions(-) >>> >>> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c >>> index 91299cc56bf7..446e00d98fd5 100644 >>> --- a/tools/perf/util/cs-etm.c >>> +++ b/tools/perf/util/cs-etm.c >>> @@ -1418,10 +1418,26 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, >>> sample.stream_id = etmq->etm->instructions_id; >>> sample.period = period; >>> sample.cpu = tidq->packet->cpu; >>> - sample.flags = tidq->prev_packet->flags; >>> sample.cpumode = event->sample.header.misc; >>> >>> - cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample); >>> + cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->prev_packet, &sample); >>> + >>> + /* Populate branch target information only when we encounter >>> + * branch instruction, which is at the end of tidq->prev_packet. >>> + */ >>> + if (addr == (tidq->prev_packet->end_addr - 4)) { >>> + /* Update the perf_sample flags using the prev_packet >>> + * since that is the queue we are synthesizing. >>> + */ >>> + sample.flags = tidq->prev_packet->flags; >>> + >>> + /* The last instruction of the previous queue would be a >>> + * branch operation. Get the target of that branch by looking >>> + * into the first executed instruction of the current packet >>> + * queue. >>> + */ >>> + sample.addr = cs_etm__first_executed_instr(tidq->packet); >>> + } >>> >>> if (etm->synth_opts.last_branch) >>> sample.branch_stack = tidq->last_branch; >>> @@ -1641,7 +1657,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, >>> /* Get instructions remainder from previous packet */ >>> instrs_prev = tidq->period_instructions; >>> >>> - tidq->period_instructions += tidq->packet->instr_count; >>> + tidq->period_instructions += tidq->prev_packet->instr_count; >>> >>> /* >>> * Record a branch when the last instruction in >>> @@ -1721,8 +1737,11 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, >>> * been executed, but PC has not advanced to next >>> * instruction) >>> */ >>> + /* Get address from prev_packet since we are synthesizing >>> + * that in cs_etm__synth_instruction_sample() >>> + */ >>> addr = cs_etm__instr_addr(etmq, trace_chan_id, >>> - tidq->packet, offset - 1); >>> + tidq->prev_packet, offset - 1); >>> ret = cs_etm__synth_instruction_sample( >>> etmq, tidq, addr, >>> etm->instructions_sample_period); >>> @@ -1786,7 +1805,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq, >>> >>> /* Handle start tracing packet */ >>> if (tidq->prev_packet->sample_type == CS_ETM_EMPTY) >>> - goto swap_packet; >>> + goto reset_last_br; >>> >>> if (etmq->etm->synth_opts.last_branch && >>> etmq->etm->synth_opts.instructions && >>> @@ -1822,8 +1841,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq, >>> return err; >>> } >>> >>> -swap_packet: >>> - cs_etm__packet_swap(etm, tidq); >>> +reset_last_br: >>> >>> /* Reset last branches after flush the trace */ >>> if (etm->synth_opts.last_branch)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 91299cc56bf7..446e00d98fd5 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1418,10 +1418,26 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, sample.stream_id = etmq->etm->instructions_id; sample.period = period; sample.cpu = tidq->packet->cpu; - sample.flags = tidq->prev_packet->flags; sample.cpumode = event->sample.header.misc; - cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample); + cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->prev_packet, &sample); + + /* Populate branch target information only when we encounter + * branch instruction, which is at the end of tidq->prev_packet. + */ + if (addr == (tidq->prev_packet->end_addr - 4)) { + /* Update the perf_sample flags using the prev_packet + * since that is the queue we are synthesizing. + */ + sample.flags = tidq->prev_packet->flags; + + /* The last instruction of the previous queue would be a + * branch operation. Get the target of that branch by looking + * into the first executed instruction of the current packet + * queue. + */ + sample.addr = cs_etm__first_executed_instr(tidq->packet); + } if (etm->synth_opts.last_branch) sample.branch_stack = tidq->last_branch; @@ -1641,7 +1657,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, /* Get instructions remainder from previous packet */ instrs_prev = tidq->period_instructions; - tidq->period_instructions += tidq->packet->instr_count; + tidq->period_instructions += tidq->prev_packet->instr_count; /* * Record a branch when the last instruction in @@ -1721,8 +1737,11 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, * been executed, but PC has not advanced to next * instruction) */ + /* Get address from prev_packet since we are synthesizing + * that in cs_etm__synth_instruction_sample() + */ addr = cs_etm__instr_addr(etmq, trace_chan_id, - tidq->packet, offset - 1); + tidq->prev_packet, offset - 1); ret = cs_etm__synth_instruction_sample( etmq, tidq, addr, etm->instructions_sample_period); @@ -1786,7 +1805,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq, /* Handle start tracing packet */ if (tidq->prev_packet->sample_type == CS_ETM_EMPTY) - goto swap_packet; + goto reset_last_br; if (etmq->etm->synth_opts.last_branch && etmq->etm->synth_opts.instructions && @@ -1822,8 +1841,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq, return err; } -swap_packet: - cs_etm__packet_swap(etm, tidq); +reset_last_br: /* Reset last branches after flush the trace */ if (etm->synth_opts.last_branch)