From patchwork Wed Jan 25 20:49:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liang, Kan" X-Patchwork-Id: 48342 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp480205wrn; Wed, 25 Jan 2023 12:54:12 -0800 (PST) X-Google-Smtp-Source: AMrXdXtXhPVGfUTvW1uuusTmmXN9t3KkAY1ibWPMMNis1R2GalSBaZAhc5lAQ6CP6y5Ex8KEGjCy X-Received: by 2002:a17:906:644f:b0:877:6549:bb6 with SMTP id l15-20020a170906644f00b0087765490bb6mr28101732ejn.58.1674680052277; Wed, 25 Jan 2023 12:54:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674680052; cv=none; d=google.com; s=arc-20160816; b=kTmt5nPe9NelZhenePlJb2YkJ9Cod+lcDdO9rRahkto1pCU6kazY4+XeE3Dfe4ym3Q fxdgvfvWCDc7CvgzkT8WxngxDhwbRPHceZzEdhLpJwDD/i2mqQaNd3BmiKKJ5G5rsE0m oJ6/Vej+KeJElYabBNRy+DA1D4ijEu+SXQn7XDWwoMI/3N5tO8rsNPHzZZVrbuA24FGP riVIj07fQt5LiQRJxRsqgR0d6KrYqHQc9231pwuYzK0eZHaObGQqnkotF/RldODtv2Ra /smTNnMcbyTGszVlcZx5E05uG4T6wZKDF0LrDpHqjeDmQlvgW69yi9yeaW/qsrE90H9P TbKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=PhgD53/oSpEm00YrecAuWF1cN92sgIbxRJpwc6MeJ8A=; b=V34NUW7WN5i+R+Cklf4/ZNjtpdfKtq4AAxRnft73HAJ73ic7vmpR+eN8VKmZrAMEYF yw29HRJO665SMltlWKfe3RXuaYeoXyOn4Lrxi4mFo7wBO5piySbUJtTc4xCmDSi/bmXV 1iJQzdq5SxU8SsnXQ4HEKfirBTCIOYnUh1hN8PBNC3dDK+SFlkTTsbT9Ak9AAr4E0xl8 hohuhGcObmAvGNAXEHvG5VHx/Oe8ZWbEvC33dZ4iV4rxP33zk77ggeSQSw9RUyVSYOlm OewFlmjcGbyQnGEeSQjHhZ4YncSocwbDBlcIV0p7q9yJdP0sCaiVSlpYgjOCOy5xjPgM pbuA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JXleQ92m; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ge7-20020a170907908700b007c17f750b75si7154468ejb.796.2023.01.25.12.53.05; Wed, 25 Jan 2023 12:54:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JXleQ92m; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235659AbjAYUtU (ORCPT + 99 others); Wed, 25 Jan 2023 15:49:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235065AbjAYUtT (ORCPT ); Wed, 25 Jan 2023 15:49:19 -0500 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E41744A3 for ; Wed, 25 Jan 2023 12:49:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1674679758; x=1706215758; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=iKVvmqHGR0SXWX9t0Lomf8ijyjMQhY9AxrfjeDTmD4U=; b=JXleQ92mCTtd6sGylxeaq1wWh/OsNnGGPZv1P1d2uvp4+UuBu446Z6xk UX3OOdIiLmwLj0O+VxYp73e5iQHam/q9DkyembdT1qWmK3Qt9LcfAOgGb KGYIlV3SO83C4nW2kqe4Z0nBanKna9AIkzklkAGqcp9i2SNEROYX7dt7t U2mEyhJuVnfUKONd2vrMzUol+M3vDudfnBYY7Da5f4H5V80cJLWbG84eA ewTaXTuWNP0/frHJtchvK/8IXA/4+rXxaoGRCcV21RGYC+B8gZwDfI3ZD gmQv9GUY+RLiH3Z+tx1O/piO1uDrR6vzBhGy3HjsK/orcZebi+U4CyOYF g==; X-IronPort-AV: E=McAfee;i="6500,9779,10601"; a="306335478" X-IronPort-AV: E=Sophos;i="5.97,246,1669104000"; d="scan'208";a="306335478" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Jan 2023 12:49:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10601"; a="612567342" X-IronPort-AV: E=Sophos;i="5.97,246,1669104000"; d="scan'208";a="612567342" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orsmga003.jf.intel.com with ESMTP; 25 Jan 2023 12:49:17 -0800 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org Cc: namhyung@kernel.org, eranian@google.com, ak@linux.intel.com, Kan Liang Subject: [PATCH V2] perf/x86/intel/ds: Fix the conversion from TSC to perf time Date: Wed, 25 Jan 2023 12:49:25 -0800 Message-Id: <20230125204925.924442-1-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_PASS, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756029310865972280?= X-GMAIL-MSGID: =?utf-8?q?1756029310865972280?= From: Kan Liang The time order is incorrect when the TSC in a PEBS record is used. $perf record -e cycles:upp dd if=/dev/zero of=/dev/null count=10000 $ perf script --show-task-events perf-exec 0 0.000000: PERF_RECORD_COMM: perf-exec:915/915 dd 915 106.479872: PERF_RECORD_COMM exec: dd:915/915 dd 915 106.483270: PERF_RECORD_EXIT(915:915):(914:914) dd 915 106.512429: 1 cycles:upp: ffffffff96c011b7 [unknown] ([unknown]) ... ... The perf time is from sched_clock_cpu(). The current PEBS code unconditionally convert the TSC to native_sched_clock(). There is a shift between the two clocks. If the TSC is stable, the shift is consistent, __sched_clock_offset. If the TSC is unstable, the shift has to be calculated at runtime. This patch doesn't support the conversion when the TSC is unstable. The TSC unstable case is a corner case and very unlikely to happen. If it happens, the TSC in a PEBS record will be dropped and fall back to perf_event_clock(). Fixes: 47a3aeb39e8d ("perf/x86/intel/pebs: Fix PEBS timestamps overwritten") Reported-by: Namhyung Kim Link: https://lore.kernel.org/all/CAM9d7cgWDVAq8-11RbJ2uGfwkKD6fA-OMwOKDrNUrU_=8MgEjg@mail.gmail.com/ Signed-off-by: Kan Liang --- Changes since V1: - Update the comments and description to avoid the confusion. arch/x86/events/intel/ds.c | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 183efa914b99..b0354dc869d2 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -2,12 +2,14 @@ #include #include #include +#include #include #include #include #include #include +#include #include "../perf_event.h" @@ -1568,6 +1570,27 @@ static u64 get_data_src(struct perf_event *event, u64 aux) return val; } +static void setup_pebs_time(struct perf_event *event, + struct perf_sample_data *data, + u64 tsc) +{ + /* Converting to a user-defined clock is not supported yet. */ + if (event->attr.use_clockid != 0) + return; + + /* + * Doesn't support the conversion when the TSC is unstable. + * The TSC unstable case is a corner case and very unlikely to + * happen. If it happens, the TSC in a PEBS record will be + * dropped and fall back to perf_event_clock(). + */ + if (!using_native_sched_clock() || !sched_clock_stable()) + return; + + data->time = native_sched_clock_from_tsc(tsc) + __sched_clock_offset; + data->sample_flags |= PERF_SAMPLE_TIME; +} + #define PERF_SAMPLE_ADDR_TYPE (PERF_SAMPLE_ADDR | \ PERF_SAMPLE_PHYS_ADDR | \ PERF_SAMPLE_DATA_PAGE_SIZE) @@ -1715,11 +1738,8 @@ static void setup_pebs_fixed_sample_data(struct perf_event *event, * * We can only do this for the default trace clock. */ - if (x86_pmu.intel_cap.pebs_format >= 3 && - event->attr.use_clockid == 0) { - data->time = native_sched_clock_from_tsc(pebs->tsc); - data->sample_flags |= PERF_SAMPLE_TIME; - } + if (x86_pmu.intel_cap.pebs_format >= 3) + setup_pebs_time(event, data, pebs->tsc); if (has_branch_stack(event)) perf_sample_save_brstack(data, event, &cpuc->lbr_stack); @@ -1781,10 +1801,7 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event, perf_sample_data_init(data, 0, event->hw.last_period); data->period = event->hw.last_period; - if (event->attr.use_clockid == 0) { - data->time = native_sched_clock_from_tsc(basic->tsc); - data->sample_flags |= PERF_SAMPLE_TIME; - } + setup_pebs_time(event, data, basic->tsc); /* * We must however always use iregs for the unwinder to stay sane; the