Message ID | 20230123182728.825519-1-kan.liang@linux.intel.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp1758087wrn; Mon, 23 Jan 2023 10:35:32 -0800 (PST) X-Google-Smtp-Source: AMrXdXuWzK2zMJfDj0ZbchqvaFiq3kZInvBgG8d14EGGRl2LfBbP7xylhD3EPtfap29XhJ7yLq08 X-Received: by 2002:a17:906:3e58:b0:7c0:fa2c:fc9b with SMTP id t24-20020a1709063e5800b007c0fa2cfc9bmr34011783eji.55.1674498931883; Mon, 23 Jan 2023 10:35:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674498931; cv=none; d=google.com; s=arc-20160816; b=JGWWh0wT0LzuWelXr54GkyG/eWn9Ntwt7fLpYlYwXG8H+snirWny7P8YrYHee3WpIi PjOnwVVZylnEpSvDnguenzPTXfjcRq9ywq6LVOgsz2ibkHLCC7N+c0G3VqIWvPaPVKWe bUgorEQV2/zxzjeAfUXBQ9ywRFnpG87hUb40iaIUIcwjm7kKs2TkuQMPIe4kA0ZJIc5i 74ZQulvpGbyo+c3RbltBkPRSBWsoteq3TktWsUmiiwY4o51Se68NGp3huUE42clRV6jG mBlkGTPETryBwMwacNea4J0/NFJqxgI35tnaMqc6Ds+kSSNtpidgN7jIbWphFOeP1qgK cS2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=cW3Uc6ixPww55xJWZOFmFAwrATweX7cBr4Ten10Tbms=; b=RDT7+E3HKwQFc6t/ZSKhrbr3PFxIcVrcW5WYZ6TTbAw9UKx13lmPVFD5jJrb5cnuQt +C+BRkiTS0VGg81FFX1xRmWo5eKgkHN7IV+WY7qQEg+qg2O3VUX1/Zt+LLatfpSSEbTX 047NiBDcprqrpBtPGNdI2cL0m/iynKf5F1LyOOjRXsEMVe3gJwyd2mSTlOB5OO7HvIVD rBx8GzDyPxcSD6UQLCJlsaQ2SnZlGMEowTQfuKv2bdvWhrZ1FW+Keipzfx6uUxW60Nz1 WBnwuAfW3/rwbnet4GjllfLbkW/SEntVN/PQKbdSc+3icwalMv+hCjxvQ9+ZT6jvhXsi AitA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YlaBFmw9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id xi8-20020a170906dac800b007c10bb4eaabsi32066734ejb.156.2023.01.23.10.35.08; Mon, 23 Jan 2023 10:35:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YlaBFmw9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229873AbjAWS3U (ORCPT <rfc822;rust.linux@gmail.com> + 99 others); Mon, 23 Jan 2023 13:29:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231725AbjAWS3S (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 23 Jan 2023 13:29:18 -0500 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7422693E5 for <linux-kernel@vger.kernel.org>; Mon, 23 Jan 2023 10:28:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1674498521; x=1706034521; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=B/X2gvIvDqA6WX8ie2w6TrNTNcAhu/c5ClftEGylgFg=; b=YlaBFmw9Qq8MG2acxi1bYQz7UTH2fRoLcT9cExCuxjAESn8eAr0aO5Kf Gz93WvRmaXlRITHZNH1ZNbtACbnLY0eDh4wWFsVjAU/ARan2rq1zyMSqI 1T618yxq6vHDEejTnvtJv8wipd7mCt6QRN2cq4OBED6FaaM6nfjKaRo7X 3V1rJ1nilRdDSRokAmLiHxbWt3m7gyJCjrxbvPVKgwLujZ0ThJZVTB3SM +GFrlcTnsy6oUmHpDU9DKE7JtgM9AO2h8YqdKwXr9NE67CD0R1lWuUMgV k1iQVHcwXeZT6WOk6hf86zYTGsA3ZvWZ1aQqwvqSbvq5PwLLfCklSR0Zc w==; X-IronPort-AV: E=McAfee;i="6500,9779,10599"; a="328201775" X-IronPort-AV: E=Sophos;i="5.97,240,1669104000"; d="scan'208";a="328201775" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Jan 2023 10:27:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10599"; a="661812072" X-IronPort-AV: E=Sophos;i="5.97,240,1669104000"; d="scan'208";a="661812072" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orsmga002.jf.intel.com with ESMTP; 23 Jan 2023 10:27:42 -0800 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, tglx@linutronix.de, jstultz@google.com, sboyd@kernel.org, linux-kernel@vger.kernel.org Cc: eranian@google.com, namhyung@kernel.org, ak@linux.intel.com, Kan Liang <kan.liang@linux.intel.com> Subject: [PATCH 0/3] Convert TSC to monotonic clock for PEBS Date: Mon, 23 Jan 2023 10:27:25 -0800 Message-Id: <20230123182728.825519-1-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755839391831129980?= X-GMAIL-MSGID: =?utf-8?q?1755839391831129980?= |
Series |
Convert TSC to monotonic clock for PEBS
|
|
Message
Liang, Kan
Jan. 23, 2023, 6:27 p.m. UTC
From: Kan Liang <kan.liang@linux.intel.com>
A Processor Event Based Sampling (PEBS) record includes a field that
provide the time stamp counter value when the counter was overflowed
and the PEBS record was generated. The accurate time stamp can be used
to reconcile user samples. However, the current PEBS codes only can
convert the time stamp to sched_clock, which is not available from user
space. A solution to convert a given TSC to user visible monotonic
clock is required.
The perf_event subsystem only converts the TSC in a NMI handler. The
converter function must be fast and NMI safe.
Considered the below two existing functions, but none of them fulfill
the above requirements.
- The ktime_get_mono_fast_ns() is NMI safe, but it can only return the
current clock monotonic rather than a given time's monotonic.
- The get_device_system_crosststamp() can calculate the system time from
a given device time. But it's not fast and NMI safe.
Introduce a new generic interface, get_mono_fast_from_given_time, to
convert a given timestamp to clock monotonic.
Kan Liang (3):
timekeeping: NMI safe converter from a given time to monotonic
x86/tsc: Add set_tsc_system_counterval
perf/x86/intel/ds: Support monotonic clock for PEBS
arch/x86/events/intel/core.c | 2 +-
arch/x86/events/intel/ds.c | 30 +++++++++++++---
arch/x86/include/asm/tsc.h | 1 +
arch/x86/kernel/tsc.c | 6 ++++
include/linux/timekeeping.h | 9 +++++
kernel/time/timekeeping.c | 68 ++++++++++++++++++++++++++++++++++--
6 files changed, 108 insertions(+), 8 deletions(-)
Comments
On Mon, Jan 23, 2023 at 10:27 AM <kan.liang@linux.intel.com> wrote: > > From: Kan Liang <kan.liang@linux.intel.com> > > A Processor Event Based Sampling (PEBS) record includes a field that > provide the time stamp counter value when the counter was overflowed > and the PEBS record was generated. The accurate time stamp can be used > to reconcile user samples. However, the current PEBS codes only can > convert the time stamp to sched_clock, which is not available from user > space. A solution to convert a given TSC to user visible monotonic > clock is required. > > The perf_event subsystem only converts the TSC in a NMI handler. The > converter function must be fast and NMI safe. > > Considered the below two existing functions, but none of them fulfill > the above requirements. > - The ktime_get_mono_fast_ns() is NMI safe, but it can only return the > current clock monotonic rather than a given time's monotonic. > - The get_device_system_crosststamp() can calculate the system time from > a given device time. But it's not fast and NMI safe. So, apologies if this is a silly question (my brain quickly evicts the details on get_device_system_crosststamp every time I look at it), but rather then introducing a new interface, what would it take to rework the existing get_device_system_crosststamp() logic to be usable for both use cases? thanks -john
On 2023-01-24 1:13 a.m., John Stultz wrote: > On Mon, Jan 23, 2023 at 10:27 AM <kan.liang@linux.intel.com> wrote: >> >> From: Kan Liang <kan.liang@linux.intel.com> >> >> A Processor Event Based Sampling (PEBS) record includes a field that >> provide the time stamp counter value when the counter was overflowed >> and the PEBS record was generated. The accurate time stamp can be used >> to reconcile user samples. However, the current PEBS codes only can >> convert the time stamp to sched_clock, which is not available from user >> space. A solution to convert a given TSC to user visible monotonic >> clock is required. >> >> The perf_event subsystem only converts the TSC in a NMI handler. The >> converter function must be fast and NMI safe. >> >> Considered the below two existing functions, but none of them fulfill >> the above requirements. >> - The ktime_get_mono_fast_ns() is NMI safe, but it can only return the >> current clock monotonic rather than a given time's monotonic. >> - The get_device_system_crosststamp() can calculate the system time from >> a given device time. But it's not fast and NMI safe. > > So, apologies if this is a silly question (my brain quickly evicts the > details on get_device_system_crosststamp every time I look at it), but > rather then introducing a new interface, what would it take to rework > the existing get_device_system_crosststamp() logic to be usable for > both use cases? > I once tried to rework the existing get_device_system_crosststamp() but I gave up finally, because - The existing function is already very complex. Adding a new case will make it more complex. It's not easy to be maintained. - Perf doesn't need all logic of the existing function. For example, the history is not required. (I think there is no problem for perf if we cannot get values for some corner cases. The worst case for perf is to fallback to the time captured in the NMI handler. It's not very accurate, but it should be acceptable.). The performance is priority one. We want a function with much simpler logic. - If I understand correct, we already introduced several dedicated functions for fast NMI access, e.g., ktime_get_mono_fast_ns(). I think we can follow the same idea. Thanks, Kan