[0/3] Convert TSC to monotonic clock for PEBS

Message ID 20230123182728.825519-1-kan.liang@linux.intel.com
Headers
Series Convert TSC to monotonic clock for PEBS |

Message

Liang, Kan Jan. 23, 2023, 6:27 p.m. UTC
  From: Kan Liang <kan.liang@linux.intel.com>

A Processor Event Based Sampling (PEBS) record includes a field that
provide the time stamp counter value when the counter was overflowed
and the PEBS record was generated. The accurate time stamp can be used
to reconcile user samples. However, the current PEBS codes only can
convert the time stamp to sched_clock, which is not available from user
space. A solution to convert a given TSC to user visible monotonic
clock is required.

The perf_event subsystem only converts the TSC in a NMI handler. The
converter function must be fast and NMI safe.

Considered the below two existing functions, but none of them fulfill
the above requirements.
- The ktime_get_mono_fast_ns() is NMI safe, but it can only return the
  current clock monotonic rather than a given time's monotonic.
- The get_device_system_crosststamp() can calculate the system time from
  a given device time. But it's not fast and NMI safe.

Introduce a new generic interface, get_mono_fast_from_given_time, to
convert a given timestamp to clock monotonic.

Kan Liang (3):
  timekeeping: NMI safe converter from a given time to monotonic
  x86/tsc: Add set_tsc_system_counterval
  perf/x86/intel/ds: Support monotonic clock for PEBS

 arch/x86/events/intel/core.c |  2 +-
 arch/x86/events/intel/ds.c   | 30 +++++++++++++---
 arch/x86/include/asm/tsc.h   |  1 +
 arch/x86/kernel/tsc.c        |  6 ++++
 include/linux/timekeeping.h  |  9 +++++
 kernel/time/timekeeping.c    | 68 ++++++++++++++++++++++++++++++++++--
 6 files changed, 108 insertions(+), 8 deletions(-)
  

Comments

John Stultz Jan. 24, 2023, 6:13 a.m. UTC | #1
On Mon, Jan 23, 2023 at 10:27 AM <kan.liang@linux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@linux.intel.com>
>
> A Processor Event Based Sampling (PEBS) record includes a field that
> provide the time stamp counter value when the counter was overflowed
> and the PEBS record was generated. The accurate time stamp can be used
> to reconcile user samples. However, the current PEBS codes only can
> convert the time stamp to sched_clock, which is not available from user
> space. A solution to convert a given TSC to user visible monotonic
> clock is required.
>
> The perf_event subsystem only converts the TSC in a NMI handler. The
> converter function must be fast and NMI safe.
>
> Considered the below two existing functions, but none of them fulfill
> the above requirements.
> - The ktime_get_mono_fast_ns() is NMI safe, but it can only return the
>   current clock monotonic rather than a given time's monotonic.
> - The get_device_system_crosststamp() can calculate the system time from
>   a given device time. But it's not fast and NMI safe.

So, apologies if this is a silly question (my brain quickly evicts the
details on get_device_system_crosststamp every time I look at it), but
rather then introducing a new interface, what would it take to rework
the existing get_device_system_crosststamp() logic to be usable for
both use cases?

thanks
-john
  
Liang, Kan Jan. 24, 2023, 3:04 p.m. UTC | #2
On 2023-01-24 1:13 a.m., John Stultz wrote:
> On Mon, Jan 23, 2023 at 10:27 AM <kan.liang@linux.intel.com> wrote:
>>
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> A Processor Event Based Sampling (PEBS) record includes a field that
>> provide the time stamp counter value when the counter was overflowed
>> and the PEBS record was generated. The accurate time stamp can be used
>> to reconcile user samples. However, the current PEBS codes only can
>> convert the time stamp to sched_clock, which is not available from user
>> space. A solution to convert a given TSC to user visible monotonic
>> clock is required.
>>
>> The perf_event subsystem only converts the TSC in a NMI handler. The
>> converter function must be fast and NMI safe.
>>
>> Considered the below two existing functions, but none of them fulfill
>> the above requirements.
>> - The ktime_get_mono_fast_ns() is NMI safe, but it can only return the
>>   current clock monotonic rather than a given time's monotonic.
>> - The get_device_system_crosststamp() can calculate the system time from
>>   a given device time. But it's not fast and NMI safe.
> 
> So, apologies if this is a silly question (my brain quickly evicts the
> details on get_device_system_crosststamp every time I look at it), but
> rather then introducing a new interface, what would it take to rework
> the existing get_device_system_crosststamp() logic to be usable for
> both use cases?
> 

I once tried to rework the existing get_device_system_crosststamp() but
I gave up finally, because
- The existing function is already very complex. Adding a new case will
make it more complex. It's not easy to be maintained.
- Perf doesn't need all logic of the existing function. For example, the
history is not required. (I think there is no problem for perf if we
cannot get values for some corner cases. The worst case for perf is to
fallback to the time captured in the NMI handler. It's not very
accurate, but it should be acceptable.). The performance is priority
one. We want a function with much simpler logic.
- If I understand correct, we already introduced several dedicated
functions for fast NMI access, e.g., ktime_get_mono_fast_ns(). I think
we can follow the same idea.


Thanks,
Kan