x86/tdx: Mark TSC reliable

Message ID 20230808162320.27297-1-kirill.shutemov@linux.intel.com
State New
Headers
Series x86/tdx: Mark TSC reliable |

Commit Message

Kirill A. Shutemov Aug. 8, 2023, 4:23 p.m. UTC
  In x86 virtualization environments, including TDX, RDTSC instruction is
handled without causing a VM exit, resulting in minimal overhead and
jitters. On the other hand, other clock sources (such as HPET, ACPI
timer, APIC, etc.) necessitate VM exits to implement, resulting in more
fluctuating measurements compared to TSC. Thus, those clock sources are
not effective for calibrating TSC.

In TD guests, TSC is virtualized by the TDX module, which ensures:

  - Virtual TSC values are consistent among all the TD’s VCPUs;
  - Monotonously incrementing for any single VCPU;
  - The frequency is determined by TD configuration. The host TSC is
    invariant on platforms where TDX is available.

Use TSC as the only reliable clock source in TD guests, bypassing
unstable calibration.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/coco/tdx/tdx.c | 3 +++
 1 file changed, 3 insertions(+)
  

Comments

Kirill A. Shutemov Aug. 8, 2023, 8:01 p.m. UTC | #1
On Tue, Aug 08, 2023 at 10:13:05AM -0700, Dave Hansen wrote:
> On 8/8/23 09:23, Kirill A. Shutemov wrote:
> ...
> > On the other hand, other clock sources (such as HPET, ACPI timer,
> > APIC, etc.) necessitate VM exits to implement, resulting in more 
> > fluctuating measurements compared to TSC. Thus, those clock sources
> > are not effective for calibrating TSC.
> 
> Do we need to do anything to _those_ to mark them as slightly stinky?

I don't know what the rules here. As far as I can see, all other clock
sources relevant for TDX guest have lower rating. I guess we are fine?

There's notable exception to the rating order is kvmclock which is higher
than tsc. It has to be disabled, but it is not clear to me how. This topic
is related to how we are going to filter allowed devices/drivers, so I
would postpone the decision until we settle on wider filtering schema.

> > In TD guests, TSC is virtualized by the TDX module, which ensures:
> > 
> >   - Virtual TSC values are consistent among all the TD’s VCPUs;
> >   - Monotonously incrementing for any single VCPU;
> >   - The frequency is determined by TD configuration. The host TSC is
> >     invariant on platforms where TDX is available.
> 
> I take it this is carved in stone in the TDX specs somewhere.  A
> reference would be nice.

TDX Module 1.0 spec:

	5.3.5. Time Stamp Counter (TSC)

	TDX provides a trusted virtual TSC to the guest TDs. TSC value is
	monotonously incrementing, starting from 0 on TD initialization by the
	host VMM. The deviation between virtual TSC values read by each VCPU is
	small.

	A guest TD should disable mechanisms that are used in non-trusted
	environment, which attempt to synchronize TSC between VCPUs, and should
	not revert to using untrusted time mechanisms.

...

	13.13.1. TSC Virtualization

	For virtual time stamp counter (TSC) values read by guest TDs, the Intel
	TDX module is designed to achieve the following:

	• Virtual TSC values are consistent among all the TD’s VCPUs at
	  the level supported by the CPU, see below.
	• The virtual TSC value for any single VCPU is monotonously
	  incrementing (except roll over from 264-1 to 0).
	• The virtual TSC frequency is determined by TD configuration.

...

> We've got VMWare and Hyper-V code basically doing the same thing today.
> So TDX is in kinda good company.  But this still makes me rather
> nervous.  Do you have any encouraging words about how unlikely future
> hardware is to screw this up, especially as TDX-supporting hardware gets
> more diverse?

Wording in the spec looks okay to me. We can only hope that implementation
going to be sane.
  
Reshetova, Elena Aug. 9, 2023, 5:44 a.m. UTC | #2
> On Tue, Aug 08, 2023 at 10:13:05AM -0700, Dave Hansen wrote:
> > On 8/8/23 09:23, Kirill A. Shutemov wrote:
> > ...
> > > On the other hand, other clock sources (such as HPET, ACPI timer,
> > > APIC, etc.) necessitate VM exits to implement, resulting in more
> > > fluctuating measurements compared to TSC. Thus, those clock sources
> > > are not effective for calibrating TSC.
> >
> > Do we need to do anything to _those_ to mark them as slightly stinky?

IMO from pure security pov yes. It would be good secure default that 
TDX guests (and other CoCo guests also) are using only trusted source time. 
There are issues with this though and would need to understand where
to draw the line. Things like hpet and such we hoped to disable via
device filtering. For some other time sources we
have used patches below. But then there are things like RTC that would
be great to disable also, but without a proper remote time server
that breaks any date/timing for the guest, so we have not done it
and probably should not by default, but we recommend not using it
in docs we have:
https://intel.github.io/ccc-linux-guest-hardening-docs/security-spec.html#tsc-and-other-timers

> 
> I don't know what the rules here. As far as I can see, all other clock
> sources relevant for TDX guest have lower rating. I guess we are fine?

What about acpi_pm? 
See this:
https://github.com/intel/tdx/commit/045692772ab4ef75062a83cc6e4ffa22cab40226

> 
> There's notable exception to the rating order is kvmclock which is higher
> than tsc. It has to be disabled, but it is not clear to me how. This topic
> is related to how we are going to filter allowed devices/drivers, so I
> would postpone the decision until we settle on wider filtering schema.

One option is to include "no-kvmclock" into kernel command line, which
is attested. Another option is to try to disable it explicitly, like we had
in past: 
https://github.com/intel/tdx/commit/6b0357f2115c1bdd158c0c8836f4f541517bf375

The obvious issues with command line is that it is going to 1) grow 
considerably if we try to disable everything we can via command line
and 2) there is a high chance that in practice people will not use secure default
and/or forget to verify the correct status of cmd line. But this is to be
expected I guess for any security method that involves attestation unfortunately.

Best Regards,
Elena.
  
Kirill A. Shutemov Aug. 9, 2023, 6:13 a.m. UTC | #3
On Wed, Aug 09, 2023 at 05:44:37AM +0000, Reshetova, Elena wrote:
> > 
> > I don't know what the rules here. As far as I can see, all other clock
> > sources relevant for TDX guest have lower rating. I guess we are fine?
> 
> What about acpi_pm? 
> See this:
> https://github.com/intel/tdx/commit/045692772ab4ef75062a83cc6e4ffa22cab40226

clocksource_acpi_pm.rating is 200 while TSC is 300.

> > There's notable exception to the rating order is kvmclock which is higher
> > than tsc. It has to be disabled, but it is not clear to me how. This topic
> > is related to how we are going to filter allowed devices/drivers, so I
> > would postpone the decision until we settle on wider filtering schema.
> 
> One option is to include "no-kvmclock" into kernel command line, which
> is attested. Another option is to try to disable it explicitly, like we had
> in past: 
> https://github.com/intel/tdx/commit/6b0357f2115c1bdd158c0c8836f4f541517bf375
> 
> The obvious issues with command line is that it is going to 1) grow 
> considerably if we try to disable everything we can via command line
> and 2) there is a high chance that in practice people will not use secure default
> and/or forget to verify the correct status of cmd line. But this is to be
> expected I guess for any security method that involves attestation unfortunately.

I guess command line is fine, until we have coherent solution on
filtering.
  

Patch

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 1d6b863c42b0..1583ec64d92e 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -769,6 +769,9 @@  void __init tdx_early_init(void)
 
 	setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
 
+	/* TSC is the only reliable clock in TDX guest */
+	setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
+
 	cc_vendor = CC_VENDOR_INTEL;
 	tdx_parse_tdinfo(&cc_mask);
 	cc_set_mask(cc_mask);