[v12,05/22] x86/virt/tdx: Add SEAMCALL infrastructure

Message ID b2a875fd855145728744617ac4425a06d8b46c90.1687784645.git.kai.huang@intel.com
State New
Headers
Series TDX host kernel support |

Commit Message

Kai Huang June 26, 2023, 2:12 p.m. UTC
  TDX introduces a new CPU mode: Secure Arbitration Mode (SEAM).  This
mode runs only the TDX module itself or other code to load the TDX
module.

The host kernel communicates with SEAM software via a new SEAMCALL
instruction.  This is conceptually similar to a guest->host hypercall,
except it is made from the host to SEAM software instead.  The TDX
module establishes a new SEAMCALL ABI which allows the host to
initialize the module and to manage VMs.

Add infrastructure to make SEAMCALLs.  The SEAMCALL ABI is very similar
to the TDCALL ABI and leverages much TDCALL infrastructure.

Also add a wrapper function of SEAMCALL to convert SEAMCALL error code
to the kernel error code, and print out SEAMCALL error code to help the
user to understand what went wrong.

Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
---

v11 -> v12:
 - Moved _ASM_EXT_TABLE() for #UD/#GP to a later patch for better patch
   review, and removed related part from changelog.
 - Minor code changes in seamcall() (David)
 - Added Isaku's tag

v10 -> v11:
 - No update

v9 -> v10:
 - Make the TDX_SEAMCALL_{GP|UD} error codes unconditional but doesn't
   define them when INTEL_TDX_HOST is enabled. (Dave)
 - Slightly improved changelog to explain why add assembly code to handle
   #UD and #GP.

v8 -> v9:
 - Changed patch title (Dave).
 - Enhanced seamcall() to include the cpu id to the error message when
   SEAMCALL fails.

v7 -> v8:
 - Improved changelog (Dave):
   - Trim down some sentences (Dave).
   - Removed __seamcall() and seamcall() function name and changed
     accordingly (Dave).
   - Improved the sentence explaining why to handle #GP (Dave).
 - Added code to print out error message in seamcall(), following
   the idea that tdx_enable() to return universal error and print out
   error message to make clear what's going wrong (Dave).  Also mention
   this in changelog.

v6 -> v7:
 - No change.

v5 -> v6:
 - Added code to handle #UD and #GP (Dave).
 - Moved the seamcall() wrapper function to this patch, and used a
   temporary __always_unused to avoid compile warning (Dave).

- v3 -> v5 (no feedback on v4):
 - Explicitly tell TDX_SEAMCALL_VMFAILINVALID is returned if the
   SEAMCALL itself fails.
 - Improve the changelog.


---
 arch/x86/virt/vmx/tdx/Makefile   |  2 +-
 arch/x86/virt/vmx/tdx/seamcall.S | 52 ++++++++++++++++++++++++++++++++
 arch/x86/virt/vmx/tdx/tdx.c      | 42 ++++++++++++++++++++++++++
 arch/x86/virt/vmx/tdx/tdx.h      | 10 ++++++
 4 files changed, 105 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/virt/vmx/tdx/seamcall.S
 create mode 100644 arch/x86/virt/vmx/tdx/tdx.h
  

Comments

Kirill A. Shutemov June 27, 2023, 9:48 a.m. UTC | #1
On Tue, Jun 27, 2023 at 02:12:35AM +1200, Kai Huang wrote:
> +/*
> + * Wrapper of __seamcall() to convert SEAMCALL leaf function error code
> + * to kernel error code.  @seamcall_ret and @out contain the SEAMCALL
> + * leaf function return code and the additional output respectively if
> + * not NULL.
> + */
> +static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> +				    u64 *seamcall_ret,
> +				    struct tdx_module_output *out)
> +{
> +	u64 sret;
> +	int cpu;
> +
> +	/* Need a stable CPU id for printing error message */
> +	cpu = get_cpu();
> +	sret = __seamcall(fn, rcx, rdx, r8, r9, out);
> +	put_cpu();
> +
> +	/* Save SEAMCALL return code if the caller wants it */
> +	if (seamcall_ret)
> +		*seamcall_ret = sret;
> +
> +	switch (sret) {
> +	case 0:
> +		/* SEAMCALL was successful */
> +		return 0;
> +	case TDX_SEAMCALL_VMFAILINVALID:
> +		pr_err_once("module is not loaded.\n");
> +		return -ENODEV;
> +	default:
> +		pr_err_once("SEAMCALL failed: CPU %d: leaf %llu, error 0x%llx.\n",
> +				cpu, fn, sret);
> +		if (out)
> +			pr_err_once("additional output: rcx 0x%llx, rdx 0x%llx, r8 0x%llx, r9 0x%llx, r10 0x%llx, r11 0x%llx.\n",
> +					out->rcx, out->rdx, out->r8,
> +					out->r9, out->r10, out->r11);

This look excessively noisy.

Don't we have SEAMCALL leafs that can fail in normal situation? Like
TDX_OPERAND_BUSY error code that indicate that operation likely will
succeed on retry.

Or is that wrapper only used for never-fail SEAMCALLs? If so, please
document it.
  
Kai Huang June 27, 2023, 10:28 a.m. UTC | #2
On Tue, 2023-06-27 at 12:48 +0300, kirill.shutemov@linux.intel.com wrote:
> On Tue, Jun 27, 2023 at 02:12:35AM +1200, Kai Huang wrote:
> > +/*
> > + * Wrapper of __seamcall() to convert SEAMCALL leaf function error code
> > + * to kernel error code.  @seamcall_ret and @out contain the SEAMCALL
> > + * leaf function return code and the additional output respectively if
> > + * not NULL.
> > + */
> > +static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> > +				    u64 *seamcall_ret,
> > +				    struct tdx_module_output *out)
> > +{
> > +	u64 sret;
> > +	int cpu;
> > +
> > +	/* Need a stable CPU id for printing error message */
> > +	cpu = get_cpu();
> > +	sret = __seamcall(fn, rcx, rdx, r8, r9, out);
> > +	put_cpu();
> > +
> > +	/* Save SEAMCALL return code if the caller wants it */
> > +	if (seamcall_ret)
> > +		*seamcall_ret = sret;
> > +
> > +	switch (sret) {
> > +	case 0:
> > +		/* SEAMCALL was successful */
> > +		return 0;
> > +	case TDX_SEAMCALL_VMFAILINVALID:
> > +		pr_err_once("module is not loaded.\n");
> > +		return -ENODEV;
> > +	default:
> > +		pr_err_once("SEAMCALL failed: CPU %d: leaf %llu, error 0x%llx.\n",
> > +				cpu, fn, sret);
> > +		if (out)
> > +			pr_err_once("additional output: rcx 0x%llx, rdx 0x%llx, r8 0x%llx, r9 0x%llx, r10 0x%llx, r11 0x%llx.\n",
> > +					out->rcx, out->rdx, out->r8,
> > +					out->r9, out->r10, out->r11);
> 
> This look excessively noisy.
> 
> Don't we have SEAMCALL leafs that can fail in normal situation? Like
> TDX_OPERAND_BUSY error code that indicate that operation likely will
> succeed on retry.

For TDX module initialization TDX_OPERAND_BUSY cannot happen.  KVM may have
legal cases that BUSY can happen, e.g., KVM's TDP MMU supports handling faults
concurrently on different cpus, but that is still under discussion.  Also KVM
tends to use __seamcall() directly:

https://lore.kernel.org/lkml/3c2c142e14a04a833b47f77faecaa91899b472cd.1678643052.git.isaku.yamahata@intel.com/

I guess KVM doesn't want to print message in all cases as you said, but for
module initialization is fine.  Those error messages are useful in case
something goes wrong, and printing them in seamcall() helps to reduce the code
to print in all callers.

> 
> Or is that wrapper only used for never-fail SEAMCALLs? If so, please
> document it.
> 

How about adding below?

	Use __seamcall() directly in cases that printing error message isn't
	desired, e.g., when SEAMCALL can legally fail with BUSY and the caller
	wants to retry.
  
Kirill A. Shutemov June 27, 2023, 11:36 a.m. UTC | #3
On Tue, Jun 27, 2023 at 10:28:20AM +0000, Huang, Kai wrote:
> > Or is that wrapper only used for never-fail SEAMCALLs? If so, please
> > document it.
> > 
> 
> How about adding below?
> 
> 	Use __seamcall() directly in cases that printing error message isn't
> 	desired, e.g., when SEAMCALL can legally fail with BUSY and the caller
> 	wants to retry.
> 

Looks good to me.
  
Isaku Yamahata June 28, 2023, 12:19 a.m. UTC | #4
On Tue, Jun 27, 2023 at 10:28:20AM +0000,
"Huang, Kai" <kai.huang@intel.com> wrote:

> On Tue, 2023-06-27 at 12:48 +0300, kirill.shutemov@linux.intel.com wrote:
> > On Tue, Jun 27, 2023 at 02:12:35AM +1200, Kai Huang wrote:
> > > +/*
> > > + * Wrapper of __seamcall() to convert SEAMCALL leaf function error code
> > > + * to kernel error code.  @seamcall_ret and @out contain the SEAMCALL
> > > + * leaf function return code and the additional output respectively if
> > > + * not NULL.
> > > + */
> > > +static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> > > +				    u64 *seamcall_ret,
> > > +				    struct tdx_module_output *out)
> > > +{
> > > +	u64 sret;
> > > +	int cpu;
> > > +
> > > +	/* Need a stable CPU id for printing error message */
> > > +	cpu = get_cpu();
> > > +	sret = __seamcall(fn, rcx, rdx, r8, r9, out);
> > > +	put_cpu();
> > > +
> > > +	/* Save SEAMCALL return code if the caller wants it */
> > > +	if (seamcall_ret)
> > > +		*seamcall_ret = sret;
> > > +
> > > +	switch (sret) {
> > > +	case 0:
> > > +		/* SEAMCALL was successful */
> > > +		return 0;
> > > +	case TDX_SEAMCALL_VMFAILINVALID:
> > > +		pr_err_once("module is not loaded.\n");
> > > +		return -ENODEV;
> > > +	default:
> > > +		pr_err_once("SEAMCALL failed: CPU %d: leaf %llu, error 0x%llx.\n",
> > > +				cpu, fn, sret);
> > > +		if (out)
> > > +			pr_err_once("additional output: rcx 0x%llx, rdx 0x%llx, r8 0x%llx, r9 0x%llx, r10 0x%llx, r11 0x%llx.\n",
> > > +					out->rcx, out->rdx, out->r8,
> > > +					out->r9, out->r10, out->r11);
> > 
> > This look excessively noisy.
> > 
> > Don't we have SEAMCALL leafs that can fail in normal situation? Like
> > TDX_OPERAND_BUSY error code that indicate that operation likely will
> > succeed on retry.
> 
> For TDX module initialization TDX_OPERAND_BUSY cannot happen.  KVM may have
> legal cases that BUSY can happen, e.g., KVM's TDP MMU supports handling faults
> concurrently on different cpus, but that is still under discussion.  Also KVM
> tends to use __seamcall() directly:
> 
> https://lore.kernel.org/lkml/3c2c142e14a04a833b47f77faecaa91899b472cd.1678643052.git.isaku.yamahata@intel.com/
> 
> I guess KVM doesn't want to print message in all cases as you said, but for
> module initialization is fine.  Those error messages are useful in case
> something goes wrong, and printing them in seamcall() helps to reduce the code
> to print in all callers.

That's right.  KVM wants to do its own error handling and error messaging.  Its
requirement is different from TDX module initialization. I didn't see much
benefit to unify the function.
  
Chao Gao June 28, 2023, 3:09 a.m. UTC | #5
>+/*
>+ * Wrapper of __seamcall() to convert SEAMCALL leaf function error code
>+ * to kernel error code.  @seamcall_ret and @out contain the SEAMCALL
>+ * leaf function return code and the additional output respectively if
>+ * not NULL.
>+ */
>+static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
>+				    u64 *seamcall_ret,
>+				    struct tdx_module_output *out)
>+{
>+	u64 sret;
>+	int cpu;
>+
>+	/* Need a stable CPU id for printing error message */
>+	cpu = get_cpu();
>+	sret = __seamcall(fn, rcx, rdx, r8, r9, out);
>+	put_cpu();
>+
>+	/* Save SEAMCALL return code if the caller wants it */
>+	if (seamcall_ret)
>+		*seamcall_ret = sret;

Hi Kai,

All callers in this series pass NULL for seamcall_ret. I am no sure if
you keep it intentionally.

>+
>+	switch (sret) {
>+	case 0:
>+		/* SEAMCALL was successful */

Nit: if you add

#define TDX_SUCCESS	0

and do

	case TDX_SUCCESS:
		return 0;

then the code becomes self-explanatory. i.e., you can drop the comment.

>+		return 0;
  
Kai Huang June 28, 2023, 3:34 a.m. UTC | #6
On Wed, 2023-06-28 at 11:09 +0800, Chao Gao wrote:
> > +/*
> > + * Wrapper of __seamcall() to convert SEAMCALL leaf function error code
> > + * to kernel error code.  @seamcall_ret and @out contain the SEAMCALL
> > + * leaf function return code and the additional output respectively if
> > + * not NULL.
> > + */
> > +static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64
> > r9,
> > +				    u64 *seamcall_ret,
> > +				    struct tdx_module_output *out)
> > +{
> > +	u64 sret;
> > +	int cpu;
> > +
> > +	/* Need a stable CPU id for printing error message */
> > +	cpu = get_cpu();
> > +	sret = __seamcall(fn, rcx, rdx, r8, r9, out);
> > +	put_cpu();
> > +
> > +	/* Save SEAMCALL return code if the caller wants it */
> > +	if (seamcall_ret)
> > +		*seamcall_ret = sret;
> 
> Hi Kai,
> 
> All callers in this series pass NULL for seamcall_ret. I am no sure if
> you keep it intentionally.

In this series all the callers doesn't need seamcall_ret.

> 
> > +
> > +	switch (sret) {
> > +	case 0:
> > +		/* SEAMCALL was successful */
> 
> Nit: if you add
> 
> #define TDX_SUCCESS	0
> 
> and do
> 
> 	case TDX_SUCCESS:
> 		return 0;
> 
> then the code becomes self-explanatory. i.e., you can drop the comment.

If using this, I ended up with below:

--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -23,6 +23,8 @@
 #define TDX_SEAMCALL_GP                        (TDX_SW_ERROR | X86_TRAP_GP)
 #define TDX_SEAMCALL_UD                        (TDX_SW_ERROR | X86_TRAP_UD)
 
+#define TDX_SUCCESS           0
+

Hi Kirill/Dave/David,

Are you happy with this?
  
Kirill A. Shutemov June 28, 2023, 11:50 a.m. UTC | #7
On Wed, Jun 28, 2023 at 03:34:05AM +0000, Huang, Kai wrote:
> On Wed, 2023-06-28 at 11:09 +0800, Chao Gao wrote:
> > > +/*
> > > + * Wrapper of __seamcall() to convert SEAMCALL leaf function error code
> > > + * to kernel error code.  @seamcall_ret and @out contain the SEAMCALL
> > > + * leaf function return code and the additional output respectively if
> > > + * not NULL.
> > > + */
> > > +static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64
> > > r9,
> > > +				    u64 *seamcall_ret,
> > > +				    struct tdx_module_output *out)
> > > +{
> > > +	u64 sret;
> > > +	int cpu;
> > > +
> > > +	/* Need a stable CPU id for printing error message */
> > > +	cpu = get_cpu();
> > > +	sret = __seamcall(fn, rcx, rdx, r8, r9, out);
> > > +	put_cpu();
> > > +
> > > +	/* Save SEAMCALL return code if the caller wants it */
> > > +	if (seamcall_ret)
> > > +		*seamcall_ret = sret;
> > 
> > Hi Kai,
> > 
> > All callers in this series pass NULL for seamcall_ret. I am no sure if
> > you keep it intentionally.
> 
> In this series all the callers doesn't need seamcall_ret.

I'm fine keeping it if it is needed by KVM TDX enabling. Otherwise, just
drop it.

> > > +
> > > +	switch (sret) {
> > > +	case 0:
> > > +		/* SEAMCALL was successful */
> > 
> > Nit: if you add
> > 
> > #define TDX_SUCCESS	0
> > 
> > and do
> > 
> > 	case TDX_SUCCESS:
> > 		return 0;
> > 
> > then the code becomes self-explanatory. i.e., you can drop the comment.
> 
> If using this, I ended up with below:
> 
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -23,6 +23,8 @@
>  #define TDX_SEAMCALL_GP                        (TDX_SW_ERROR | X86_TRAP_GP)
>  #define TDX_SEAMCALL_UD                        (TDX_SW_ERROR | X86_TRAP_UD)
>  
> +#define TDX_SUCCESS           0
> +
> 
> Hi Kirill/Dave/David,
> 
> Are you happy with this?

Sure, looks good.
  
Peter Zijlstra June 28, 2023, 12:58 p.m. UTC | #8
On Tue, Jun 27, 2023 at 02:12:35AM +1200, Kai Huang wrote:

> +static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,

__always_inline perhaps? __always_unused seems wrong, worse it's still
there at the end of the series:

$ quilt diff --combine - | grep seamcall
...
+static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
...
+       ret = seamcall(TDH_SYS_INIT, 0, 0, 0, 0, NULL, NULL);
+       ret = seamcall(TDH_SYS_LP_INIT, 0, 0, 0, 0, NULL, NULL);
+       ret = seamcall(TDH_SYS_INFO, sysinfo_pa, TDSYSINFO_STRUCT_SIZE,
+       ret = seamcall(TDH_SYS_CONFIG, __pa(tdmr_pa_array),
+       return seamcall(TDH_SYS_KEY_CONFIG, 0, 0, 0, 0, NULL, NULL);
+               ret = seamcall(TDH_SYS_TDMR_INIT, tdmr->base, 0, 0, 0, NULL,
...

Definitely not unused.

> +				    u64 *seamcall_ret,
> +				    struct tdx_module_output *out)

This interface is atrocious :/ Why have these two ret values? Why can't
that live in a single space -- /me looks throught the callers, and finds
seamcall_ret is unused :-(

Worse, the input (c,d,8,9) is a strict subset of the output
(c,d,8,9,10,11) so why isn't that a single thing used for both input and
output.

struct tdx_call {
	u64 rcx, rdx, r8, r9, r10, r11;
};

static int __always_inline seamcall(u64 fn, struct tdx_call *regs)
{
}


	struct tdx_regs regs = { };
	ret = seamcall(THD_SYS_INIT, &regs);



	struct tdx_regs regs = {
		.rcx = sysinfo_pa,	.rdx = TDXSYSINFO_STRUCT_SIZE,
		.r8  = cmr_array_pa,	.r9  = MAX_CMRS,
	};
	ret = seamcall(THD_SYS_INFO, &regs);
	if (ret)
		return ret;

	print_cmrs(cmr_array, regs.r9);


/me looks more at this stuff and ... WTF!?!?

Can someone explain to me why __tdx_hypercall() is sane (per the above)
but then we grew __tdx_module_call() as an absolute abomination and are
apparently using that for seam too?




> +{
> +	u64 sret;
> +	int cpu;
> +
> +	/* Need a stable CPU id for printing error message */
> +	cpu = get_cpu();

And that's important because? Does having preemption off across the
seamcall make sense? Does it still make sense when you add a loop later?

> +	sret = __seamcall(fn, rcx, rdx, r8, r9, out);
> +	put_cpu();
> +
> +	/* Save SEAMCALL return code if the caller wants it */
> +	if (seamcall_ret)
> +		*seamcall_ret = sret;
> +
> +	switch (sret) {
> +	case 0:
> +		/* SEAMCALL was successful */
> +		return 0;
> +	case TDX_SEAMCALL_VMFAILINVALID:
> +		pr_err_once("module is not loaded.\n");
> +		return -ENODEV;
> +	default:
> +		pr_err_once("SEAMCALL failed: CPU %d: leaf %llu, error 0x%llx.\n",
> +				cpu, fn, sret);
> +		if (out)
> +			pr_err_once("additional output: rcx 0x%llx, rdx 0x%llx, r8 0x%llx, r9 0x%llx, r10 0x%llx, r11 0x%llx.\n",
> +					out->rcx, out->rdx, out->r8,
> +					out->r9, out->r10, out->r11);

At the very least this lacks { }, but it is quite horrendous coding
style.

Why switch() at all, would not:

	if (!rset)
		return 0;

	if (sret == TDX_SEAMCALL_VMFAILINVALID) {
		pr_nonsense();
		return -ENODEV;
	}

	if (sret == TDX_SEAMCALL_GP) {
		pr_nonsense();
		return -ENODEV;
	}

	if (sret == TDX_SEAMCALL_UD) {
		pr_nonsense();
		return -EINVAL;
	}

	pr_nonsense();
	return -EIO;

be much clearer and have less horrific indenting issues?

> +		return -EIO;
> +	}
> +}
  
Peter Zijlstra June 28, 2023, 1:54 p.m. UTC | #9
On Wed, Jun 28, 2023 at 02:58:13PM +0200, Peter Zijlstra wrote:

> Can someone explain to me why __tdx_hypercall() is sane (per the above)
> but then we grew __tdx_module_call() as an absolute abomination and are
> apparently using that for seam too?

That is, why do we have two different TDCALL wrappers? Makes no sense.
  
Kai Huang June 28, 2023, 11:21 p.m. UTC | #10
On Wed, 2023-06-28 at 14:58 +0200, Peter Zijlstra wrote:
> On Tue, Jun 27, 2023 at 02:12:35AM +1200, Kai Huang wrote:
> 
> > +static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> 
> __always_inline perhaps? __always_unused seems wrong, worse it's still
> there at the end of the series:
> 
> $ quilt diff --combine - | grep seamcall
> ...
> +static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> ...
> +       ret = seamcall(TDH_SYS_INIT, 0, 0, 0, 0, NULL, NULL);
> +       ret = seamcall(TDH_SYS_LP_INIT, 0, 0, 0, 0, NULL, NULL);
> +       ret = seamcall(TDH_SYS_INFO, sysinfo_pa, TDSYSINFO_STRUCT_SIZE,
> +       ret = seamcall(TDH_SYS_CONFIG, __pa(tdmr_pa_array),
> +       return seamcall(TDH_SYS_KEY_CONFIG, 0, 0, 0, 0, NULL, NULL);
> +               ret = seamcall(TDH_SYS_TDMR_INIT, tdmr->base, 0, 0, 0, NULL,
> ...
> 
> Definitely not unused.

Thanks for reviewing!

Sorry obviously I forgot to remove __always_unused in the patch that firstly
used seamcall().  Should be more careful. :(

> 
> > +				    u64 *seamcall_ret,
> > +				    struct tdx_module_output *out)
> 
> This interface is atrocious :/ Why have these two ret values? Why can't
> that live in a single space -- /me looks throught the callers, and finds
> seamcall_ret is unused :-(

I'll @seamcall_ret as also suggested by Kirill.

> 
> Worse, the input (c,d,8,9) is a strict subset of the output
> (c,d,8,9,10,11) so why isn't that a single thing used for both input and
> output.
> 
> struct tdx_call {
> 	u64 rcx, rdx, r8, r9, r10, r11;
> };
> 
> static int __always_inline seamcall(u64 fn, struct tdx_call *regs)
> {
> }
> 
> 
> 	struct tdx_regs regs = { };
> 	ret = seamcall(THD_SYS_INIT, &regs);
> 
> 
> 
> 	struct tdx_regs regs = {
> 		.rcx = sysinfo_pa,	.rdx = TDXSYSINFO_STRUCT_SIZE,
> 		.r8  = cmr_array_pa,	.r9  = MAX_CMRS,
> 	};
> 	ret = seamcall(THD_SYS_INFO, &regs);
> 	if (ret)
> 		return ret;
> 
> 	print_cmrs(cmr_array, regs.r9);
> 
> 
> /me looks more at this stuff and ... WTF!?!?
> 
> Can someone explain to me why __tdx_hypercall() is sane (per the above)
> but then we grew __tdx_module_call() as an absolute abomination and are
> apparently using that for seam too?
> 
> 

Sorry I don't know the story behind __tdx_hypercall().

For TDCALL and SEAMCALL, I believe one reason is they can be used in performance
critical path.  The @out is not always used, so putting all outputs to a
structure can reduce the number of function parameters. I once had separate
struct tdx_seamcall_input {} and struct tdx_seamcall_out {} but wasn't
preferred.

Kirill, could you help to explain?

> 
> 
> > +{
> > +	u64 sret;
> > +	int cpu;
> > +
> > +	/* Need a stable CPU id for printing error message */
> > +	cpu = get_cpu();
> 
> And that's important because? 
> 

I want to have a stable cpu for error message printing.

> Does having preemption off across the
> seamcall make sense? Does it still make sense when you add a loop later?

SEAMCALL itself isn't interruptible, so I think having preemption off around
SEAMCALL is fine.  But I agree disabling preemption around multiple SEAMCALL
isn't ideal.  I'll change that to only disable preemption around one SEAMCALL to
get a correct CPU id for error printing.

> 
> > +	sret = __seamcall(fn, rcx, rdx, r8, r9, out);
> > +	put_cpu();
> > +
> > +	/* Save SEAMCALL return code if the caller wants it */
> > +	if (seamcall_ret)
> > +		*seamcall_ret = sret;
> > +
> > +	switch (sret) {
> > +	case 0:
> > +		/* SEAMCALL was successful */
> > +		return 0;
> > +	case TDX_SEAMCALL_VMFAILINVALID:
> > +		pr_err_once("module is not loaded.\n");
> > +		return -ENODEV;
> > +	default:
> > +		pr_err_once("SEAMCALL failed: CPU %d: leaf %llu, error 0x%llx.\n",
> > +				cpu, fn, sret);
> > +		if (out)
> > +			pr_err_once("additional output: rcx 0x%llx, rdx 0x%llx, r8 0x%llx, r9 0x%llx, r10 0x%llx, r11 0x%llx.\n",
> > +					out->rcx, out->rdx, out->r8,
> > +					out->r9, out->r10, out->r11);
> 
> At the very least this lacks { }, but it is quite horrendous coding
> style.
> 
> Why switch() at all, would not:
> 
> 	if (!rset)
> 		return 0;
> 
> 	if (sret == TDX_SEAMCALL_VMFAILINVALID) {
> 		pr_nonsense();
> 		return -ENODEV;
> 	}
> 
> 	if (sret == TDX_SEAMCALL_GP) {
> 		pr_nonsense();
> 		return -ENODEV;
> 	}
> 
> 	if (sret == TDX_SEAMCALL_UD) {
> 		pr_nonsense();
> 		return -EINVAL;
> 	}
> 
> 	pr_nonsense();
> 	return -EIO;
> 
> be much clearer and have less horrific indenting issues?

I can certainly change to this style.  Thanks.
  
Kai Huang June 28, 2023, 11:25 p.m. UTC | #11
On Wed, 2023-06-28 at 15:54 +0200, Peter Zijlstra wrote:
> On Wed, Jun 28, 2023 at 02:58:13PM +0200, Peter Zijlstra wrote:
> 
> > Can someone explain to me why __tdx_hypercall() is sane (per the above)
> > but then we grew __tdx_module_call() as an absolute abomination and are
> > apparently using that for seam too?
> 
> That is, why do we have two different TDCALL wrappers? Makes no sense.
> 
I think the reason should be TDCALL/SEAMCALL can be used in performance critical
path, but TDVMCALL isn't.

For example, SEAMCALLs are used in KVM's MMU code to handle page fault for TDX
private pages.

Kirill, could you help to clarify?  Thanks.
  
Kai Huang June 28, 2023, 11:31 p.m. UTC | #12
On Wed, 2023-06-28 at 14:50 +0300, kirill.shutemov@linux.intel.com wrote:
> On Wed, Jun 28, 2023 at 03:34:05AM +0000, Huang, Kai wrote:
> > On Wed, 2023-06-28 at 11:09 +0800, Chao Gao wrote:
> > > > +/*
> > > > + * Wrapper of __seamcall() to convert SEAMCALL leaf function error code
> > > > + * to kernel error code.  @seamcall_ret and @out contain the SEAMCALL
> > > > + * leaf function return code and the additional output respectively if
> > > > + * not NULL.
> > > > + */
> > > > +static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64
> > > > r9,
> > > > +				    u64 *seamcall_ret,
> > > > +				    struct tdx_module_output *out)
> > > > +{
> > > > +	u64 sret;
> > > > +	int cpu;
> > > > +
> > > > +	/* Need a stable CPU id for printing error message */
> > > > +	cpu = get_cpu();
> > > > +	sret = __seamcall(fn, rcx, rdx, r8, r9, out);
> > > > +	put_cpu();
> > > > +
> > > > +	/* Save SEAMCALL return code if the caller wants it */
> > > > +	if (seamcall_ret)
> > > > +		*seamcall_ret = sret;
> > > 
> > > Hi Kai,
> > > 
> > > All callers in this series pass NULL for seamcall_ret. I am no sure if
> > > you keep it intentionally.
> > 
> > In this series all the callers doesn't need seamcall_ret.
> 
> I'm fine keeping it if it is needed by KVM TDX enabling. Otherwise, just
> drop it.

No problem I'll drop it.  KVM is using __seamcall() anyway.
  
Kai Huang June 29, 2023, 3:40 a.m. UTC | #13
On Wed, 2023-06-28 at 23:21 +0000, Huang, Kai wrote:
> > > +	/* Need a stable CPU id for printing error message */
> > > +	cpu = get_cpu();
> > 
> > And that's important because? 
> > 
> 
> I want to have a stable cpu for error message printing.

Sorry misunderstood your question.

I think having the CPU id on which the SEAMCALL failed in the dmesg would be
better?  But it's not absolutely needed.  I can remove it (thus remove
{get|put}_cpu()) if you prefer not to print?
  
Kirill A. Shutemov June 29, 2023, 10:15 a.m. UTC | #14
On Wed, Jun 28, 2023 at 03:54:36PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 28, 2023 at 02:58:13PM +0200, Peter Zijlstra wrote:
> 
> > Can someone explain to me why __tdx_hypercall() is sane (per the above)
> > but then we grew __tdx_module_call() as an absolute abomination and are
> > apparently using that for seam too?
> 
> That is, why do we have two different TDCALL wrappers? Makes no sense.

__tdx_module_call() is the wrapper for TDCALL.

__tdx_hypercall() is the wrapper for TDG.VP.VMCALL leaf function of
TDCALL. The function is used often and it uses wider range or registers
comparing to the rest of the TDCALL functions.
  
David Hildenbrand June 29, 2023, 11:25 a.m. UTC | #15
>> then the code becomes self-explanatory. i.e., you can drop the comment.
> 
> If using this, I ended up with below:
> 
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -23,6 +23,8 @@
>   #define TDX_SEAMCALL_GP                        (TDX_SW_ERROR | X86_TRAP_GP)
>   #define TDX_SEAMCALL_UD                        (TDX_SW_ERROR | X86_TRAP_UD)
>   
> +#define TDX_SUCCESS           0
> +
> 
> Hi Kirill/Dave/David,
> 
> Are you happy with this?

Yes, all sounds good to me!
  

Patch

diff --git a/arch/x86/virt/vmx/tdx/Makefile b/arch/x86/virt/vmx/tdx/Makefile
index 93ca8b73e1f1..38d534f2c113 100644
--- a/arch/x86/virt/vmx/tdx/Makefile
+++ b/arch/x86/virt/vmx/tdx/Makefile
@@ -1,2 +1,2 @@ 
 # SPDX-License-Identifier: GPL-2.0-only
-obj-y += tdx.o
+obj-y += tdx.o seamcall.o
diff --git a/arch/x86/virt/vmx/tdx/seamcall.S b/arch/x86/virt/vmx/tdx/seamcall.S
new file mode 100644
index 000000000000..f81be6b9c133
--- /dev/null
+++ b/arch/x86/virt/vmx/tdx/seamcall.S
@@ -0,0 +1,52 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <linux/linkage.h>
+#include <asm/frame.h>
+
+#include "tdxcall.S"
+
+/*
+ * __seamcall() - Host-side interface functions to SEAM software module
+ *		  (the P-SEAMLDR or the TDX module).
+ *
+ * Transform function call register arguments into the SEAMCALL register
+ * ABI.  Return TDX_SEAMCALL_VMFAILINVALID if the SEAMCALL itself fails,
+ * or the completion status of the SEAMCALL leaf function.  Additional
+ * output operands are saved in @out (if it is provided by the caller).
+ *
+ *-------------------------------------------------------------------------
+ * SEAMCALL ABI:
+ *-------------------------------------------------------------------------
+ * Input Registers:
+ *
+ * RAX                 - SEAMCALL Leaf number.
+ * RCX,RDX,R8-R9       - SEAMCALL Leaf specific input registers.
+ *
+ * Output Registers:
+ *
+ * RAX                 - SEAMCALL completion status code.
+ * RCX,RDX,R8-R11      - SEAMCALL Leaf specific output registers.
+ *
+ *-------------------------------------------------------------------------
+ *
+ * __seamcall() function ABI:
+ *
+ * @fn  (RDI)          - SEAMCALL Leaf number, moved to RAX
+ * @rcx (RSI)          - Input parameter 1, moved to RCX
+ * @rdx (RDX)          - Input parameter 2, moved to RDX
+ * @r8  (RCX)          - Input parameter 3, moved to R8
+ * @r9  (R8)           - Input parameter 4, moved to R9
+ *
+ * @out (R9)           - struct tdx_module_output pointer
+ *			 stored temporarily in R12 (not
+ *			 used by the P-SEAMLDR or the TDX
+ *			 module). It can be NULL.
+ *
+ * Return (via RAX) the completion status of the SEAMCALL, or
+ * TDX_SEAMCALL_VMFAILINVALID.
+ */
+SYM_FUNC_START(__seamcall)
+	FRAME_BEGIN
+	TDX_MODULE_CALL host=1
+	FRAME_END
+	RET
+SYM_FUNC_END(__seamcall)
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 908590e85749..f8233cba5931 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -12,14 +12,56 @@ 
 #include <linux/init.h>
 #include <linux/errno.h>
 #include <linux/printk.h>
+#include <linux/smp.h>
 #include <asm/msr-index.h>
 #include <asm/msr.h>
 #include <asm/tdx.h>
+#include "tdx.h"
 
 static u32 tdx_global_keyid __ro_after_init;
 static u32 tdx_guest_keyid_start __ro_after_init;
 static u32 tdx_nr_guest_keyids __ro_after_init;
 
+/*
+ * Wrapper of __seamcall() to convert SEAMCALL leaf function error code
+ * to kernel error code.  @seamcall_ret and @out contain the SEAMCALL
+ * leaf function return code and the additional output respectively if
+ * not NULL.
+ */
+static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
+				    u64 *seamcall_ret,
+				    struct tdx_module_output *out)
+{
+	u64 sret;
+	int cpu;
+
+	/* Need a stable CPU id for printing error message */
+	cpu = get_cpu();
+	sret = __seamcall(fn, rcx, rdx, r8, r9, out);
+	put_cpu();
+
+	/* Save SEAMCALL return code if the caller wants it */
+	if (seamcall_ret)
+		*seamcall_ret = sret;
+
+	switch (sret) {
+	case 0:
+		/* SEAMCALL was successful */
+		return 0;
+	case TDX_SEAMCALL_VMFAILINVALID:
+		pr_err_once("module is not loaded.\n");
+		return -ENODEV;
+	default:
+		pr_err_once("SEAMCALL failed: CPU %d: leaf %llu, error 0x%llx.\n",
+				cpu, fn, sret);
+		if (out)
+			pr_err_once("additional output: rcx 0x%llx, rdx 0x%llx, r8 0x%llx, r9 0x%llx, r10 0x%llx, r11 0x%llx.\n",
+					out->rcx, out->rdx, out->r8,
+					out->r9, out->r10, out->r11);
+		return -EIO;
+	}
+}
+
 static int __init record_keyid_partitioning(u32 *tdx_keyid_start,
 					    u32 *nr_tdx_keyids)
 {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
new file mode 100644
index 000000000000..48ad1a1ba737
--- /dev/null
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -0,0 +1,10 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _X86_VIRT_TDX_H
+#define _X86_VIRT_TDX_H
+
+#include <linux/types.h>
+
+struct tdx_module_output;
+u64 __seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
+	       struct tdx_module_output *out);
+#endif