Message ID | d08c5e27dd7377564d69648f3eb7b56d3c95b84b.1689151537.git.kai.huang@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1012868vqm; Wed, 12 Jul 2023 02:06:38 -0700 (PDT) X-Google-Smtp-Source: APBJJlEMxYgTCSwhrXjthNzAX6CVom/7K0Ek96CimFZw+uckmWatsZ/CFrOtxGmSsxMWfkGyqF7t X-Received: by 2002:a19:e008:0:b0:4fb:8948:2b8e with SMTP id x8-20020a19e008000000b004fb89482b8emr17177715lfg.8.1689152798376; Wed, 12 Jul 2023 02:06:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689152798; cv=none; d=google.com; s=arc-20160816; b=sQgW/BU6pOaaLiepMYwuC1qZDN0zIMQuybSggqtcpb+nzioB+QdDg4hJNBy0aWoWpI mPw1YwuzDUY1kVm6JF8k5RsKDgbFd5uy7nGvQd5kUIScJQoOgPnXV/SZDYIHnxI9C+ab nT4SwfhTxKOA2D7B7B/5cz4Ujmau6qTpXKg0FQBLDexILSvCldiGVat/fiL/OmXhjGn3 YkSui+uxp+orLxE/G6J9Z8Ev2ySbGwTFDt5r8Lw/gQWSW8NGC7uHcCkp1CNJ2YP9K/iB zE7IkpHnT7B1/ujd0ImBq1goXb0i2NcmD8reZeYkYOFBDp/ikC639qecPzRzXuevrxFl L7Gg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=CEgGyvplSJhoa6EYLn3DQjGFNiCc2mcEEPMFXPMvuGE=; fh=edBx5P/zffK2H8fHqIB7+dmlz60K6YmM7EnWiorgzxE=; b=eveUknBIxH6hwWZFHshr4hv6p3auRvrAg2dVIy4pSBpsJoxiE6cUllCLCjDnifzKmR OJIZz/TrUE7Iol4aelnq8dsw6ckVAVqLVdpjDIyOaSBWw8Ph0HxF95tcQuhfz8Y+jV9d We3f6e5Vci6Ctmigwkn7OwKOtAXj8dgVGjGfEa0pRNrbFq/euSLJPlakB9nlj79/CRwF MOqB+Ww06Ru6wDauZ/QDmUdyCi+i5mJ817D6EyVwXPvuNoQmVLVUhc7INpas1njLqeB7 1fn37tishwbmJv81Ha3dpRg8VeQAFzZ5gCeiEJ4cVkiNUheQrk/n7y1rSVqmd8QHEQOc Gw9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dmdKqcXk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m4-20020a50ef04000000b0051dd307727dsi3886779eds.59.2023.07.12.02.06.13; Wed, 12 Jul 2023 02:06:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dmdKqcXk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232969AbjGLI5k (ORCPT <rfc822;gnulinuxfreebsd@gmail.com> + 99 others); Wed, 12 Jul 2023 04:57:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233070AbjGLI44 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 12 Jul 2023 04:56:56 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27F7AFB; Wed, 12 Jul 2023 01:56:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689152179; x=1720688179; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZZnS4qYKEmMkuj/bTVIMjbuKRqrLk9IATDNJFNB5kyo=; b=dmdKqcXkULA9O/kFBSm752ZEMMohLh2+KLJEQox56hS3ObFjqQ5aXRJ9 NmfZmDmbwwVIbhk1xH+im+9tGhWocDPWiFgeCoeUNsvIfX/XpiVk2SPjP 4+lQ7/Hxg33LmAdUNPBaq+CujC9Bl0SOm3bDhfGlAGxImvWDKdJcO3L0n 2U8ZH0pfHyR5q4OFT2GlEoO4S5CR0vtGokrqfHLyB7X6PtACZqqaWJcWE 3lGgG4SkKwlLfzFLCQGv5/5mPEituq4KRlb+4rH40AS2gHmv+i0NR+jTS TU+GQpvhE+KBTMpjWYokB9Nn882SaAZtj05PpHCA/5OdS2RfYfC8uWSZ3 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="344439344" X-IronPort-AV: E=Sophos;i="6.01,199,1684825200"; d="scan'208";a="344439344" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 01:56:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="845573421" X-IronPort-AV: E=Sophos;i="6.01,199,1684825200"; d="scan'208";a="845573421" Received: from mjamatan-mobl2.amr.corp.intel.com (HELO khuang2-desk.gar.corp.intel.com) ([10.209.168.102]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 01:56:10 -0700 From: Kai Huang <kai.huang@intel.com> To: peterz@infradead.org, kirill.shutemov@linux.intel.com, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, tglx@linutronix.de, bp@alien8.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, isaku.yamahata@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, kai.huang@intel.com Subject: [PATCH 10/10] x86/virt/tdx: Allow SEAMCALL to handle #UD and #GP Date: Wed, 12 Jul 2023 20:55:24 +1200 Message-ID: <d08c5e27dd7377564d69648f3eb7b56d3c95b84b.1689151537.git.kai.huang@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <cover.1689151537.git.kai.huang@intel.com> References: <cover.1689151537.git.kai.huang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771205084913666109 X-GMAIL-MSGID: 1771205084913666109 |
Series |
Unify TDCALL/SEAMCALL and TDVMCALL assembly
|
|
Commit Message
Kai Huang
July 12, 2023, 8:55 a.m. UTC
On the platform with the "partial write machine check" erratum, a kernel
partial write to TDX private memory may cause unexpected machine check.
It would be nice if the #MC handler could print additional information
to show the #MC was TDX private memory error due to possible kernel bug.
To do that, the machine check handler needs to use SEAMCALL to query
page type of the error memory from the TDX module, because there's no
existing infrastructure to track TDX private pages.
SEAMCALL instruction causes #UD if CPU isn't in VMX operation. In #MC
handler, it is legal that CPU isn't in VMX operation when making this
SEAMCALL. Extend the TDX_MODULE_CALL macro to handle #UD so the
SEAMCALL can return error code instead of Oops in the #MC handler.
Opportunistically handles #GP too since they share the same code.
A bonus is when kernel mistakenly calls SEAMCALL when CPU isn't in VMX
operation, or when TDX isn't enabled by the BIOS, or when the BIOS is
buggy, the kernel can get a nicer error message rather than a less
understandable Oops.
This is basically based on Peter's code.
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
arch/x86/include/asm/tdx.h | 5 +++++
arch/x86/virt/vmx/tdx/tdxcall.S | 20 ++++++++++++++++++++
2 files changed, 25 insertions(+)
Comments
On Wed, Jul 12, 2023 at 08:55:24PM +1200, Kai Huang wrote: > @@ -85,6 +86,7 @@ > .endif /* \saved */ > > .if \host > +1: > seamcall > /* > * SEAMCALL instruction is essentially a VMExit from VMX root > @@ -99,6 +101,7 @@ > */ > mov $TDX_SEAMCALL_VMFAILINVALID, %rdi > cmovc %rdi, %rax > +2: > .else > tdcall > .endif This is just wrong, if the thing traps you should not do the return registers. And at this point the mov/cmovc thing doesn't make much sense either. > @@ -185,4 +188,21 @@ > > FRAME_END > RET > + > + .if \host > +3: > + /* > + * SEAMCALL caused #GP or #UD. By reaching here %eax contains > + * the trap number. Convert the trap number to the TDX error > + * code by setting TDX_SW_ERROR to the high 32-bits of %rax. > + * > + * Note cannot OR TDX_SW_ERROR directly to %rax as OR instruction > + * only accepts 32-bit immediate at most. > + */ > + movq $TDX_SW_ERROR, %r12 > + orq %r12, %rax > + jmp 2b > + > + _ASM_EXTABLE_FAULT(1b, 3b) > + .endif /* \host */ > .endm Also, please used named labels where possible and *please* keep asm directives unindented. --- a/arch/x86/virt/vmx/tdx/tdxcall.S +++ b/arch/x86/virt/vmx/tdx/tdxcall.S @@ -56,7 +56,7 @@ movq TDX_MODULE_r10(%rsi), %r10 movq TDX_MODULE_r11(%rsi), %r11 - .if \saved +.if \saved /* * Move additional input regs from the structure. For simplicity * assume that anything needs the callee-saved regs also tramples @@ -75,18 +75,18 @@ movq TDX_MODULE_r15(%rsi), %r15 movq TDX_MODULE_rbx(%rsi), %rbx - .if \ret +.if \ret /* Save the structure pointer as %rsi is about to be clobbered */ pushq %rsi - .endif +.endif movq TDX_MODULE_rdi(%rsi), %rdi /* %rsi needs to be done at last */ movq TDX_MODULE_rsi(%rsi), %rsi - .endif /* \saved */ +.endif /* \saved */ - .if \host -1: +.if \host +.Lseamcall: seamcall /* * SEAMCALL instruction is essentially a VMExit from VMX root @@ -99,15 +99,13 @@ * This value will never be used as actual SEAMCALL error code as * it is from the Reserved status code class. */ - mov $TDX_SEAMCALL_VMFAILINVALID, %rdi - cmovc %rdi, %rax -2: - .else + jc .Lseamfail +.else tdcall - .endif +.endif - .if \ret - .if \saved +.if \ret +.if \saved /* * Restore the structure from stack to saved the output registers * @@ -136,7 +134,7 @@ movq %r15, TDX_MODULE_r15(%rsi) movq %rbx, TDX_MODULE_rbx(%rsi) movq %rdi, TDX_MODULE_rdi(%rsi) - .endif /* \saved */ +.endif /* \saved */ /* Copy output regs to the structure */ movq %rcx, TDX_MODULE_rcx(%rsi) @@ -145,10 +143,11 @@ movq %r9, TDX_MODULE_r9(%rsi) movq %r10, TDX_MODULE_r10(%rsi) movq %r11, TDX_MODULE_r11(%rsi) - .endif /* \ret */ +.endif /* \ret */ - .if \saved - .if \ret +.Lout: +.if \saved +.if \ret /* * Clear registers shared by guest for VP.ENTER and VP.VMCALL to * prevent speculative use of values from guest/VMM, including @@ -170,13 +169,8 @@ xorq %r9, %r9 xorq %r10, %r10 xorq %r11, %r11 - xorq %r12, %r12 - xorq %r13, %r13 - xorq %r14, %r14 - xorq %r15, %r15 - xorq %rbx, %rbx xorq %rdi, %rdi - .endif /* \ret */ +.endif /* \ret */ /* Restore callee-saved GPRs as mandated by the x86_64 ABI */ popq %r15 @@ -184,13 +178,17 @@ popq %r13 popq %r12 popq %rbx - .endif /* \saved */ +.endif /* \saved */ FRAME_END RET - .if \host -3: +.if \host +.Lseamfail: + mov $TDX_SEAMCALL_VMFAILINVALID, %rax + jmp .Lout + +.Lseamtrap: /* * SEAMCALL caused #GP or #UD. By reaching here %eax contains * the trap number. Convert the trap number to the TDX error @@ -201,8 +199,8 @@ */ movq $TDX_SW_ERROR, %r12 orq %r12, %rax - jmp 2b + jmp .Lout - _ASM_EXTABLE_FAULT(1b, 3b) - .endif /* \host */ + _ASM_EXTABLE_FAULT(.Lseamcall, .Lseamtrap) +.endif /* \host */ .endm
On Thu, 2023-07-13 at 10:07 +0200, Peter Zijlstra wrote: > On Wed, Jul 12, 2023 at 08:55:24PM +1200, Kai Huang wrote: > > @@ -85,6 +86,7 @@ > > .endif /* \saved */ > > > > .if \host > > +1: > > seamcall > > /* > > * SEAMCALL instruction is essentially a VMExit from VMX root > > @@ -99,6 +101,7 @@ > > */ > > mov $TDX_SEAMCALL_VMFAILINVALID, %rdi > > cmovc %rdi, %rax > > +2: > > .else > > tdcall > > .endif > > This is just wrong, if the thing traps you should not do the return > registers. And at this point the mov/cmovc thing doesn't make much sense > either. OK will do. Yes "do return registers" isn't necessary. I thought to keep code simple we can just do it. The trap/VMFAILINVALID code path isn't in performance path anyway. This is a problem in the current upstream code too. I'll fix it first in a separate patch. > > > @@ -185,4 +188,21 @@ > > > > FRAME_END > > RET > > + > > + .if \host > > +3: > > + /* > > + * SEAMCALL caused #GP or #UD. By reaching here %eax contains > > + * the trap number. Convert the trap number to the TDX error > > + * code by setting TDX_SW_ERROR to the high 32-bits of %rax. > > + * > > + * Note cannot OR TDX_SW_ERROR directly to %rax as OR instruction > > + * only accepts 32-bit immediate at most. > > + */ > > + movq $TDX_SW_ERROR, %r12 > > + orq %r12, %rax > > + jmp 2b > > + > > + _ASM_EXTABLE_FAULT(1b, 3b) > > + .endif /* \host */ > > .endm > > Also, please used named labels where possible and *please* keep asm > directives unindented. Yes will do. > > > --- a/arch/x86/virt/vmx/tdx/tdxcall.S > +++ b/arch/x86/virt/vmx/tdx/tdxcall.S > @@ -56,7 +56,7 @@ > movq TDX_MODULE_r10(%rsi), %r10 > movq TDX_MODULE_r11(%rsi), %r11 > > - .if \saved > +.if \saved > /* > * Move additional input regs from the structure. For simplicity > * assume that anything needs the callee-saved regs also tramples > @@ -75,18 +75,18 @@ > movq TDX_MODULE_r15(%rsi), %r15 > movq TDX_MODULE_rbx(%rsi), %rbx > > - .if \ret > +.if \ret > /* Save the structure pointer as %rsi is about to be clobbered */ > pushq %rsi > - .endif > +.endif > > movq TDX_MODULE_rdi(%rsi), %rdi > /* %rsi needs to be done at last */ > movq TDX_MODULE_rsi(%rsi), %rsi > - .endif /* \saved */ > +.endif /* \saved */ > > - .if \host > -1: > +.if \host > +.Lseamcall: > seamcall > /* > * SEAMCALL instruction is essentially a VMExit from VMX root > @@ -99,15 +99,13 @@ > * This value will never be used as actual SEAMCALL error code as > * it is from the Reserved status code class. > */ > - mov $TDX_SEAMCALL_VMFAILINVALID, %rdi > - cmovc %rdi, %rax > -2: > - .else > + jc .Lseamfail > +.else > tdcall > - .endif > +.endif > > - .if \ret > - .if \saved > +.if \ret > +.if \saved > /* > * Restore the structure from stack to saved the output registers > * > @@ -136,7 +134,7 @@ > movq %r15, TDX_MODULE_r15(%rsi) > movq %rbx, TDX_MODULE_rbx(%rsi) > movq %rdi, TDX_MODULE_rdi(%rsi) > - .endif /* \saved */ > +.endif /* \saved */ > > /* Copy output regs to the structure */ > movq %rcx, TDX_MODULE_rcx(%rsi) > @@ -145,10 +143,11 @@ > movq %r9, TDX_MODULE_r9(%rsi) > movq %r10, TDX_MODULE_r10(%rsi) > movq %r11, TDX_MODULE_r11(%rsi) > - .endif /* \ret */ > +.endif /* \ret */ > > - .if \saved > - .if \ret > +.Lout: > +.if \saved > +.if \ret > /* > * Clear registers shared by guest for VP.ENTER and VP.VMCALL to > * prevent speculative use of values from guest/VMM, including > @@ -170,13 +169,8 @@ > xorq %r9, %r9 > xorq %r10, %r10 > xorq %r11, %r11 > - xorq %r12, %r12 > - xorq %r13, %r13 > - xorq %r14, %r14 > - xorq %r15, %r15 > - xorq %rbx, %rbx > xorq %rdi, %rdi > - .endif /* \ret */ > +.endif /* \ret */ > > /* Restore callee-saved GPRs as mandated by the x86_64 ABI */ > popq %r15 > @@ -184,13 +178,17 @@ > popq %r13 > popq %r12 > popq %rbx > - .endif /* \saved */ > +.endif /* \saved */ > > FRAME_END > RET > > - .if \host > -3: > +.if \host > +.Lseamfail: > + mov $TDX_SEAMCALL_VMFAILINVALID, %rax > + jmp .Lout > + > +.Lseamtrap: > /* > * SEAMCALL caused #GP or #UD. By reaching here %eax contains > * the trap number. Convert the trap number to the TDX error > @@ -201,8 +199,8 @@ > */ > movq $TDX_SW_ERROR, %r12 > orq %r12, %rax > - jmp 2b > + jmp .Lout Thanks for the code. There might be stack balancing issue here. I'll double check when updating this patch. Thanks! > > - _ASM_EXTABLE_FAULT(1b, 3b) > - .endif /* \host */ > + _ASM_EXTABLE_FAULT(.Lseamcall, .Lseamtrap) > +.endif /* \host */ > .endm
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index a82e5249d079..feb85316346e 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -8,6 +8,8 @@ #include <asm/ptrace.h> #include <asm/shared/tdx.h> +#include <asm/trapnr.h> + /* * SW-defined error codes. * @@ -18,6 +20,9 @@ #define TDX_SW_ERROR (TDX_ERROR | GENMASK_ULL(47, 40)) #define TDX_SEAMCALL_VMFAILINVALID (TDX_SW_ERROR | _UL(0xFFFF0000)) +#define TDX_SEAMCALL_GP (TDX_SW_ERROR | X86_TRAP_GP) +#define TDX_SEAMCALL_UD (TDX_SW_ERROR | X86_TRAP_UD) + #ifndef __ASSEMBLY__ /* diff --git a/arch/x86/virt/vmx/tdx/tdxcall.S b/arch/x86/virt/vmx/tdx/tdxcall.S index e4e90ebf5dad..04b0c466f38c 100644 --- a/arch/x86/virt/vmx/tdx/tdxcall.S +++ b/arch/x86/virt/vmx/tdx/tdxcall.S @@ -2,6 +2,7 @@ #include <asm/asm-offsets.h> #include <asm/frame.h> #include <asm/tdx.h> +#include <asm/asm.h> /* * TDCALL and SEAMCALL are supported in Binutils >= 2.36. @@ -85,6 +86,7 @@ .endif /* \saved */ .if \host +1: seamcall /* * SEAMCALL instruction is essentially a VMExit from VMX root @@ -99,6 +101,7 @@ */ mov $TDX_SEAMCALL_VMFAILINVALID, %rdi cmovc %rdi, %rax +2: .else tdcall .endif @@ -185,4 +188,21 @@ FRAME_END RET + + .if \host +3: + /* + * SEAMCALL caused #GP or #UD. By reaching here %eax contains + * the trap number. Convert the trap number to the TDX error + * code by setting TDX_SW_ERROR to the high 32-bits of %rax. + * + * Note cannot OR TDX_SW_ERROR directly to %rax as OR instruction + * only accepts 32-bit immediate at most. + */ + movq $TDX_SW_ERROR, %r12 + orq %r12, %rax + jmp 2b + + _ASM_EXTABLE_FAULT(1b, 3b) + .endif /* \host */ .endm