From patchwork Tue Oct 3 06:24:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 147668 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2a8e:b0:403:3b70:6f57 with SMTP id in14csp1899294vqb; Mon, 2 Oct 2023 23:57:09 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEQ/+TGkNVhy1/XIQupk+T8ktkzyIhy2T8tNWuS77fsdmpRyZdS/GMIhrjBs4TxG9cNRn5+ X-Received: by 2002:a05:6a00:478f:b0:68e:2c2a:aa1d with SMTP id dh15-20020a056a00478f00b0068e2c2aaa1dmr2176478pfb.11.1696316229410; Mon, 02 Oct 2023 23:57:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696316229; cv=none; d=google.com; s=arc-20160816; b=kUMbbnXYOSO3mMxbucy0OlT/gvqyMWUKHw3OKgcL7hRs9jMU/NwbQQfunV3dG1Wt2l niKk759eCKSKVqIPA3bS2w92m/YeUFHNgwmrPkevaC5S70mhB2hApK6GaaSmJ4X10qG9 RQPyFbgLV4SaKQK4zMMzqOcK6SStHq09fPIuTVCuUS/OpPv+rgMLSEXQQOa+TJdvpj/V nFUnXX1rVU7C4A7M8o+2xn4wU5N0kZG58ldxX6+ebP+QeVqW6oQeoHgRS0549M2xYksw FBWSQ7eMyZLpwQj8gdo/+mZJSU3VOf0qYBs0Jf8/jfYfOyctgRK/PxjpkkTk6Fyl3It8 tgew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=B8dhvvLwVto7H04qMb5wWdnQ6PcXHI4Uph7ke+BlMEw=; fh=Tb1p8S3iOxe/kX/WSNC3xEJzSVgGJ5DaxSrKb+pzH+4=; b=ya6+U1vD4ClVQ+g2pHjuuMEFPL3VJJPdZwOHFeF0PwFKX1Iyu9QJhLB5MnwxYVBlkP ndBMwL4wu+Hk7aa/X8ge3iFZESsqX4qMZMYLRopYGzmVachOtQpDZOR4wc8U7MVol5vd 9jYUUpnZt2pACdKU1g5qmKx9uZBPZ9Wu3brcxXXvklv9OVX7R71NbxJB39HAXftUvJ/W l0DCY0MF3qPJqg43tb1GPk+pvrjLs6EFb2Mzxxrbzl2a6Fee3WN7cbkylQnjg4urElO4 oyaY8r5sjGewclVF45cXyCjIqHE4hhIJTkB8p4PW2Fy/wwqg4Mrbdq8jkjqtmNISFMfw LmpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=F1e+pEta; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id f20-20020a637554000000b00578daf6f4e2si808846pgn.191.2023.10.02.23.57.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 23:57:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=F1e+pEta; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 52F6E80B9518; Mon, 2 Oct 2023 23:57:08 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239319AbjJCG5E (ORCPT + 18 others); Tue, 3 Oct 2023 02:57:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231184AbjJCG4c (ORCPT ); Tue, 3 Oct 2023 02:56:32 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A053426BD; Mon, 2 Oct 2023 23:55:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696316144; x=1727852144; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=t7LG5n12hETRnrxNhjk5swu4X1wUVGmoU9KoK3uS4dI=; b=F1e+pEtaxnJ4Oy4HH1bj8o0e/1Fxr8DEvXvgTJG0d9XdFrJvzDDuQoUI rXqrX9USAKXOrfNzgdND4gdinckRyOMB7DP7hUAkeuuqUuwQy3CDWnaSb Rjzx7aIHy4Xv+7g5DakwGQ/ChRtl7z9I7nL7209/AESE7pf8ES3Sw3FGg 2ExgD6tMY6LGGu2qrRG5ZRE0UUDHoZ//PSeT48uAyTYLc/HV4SZeb4MF6 Jz8uVNi3ILQ7pvrAQg0kZTi1719yqzW6A+nZf+aCKQLYLgB9JNJaFzRxi VHETW7dUHbjCZU45i8PJOCUhtmwPihjKe8FFSm+i7DHDnVWe1HfouAyPv w==; X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="367858294" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="367858294" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Oct 2023 23:54:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="1081901001" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="1081901001" Received: from unknown (HELO fred..) ([172.25.112.68]) by fmsmga005.fm.intel.com with ESMTP; 02 Oct 2023 23:54:50 -0700 From: Xin Li To: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, linux-hyperv@vger.kernel.org, kvm@vger.kernel.org, xen-devel@lists.xenproject.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, luto@kernel.org, pbonzini@redhat.com, seanjc@google.com, peterz@infradead.org, jgross@suse.com, ravi.v.shankar@intel.com, mhiramat@kernel.org, andrew.cooper3@citrix.com, jiangshanlai@gmail.com, nik.borisov@suse.com Subject: [PATCH v12 31/37] x86/fred: Fixup fault on ERETU by jumping to fred_entrypoint_user Date: Mon, 2 Oct 2023 23:24:52 -0700 Message-Id: <20231003062458.23552-32-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231003062458.23552-1-xin3.li@intel.com> References: <20231003062458.23552-1-xin3.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Mon, 02 Oct 2023 23:57:08 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778716486554141302 X-GMAIL-MSGID: 1778716486554141302 If the stack frame contains an invalid user context (e.g. due to invalid SS, a non-canonical RIP, etc.) the ERETU instruction will trap (#SS or #GP). From a Linux point of view, this really should be considered a user space failure, so use the standard fault fixup mechanism to intercept the fault, fix up the exception frame, and redirect execution to fred_entrypoint_user. The end result is that it appears just as if the hardware had taken the exception immediately after completing the transition to user space. Suggested-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v8: * Reflect the FRED spec 5.0 change that ERETS and ERETU add 8 to %rsp before popping the return context from the stack. Changes since v6: * Add a comment to explain why it is safe to write to the previous FRED stack frame. (Lai Jiangshan). Changes since v5: * Move the NMI bit from an invalid stack frame, which caused ERETU to fault, to the fault handler's stack frame, thus to unblock NMI ASAP if NMI is blocked (Lai Jiangshan). --- arch/x86/entry/entry_64_fred.S | 5 +- arch/x86/include/asm/extable_fixup_types.h | 4 +- arch/x86/mm/extable.c | 79 ++++++++++++++++++++++ 3 files changed, 86 insertions(+), 2 deletions(-) diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S index 5781c3411b44..d1c2fc4af8ae 100644 --- a/arch/x86/entry/entry_64_fred.S +++ b/arch/x86/entry/entry_64_fred.S @@ -3,6 +3,7 @@ * The actual FRED entry points. */ +#include #include #include "calling.h" @@ -34,7 +35,9 @@ SYM_CODE_START_NOALIGN(asm_fred_entrypoint_user) call fred_entry_from_user SYM_INNER_LABEL(asm_fred_exit_user, SYM_L_GLOBAL) FRED_EXIT - ERETU +1: ERETU + + _ASM_EXTABLE_TYPE(1b, asm_fred_entrypoint_user, EX_TYPE_ERETU) SYM_CODE_END(asm_fred_entrypoint_user) .fill asm_fred_entrypoint_kernel - ., 1, 0xcc diff --git a/arch/x86/include/asm/extable_fixup_types.h b/arch/x86/include/asm/extable_fixup_types.h index 991e31cfde94..1585c798a02f 100644 --- a/arch/x86/include/asm/extable_fixup_types.h +++ b/arch/x86/include/asm/extable_fixup_types.h @@ -64,6 +64,8 @@ #define EX_TYPE_UCOPY_LEN4 (EX_TYPE_UCOPY_LEN | EX_DATA_IMM(4)) #define EX_TYPE_UCOPY_LEN8 (EX_TYPE_UCOPY_LEN | EX_DATA_IMM(8)) -#define EX_TYPE_ZEROPAD 20 /* longword load with zeropad on fault */ +#define EX_TYPE_ZEROPAD 20 /* longword load with zeropad on fault */ + +#define EX_TYPE_ERETU 21 #endif diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index 271dcb2deabc..fc40a4e12f3a 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -223,6 +224,80 @@ static bool ex_handler_ucopy_len(const struct exception_table_entry *fixup, return ex_handler_uaccess(fixup, regs, trapnr, fault_address); } +#ifdef CONFIG_X86_FRED +static bool ex_handler_eretu(const struct exception_table_entry *fixup, + struct pt_regs *regs, unsigned long error_code) +{ + struct pt_regs *uregs = (struct pt_regs *) + (regs->sp - offsetof(struct pt_regs, orig_ax)); + unsigned short ss = uregs->ss; + unsigned short cs = uregs->cs; + + /* + * Move the NMI bit from the invalid stack frame, which caused ERETU + * to fault, to the fault handler's stack frame, thus to unblock NMI + * with the fault handler's ERETS instruction ASAP if NMI is blocked. + */ + regs->fred_ss.nmi = uregs->fred_ss.nmi; + + /* + * Sync event information to uregs, i.e., the ERETU return frame, but + * is it safe to write to the ERETU return frame which is just above + * current event stack frame? + * + * The RSP used by FRED to push a stack frame is not the value in %rsp, + * it is calculated from %rsp with the following 2 steps: + * 1) RSP = %rsp - (IA32_FRED_CONFIG & 0x1c0) // Reserve N*64 bytes + * 2) RSP = RSP & ~0x3f // Align to a 64-byte cache line + * when an event delivery doesn't trigger a stack level change. + * + * Here is an example with N*64 (N=1) bytes reserved: + * + * 64-byte cache line ==> ______________ + * |___Reserved___| + * |__Event_data__| + * |_____SS_______| + * |_____RSP______| + * |_____FLAGS____| + * |_____CS_______| + * |_____IP_______| + * 64-byte cache line ==> |__Error_code__| <== ERETU return frame + * |______________| + * |______________| + * |______________| + * |______________| + * |______________| + * |______________| + * |______________| + * 64-byte cache line ==> |______________| <== RSP after step 1) and 2) + * |___Reserved___| + * |__Event_data__| + * |_____SS_______| + * |_____RSP______| + * |_____FLAGS____| + * |_____CS_______| + * |_____IP_______| + * 64-byte cache line ==> |__Error_code__| <== ERETS return frame + * + * Thus a new FRED stack frame will always be pushed below a previous + * FRED stack frame ((N*64) bytes may be reserved between), and it is + * safe to write to a previous FRED stack frame as they never overlap. + */ + fred_info(uregs)->edata = fred_event_data(regs); + uregs->ssx = regs->ssx; + uregs->fred_ss.ss = ss; + /* The NMI bit was moved away above */ + uregs->fred_ss.nmi = 0; + uregs->csx = regs->csx; + uregs->fred_cs.sl = 0; + uregs->fred_cs.wfe = 0; + uregs->cs = cs; + uregs->orig_ax = error_code; + + return ex_handler_default(fixup, regs); +} +#endif + int ex_get_fixup_type(unsigned long ip) { const struct exception_table_entry *e = search_exception_tables(ip); @@ -300,6 +375,10 @@ int fixup_exception(struct pt_regs *regs, int trapnr, unsigned long error_code, return ex_handler_ucopy_len(e, regs, trapnr, fault_addr, reg, imm); case EX_TYPE_ZEROPAD: return ex_handler_zeropad(e, regs, fault_addr); +#ifdef CONFIG_X86_FRED + case EX_TYPE_ERETU: + return ex_handler_eretu(e, regs, error_code); +#endif } BUG(); }