From patchwork Wed Jul 19 22:47:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: tip-bot2 for Thomas Gleixner X-Patchwork-Id: 122875 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp2753930vqt; Wed, 19 Jul 2023 15:50:09 -0700 (PDT) X-Google-Smtp-Source: APBJJlFQM4Mv66BrPCyVOqnpieXJIPb2LFmDC8ogxhaJ+H+BPZTXyXLnrED4lqH7mV1iimgjAqj6 X-Received: by 2002:a17:90a:1f4e:b0:25c:2260:9f5c with SMTP id y14-20020a17090a1f4e00b0025c22609f5cmr3113357pjy.34.1689807009247; Wed, 19 Jul 2023 15:50:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689807009; cv=none; d=google.com; s=arc-20160816; b=kXt7iqGgbI9ZGIfGRgy+nRNEpfjujbwngirzd4pLil3kFJ94Yg7mGPY9Jfz2Px9l9i WMgDB0ZmK1lkYnCpAeB0dYZHwtlKENWqtmX40Q6K5ZS7YNzjFWuq0olV9O6BNKCu0wuc 4AgcEuaR0UqWduLWQjx6abUoz56pDFUxrp58A/WKI/vIpbg/V/mFZntwYIETgouFMzmQ PHuN460x4HsEFdJbGFdpEcDyTHy5A4eyDYgjS5VNJMh5UIuDxeBhUhykcLHt3yYY1I31 SDLP8FI59prrBGGzcPgL5YNAWmN3FWit4xrYL6jW2eN/hFlWve9jTUlSAe/2yME08WIL pWDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:robot-unsubscribe :robot-id:message-id:mime-version:cc:subject:to:reply-to:sender:from :dkim-signature:dkim-signature:date; bh=FCsx4cbD3PYOON2qzMGFeYo1dh4l54ZuLLqRqgmOgfk=; fh=cYKsLYKlfL5GLDpOt1a9DaF93PQ8azQhO0iBCt+veh0=; b=sjgZqDjad/6s8sTzUOQIOZHAMt7D46KcWP/JNBMYOg6Dn6FSmgZtUrKrCCtjpXBSed pIx9gFG6YU4pvufgD0NZQoh7ygVfnw9cBYDH+MX2zCwO1E6glV6a+lTb+ld4PZJBrfYc xo5Ny90bR8qgZ0W7KC84N0T0ZKfi6wfTUbpcBpnc/DiU+i0LlMkQQdb+0Cfmh0KDwef4 IcdVP28QA5lBDsouKyssvkmg4EIEvcOYGRsH6x63vonTavT6FKDPPBe1BI1cojkIukWa DZzPfeRf4k4ra0xIfvPOeimn+ebQs/GRB4Ft4sfSvQHef8RDOJEng8Eaiv/Btxi09Pub GPdw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=uz2WaBZQ; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c2-20020a17090a8d0200b00262e629f543si2207127pjo.80.2023.07.19.15.49.56; Wed, 19 Jul 2023 15:50:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=uz2WaBZQ; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231357AbjGSWtP (ORCPT + 99 others); Wed, 19 Jul 2023 18:49:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231190AbjGSWrw (ORCPT ); Wed, 19 Jul 2023 18:47:52 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB09F2698; Wed, 19 Jul 2023 15:47:39 -0700 (PDT) Date: Wed, 19 Jul 2023 22:47:37 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1689806858; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=FCsx4cbD3PYOON2qzMGFeYo1dh4l54ZuLLqRqgmOgfk=; b=uz2WaBZQjpRYB7T0OXpEME/RU1sTxxDAsegPJem5/p06nAaLDvctx2vivS/77YmNuRGkFQ zdVYgeDQcj4MNTvtKK90buckBF0RNe0X8vca7hofdr1GB0k2OEkJoH97RHxUvwWwzwdtcq QgCS58RzIa/yE6bxsWWUw7Hwu6dOFfP0derKXS+fuPMyqsUfUi4pYTE923dVMm2aaRU8Ar KKO87DZB/xRztw4EQ3Wu0UwYhb9DKszuMLsp88ULBiWPMeDe9IwaWl+sp1oXplBq2HTE8V Bx39D8lLeOAbjm3OKN7VKqZCU0uXFjupq2nO/MV7OAIzBmkwVung2CFqqUzRoA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1689806858; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=FCsx4cbD3PYOON2qzMGFeYo1dh4l54ZuLLqRqgmOgfk=; b=VsFdypwAMsecTFJA+Bow7mPfCBUS/wGyg7QgOXGm/n7vY8+/y6m6LyDYyvSooyFhB24C+r wAqcikczGg30cuDg== From: "tip-bot2 for Rick Edgecombe" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: x86/shstk] x86/mm: Check shadow stack page fault errors Cc: "Yu-cheng Yu" , Rick Edgecombe , Dave Hansen , "Borislav Petkov (AMD)" , Kees Cook , "Mike Rapoport (IBM)" , Pengfei Xu , John Allen , x86@kernel.org, linux-kernel@vger.kernel.org MIME-Version: 1.0 Message-ID: <168980685764.28540.16475557765891889796.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771891074235768871 X-GMAIL-MSGID: 1771891074235768871 The following commit has been merged into the x86/shstk branch of tip: Commit-ID: fd5439e0c97bbc469d0a263ce343241fc48200ea Gitweb: https://git.kernel.org/tip/fd5439e0c97bbc469d0a263ce343241fc48200ea Author: Rick Edgecombe AuthorDate: Mon, 12 Jun 2023 17:10:41 -07:00 Committer: Rick Edgecombe CommitterDate: Tue, 11 Jul 2023 14:12:19 -07:00 x86/mm: Check shadow stack page fault errors The CPU performs "shadow stack accesses" when it expects to encounter shadow stack mappings. These accesses can be implicit (via CALL/RET instructions) or explicit (instructions like WRSS). Shadow stack accesses to shadow-stack mappings can result in faults in normal, valid operation just like regular accesses to regular mappings. Shadow stacks need some of the same features like delayed allocation, swap and copy-on-write. The kernel needs to use faults to implement those features. The architecture has concepts of both shadow stack reads and shadow stack writes. Any shadow stack access to non-shadow stack memory will generate a fault with the shadow stack error code bit set. This means that, unlike normal write protection, the fault handler needs to create a type of memory that can be written to (with instructions that generate shadow stack writes), even to fulfill a read access. So in the case of COW memory, the COW needs to take place even with a shadow stack read. Otherwise the page will be left (shadow stack) writable in userspace. So to trigger the appropriate behavior, set FAULT_FLAG_WRITE for shadow stack accesses, even if the access was a shadow stack read. For the purpose of making this clearer, consider the following example. If a process has a shadow stack, and forks, the shadow stack PTEs will become read-only due to COW. If the CPU in one process performs a shadow stack read access to the shadow stack, for example executing a RET and causing the CPU to read the shadow stack copy of the return address, then in order for the fault to be resolved the PTE will need to be set with shadow stack permissions. But then the memory would be changeable from userspace (from CALL, RET, WRSS, etc). So this scenario needs to trigger COW, otherwise the shared page would be changeable from both processes. Shadow stack accesses can also result in errors, such as when a shadow stack overflows, or if a shadow stack access occurs to a non-shadow-stack mapping. Also, generate the errors for invalid shadow stack accesses. Co-developed-by: Yu-cheng Yu Signed-off-by: Yu-cheng Yu Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230613001108.3040476-16-rick.p.edgecombe%40intel.com --- arch/x86/include/asm/trap_pf.h | 2 ++ arch/x86/mm/fault.c | 22 ++++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/arch/x86/include/asm/trap_pf.h b/arch/x86/include/asm/trap_pf.h index 10b1de5..afa5243 100644 --- a/arch/x86/include/asm/trap_pf.h +++ b/arch/x86/include/asm/trap_pf.h @@ -11,6 +11,7 @@ * bit 3 == 1: use of reserved bit detected * bit 4 == 1: fault was an instruction fetch * bit 5 == 1: protection keys block access + * bit 6 == 1: shadow stack access fault * bit 15 == 1: SGX MMU page-fault */ enum x86_pf_error_code { @@ -20,6 +21,7 @@ enum x86_pf_error_code { X86_PF_RSVD = 1 << 3, X86_PF_INSTR = 1 << 4, X86_PF_PK = 1 << 5, + X86_PF_SHSTK = 1 << 6, X86_PF_SGX = 1 << 15, }; diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index e8711b2..8dff37e 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1112,8 +1112,22 @@ access_error(unsigned long error_code, struct vm_area_struct *vma) (error_code & X86_PF_INSTR), foreign)) return 1; + /* + * Shadow stack accesses (PF_SHSTK=1) are only permitted to + * shadow stack VMAs. All other accesses result in an error. + */ + if (error_code & X86_PF_SHSTK) { + if (unlikely(!(vma->vm_flags & VM_SHADOW_STACK))) + return 1; + if (unlikely(!(vma->vm_flags & VM_WRITE))) + return 1; + return 0; + } + if (error_code & X86_PF_WRITE) { /* write, present and write, not present: */ + if (unlikely(vma->vm_flags & VM_SHADOW_STACK)) + return 1; if (unlikely(!(vma->vm_flags & VM_WRITE))) return 1; return 0; @@ -1305,6 +1319,14 @@ void do_user_addr_fault(struct pt_regs *regs, perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); + /* + * Read-only permissions can not be expressed in shadow stack PTEs. + * Treat all shadow stack accesses as WRITE faults. This ensures + * that the MM will prepare everything (e.g., break COW) such that + * maybe_mkwrite() can create a proper shadow stack PTE. + */ + if (error_code & X86_PF_SHSTK) + flags |= FAULT_FLAG_WRITE; if (error_code & X86_PF_WRITE) flags |= FAULT_FLAG_WRITE; if (error_code & X86_PF_INSTR)