From patchwork Mon Mar 20 16:39:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: tip-bot2 for Thomas Gleixner X-Patchwork-Id: 72307 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1331070wrt; Mon, 20 Mar 2023 10:11:27 -0700 (PDT) X-Google-Smtp-Source: AK7set/PCRthP24ISOfPsfFV4JE6igOzRB21wwyKJkKt3l/x/7gYAhH54PxG3K3vckA4WnLJMIom X-Received: by 2002:a05:6a20:7d90:b0:d9:e7e5:71d9 with SMTP id v16-20020a056a207d9000b000d9e7e571d9mr3393776pzj.29.1679332287311; Mon, 20 Mar 2023 10:11:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679332287; cv=none; d=google.com; s=arc-20160816; b=BWTj9xCGQVQeBn8X7tYse/PE3K2Zs3/WuflXG0EFgHB1PJ8wyyQCT0h1dfEpbNiIRs 06VGUlkU3dbTOsUxRtd93CNGHM+Im1cMfpyZcoRFl46jXKrrN7XobU9IinmRbDdTkdv7 opAKKFnLmh/3p6fQTUiS4oHAm/zoaErN8fsM6lAX46GrCMlh6WyXdhIwo6hDiw9L+B6k nTxqm9foPeK5+dGYcuD7PTyYK+brKyJb4lOwZ5DhuH7Nhi2YP8r7W0MbJUi/IEsQPY63 oEzfiYtg+GfY6+U/21lYSljWIAU73MaLyl24vT1l8FlF9eJyoE6qo+KU46nfPa2MSLLJ Q1lQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:robot-unsubscribe :robot-id:message-id:mime-version:cc:subject:to:reply-to:sender:from :dkim-signature:dkim-signature:date; bh=ueWkHzvql+LXOcl6hYS9xKZyB0gscHvFcSm1cPOo2u4=; b=fxyS/gnvplnt1X/GiMbLjT1xm1wdzgp0kYTt8FRzFG6csJMrdSq/pEeQPj3vqHJKM3 oW51h9uvCx/2OOdrfGbYB+KiuXl+mClNvoV1iWi0vWE8nO+E1ZV9fUWCFh1WtEa2HbL6 jSresVE50caTA27sUQ0dOGl/wimOfIOf9zAUinMbFYFY/DBBnAeQlaTgjvuplan0tO4i JZLL9ujv3Zeqf5IH2UR0rie0PlfscezlJwPiL8VDKQUdXvF7RZ+45kJR1dnX1tEekiNg G1Cxltp52mJb/hNluJtaaYsUToNVZc4CSFEBRb78Vz/9nblCCb/KQXswzwLf8xzmweu9 uVZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=dWXN3gp+; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m16-20020a637110000000b0050f92223f48si1319413pgc.161.2023.03.20.10.11.12; Mon, 20 Mar 2023 10:11:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=dWXN3gp+; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232406AbjCTQsK (ORCPT + 99 others); Mon, 20 Mar 2023 12:48:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232527AbjCTQri (ORCPT ); Mon, 20 Mar 2023 12:47:38 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E732F27D5F; Mon, 20 Mar 2023 09:41:07 -0700 (PDT) Date: Mon, 20 Mar 2023 16:39:26 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1679330367; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=ueWkHzvql+LXOcl6hYS9xKZyB0gscHvFcSm1cPOo2u4=; b=dWXN3gp+aaReGtdLrY2M8ERuIcFWveCflzJJXzILCZihXoCgDMBOSv3g6o2B+UPvGXTnpy IYQwtaZNxVT02EkeZhL1NPo3TO66+0fgMCvVuK1Dph3BXaXiepFSAxoV3Ir2y075jqjorV Vi06BIHU+VsBlO3Uyd6xaYcQngHIYKRMh2+S3RdpWxpyF+1MgAV9Op70PNdmD6YlTUSxID Jh+30WDMP2h3bEXsm+GM9K1UWF62mmkJh9/0BsK/08RM6zcCPbofjlA8pSj8EvFxzvlO10 PzSpANa2M/3qGe/CLp/kEXqkieb6QK9oXKNdwa8Jqcvj7OrxMaispO1l6rxdiA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1679330367; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=ueWkHzvql+LXOcl6hYS9xKZyB0gscHvFcSm1cPOo2u4=; b=WzMWbko0rAWXlXLuff5+eEDr9PJ5XfHI4OJd39wvXp7dFTC7G7O5rPhPqdm/6UiYy5HGD9 2CcP2aQ15sreeHAw== From: "tip-bot2 for Rick Edgecombe" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: x86/shstk] x86/mm: Start actually marking _PAGE_SAVED_DIRTY Cc: "Yu-cheng Yu" , Rick Edgecombe , Dave Hansen , "Borislav Petkov (AMD)" , Kees Cook , "Mike Rapoport (IBM)" , Pengfei Xu , John Allen , x86@kernel.org, linux-kernel@vger.kernel.org MIME-Version: 1.0 Message-ID: <167933036694.5837.3655626911490297516.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760907532450480750?= X-GMAIL-MSGID: =?utf-8?q?1760907532450480750?= The following commit has been merged into the x86/shstk branch of tip: Commit-ID: b349068fa10c809ee6fcce6fe212abf48da0034b Gitweb: https://git.kernel.org/tip/b349068fa10c809ee6fcce6fe212abf48da0034b Author: Rick Edgecombe AuthorDate: Sat, 18 Mar 2023 17:15:11 -07:00 Committer: Dave Hansen CommitterDate: Mon, 20 Mar 2023 09:01:09 -07:00 x86/mm: Start actually marking _PAGE_SAVED_DIRTY The recently introduced _PAGE_SAVED_DIRTY should be used instead of the HW Dirty bit whenever a PTE is Write=0, in order to not inadvertently create shadow stack PTEs. Update pte_mk*() helpers to do this, and apply the same changes to pmd and pud. For pte_modify() this is a bit trickier. It takes a "raw" pgprot_t which was not necessarily created with any of the existing PTE bit helpers. That means that it can return a pte_t with Write=0,Dirty=1, a shadow stack PTE, when it did not intend to create one. Modify it to also move _PAGE_DIRTY to _PAGE_SAVED_DIRTY. To avoid creating Write=0,Dirty=1 PTEs, pte_modify() needs to avoid: 1. Marking Write=0 PTEs Dirty=1 2. Marking Dirty=1 PTEs Write=0 The first case cannot happen as the existing behavior of pte_modify() is to filter out any Dirty bit passed in newprot. Handle the second case by shifting _PAGE_DIRTY=1 to _PAGE_SAVED_DIRTY=1 if the PTE was write protected by the pte_modify() call. Apply the same changes to pmd_modify(). Co-developed-by: Yu-cheng Yu Signed-off-by: Yu-cheng Yu Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230319001535.23210-17-rick.p.edgecombe%40intel.com --- arch/x86/include/asm/pgtable.h | 168 +++++++++++++++++++++++++++----- 1 file changed, 145 insertions(+), 23 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 349fcab..05dfdbd 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -124,9 +124,17 @@ extern pmdval_t early_pmd_flags; * The following only work if pte_present() is true. * Undefined behaviour if not.. */ -static inline int pte_dirty(pte_t pte) +static inline bool pte_dirty(pte_t pte) { - return pte_flags(pte) & _PAGE_DIRTY; + return pte_flags(pte) & _PAGE_DIRTY_BITS; +} + +static inline bool pte_shstk(pte_t pte) +{ + if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK)) + return false; + + return (pte_flags(pte) & (_PAGE_RW | _PAGE_DIRTY)) == _PAGE_DIRTY; } static inline int pte_young(pte_t pte) @@ -134,9 +142,18 @@ static inline int pte_young(pte_t pte) return pte_flags(pte) & _PAGE_ACCESSED; } -static inline int pmd_dirty(pmd_t pmd) +static inline bool pmd_dirty(pmd_t pmd) { - return pmd_flags(pmd) & _PAGE_DIRTY; + return pmd_flags(pmd) & _PAGE_DIRTY_BITS; +} + +static inline bool pmd_shstk(pmd_t pmd) +{ + if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK)) + return false; + + return (pmd_flags(pmd) & (_PAGE_RW | _PAGE_DIRTY | _PAGE_PSE)) == + (_PAGE_DIRTY | _PAGE_PSE); } #define pmd_young pmd_young @@ -145,9 +162,9 @@ static inline int pmd_young(pmd_t pmd) return pmd_flags(pmd) & _PAGE_ACCESSED; } -static inline int pud_dirty(pud_t pud) +static inline bool pud_dirty(pud_t pud) { - return pud_flags(pud) & _PAGE_DIRTY; + return pud_flags(pud) & _PAGE_DIRTY_BITS; } static inline int pud_young(pud_t pud) @@ -157,13 +174,21 @@ static inline int pud_young(pud_t pud) static inline int pte_write(pte_t pte) { - return pte_flags(pte) & _PAGE_RW; + /* + * Shadow stack pages are logically writable, but do not have + * _PAGE_RW. Check for them separately from _PAGE_RW itself. + */ + return (pte_flags(pte) & _PAGE_RW) || pte_shstk(pte); } #define pmd_write pmd_write static inline int pmd_write(pmd_t pmd) { - return pmd_flags(pmd) & _PAGE_RW; + /* + * Shadow stack pages are logically writable, but do not have + * _PAGE_RW. Check for them separately from _PAGE_RW itself. + */ + return (pmd_flags(pmd) & _PAGE_RW) || pmd_shstk(pmd); } #define pud_write pud_write @@ -342,7 +367,16 @@ static inline pte_t pte_clear_saveddirty(pte_t pte) static inline pte_t pte_wrprotect(pte_t pte) { - return pte_clear_flags(pte, _PAGE_RW); + pte = pte_clear_flags(pte, _PAGE_RW); + + /* + * Blindly clearing _PAGE_RW might accidentally create + * a shadow stack PTE (Write=0,Dirty=1). Move the hardware + * dirty value to the software bit. + */ + if (pte_dirty(pte)) + pte = pte_mksaveddirty(pte); + return pte; } #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP @@ -380,7 +414,7 @@ static inline pte_t pte_clear_uffd_wp(pte_t pte) static inline pte_t pte_mkclean(pte_t pte) { - return pte_clear_flags(pte, _PAGE_DIRTY); + return pte_clear_flags(pte, _PAGE_DIRTY_BITS); } static inline pte_t pte_mkold(pte_t pte) @@ -395,7 +429,19 @@ static inline pte_t pte_mkexec(pte_t pte) static inline pte_t pte_mkdirty(pte_t pte) { - return pte_set_flags(pte, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + pteval_t dirty = _PAGE_DIRTY; + + /* Avoid creating Dirty=1,Write=0 PTEs */ + if (cpu_feature_enabled(X86_FEATURE_USER_SHSTK) && !pte_write(pte)) + dirty = _PAGE_SAVED_DIRTY; + + return pte_set_flags(pte, dirty | _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_mkwrite_shstk(pte_t pte) +{ + /* pte_clear_saveddirty() also sets Dirty=1 */ + return pte_clear_saveddirty(pte); } static inline pte_t pte_mkyoung(pte_t pte) @@ -412,7 +458,12 @@ struct vm_area_struct; static inline pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma) { - return pte_mkwrite_kernel(pte); + pte = pte_mkwrite_kernel(pte); + + if (pte_dirty(pte)) + pte = pte_clear_saveddirty(pte); + + return pte; } static inline pte_t pte_mkhuge(pte_t pte) @@ -481,7 +532,15 @@ static inline pmd_t pmd_clear_saveddirty(pmd_t pmd) static inline pmd_t pmd_wrprotect(pmd_t pmd) { - return pmd_clear_flags(pmd, _PAGE_RW); + pmd = pmd_clear_flags(pmd, _PAGE_RW); + /* + * Blindly clearing _PAGE_RW might accidentally create + * a shadow stack PMD (RW=0, Dirty=1). Move the hardware + * dirty value to the software bit. + */ + if (pmd_dirty(pmd)) + pmd = pmd_mksaveddirty(pmd); + return pmd; } #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP @@ -508,12 +567,23 @@ static inline pmd_t pmd_mkold(pmd_t pmd) static inline pmd_t pmd_mkclean(pmd_t pmd) { - return pmd_clear_flags(pmd, _PAGE_DIRTY); + return pmd_clear_flags(pmd, _PAGE_DIRTY_BITS); } static inline pmd_t pmd_mkdirty(pmd_t pmd) { - return pmd_set_flags(pmd, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + pmdval_t dirty = _PAGE_DIRTY; + + /* Avoid creating (HW)Dirty=1, Write=0 PMDs */ + if (cpu_feature_enabled(X86_FEATURE_USER_SHSTK) && !pmd_write(pmd)) + dirty = _PAGE_SAVED_DIRTY; + + return pmd_set_flags(pmd, dirty | _PAGE_SOFT_DIRTY); +} + +static inline pmd_t pmd_mkwrite_shstk(pmd_t pmd) +{ + return pmd_clear_saveddirty(pmd); } static inline pmd_t pmd_mkdevmap(pmd_t pmd) @@ -533,7 +603,12 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd) static inline pmd_t pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) { - return pmd_set_flags(pmd, _PAGE_RW); + pmd = pmd_set_flags(pmd, _PAGE_RW); + + if (pmd_dirty(pmd)) + pmd = pmd_clear_saveddirty(pmd); + + return pmd; } static inline pud_t pud_set_flags(pud_t pud, pudval_t set) @@ -577,17 +652,32 @@ static inline pud_t pud_mkold(pud_t pud) static inline pud_t pud_mkclean(pud_t pud) { - return pud_clear_flags(pud, _PAGE_DIRTY); + return pud_clear_flags(pud, _PAGE_DIRTY_BITS); } static inline pud_t pud_wrprotect(pud_t pud) { - return pud_clear_flags(pud, _PAGE_RW); + pud = pud_clear_flags(pud, _PAGE_RW); + + /* + * Blindly clearing _PAGE_RW might accidentally create + * a shadow stack PUD (RW=0, Dirty=1). Move the hardware + * dirty value to the software bit. + */ + if (pud_dirty(pud)) + pud = pud_mksaveddirty(pud); + return pud; } static inline pud_t pud_mkdirty(pud_t pud) { - return pud_set_flags(pud, _PAGE_DIRTY | _PAGE_SOFT_DIRTY); + pudval_t dirty = _PAGE_DIRTY; + + /* Avoid creating (HW)Dirty=1, Write=0 PUDs */ + if (cpu_feature_enabled(X86_FEATURE_USER_SHSTK) && !pud_write(pud)) + dirty = _PAGE_SAVED_DIRTY; + + return pud_set_flags(pud, dirty | _PAGE_SOFT_DIRTY); } static inline pud_t pud_mkdevmap(pud_t pud) @@ -607,7 +697,11 @@ static inline pud_t pud_mkyoung(pud_t pud) static inline pud_t pud_mkwrite(pud_t pud) { - return pud_set_flags(pud, _PAGE_RW); + pud = pud_set_flags(pud, _PAGE_RW); + + if (pud_dirty(pud)) + pud = pud_clear_saveddirty(pud); + return pud; } #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY @@ -724,6 +818,8 @@ static inline u64 flip_protnone_guard(u64 oldval, u64 val, u64 mask); static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) { pteval_t val = pte_val(pte), oldval = val; + bool wr_protected; + pte_t pte_result; /* * Chop off the NX bit (if present), and add the NX portion of @@ -732,17 +828,43 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) val &= _PAGE_CHG_MASK; val |= check_pgprot(newprot) & ~_PAGE_CHG_MASK; val = flip_protnone_guard(oldval, val, PTE_PFN_MASK); - return __pte(val); + + pte_result = __pte(val); + + /* + * Do the saveddirty fixup if the PTE was just write protected and + * it's dirty. + */ + wr_protected = (oldval & _PAGE_RW) && !(val & _PAGE_RW); + if (cpu_feature_enabled(X86_FEATURE_USER_SHSTK) && wr_protected && + (val & _PAGE_DIRTY)) + pte_result = pte_mksaveddirty(pte_result); + + return pte_result; } static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) { pmdval_t val = pmd_val(pmd), oldval = val; + bool wr_protected; + pmd_t pmd_result; - val &= _HPAGE_CHG_MASK; + val &= (_HPAGE_CHG_MASK & ~_PAGE_DIRTY); val |= check_pgprot(newprot) & ~_HPAGE_CHG_MASK; val = flip_protnone_guard(oldval, val, PHYSICAL_PMD_PAGE_MASK); - return __pmd(val); + + pmd_result = __pmd(val); + + /* + * Do the saveddirty fixup if the PMD was just write protected and + * it's dirty. + */ + wr_protected = (oldval & _PAGE_RW) && !(val & _PAGE_RW); + if (cpu_feature_enabled(X86_FEATURE_USER_SHSTK) && wr_protected && + (val & _PAGE_DIRTY)) + pmd_result = pmd_mksaveddirty(pmd_result); + + return pmd_result; } /*