From patchwork Tue Dec 12 23:16:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 177660 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp8060867vqy; Tue, 12 Dec 2023 15:17:57 -0800 (PST) X-Google-Smtp-Source: AGHT+IHfQHJASj49g9G+UWHmw+g3ysRxyZi99pUinQcsbDJEVx/toQX8vWJBb8U00QW9Jrjg67tX X-Received: by 2002:a05:6a20:4fa1:b0:191:6d71:cebf with SMTP id gh33-20020a056a204fa100b001916d71cebfmr1177654pzb.32.1702423077179; Tue, 12 Dec 2023 15:17:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702423077; cv=none; d=google.com; s=arc-20160816; b=YPDydvh9DPQV81uu+rVYB+ccbV5adPFDe7pSGWZgLqcCHC1urjGTEJkDQGc8DcVWO4 u+VnDJ1k3ffFrqitaL+tBT8P7+dBJPO6wyIWDmLUt4yNDEuvI1gFw7vp3flU3uxBqds2 U/O4sZxWn6U39z8Ds7l9RBJ/sTcGUStfiVpFchZsU7F4nKwZdFhXzHW8Xot0YkBwLvwX DF3sKAjj+JhSWGTnZkcNBrjQoxLYov5M5XvWP2RcXQpPsgAUzuAcDLBvyEbIu7MpiqYR QIfJ6ZRq8RoMVMvHYtRkpQJyhGSa/U0RtK31gBnA0dNcGjYK+MxCNKZ3MT613RNVn6dp iWdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DmouMMuL78gKzcDFfbVzkzXE8UDTo3Q4rwqrYea/myQ=; fh=0mRPlJo0PPrkJQqE76MB2W/JiJzBa9Y7evJmiT9T7OE=; b=cO17D/wq93XhFAxrfrbe3x2Eksn6zqkK0xvbcmCymgc7u+jx+NeffDbxEI6JIYYkLq 0a1Cp1S4je5rP9Lba/AsWiDQQrMqPEnG+35UqYtsKKCD9CLwE2uFz2WZD8Eh0pb7/sqH O5yJTAh7agVusMg63ANagoW8wnQWGX7Pu0OXFqd8yhRk1AFS2aOvuRcLxpzjb5DLhY4H K4qwkd6CLN6WKyL1YFmmMawI/LVcBX9/PfCGGuTvtnknHTT2rhys3BnG2/zGxbOueG61 W6NfDs7oXgHXUiRYwjCugXy5mrdWwlBtinKmYs/kmeV1BaVv07jE2R4H9E/iI8HawqGE aJZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=lX1FHusX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id lj14-20020a17090b344e00b0028ae12f4ecdsi193062pjb.171.2023.12.12.15.17.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 15:17:57 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=lX1FHusX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 818E8804A72A; Tue, 12 Dec 2023 15:17:45 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378071AbjLLXR2 (ORCPT + 99 others); Tue, 12 Dec 2023 18:17:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48220 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377984AbjLLXRK (ORCPT ); Tue, 12 Dec 2023 18:17:10 -0500 Received: from mail-oa1-x2b.google.com (mail-oa1-x2b.google.com [IPv6:2001:4860:4864:20::2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B6CA99 for ; Tue, 12 Dec 2023 15:17:12 -0800 (PST) Received: by mail-oa1-x2b.google.com with SMTP id 586e51a60fabf-1f055438492so4669117fac.3 for ; Tue, 12 Dec 2023 15:17:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423032; x=1703027832; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DmouMMuL78gKzcDFfbVzkzXE8UDTo3Q4rwqrYea/myQ=; b=lX1FHusXWCgacqtjr4amN5wcHfC3jsH9j08xBbFV+d/o/JAuCEJymeuO3F+i6aVb6u 2aH9ytcBN6Dr1ePRno5TSIhR62rt0pjMlPLjRdO7I1Lq/D35erfiH5gLJBF+MwP9LPf5 Z3vhD8hrveQ7OybEF7hZ94uP9vVNzyH98m614= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423032; x=1703027832; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DmouMMuL78gKzcDFfbVzkzXE8UDTo3Q4rwqrYea/myQ=; b=MVOFyFItRb69c9AzZBMjf5QlBe/gTyndsc52Bk+kVpV/3d4AQcCwwyiN0GIjSLP86Y i+Oqd0OOx4SOBozYJfaTFOGoQlVfosxFsZWPqt/78Zd7BDYWifdNDNUFG/ZIfdFovwem 4KXECqKmMe0nrGj/cMkCjukl1LNVMGlzPE9DoNDscLjt/HA0FSAf83E/yIfYXi7Doc0a SqqWP3kBc5Q/sI1m97YPDNQxuzi7XeSK+uT9JthZAQVi4gUi43bfNBmzxhnVyCq7Smz1 liuqkKIH184ioMfFIBi+jB98mXsSn1XSTWn/I3mdodgJH6m4P0foihOVL8Ze6AZi2mOY UDIA== X-Gm-Message-State: AOJu0YzepU7K1OVd6oVCaj5MhJln620Eof6Cm5e53CPtQ3GreaX6aK+B p1sjCxQJHlUuGMgeQ1J6f4rU1g== X-Received: by 2002:a05:6871:b06:b0:1fb:337c:402c with SMTP id fq6-20020a0568710b0600b001fb337c402cmr8802265oab.37.1702423031712; Tue, 12 Dec 2023 15:17:11 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id p2-20020aa78602000000b006ce691a1419sm8646308pfn.186.2023.12.12.15.17.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:11 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 01/11] mseal: Add mseal syscall. Date: Tue, 12 Dec 2023 23:16:55 +0000 Message-ID: <20231212231706.2680890-2-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Tue, 12 Dec 2023 15:17:45 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785119980671472854 X-GMAIL-MSGID: 1785119980671472854 From: Jeff Xu The new mseal() is an architecture independent syscall, and with following signature: mseal(void addr, size_t len, unsigned long types, unsigned long flags) addr/len: memory range. Must be continuous/allocated memory, or else mseal() will fail and no VMA is updated. For details on acceptable arguments, please refer to comments in mseal.c. Those are also covered by the selftest. This CL adds three sealing types. MM_SEAL_BASE MM_SEAL_PROT_PKEY MM_SEAL_SEAL The MM_SEAL_BASE: The base package includes the features common to all VMA sealing types. It prevents sealed VMAs from: 1> Unmapping, moving to another location, and shrinking the size, via munmap() and mremap(), can leave an empty space, therefore can be replaced with a VMA with a new set of attributes. 2> Move or expand a different vma into the current location, via mremap(). 3> Modifying sealed VMA via mmap(MAP_FIXED). 4> Size expansion, via mremap(), does not appear to pose any specific risks to sealed VMAs. It is included anyway because the use case is unclear. In any case, users can rely on merging to expand a sealed VMA. We consider the MM_SEAL_BASE feature, on which other sealing features will depend. For instance, it probably does not make sense to seal PROT_PKEY without sealing the BASE, and the kernel will implicitly add SEAL_BASE for SEAL_PROT_PKEY. (If the application wants to relax this in future, we could use the “flags” field in mseal() to overwrite this the behavior of implicitly adding SEAL_BASE.) The MM_SEAL_PROT_PKEY: Seal PROT and PKEY of the address range, in other words, mprotect() and pkey_mprotect() will be denied if the memory is sealed with MM_SEAL_PROT_PKEY. The MM_SEAL_SEAL MM_SEAL_SEAL denies adding a new seal for an VMA. The kernel will remember which seal types are applied, and the application doesn’t need to repeat all existing seal types in the next mseal(). Once a seal type is applied, it can’t be unsealed. Call mseal() on an existing seal type is a no-action, not a failure. Data structure: Internally, the vm_area_struct adds a new field, vm_seals, to store the bit masks. The vm_seals field is added because the existing vm_flags field is full in 32-bit CPUs. The vm_seals field can be merged into vm_flags in the future if the size of vm_flags is ever expanded. TODO: Sealed VMA won't merge with other VMA in this patch, merging support will be added in later patch. Signed-off-by: Jeff Xu --- include/linux/mm.h | 45 ++++++- include/linux/mm_types.h | 7 ++ include/linux/syscalls.h | 2 + include/uapi/linux/mman.h | 4 + kernel/sys_ni.c | 1 + mm/Kconfig | 9 ++ mm/Makefile | 1 + mm/mmap.c | 3 + mm/mseal.c | 257 ++++++++++++++++++++++++++++++++++++++ 9 files changed, 328 insertions(+), 1 deletion(-) create mode 100644 mm/mseal.c diff --git a/include/linux/mm.h b/include/linux/mm.h index 19fc73b02c9f..3d1120570de5 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -30,6 +30,7 @@ #include #include #include +#include struct mempolicy; struct anon_vma; @@ -257,9 +258,17 @@ extern struct rw_semaphore nommu_region_sem; extern unsigned int kobjsize(const void *objp); #endif +/* + * MM_SEAL_ALL is all supported flags in mseal(). + */ +#define MM_SEAL_ALL ( \ + MM_SEAL_SEAL | \ + MM_SEAL_BASE | \ + MM_SEAL_PROT_PKEY) + /* * vm_flags in vm_area_struct, see mm_types.h. - * When changing, update also include/trace/events/mmflags.h + * When changing, update also include/trace/events/mmflags.h. */ #define VM_NONE 0x00000000 @@ -3308,6 +3317,40 @@ static inline void mm_populate(unsigned long addr, unsigned long len) static inline void mm_populate(unsigned long addr, unsigned long len) {} #endif +#ifdef CONFIG_MSEAL +static inline bool check_vma_seals_mergeable(unsigned long vm_seals) +{ + /* + * Set sealed VMA not mergeable with another VMA for now. + * This will be changed in later commit to make sealed + * VMA also mergeable. + */ + if (vm_seals & MM_SEAL_ALL) + return false; + + return true; +} + +/* + * return the valid sealing (after mask). + */ +static inline unsigned long vma_seals(struct vm_area_struct *vma) +{ + return (vma->vm_seals & MM_SEAL_ALL); +} + +#else +static inline bool check_vma_seals_mergeable(unsigned long vm_seals1) +{ + return true; +} + +static inline unsigned long vma_seals(struct vm_area_struct *vma) +{ + return 0; +} +#endif + /* These take the mm semaphore themselves */ extern int __must_check vm_brk(unsigned long, unsigned long); extern int __must_check vm_brk_flags(unsigned long, unsigned long, unsigned long); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 589f31ef2e84..052799173c86 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -687,6 +687,13 @@ struct vm_area_struct { struct vma_numab_state *numab_state; /* NUMA Balancing state */ #endif struct vm_userfaultfd_ctx vm_userfaultfd_ctx; +#ifdef CONFIG_MSEAL + /* + * bit masks for seal. + * need this since vm_flags is full. + */ + unsigned long vm_seals; /* seal flags, see mm.h. */ +#endif } __randomize_layout; #ifdef CONFIG_SCHED_MM_CID diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 0901af60d971..b1c766b74765 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -812,6 +812,8 @@ asmlinkage long sys_process_mrelease(int pidfd, unsigned int flags); asmlinkage long sys_remap_file_pages(unsigned long start, unsigned long size, unsigned long prot, unsigned long pgoff, unsigned long flags); +asmlinkage long sys_mseal(unsigned long start, size_t len, unsigned long types, + unsigned long flags); asmlinkage long sys_mbind(unsigned long start, unsigned long len, unsigned long mode, const unsigned long __user *nmask, diff --git a/include/uapi/linux/mman.h b/include/uapi/linux/mman.h index a246e11988d5..f561652886c4 100644 --- a/include/uapi/linux/mman.h +++ b/include/uapi/linux/mman.h @@ -55,4 +55,8 @@ struct cachestat { __u64 nr_recently_evicted; }; +#define MM_SEAL_SEAL _BITUL(0) +#define MM_SEAL_BASE _BITUL(1) +#define MM_SEAL_PROT_PKEY _BITUL(2) + #endif /* _UAPI_LINUX_MMAN_H */ diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 9db51ea373b0..716d64df522d 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -195,6 +195,7 @@ COND_SYSCALL(migrate_pages); COND_SYSCALL(move_pages); COND_SYSCALL(set_mempolicy_home_node); COND_SYSCALL(cachestat); +COND_SYSCALL(mseal); COND_SYSCALL(perf_event_open); COND_SYSCALL(accept4); diff --git a/mm/Kconfig b/mm/Kconfig index 264a2df5ecf5..63972d476d19 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1258,6 +1258,15 @@ config LOCK_MM_AND_FIND_VMA bool depends on !STACK_GROWSUP +config MSEAL + default n + bool "Enable mseal() system call" + depends on MMU + help + Enable the virtual memory sealing. + This feature allows sealing each virtual memory area separately with + multiple sealing types. + source "mm/damon/Kconfig" endmenu diff --git a/mm/Makefile b/mm/Makefile index ec65984e2ade..643d8518dac0 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -120,6 +120,7 @@ obj-$(CONFIG_PAGE_EXTENSION) += page_ext.o obj-$(CONFIG_PAGE_TABLE_CHECK) += page_table_check.o obj-$(CONFIG_CMA_DEBUGFS) += cma_debug.o obj-$(CONFIG_SECRETMEM) += secretmem.o +obj-$(CONFIG_MSEAL) += mseal.o obj-$(CONFIG_CMA_SYSFS) += cma_sysfs.o obj-$(CONFIG_USERFAULTFD) += userfaultfd.o obj-$(CONFIG_IDLE_PAGE_TRACKING) += page_idle.o diff --git a/mm/mmap.c b/mm/mmap.c index 9e018d8dd7d6..42462c2a0c35 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -740,6 +740,9 @@ static inline bool is_mergeable_vma(struct vm_area_struct *vma, return false; if (!anon_vma_name_eq(anon_vma_name(vma), anon_name)) return false; + if (!check_vma_seals_mergeable(vma_seals(vma))) + return false; + return true; } diff --git a/mm/mseal.c b/mm/mseal.c new file mode 100644 index 000000000000..13bbe9ef5883 --- /dev/null +++ b/mm/mseal.c @@ -0,0 +1,257 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Implement mseal() syscall. + * + * Copyright (c) 2023 Google, Inc. + * + * Author: Jeff Xu + */ + +#include +#include +#include +#include +#include "internal.h" + +static bool can_do_mseal(unsigned long types, unsigned long flags) +{ + /* check types is a valid bitmap. */ + if (types & ~MM_SEAL_ALL) + return false; + + /* flags isn't used for now. */ + if (flags) + return false; + + return true; +} + +/* + * Check if a seal type can be added to VMA. + */ +static bool can_add_vma_seals(struct vm_area_struct *vma, unsigned long newSeals) +{ + /* When SEAL_MSEAL is set, reject if a new type of seal is added. */ + if ((vma->vm_seals & MM_SEAL_SEAL) && + (newSeals & ~(vma_seals(vma)))) + return false; + + return true; +} + +static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, + struct vm_area_struct **prev, unsigned long start, + unsigned long end, unsigned long addtypes) +{ + int ret = 0; + + if (addtypes & ~(vma_seals(vma))) { + /* + * Handle split at start and end. + * For now sealed VMA doesn't merge with other VMAs. + * This will be updated in later commit to make + * sealed VMA also mergeable. + */ + if (start != vma->vm_start) { + ret = split_vma(vmi, vma, start, 1); + if (ret) + goto out; + } + + if (end != vma->vm_end) { + ret = split_vma(vmi, vma, end, 0); + if (ret) + goto out; + } + + vma->vm_seals |= addtypes; + } + +out: + *prev = vma; + return ret; +} + +/* + * Check for do_mseal: + * 1> start is part of a valid vma. + * 2> end is part of a valid vma. + * 3> No gap (unallocated address) between start and end. + * 4> requested seal type can be added in given address range. + */ +static int check_mm_seal(unsigned long start, unsigned long end, + unsigned long newtypes) +{ + struct vm_area_struct *vma; + unsigned long nstart = start; + + VMA_ITERATOR(vmi, current->mm, start); + + /* going through each vma to check. */ + for_each_vma_range(vmi, vma, end) { + if (vma->vm_start > nstart) + /* unallocated memory found. */ + return -ENOMEM; + + if (!can_add_vma_seals(vma, newtypes)) + return -EACCES; + + if (vma->vm_end >= end) + return 0; + + nstart = vma->vm_end; + } + + return -ENOMEM; +} + +/* + * Apply sealing. + */ +static int apply_mm_seal(unsigned long start, unsigned long end, + unsigned long newtypes) +{ + unsigned long nstart, nend; + struct vm_area_struct *vma, *prev = NULL; + struct vma_iterator vmi; + int error = 0; + + vma_iter_init(&vmi, current->mm, start); + vma = vma_find(&vmi, end); + + prev = vma_prev(&vmi); + if (start > vma->vm_start) + prev = vma; + + nstart = start; + + /* going through each vma to update. */ + for_each_vma_range(vmi, vma, end) { + nend = vma->vm_end; + if (nend > end) + nend = end; + + error = mseal_fixup(&vmi, vma, &prev, nstart, nend, newtypes); + if (error) + break; + + nstart = vma->vm_end; + } + + return error; +} + +/* + * mseal(2) seals the VM's meta data from + * selected syscalls. + * + * addr/len: VM address range. + * + * The address range by addr/len must meet: + * start (addr) must be in a valid VMA. + * end (addr + len) must be in a valid VMA. + * no gap (unallocated memory) between start and end. + * start (addr) must be page aligned. + * + * len: len will be page aligned implicitly. + * + * types: bit mask for sealed syscalls. + * MM_SEAL_BASE: prevent VMA from: + * 1> Unmapping, moving to another location, and shrinking + * the size, via munmap() and mremap(), can leave an empty + * space, therefore can be replaced with a VMA with a new + * set of attributes. + * 2> Move or expand a different vma into the current location, + * via mremap(). + * 3> Modifying sealed VMA via mmap(MAP_FIXED). + * 4> Size expansion, via mremap(), does not appear to pose any + * specific risks to sealed VMAs. It is included anyway because + * the use case is unclear. In any case, users can rely on + * merging to expand a sealed VMA. + * + * The MM_SEAL_PROT_PKEY: + * Seal PROT and PKEY of the address range, in other words, + * mprotect() and pkey_mprotect() will be denied if the memory is + * sealed with MM_SEAL_PROT_PKEY. + * + * The MM_SEAL_SEAL + * MM_SEAL_SEAL denies adding a new seal for an VMA. + * + * The kernel will remember which seal types are applied, and the + * application doesn’t need to repeat all existing seal types in + * the next mseal(). Once a seal type is applied, it can’t be + * unsealed. Call mseal() on an existing seal type is a no-action, + * not a failure. + * + * flags: reserved. + * + * return values: + * zero: success. + * -EINVAL: + * invalid seal type. + * invalid input flags. + * addr is not page aligned. + * addr + len overflow. + * -ENOMEM: + * addr is not a valid address (not allocated). + * end (addr + len) is not a valid address. + * a gap (unallocated memory) between start and end. + * -EACCES: + * MM_SEAL_SEAL is set, adding a new seal is rejected. + * + * Note: + * user can call mseal(2) multiple times to add new seal types. + * adding an already added seal type is a no-action (no error). + * adding a new seal type after MM_SEAL_SEAL will be rejected. + * unseal() or removing a seal type is not supported. + */ +static int do_mseal(unsigned long start, size_t len_in, unsigned long types, + unsigned long flags) +{ + int ret = 0; + unsigned long end; + struct mm_struct *mm = current->mm; + size_t len; + + /* MM_SEAL_BASE is set when other seal types are set. */ + if (types & MM_SEAL_PROT_PKEY) + types |= MM_SEAL_BASE; + + if (!can_do_mseal(types, flags)) + return -EINVAL; + + start = untagged_addr(start); + if (!PAGE_ALIGNED(start)) + return -EINVAL; + + len = PAGE_ALIGN(len_in); + /* Check to see whether len was rounded up from small -ve to zero. */ + if (len_in && !len) + return -EINVAL; + + end = start + len; + if (end < start) + return -EINVAL; + + if (end == start) + return 0; + + if (mmap_write_lock_killable(mm)) + return -EINTR; + + ret = check_mm_seal(start, end, types); + if (ret) + goto out; + + ret = apply_mm_seal(start, end, types); + +out: + mmap_write_unlock(current->mm); + return ret; +} + +SYSCALL_DEFINE4(mseal, unsigned long, start, size_t, len, unsigned long, types, unsigned long, + flags) +{ + return do_mseal(start, len, types, flags); +} From patchwork Tue Dec 12 23:16:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 177659 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp8060822vqy; Tue, 12 Dec 2023 15:17:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IG30K15fZu9msdf3Xgu+h/HFi4pcJY+2/VaC/Nsq6wtDJMvb6aVlLPUCUV/DH1zqWC+oJxN X-Received: by 2002:a17:90b:1b10:b0:28a:290c:d46a with SMTP id nu16-20020a17090b1b1000b0028a290cd46amr6404460pjb.94.1702423071826; Tue, 12 Dec 2023 15:17:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702423071; cv=none; d=google.com; s=arc-20160816; b=oCUxONCfw6kQPvk1xzGqiRPs5KadCMamoYSqDJ596Y8O3lHZ++PcRNgBpmf1XZ9xOM 7q0rmWZZZ7nLhbkF1/2YLJgucZ8vSOH5AydQzdcxeYhWb3rWE8sIReOQUw7ajob2lh6T PMFSeNWvtCoERqftC06Zm7JonppbfGkiycP9VmShByxRKZhANwV++hXZtwt28kKjk7oj XFe3JLo2sAnJ30VawY9zzY2tN9mAzi1yw6IM+j7Gz1b+KXcXlgX4pYKfgmCKlC4+QQMk Ih0QjpP886GV/xJLT0wyNviVCgOTneqCVmZMm5K6xIQLnZ6VcKrnNpaDt6LoJ9JmRx22 n9Og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=9YF8gInYyKnjLyYVifyCViAm+ORlRHRAbg0UC91NVL0=; fh=0mRPlJo0PPrkJQqE76MB2W/JiJzBa9Y7evJmiT9T7OE=; b=O3nhAGa+A0l3GuZHm/XRuAIxuS2Ky/buf2UKb1JTUGVbOb9yoveveAdennud04MBK/ nbcsCU+f2xpgkblqKaE4G1J7fMc/QSar8s7P955Fg8qhgb5RM4ffWH5/28arcx1J32wv xfyvXVtKtoq6/lY9T0ywjdODEoQw+9I5MOVluaEleRfS2L2OYZw39fAGajOQOfFaEpIW khS3syiE38e+Vi+jUWyoZllD2lnsdrw/AIN08vjaDm5OUYEOirEQ0TRRmueMoFx/m0YS C84iQiuCsg5MJA3wLF8OFe8YbGk80TKjSDBRGAwVHCc8Y4FQH7UX3CJGuuVlTt5yT/Hf 4uZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=JveXj9O7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id i8-20020a17090a974800b00286e6980c12si8550457pjw.93.2023.12.12.15.17.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 15:17:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=JveXj9O7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id C198281CB0EA; Tue, 12 Dec 2023 15:17:42 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377975AbjLLXRW (ORCPT + 99 others); Tue, 12 Dec 2023 18:17:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377982AbjLLXRK (ORCPT ); Tue, 12 Dec 2023 18:17:10 -0500 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79B15CD for ; Tue, 12 Dec 2023 15:17:13 -0800 (PST) Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-6ce9e897aeaso5571212b3a.2 for ; Tue, 12 Dec 2023 15:17:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423033; x=1703027833; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9YF8gInYyKnjLyYVifyCViAm+ORlRHRAbg0UC91NVL0=; b=JveXj9O7DrMtVI6EPYNPmCyQRpnbxLDOhltqcF7+JB1Mj/2fPZVbqn3rHV/mDTKqto oq9A6Jrp3IpCXOYSFlMzA+YHd6jLv7dXk5x3n6Qri6rtHqhjf3nihi1/0kjx3w0qZdjK B9bMPguq9e8wEUUPgwmqrD0kB82X4Bo8Agkuo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423033; x=1703027833; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9YF8gInYyKnjLyYVifyCViAm+ORlRHRAbg0UC91NVL0=; b=JMJla0g4EVcl6GY/wAlvGQqUmQ+qcugIkFT6GRaSfkgRIIcTW0jZlCm9hvv1SQmPu+ qEAUZX44VCDJyRD79jbYF2nxZVHfWkYlwPylazAmUDsyswuJFWTs+V2V5c3QAHFzMCop MSb8YxYNF1EC0gtG6FUeoyfkwL+4oBsEhdYCP7VufjZFaSQZCHOFIpO55PeIl/iXq6nl IA4Hw3tElAKyxB8x6wNw8yiwggRvnZ0zIZEh0Q3TTV4SuS3pePVgwAMpYyUMyVbkU/hD 4GsqVcVITQCet9uFTKw7P66yBckwhITU1NoDP6xoxos8kldl8WYBD1TR+cCfuqVuv9H2 Q0SA== X-Gm-Message-State: AOJu0YwsjPUeF+5En6iIrUaKQf1UqJEy9eSA4joSmIMpjXOD8KDKBwIi Qc3gFvDR5YA0O1Ax4IuFqm06jQ== X-Received: by 2002:a05:6a00:2d1b:b0:6cb:bc92:c73f with SMTP id fa27-20020a056a002d1b00b006cbbc92c73fmr7494441pfb.2.1702423032923; Tue, 12 Dec 2023 15:17:12 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id m26-20020aa78a1a000000b006c988fda657sm8975614pfa.177.2023.12.12.15.17.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:12 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 02/11] mseal: Wire up mseal syscall Date: Tue, 12 Dec 2023 23:16:56 +0000 Message-ID: <20231212231706.2680890-3-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Tue, 12 Dec 2023 15:17:42 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785119974955240320 X-GMAIL-MSGID: 1785119974955240320 From: Jeff Xu Wire up mseal syscall for all architectures. Signed-off-by: Jeff Xu --- arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 ++ arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl | 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + include/uapi/asm-generic/unistd.h | 5 ++++- 19 files changed, 23 insertions(+), 2 deletions(-) diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index b68f1f56b836..4de33b969009 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -496,3 +496,4 @@ 564 common futex_wake sys_futex_wake 565 common futex_wait sys_futex_wait 566 common futex_requeue sys_futex_requeue +567 common mseal sys_mseal diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 93d0d46cbb15..dacea023bb88 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -469,3 +469,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 531effca5f1f..298313d2e0af 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -39,7 +39,7 @@ #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 457 +#define __NR_compat_syscalls 458 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index c453291154fd..015c80b14206 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -917,6 +917,8 @@ __SYSCALL(__NR_futex_wake, sys_futex_wake) __SYSCALL(__NR_futex_wait, sys_futex_wait) #define __NR_futex_requeue 456 __SYSCALL(__NR_futex_requeue, sys_futex_requeue) +#define __NR_mseal 457 +__SYSCALL(__NR_mseal, sys_mseal) /* * Please add new compat syscalls above this comment and update diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index 81375ea78288..e8b40451693d 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -376,3 +376,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index f7f997a88bab..0da4a4dc1737 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -455,3 +455,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl index 2967ec26b978..ca8572222783 100644 --- a/arch/microblaze/kernel/syscalls/syscall.tbl +++ b/arch/microblaze/kernel/syscalls/syscall.tbl @@ -461,3 +461,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl index 383abb1713f4..4fd33623b7e8 100644 --- a/arch/mips/kernel/syscalls/syscall_n32.tbl +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -394,3 +394,4 @@ 454 n32 futex_wake sys_futex_wake 455 n32 futex_wait sys_futex_wait 456 n32 futex_requeue sys_futex_requeue +457 n32 mseal sys_mseal diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl index c9bd09ba905f..aaa6382781e0 100644 --- a/arch/mips/kernel/syscalls/syscall_n64.tbl +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -370,3 +370,4 @@ 454 n64 futex_wake sys_futex_wake 455 n64 futex_wait sys_futex_wait 456 n64 futex_requeue sys_futex_requeue +457 n64 mseal sys_mseal diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl index ba5ef6cea97a..bbdd6f151224 100644 --- a/arch/mips/kernel/syscalls/syscall_o32.tbl +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -443,3 +443,4 @@ 454 o32 futex_wake sys_futex_wake 455 o32 futex_wait sys_futex_wait 456 o32 futex_requeue sys_futex_requeue +457 o32 mseal sys_mseal diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl index 9f0f6df55361..8dda80555c7c 100644 --- a/arch/parisc/kernel/syscalls/syscall.tbl +++ b/arch/parisc/kernel/syscalls/syscall.tbl @@ -454,3 +454,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index 26fc41904266..d0aa97a669bc 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -542,3 +542,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl index 31be90b241f7..228f100f8565 100644 --- a/arch/s390/kernel/syscalls/syscall.tbl +++ b/arch/s390/kernel/syscalls/syscall.tbl @@ -458,3 +458,4 @@ 454 common futex_wake sys_futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue sys_futex_requeue +457 common mseal sys_mseal sys_mseal diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl index 4bc5d488ab17..cf08ea4a7539 100644 --- a/arch/sh/kernel/syscalls/syscall.tbl +++ b/arch/sh/kernel/syscalls/syscall.tbl @@ -458,3 +458,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl index 8404c8e50394..30796f78bdc2 100644 --- a/arch/sparc/kernel/syscalls/syscall.tbl +++ b/arch/sparc/kernel/syscalls/syscall.tbl @@ -501,3 +501,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 31c48bc2c3d8..c4163b904714 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -460,3 +460,4 @@ 454 i386 futex_wake sys_futex_wake 455 i386 futex_wait sys_futex_wait 456 i386 futex_requeue sys_futex_requeue +457 i386 mseal sys_mseal diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index a577bb27c16d..47fbc6ac0267 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -378,6 +378,7 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal # # Due to a historical design error, certain syscalls are numbered differently diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl index dd71ecce8b86..fe5f562f6493 100644 --- a/arch/xtensa/kernel/syscalls/syscall.tbl +++ b/arch/xtensa/kernel/syscalls/syscall.tbl @@ -426,3 +426,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index d9e9cd13e577..1678245d8a2b 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -829,8 +829,11 @@ __SYSCALL(__NR_futex_wait, sys_futex_wait) #define __NR_futex_requeue 456 __SYSCALL(__NR_futex_requeue, sys_futex_requeue) +#define __NR_mseal 457 +__SYSCALL(__NR_mseal, sys_mseal) + #undef __NR_syscalls -#define __NR_syscalls 457 +#define __NR_syscalls 458 /* * 32 bit systems traditionally used different From patchwork Tue Dec 12 23:16:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 177657 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp8060641vqy; Tue, 12 Dec 2023 15:17:22 -0800 (PST) X-Google-Smtp-Source: AGHT+IHxN94ri+PY4fhNzW6QU9n57w5khyWkVYDpkp/FdHsLxn/NXtPPcWKNS+TPsPbGTjdwfGc1 X-Received: by 2002:a05:6a20:4310:b0:18f:4779:6781 with SMTP id h16-20020a056a20431000b0018f47796781mr9322297pzk.105.1702423042035; Tue, 12 Dec 2023 15:17:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702423042; cv=none; d=google.com; s=arc-20160816; b=B5GMynijznDAqWRNZcbxW6nzjP4XWrDAgN0v1I36ZasSx20DU+5naZBk+PEe1F48d/ CvupWMaMeOI/LAQGxVwnOqj8c5ZEuPzVcaKaXYShcoyk8LpHF1f5+/CgEzI0MpNt0iwO 3B+F+xxdf3PhGgqMDgGJzauOF4DfX+8Xb5y/b0BjBgRYnOk8xmXeE3ReV6vLspF7ApAT YtlhoYyCVWG5x6alqdxS7GWT17z3tyzA1PdpU0OHNxYpCWEn1/skM4RDqU9MxQclm0kK EeuCrHkWB4iCWj0h7g9YWZV0VrZTqa8CWVG9yVqx4u+xuv4lTqJah6qQDxTwRDS/+78G iMWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Vgg0J3qxY24SnHD9sqtPXJJls+X5lh0PMbmZ3jrDVoQ=; fh=0mRPlJo0PPrkJQqE76MB2W/JiJzBa9Y7evJmiT9T7OE=; b=tuYwWdQOBYg5fZpoTIn1lV44fJWmmanC+owv/XXdZhlE8pOKSPnhGjEPDzY2AVYbWG j5D5DzwZ97lCnMdrM9gWScSIbFlHTD9r7zHgTLs+4fq82Em/6XVKBidXo/c2DNF6/kvD vAfrjAalqH8M94Px/sMbCkJISQsxM9+tA58cvFMyLqJOioPki7/i/1T/4FtBy7xnsOt+ 7uFVaZxCZgD8Z2KzQbKCmv2fLgufXYuaNBL4Dj8lP6R2DK6Q/EWCVH7pW0a04EgRSEMz s8lBhYRFLvxpgJuuHBo/BumPVA02hAgxJMa5v3RZoe8Mb6jVoZO8bbgKO0VfjcrVPrzt R68g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=T6wsIoUG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id fb4-20020a056a002d8400b006ce64b59391si8355248pfb.110.2023.12.12.15.17.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 15:17:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=T6wsIoUG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 2A24D80425AC; Tue, 12 Dec 2023 15:17:21 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377979AbjLLXRK (ORCPT + 99 others); Tue, 12 Dec 2023 18:17:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377975AbjLLXRJ (ORCPT ); Tue, 12 Dec 2023 18:17:09 -0500 Received: from mail-pg1-x534.google.com (mail-pg1-x534.google.com [IPv6:2607:f8b0:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4452CD0 for ; Tue, 12 Dec 2023 15:17:14 -0800 (PST) Received: by mail-pg1-x534.google.com with SMTP id 41be03b00d2f7-5ca29c131ebso1716749a12.0 for ; Tue, 12 Dec 2023 15:17:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423034; x=1703027834; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Vgg0J3qxY24SnHD9sqtPXJJls+X5lh0PMbmZ3jrDVoQ=; b=T6wsIoUGX3fIwDf01FpNscoSVu4ubDm1E9zTDGFrSasB/SBA+rC/I52alm6lkdDh7m PqA/teR0nmJ7Hdgl+e/ylfl9d3bs1oKwrnNortNZZ67G6sZrRO+zazp3yadNcdjwxQtP BJ5VJ7ZYbEzS/lulvk5t77B4RRA68VnFkTfWM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423034; x=1703027834; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Vgg0J3qxY24SnHD9sqtPXJJls+X5lh0PMbmZ3jrDVoQ=; b=Cnl4zEnW9iBoY4BmoYq/f15U7AGJFzyCNcdNx9uVLAiAeGZzTK93DdTGerkBc8fP9+ P6HQFBEjUAvQq5wbBZOBfY9qw7wdaafAuVqH5v+iP9dgu7seE52deft6AunzxqjJnDq8 RiZvn8gzf/YF7pP4YuxB/Tq7VfCyYiMIhDLupMuxrXuSO+xsxHrTSH6cZgHhUJ8eIfgR W6dzyBcyooQe7pLJM3xVANVrO8yV4dGlQwPQAmfvpJymywDtNMNP4qz5/o6L9JOuh+kW m/iUX3LnZS89sa/l0kNxHERm7JGbmKtp7vdaBTBt/NXjNZaRDoWbqWt3Vpe3ES+AwBpY d3iQ== X-Gm-Message-State: AOJu0Yyl+EQrE50BiXrynsvrAteAgVSKMfSM/85KpE7ZH/QtDV27RBey j9r7jojSA063uOjIeut99t0sOQ== X-Received: by 2002:a05:6a20:429a:b0:190:50ec:e2e4 with SMTP id o26-20020a056a20429a00b0019050ece2e4mr9563392pzj.45.1702423033741; Tue, 12 Dec 2023 15:17:13 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id b4-20020aa78704000000b006ce41b1ba8csm8575780pfo.131.2023.12.12.15.17.13 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:13 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 03/11] mseal: add can_modify_mm and can_modify_vma Date: Tue, 12 Dec 2023 23:16:57 +0000 Message-ID: <20231212231706.2680890-4-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 12 Dec 2023 15:17:21 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785119943462600652 X-GMAIL-MSGID: 1785119943462600652 From: Jeff Xu Two utilities to be used later. can_modify_mm: checks sealing flags for given memory range. can_modify_vma: checks sealing flags for given vma. Signed-off-by: Jeff Xu --- include/linux/mm.h | 18 ++++++++++++++++++ mm/mseal.c | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 56 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3d1120570de5..2435acc1f44f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3339,6 +3339,12 @@ static inline unsigned long vma_seals(struct vm_area_struct *vma) return (vma->vm_seals & MM_SEAL_ALL); } +extern bool can_modify_mm(struct mm_struct *mm, unsigned long start, + unsigned long end, unsigned long checkSeals); + +extern bool can_modify_vma(struct vm_area_struct *vma, + unsigned long checkSeals); + #else static inline bool check_vma_seals_mergeable(unsigned long vm_seals1) { @@ -3349,6 +3355,18 @@ static inline unsigned long vma_seals(struct vm_area_struct *vma) { return 0; } + +static inline bool can_modify_mm(struct mm_struct *mm, unsigned long start, + unsigned long end, unsigned long checkSeals) +{ + return true; +} + +static inline bool can_modify_vma(struct vm_area_struct *vma, + unsigned long checkSeals) +{ + return true; +} #endif /* These take the mm semaphore themselves */ diff --git a/mm/mseal.c b/mm/mseal.c index 13bbe9ef5883..d12aa628ebdc 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -26,6 +26,44 @@ static bool can_do_mseal(unsigned long types, unsigned long flags) return true; } +/* + * check if a vma is sealed for modification. + * return true, if modification is allowed. + */ +bool can_modify_vma(struct vm_area_struct *vma, + unsigned long checkSeals) +{ + if (checkSeals & vma_seals(vma)) + return false; + + return true; +} + +/* + * Check if the vmas of a memory range are allowed to be modified. + * the memory ranger can have a gap (unallocated memory). + * return true, if it is allowed. + */ +bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end, + unsigned long checkSeals) +{ + struct vm_area_struct *vma; + + VMA_ITERATOR(vmi, mm, start); + + if (!checkSeals) + return true; + + /* going through each vma to check. */ + for_each_vma_range(vmi, vma, end) { + if (!can_modify_vma(vma, checkSeals)) + return false; + } + + /* Allow by default. */ + return true; +} + /* * Check if a seal type can be added to VMA. */ From patchwork Tue Dec 12 23:16:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 177658 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp8060787vqy; Tue, 12 Dec 2023 15:17:45 -0800 (PST) X-Google-Smtp-Source: AGHT+IEh4Ev7lxNoqhC04AzgsF4i+0f3Q/gWUebCaZ2aa9uLr4f0kATQLfI/RnbNgxGHRP6Gi9YJ X-Received: by 2002:a05:6e02:1e0c:b0:35f:535a:9c65 with SMTP id g12-20020a056e021e0c00b0035f535a9c65mr5063442ila.29.1702423065672; Tue, 12 Dec 2023 15:17:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702423065; cv=none; d=google.com; s=arc-20160816; b=X9J7q334uaYe8/drNPCvfvs5v+uxRObXyDMvyxqZNAtmFZtCDsi1uHSWPGUt+OPO+w 7n/XwmzRZhdjel/7A3Lb02bXwXg7lQTG+JA6UumHkheYEoFvDl0SwJOQ67TTvDj51EE/ /9gb8Kmxj3HMePlQzNBs7BuZfjk0qmuQGJtqCsherPbklSalejm7EF4ThBSWeeulKlIO jAwLHf893SwWxbUx7wR7FLsipfsjNtyKs8h8QEizeDnIuzESPJFYnBmaG58qkz0gxg35 HKndQEiv9cv7+7qC/MB8kjzk5hY/oqSZfFZtXjR7QLmwwizip0TPjmtrHor/AJXtVIP3 m/Og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=1D1PlLUfxR4CYnPMS2OzloWMrMloIYQmh/cjfDWMVaY=; fh=0mRPlJo0PPrkJQqE76MB2W/JiJzBa9Y7evJmiT9T7OE=; b=K0v2GeUaG+iy2S6D595FLHHOz4aLFIh0hAJmAJznemlHQC2xc746WNY5NBxJoJmpbJ JnNu7Yo0GGewWi6ftsG3o0bN5q1RyqbGHZP54PEDT5h3ausjnXXk2HKX6nr1FmOqfeMh i4enek4v6jlOqpGDbs1mHz+cX5lKFWtSa+zUxlTOwAzvRgOB9BUWXCnCxrYZd1vi6uf+ PAta+nASTVtK48ZJ5Er7xNTlfmlREdv/yyyk/GBzoR3XG6KYj9/pQmjo8675sieRFmlD 5Zgj1gHA+KnENlsoXRIStJNJkS8dWPZ6/cwxxj+aZ3yr1M6tNo8KgUkblK5ICrgkVKfC S/5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=VwpHN4+I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id j7-20020a654287000000b005c65ee2abaesi8270717pgp.219.2023.12.12.15.17.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 15:17:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=VwpHN4+I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id E56B580B26BE; Tue, 12 Dec 2023 15:17:39 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377983AbjLLXRZ (ORCPT + 99 others); Tue, 12 Dec 2023 18:17:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377980AbjLLXRK (ORCPT ); Tue, 12 Dec 2023 18:17:10 -0500 Received: from mail-oi1-x230.google.com (mail-oi1-x230.google.com [IPv6:2607:f8b0:4864:20::230]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79305DB for ; Tue, 12 Dec 2023 15:17:15 -0800 (PST) Received: by mail-oi1-x230.google.com with SMTP id 5614622812f47-3b9efed2e6fso3511013b6e.0 for ; Tue, 12 Dec 2023 15:17:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423035; x=1703027835; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1D1PlLUfxR4CYnPMS2OzloWMrMloIYQmh/cjfDWMVaY=; b=VwpHN4+IJZhT1v7aZl1ccbIWgf9dK4aE019rwpcXSpH2ps8fUikdnlYv2UJ8l+nt9/ FlA/9gvk04NWswkP53hK603251om+FfWYZrjDdQ6LIcZkq3TZtcg+4a0GG+loqPhWvQD nXlonMf2JgHsnjKGXif9SSk/qGoZ1/LzWhhTQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423035; x=1703027835; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1D1PlLUfxR4CYnPMS2OzloWMrMloIYQmh/cjfDWMVaY=; b=MeKGB00XDKXRGRNI/8SnG6cohohmVqAcKyeS1+xxlJE6wYtSYmDz1SXZNB3FgpKiHl +83Jh0kaSFcCYhHsJ5Z/+eamXVq2dN3cKFvIG4Y5sIrdLM89GweJvDSWhr8esn2+Cth+ Qq3bYscQcIUpQlzqpR/mMoAjJJPArRRLOaiHZZENomBy+Fk0UYiGnR6ICjQGlIkXYOau sLFGP9sYGhwamvJJpUw3dJK3ST0xxojaQ0EgDNVO723dbfs0W9KW8KVTIP7WhtmtDVCc qe1rBXUcWTbBupFqABycdAljD/zuFltmfAGtHnUqxlmtXjPXtdBzsIhoJjn2T527WMgB vnPg== X-Gm-Message-State: AOJu0YzsE24nH43a0VDdoVuJuTuKlU3nAKReweKx64717Q5kt4sJloXt ennvX4cYqyhAchl/pGRZgv6/Ew== X-Received: by 2002:a05:6358:262a:b0:16e:2898:5e02 with SMTP id l42-20020a056358262a00b0016e28985e02mr9003519rwc.32.1702423034702; Tue, 12 Dec 2023 15:17:14 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id s188-20020a635ec5000000b005c6617b52e6sm8763314pgb.5.2023.12.12.15.17.14 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:14 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 04/11] mseal: add MM_SEAL_BASE Date: Tue, 12 Dec 2023 23:16:58 +0000 Message-ID: <20231212231706.2680890-5-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Tue, 12 Dec 2023 15:17:40 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785119968815791697 X-GMAIL-MSGID: 1785119968815791697 From: Jeff Xu The base package includes the features common to all VMA sealing types. It prevents sealed VMAs from: 1> Unmapping, moving to another location, and shrinking the size, via munmap() and mremap(), can leave an empty space, therefore can be replaced with a VMA with a new set of attributes. 2> Move or expand a different vma into the current location, via mremap(). 3> Modifying sealed VMA via mmap(MAP_FIXED). 4> Size expansion, via mremap(), does not appear to pose any specific risks to sealed VMAs. It is included anyway because the use case is unclear. In any case, users can rely on merging to expand a sealed VMA. We consider the MM_SEAL_BASE feature, on which other sealing features will depend. For instance, it probably does not make sense to seal PROT_PKEY without sealing the BASE, and the kernel will implicitly add SEAL_BASE for SEAL_PROT_PKEY. (If the application wants to relax this in future, we could use the flags field in mseal() to overwrite this the behavior of implicitly adding SEAL_BASE.) Signed-off-by: Jeff Xu --- mm/mmap.c | 23 +++++++++++++++++++++++ mm/mremap.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index 42462c2a0c35..dbc557bd460c 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1259,6 +1259,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr, return -EEXIST; } + /* + * Check if the address range is sealed for do_mmap(). + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(mm, addr, addr + len, MM_SEAL_BASE)) + return -EACCES; + if (prot == PROT_EXEC) { pkey = execute_only_pkey(mm); if (pkey < 0) @@ -2632,6 +2639,14 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm, if (end == start) return -EINVAL; + /* + * Check if memory is sealed before arch_unmap. + * Prevent unmapping a sealed VMA. + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(mm, start, end, MM_SEAL_BASE)) + return -EACCES; + /* arch_unmap() might do unmaps itself. */ arch_unmap(mm, start, end); @@ -3053,6 +3068,14 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, { struct mm_struct *mm = vma->vm_mm; + /* + * Check if memory is sealed before arch_unmap. + * Prevent unmapping a sealed VMA. + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(mm, start, end, MM_SEAL_BASE)) + return -EACCES; + arch_unmap(mm, start, end); return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock); } diff --git a/mm/mremap.c b/mm/mremap.c index 382e81c33fc4..ff7429bfbbe1 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -835,7 +835,35 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len, if ((mm->map_count + 2) >= sysctl_max_map_count - 3) return -ENOMEM; + /* + * In mremap_to() which moves a VMA to another address. + * Check if src address is sealed, if so, reject. + * In other words, prevent a sealed VMA being moved to + * another address. + * + * Place can_modify_mm here because mremap_to() + * does its own checking for address range, and we only + * check the sealing after passing those checks. + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(mm, addr, addr + old_len, MM_SEAL_BASE)) + return -EACCES; + if (flags & MREMAP_FIXED) { + /* + * In mremap_to() which moves a VMA to another address. + * Check if dst address is sealed, if so, reject. + * In other words, prevent moving a vma to a sealed VMA. + * + * Place can_modify_mm here because mremap_to() does its + * own checking for address, and we only check the sealing + * after passing those checks. + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(mm, new_addr, new_addr + new_len, + MM_SEAL_BASE)) + return -EACCES; + ret = do_munmap(mm, new_addr, new_len, uf_unmap_early); if (ret) goto out; @@ -994,6 +1022,20 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, goto out; } + /* + * This is shrink/expand case (not mremap_to()) + * Check if src address is sealed, if so, reject. + * In other words, prevent shrinking or expanding a sealed VMA. + * + * Place can_modify_mm here so we can keep the logic related to + * shrink/expand together. Perhaps we can extract below to be its + * own function in future. + */ + if (!can_modify_mm(mm, addr, addr + old_len, MM_SEAL_BASE)) { + ret = -EACCES; + goto out; + } + /* * Always allow a shrinking remap: that just unmaps * the unnecessary pages.. From patchwork Tue Dec 12 23:16:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 177667 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp8061048vqy; Tue, 12 Dec 2023 15:18:20 -0800 (PST) X-Google-Smtp-Source: AGHT+IFnnAj+CXYy/LoKP+ZD99imzRW8Iq3hs0YPyx/JXKArbNm3QC57MAaFGpLbMbOkmDmumQac X-Received: by 2002:a05:6e02:1be5:b0:35d:59a2:a32c with SMTP id y5-20020a056e021be500b0035d59a2a32cmr7087520ilv.46.1702423100531; Tue, 12 Dec 2023 15:18:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702423100; cv=none; d=google.com; s=arc-20160816; b=WaT/2fzd+Vkkr0vYfyPDhxLNBRrrn3/pL59F6zufDnWJn4tBhN6aNAOB/Z0XvRN5fQ ds3iAYO/xjQx8hC7IJ9YhswQl9Rq4+6Rmu8l5fM5bIii/R6zm3l9XMeK4p7l3L09aPee otYRNCw9mCUtLkwR1L/+2d+inT/gNDe4Unz6KJ+PweavBolOAsoSt88jq235AiFD2lNv bEGRw26tr05n3UyjJZ+X9meIc08jT9ofA6smWh9QG3HFUgYAROjVf1YRhkPXOC8V/s6k cG9zyMmS74LA+RYBXfUnhIAXgBtFkQwATYUegl6gPjUYnz6kqZBw//kbSif6kDxsrska OGgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=nL+BTz2K9JmfgWW40zZKnYHQ0fFwxEyXyS5r4SraM1c=; fh=0mRPlJo0PPrkJQqE76MB2W/JiJzBa9Y7evJmiT9T7OE=; b=XUaOjC6e9YRH2NuMC/RwkdnrNBTnsiA+8L2SDoWoc7cJM/wOXYE+AE1kkeNa7dMmSu y7RohC2nr3hECji0EQeEZdqTm8Jno1ZmQxwklKz1pJIyKohFUMtnCHgHhjKf3V9nUU91 VATkpwI4EPsGg6kJssbUVdfeinuJDyxDPdPiJlpoFJFit82jMBifNuHPbiXJKLsWc4EB PrWXGX7XTTC1MCdRZ+/xVqQWNhih7IEETBN2RE7qdWvsS2jg6S4PU5oERVU0E0fJ4WuT 7W9vIEdY8W0qmxH17voq1DhdYwLddWKaJS4p5vjozvZYT3w+zsOkC+IAs90nO9bAbBMQ 9N7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=TyBjuWlz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id y31-20020a634b1f000000b005c635a9d1f2si8300793pga.156.2023.12.12.15.18.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 15:18:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=TyBjuWlz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 7631C80B26DF; Tue, 12 Dec 2023 15:18:12 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378020AbjLLXRS (ORCPT + 99 others); Tue, 12 Dec 2023 18:17:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48216 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377983AbjLLXRK (ORCPT ); Tue, 12 Dec 2023 18:17:10 -0500 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10091DC for ; Tue, 12 Dec 2023 15:17:16 -0800 (PST) Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1d03bcf27e9so39463435ad.0 for ; Tue, 12 Dec 2023 15:17:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423035; x=1703027835; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nL+BTz2K9JmfgWW40zZKnYHQ0fFwxEyXyS5r4SraM1c=; b=TyBjuWlz2QWPbohiHkWTUANQfk4hCqqFjbNhgTsqEpaS3knwjlCsXJVWKWwzvBHIyV iew2Ew/6Ai7U/QGnuhP/wf45FSx+1sw8siUl2EpanVEvmZsyDwijfgEO/P33jP7y+b/S 4g95Imryk63/kRG8ozB3rZyt10UIxvTHCM6ak= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423035; x=1703027835; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nL+BTz2K9JmfgWW40zZKnYHQ0fFwxEyXyS5r4SraM1c=; b=iEL/D4GuAHRS8/LpqW58OdXTNxuWIb8HdbhtwodqJ1DzN1n1pUTy/XB2bZlawFbuzu F4f7TwJvwDmj+wHJ1R+dGH8DY1V+pEObxne/s4o4ogl4tE1IU9Z5URGTeYpcX4swoeCM dHkPjSimUTo1ANoLlOuNiQOA/MLEjnWrnH14WjAm9X0cEKZJjDSj6G7ntdSPssvBqLkd 99HcnCd36AqKtk4VaiAJY9eQMM91XFGX2UPczrzR7n77v7y4QX8g6ncsPj0zXsoUWDEK yFVQVQSlyMHfk5rteH5NF7f6nSdjdfZqsfPi8QeC8ToAzJYGZ7ssKYHZ8N18SUcqveZz eyFA== X-Gm-Message-State: AOJu0YzT1ff+x+V000Vi/1RkFJ7nK6wwu88O472/tQL0G15xdJqFQ8pD 5aoQgLTvq0l9vDXmHcrXoi/yIA== X-Received: by 2002:a17:902:b702:b0:1d0:7d83:fdd9 with SMTP id d2-20020a170902b70200b001d07d83fdd9mr3543832pls.122.1702423035559; Tue, 12 Dec 2023 15:17:15 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id h2-20020a170902f54200b001cfc67d46efsm9074320plf.191.2023.12.12.15.17.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:15 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 05/11] mseal: add MM_SEAL_PROT_PKEY Date: Tue, 12 Dec 2023 23:16:59 +0000 Message-ID: <20231212231706.2680890-6-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Tue, 12 Dec 2023 15:18:12 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785120005377952318 X-GMAIL-MSGID: 1785120005377952318 From: Jeff Xu Seal PROT and PKEY of the address range, in other words, mprotect() and pkey_mprotect() will be denied if the memory is sealed with MM_SEAL_PROT_PKEY. Signed-off-by: Jeff Xu --- mm/mprotect.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/mm/mprotect.c b/mm/mprotect.c index b94fbb45d5c7..1527188b1e92 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include #include @@ -753,6 +754,15 @@ static int do_mprotect_pkey(unsigned long start, size_t len, } } + /* + * checking if PROT and PKEY is sealed. + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(current->mm, start, end, MM_SEAL_PROT_PKEY)) { + error = -EACCES; + goto out; + } + prev = vma_prev(&vmi); if (start > vma->vm_start) prev = vma; From patchwork Tue Dec 12 23:17:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 177663 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp8060932vqy; Tue, 12 Dec 2023 15:18:07 -0800 (PST) X-Google-Smtp-Source: AGHT+IFMmMsebvL5AKPaKscHAfDRT+Ml0Wn0+y0p2ZFcS/uwvj374vHMOmVrIy2Bqw51kl+doFnO X-Received: by 2002:a17:902:e74e:b0:1d0:6d5d:5e4d with SMTP id p14-20020a170902e74e00b001d06d5d5e4dmr3369351plf.59.1702423086974; Tue, 12 Dec 2023 15:18:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702423086; cv=none; d=google.com; s=arc-20160816; b=umVulMnYf5VVVRHmzLfXPEPTm90tNRxufSMyMr3g7A8lI/wVr/kCpbHw6EyL9h1/dv ZaFIBq+ga92dR69wVqWNWldOUwBmNDy5FBAYZjylNGel8aQV6eGKoVXkwtNhLykqzn7A e6jkIVMO8IOixvVGIXAnf7LTl3ZGgQPV/DYi8OZaLL63RCgdMnF4DGrEV0efeWRHQMEo LUYu0y0doW/ZAq+iFKQvUfzXtLK5ouPhGiKSGzmF5EUrsY89olrvzEtzEo8vFXJTVjDe dPNv86S4cBUlPeXkKDMLrrD0pDN5Bgb5TZOIybrS5/Ah5kYII3zP4atRENmKsFgTHgor S8KQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=D0KleV6o3T8VwIpW2LkN+EL0QO50pQ5bkLoTJyBRonk=; fh=0mRPlJo0PPrkJQqE76MB2W/JiJzBa9Y7evJmiT9T7OE=; b=vcJwziYFlcB/h1fSjElWStu2vfRt2naMkipWXoyFzFjDhc5A31MmW2dfAp2MZBU3x8 NgNVeZjNfbdEclBB7Mh/1FaLSOuufUsKu+AAzpdcgLRZRv0iZW0c+zqYfXLuGPFHeFA5 eqVRJSOPViZsz801b0Ch1yi4EqeVhdGcq/0MPlyypso3DEj/w8xJ3v3+JtBUnjywOLhI hReB1OioBXfBPXFEx5/ivZ/ncT27wC5u8hYqwi3JYcEsknIwnTPyUAxgX9spcG+BY0tz r2+srNnVI3RH+Ae8pCKPfPkbkb91uDbboB+Vnc5+4HaxzARJtH3R/sND+OsVdSRgHFza 5QUQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=UVi8MmUb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id jb17-20020a170903259100b001d08fccc68fsi8448846plb.529.2023.12.12.15.18.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 15:18:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=UVi8MmUb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 2E8B2804AD31; Tue, 12 Dec 2023 15:17:56 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378077AbjLLXRb (ORCPT + 99 others); Tue, 12 Dec 2023 18:17:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48222 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377995AbjLLXRL (ORCPT ); Tue, 12 Dec 2023 18:17:11 -0500 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46624D0 for ; Tue, 12 Dec 2023 15:17:17 -0800 (PST) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1d343f8e139so6563915ad.0 for ; Tue, 12 Dec 2023 15:17:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423037; x=1703027837; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=D0KleV6o3T8VwIpW2LkN+EL0QO50pQ5bkLoTJyBRonk=; b=UVi8MmUbHi0kNmSZ3GTDUM9D8DkaJTGkCEL2dbPQ0c2m9U/F/dZ2Sd89oztK1KW+OS SmE11kWukVcJXlgVH1mBBwmpD/8kX8qTm4YzCAl427EblT8kfxN4p3Uv3JVke1Ygzvr0 y9MnTZdXv61odrB09tA71ICC7wjyNSI2bDRq0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423037; x=1703027837; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=D0KleV6o3T8VwIpW2LkN+EL0QO50pQ5bkLoTJyBRonk=; b=iCT5QCh/eKDpeAXhS/Tr7wIJjuNdNRrdxvUg+HkeNWJ8R3vJwDNwLQh+TrrdoACrMc 4/xFaDQSpIZGGGrXAvyMpMybfye2p5hmSSiWxA3zlAHPl80lF68TWiTLt3vkFbwJ59B2 UufnLZvGQmvROFeRRHreti4Xy0i4wfdavGvVnCl9dgtvvM3l/SbVAd62orzHwS4g+UPC wG6BWOjT3HTsYPXutojXov9TsBA9QVKUetKaFNHZIbDXZnCToMxGR+sYNliDtEksa11c cpd74PdeGgtTVeZp6yxXS0fE7KOyrz9w1DlgJ05dJnVHN1vN6sBQxGBXqkhdKbg1CCQs FnxA== X-Gm-Message-State: AOJu0YySD/E/fVcpDSZWCixJaNm6PH+1LWSK+FrTDNif3HxskYo/hAoP iOi297jWu6tIGggoXyZ54KUUqg== X-Received: by 2002:a17:902:d4d2:b0:1d3:4aab:194d with SMTP id o18-20020a170902d4d200b001d34aab194dmr281091plg.72.1702423036672; Tue, 12 Dec 2023 15:17:16 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id e1-20020a17090301c100b001d33e65b3cdsm1661489plh.112.2023.12.12.15.17.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:16 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 06/11] mseal: add sealing support for mmap Date: Tue, 12 Dec 2023 23:17:00 +0000 Message-ID: <20231212231706.2680890-7-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Tue, 12 Dec 2023 15:17:56 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785119990803539922 X-GMAIL-MSGID: 1785119990803539922 From: Jeff Xu Allow mmap() to set the sealing type when creating a mapping. This is useful for optimization because it avoids having to make two system calls: one for mmap() and one for mseal(). With this change, mmap() can take an input that specifies the sealing type, so only one system call is needed. This patch uses the "prot" field of mmap() to set the sealing. Three sealing types are added to match with MM_SEAL_xyz in mseal(). PROT_SEAL_SEAL PROT_SEAL_BASE PROT_SEAL_PROT_PKEY We also thought about using MAP_SEAL_xyz, which is a field in the mmap() function called "flags". However, this field is more about the type of mapping, such as MAP_FIXED_NOREPLACE. The "prot" field seems more appropriate for our case. It's worth noting that even though the sealing type is set via the "prot" field in mmap(), we don't require it to be set in the "prot" field in later mprotect() call. This is unlike the PROT_READ, PROT_WRITE, PROT_EXEC bits, e.g. if PROT_WRITE is not set in mprotect(), it means that the region is not writable. In other words, if you set PROT_SEAL_PROT_PKEY in mmap(), you don't need to set it in mprotect(). In fact, with the current approach, mseal() is used to set sealing on existing VMA. Signed-off-by: Jeff Xu Suggested-by: Pedro Falcato --- arch/mips/kernel/vdso.c | 10 +++- include/linux/mm.h | 63 +++++++++++++++++++++++++- include/uapi/asm-generic/mman-common.h | 13 ++++++ mm/mmap.c | 25 ++++++++-- 4 files changed, 105 insertions(+), 6 deletions(-) diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c index f6d40e43f108..6d1103d36af1 100644 --- a/arch/mips/kernel/vdso.c +++ b/arch/mips/kernel/vdso.c @@ -98,11 +98,17 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) return -EINTR; if (IS_ENABLED(CONFIG_MIPS_FP_SUPPORT)) { - /* Map delay slot emulation page */ + /* + * Map delay slot emulation page. + * + * Note: passing vm_seals = 0 + * Don't support sealing for vdso for now. + * This might change when we add sealing support for vdso. + */ base = mmap_region(NULL, STACK_TOP, PAGE_SIZE, VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC, - 0, NULL); + 0, NULL, 0); if (IS_ERR_VALUE(base)) { ret = base; goto out; diff --git a/include/linux/mm.h b/include/linux/mm.h index 2435acc1f44f..5d3ee79f1438 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -266,6 +266,15 @@ extern unsigned int kobjsize(const void *objp); MM_SEAL_BASE | \ MM_SEAL_PROT_PKEY) +/* + * PROT_SEAL_ALL is all supported flags in mmap(). + * See include/uapi/asm-generic/mman-common.h. + */ +#define PROT_SEAL_ALL ( \ + PROT_SEAL_SEAL | \ + PROT_SEAL_BASE | \ + PROT_SEAL_PROT_PKEY) + /* * vm_flags in vm_area_struct, see mm_types.h. * When changing, update also include/trace/events/mmflags.h. @@ -3290,7 +3299,7 @@ extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned lo extern unsigned long mmap_region(struct file *file, unsigned long addr, unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, - struct list_head *uf); + struct list_head *uf, unsigned long vm_seals); extern unsigned long do_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flags, vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate, @@ -3339,12 +3348,47 @@ static inline unsigned long vma_seals(struct vm_area_struct *vma) return (vma->vm_seals & MM_SEAL_ALL); } +static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm_seals) +{ + vma->vm_seals |= vm_seals; +} + extern bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned long checkSeals); extern bool can_modify_vma(struct vm_area_struct *vma, unsigned long checkSeals); +/* + * Convert prot field of mmap to vm_seals type. + */ +static inline unsigned long convert_mmap_seals(unsigned long prot) +{ + unsigned long seals = 0; + + /* + * set SEAL_PROT_PKEY implies SEAL_BASE. + */ + if (prot & PROT_SEAL_PROT_PKEY) + prot |= PROT_SEAL_BASE; + + /* + * The seal bits start from bit 26 of the "prot" field of mmap. + * see comments in include/uapi/asm-generic/mman-common.h. + */ + seals = (prot & PROT_SEAL_ALL) >> PROT_SEAL_BIT_BEGIN; + return seals; +} + +/* + * check input sealing type from the "prot" field of mmap(). + * for CONFIG_MSEAL case, this always return 0 (successful). + */ +static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) +{ + *vm_seals = convert_mmap_seals(prot); + return 0; +} #else static inline bool check_vma_seals_mergeable(unsigned long vm_seals1) { @@ -3367,6 +3411,23 @@ static inline bool can_modify_vma(struct vm_area_struct *vma, { return true; } + +static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm_seals) +{ +} + +/* + * check input sealing type from the "prot" field of mmap(). + * For not CONFIG_MSEAL, if SEAL flag is set, it will return failure. + */ +static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) +{ + if (prot & PROT_SEAL_ALL) + return -EINVAL; + + return 0; +} + #endif /* These take the mm semaphore themselves */ diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index 6ce1f1ceb432..f07ad9e70b3a 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -17,6 +17,19 @@ #define PROT_GROWSDOWN 0x01000000 /* mprotect flag: extend change to start of growsdown vma */ #define PROT_GROWSUP 0x02000000 /* mprotect flag: extend change to end of growsup vma */ +/* + * The PROT_SEAL_XX defines memory sealings flags in the prot argument + * of mmap(). The bits currently take consecutive bits and match + * the same sequence as MM_SEAL_XX type, this allows convert_mmap_seals() + * to convert prot to MM_SEAL_XX type using bit operations. + * The include/uapi/linux/mman.h header file defines the MM_SEAL_XX type, + * which is used by the mseal() system call. + */ +#define PROT_SEAL_BIT_BEGIN 26 +#define PROT_SEAL_SEAL _BITUL(PROT_SEAL_BIT_BEGIN) /* 0x04000000 seal seal */ +#define PROT_SEAL_BASE _BITUL(PROT_SEAL_BIT_BEGIN + 1) /* 0x08000000 base for all sealing types */ +#define PROT_SEAL_PROT_PKEY _BITUL(PROT_SEAL_BIT_BEGIN + 2) /* 0x10000000 seal prot and pkey */ + /* 0x01 - 0x03 are defined in linux/mman.h */ #define MAP_TYPE 0x0f /* Mask for type of mapping */ #define MAP_FIXED 0x10 /* Interpret addr exactly */ diff --git a/mm/mmap.c b/mm/mmap.c index dbc557bd460c..3e1bf5a131b0 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1211,6 +1211,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, { struct mm_struct *mm = current->mm; int pkey = 0; + unsigned long vm_seals = 0; *populate = 0; @@ -1231,6 +1232,9 @@ unsigned long do_mmap(struct file *file, unsigned long addr, if (flags & MAP_FIXED_NOREPLACE) flags |= MAP_FIXED; + if (check_mmap_seals(prot, &vm_seals) < 0) + return -EINVAL; + if (!(flags & MAP_FIXED)) addr = round_hint_to_min(addr); @@ -1381,7 +1385,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, vm_flags |= VM_NORESERVE; } - addr = mmap_region(file, addr, len, vm_flags, pgoff, uf); + addr = mmap_region(file, addr, len, vm_flags, pgoff, uf, vm_seals); if (!IS_ERR_VALUE(addr) && ((vm_flags & VM_LOCKED) || (flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE)) @@ -2679,7 +2683,7 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, unsigned long mmap_region(struct file *file, unsigned long addr, unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, - struct list_head *uf) + struct list_head *uf, unsigned long vm_seals) { struct mm_struct *mm = current->mm; struct vm_area_struct *vma = NULL; @@ -2723,7 +2727,13 @@ unsigned long mmap_region(struct file *file, unsigned long addr, next = vma_next(&vmi); prev = vma_prev(&vmi); - if (vm_flags & VM_SPECIAL) { + /* + * For now, sealed VMA doesn't merge with other VMA, + * Will change this in later commit when we make sealed VMA + * also mergeable. + */ + if ((vm_flags & VM_SPECIAL) || + (vm_seals & MM_SEAL_ALL)) { if (prev) vma_iter_next_range(&vmi); goto cannot_expand; @@ -2781,6 +2791,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vma->vm_page_prot = vm_get_page_prot(vm_flags); vma->vm_pgoff = pgoff; + update_vma_seals(vma, vm_seals); + if (file) { if (vm_flags & VM_SHARED) { error = mapping_map_writable(file->f_mapping); @@ -2992,6 +3004,13 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, if (pgoff + (size >> PAGE_SHIFT) < pgoff) return ret; + /* + * Do not support sealing in remap_file_page. + * sealing is set via mmap() and mseal(). + */ + if (prot & PROT_SEAL_ALL) + return ret; + if (mmap_write_lock_killable(mm)) return -EINTR; From patchwork Tue Dec 12 23:17:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 177664 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp8060951vqy; Tue, 12 Dec 2023 15:18:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IF4OUNKDLW2qEndg+RC9w4bO97ZxxTapf6BiL1Xz36rv+VSvUzNibcaqywpvErieQI/bSaF X-Received: by 2002:a17:902:ee4d:b0:1d2:eb05:9d02 with SMTP id 13-20020a170902ee4d00b001d2eb059d02mr3773768plo.123.1702423089577; Tue, 12 Dec 2023 15:18:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702423089; cv=none; d=google.com; s=arc-20160816; b=eQ6rUTpfXhoO7V09bMGBy95Cv8k+Y7PXq7fwq2MIme2BS7fzep4mhosKPi45f2L+he krbfRvs/VEEa1QP3sNhLsc2iIUh2oRiFwFegBRA93LspFohukz3sVFIth3n1HtKRr6p/ LwEjjxmJXHwi3tcHjIy4dhzzA9HJI9Dwyc28gNEfU6GKkvqNd9HTFqd5TtlTzRpQXF0T PB5k+XFuAQDEJw8ZVEkNkHxZhZTBH1qPCjVgSpne4oKA6YxiU6w7EdwlcjBKEeYae080 ZqWTsaJmvjJ0DARoWlYhEd45km7iESVUO9Y/TnqeC8k39+2r7+iudMvTf3H8BBYegOQK qR7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=9hi31AjxJtNZjlJknX0J3BHnj/VEX2GCu7NwkPG/5ME=; fh=0mRPlJo0PPrkJQqE76MB2W/JiJzBa9Y7evJmiT9T7OE=; b=zqaMp5wxo8HV6YOO+5hcM44hZaqP8/3VFFVRQxr9D4ZyawG5hWsE+F3sxjFwf3ZB45 ROgrSl2Hu69VciCBs/cTGBZ2M82kcL6kLxqJErjwgDJo+TxIfSHNhtWNCrqYvJ2/KCE1 94VyE6PcOG/Fho63jUZadJzguWDZNwiJQxPPnxMfRj/VI//Pt0YsizgV66c7sm7U+X42 8ggOUUEudUegR7tLUkGWrty70biqMpm2McppvHZPbZFA/Nzy1rBnBgwWMix9rVQ08pC/ //kxkpei4Dj/fCNlaBrb4Ip2qGVpifqNtvH1qB4u/GLBDE8t/Fhisf/UXGaGhKhlgF71 /ghA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=SVnvx4Kv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id h11-20020a170902f7cb00b001cc53dbf53dsi8270818plw.648.2023.12.12.15.18.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 15:18:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=SVnvx4Kv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 33690804C218; Tue, 12 Dec 2023 15:17:56 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378084AbjLLXRd (ORCPT + 99 others); Tue, 12 Dec 2023 18:17:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377999AbjLLXRR (ORCPT ); Tue, 12 Dec 2023 18:17:17 -0500 Received: from mail-ot1-x329.google.com (mail-ot1-x329.google.com [IPv6:2607:f8b0:4864:20::329]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B675D2 for ; Tue, 12 Dec 2023 15:17:18 -0800 (PST) Received: by mail-ot1-x329.google.com with SMTP id 46e09a7af769-6d9f4eed60eso3600186a34.1 for ; Tue, 12 Dec 2023 15:17:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423038; x=1703027838; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9hi31AjxJtNZjlJknX0J3BHnj/VEX2GCu7NwkPG/5ME=; b=SVnvx4Kv7VfUbtzFewZaPKducGIyowYGkR4SRcsDB29q2C800rgrYLWJMuNJmy1VGP qtFqp6amVjfGe8Q0UKjLhnvZPIrJ7BGyRFe9K+J4uAc7y3+KhuzC/qushJhxYiFiChP5 mDu6LxxYTM7gBfvxZFkXJU7A8YZm8+3Z3TKQE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423038; x=1703027838; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9hi31AjxJtNZjlJknX0J3BHnj/VEX2GCu7NwkPG/5ME=; b=uwBWi5UAvYxZByRbLWuqLVyFJisEfd+aKJCqv7fcPy1JqPTCupi6aLcMyuWl3QhDOd L5Rutob3NGtvJPmY8QbBLGiatGX0A0zhyNMRogluCmiHsMztE1/5yH50BtisXMLLKjBT bD3zrtZByCl2VEZQfLnCNDAHGKkOVQ+qzyOSXulQB0nhWh7vs7OhpoGfUr2mjAd0rlQA VkUGvuSas+tYKYvxZVWklX91mF9Mg/UShpEJoZeQLmfEzXqKcYZ2izUdal//JjoY8dXj NU7CwU8vJ3hfGSzkbO5flwQ2xCAuH9UymwDNfstkH3dTZ6dEQlH5g1mgd6fs8U3Rm77t oOxg== X-Gm-Message-State: AOJu0YwoYxSZWzDqlRJ0BJh0bnWc3cbowR/2WKovrKfwNgjwjozvAHWw hY6hdDIFJT637TVxnCs61WwD4g== X-Received: by 2002:a05:6358:9394:b0:170:5200:e1b2 with SMTP id h20-20020a056358939400b001705200e1b2mr7564052rwb.4.1702423037735; Tue, 12 Dec 2023 15:17:17 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id w5-20020a056a0014c500b006ce2e77ec4csm8687959pfu.193.2023.12.12.15.17.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:17 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 07/11] mseal: make sealed VMA mergeable. Date: Tue, 12 Dec 2023 23:17:01 +0000 Message-ID: <20231212231706.2680890-8-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Tue, 12 Dec 2023 15:17:56 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785119993137225704 X-GMAIL-MSGID: 1785119993137225704 From: Jeff Xu Add merge/split handling for mlock/madvice/mprotect/mmap case. Make sealed VMA mergeable with adjacent VMAs. This is so that we don't run out of VMAs, i.e. there is a max number of VMA per process. Signed-off-by: Jeff Xu Suggested-by: Jann Horn --- fs/userfaultfd.c | 8 +++++--- include/linux/mm.h | 31 +++++++++++++------------------ mm/madvise.c | 2 +- mm/mempolicy.c | 2 +- mm/mlock.c | 2 +- mm/mmap.c | 44 +++++++++++++++++++++----------------------- mm/mprotect.c | 2 +- mm/mremap.c | 2 +- mm/mseal.c | 23 ++++++++++++++++++----- 9 files changed, 62 insertions(+), 54 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 56eaae9dac1a..8ebee7c1c6cf 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -926,7 +926,8 @@ static int userfaultfd_release(struct inode *inode, struct file *file) new_flags, vma->anon_vma, vma->vm_file, vma->vm_pgoff, vma_policy(vma), - NULL_VM_UFFD_CTX, anon_vma_name(vma)); + NULL_VM_UFFD_CTX, anon_vma_name(vma), + vma_seals(vma)); if (prev) { vma = prev; } else { @@ -1483,7 +1484,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), ((struct vm_userfaultfd_ctx){ ctx }), - anon_vma_name(vma)); + anon_vma_name(vma), vma_seals(vma)); if (prev) { /* vma_merge() invalidated the mas */ vma = prev; @@ -1668,7 +1669,8 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, prev = vma_merge(&vmi, mm, prev, start, vma_end, new_flags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - NULL_VM_UFFD_CTX, anon_vma_name(vma)); + NULL_VM_UFFD_CTX, anon_vma_name(vma), + vma_seals(vma)); if (prev) { vma = prev; goto next; diff --git a/include/linux/mm.h b/include/linux/mm.h index 5d3ee79f1438..1f162bb5b38d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3243,7 +3243,7 @@ extern struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *, struct vm_area_struct *prev, unsigned long addr, unsigned long end, unsigned long vm_flags, struct anon_vma *, struct file *, pgoff_t, struct mempolicy *, struct vm_userfaultfd_ctx, - struct anon_vma_name *); + struct anon_vma_name *, unsigned long vm_seals); extern struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *); extern int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *, unsigned long addr, int new_below); @@ -3327,19 +3327,6 @@ static inline void mm_populate(unsigned long addr, unsigned long len) {} #endif #ifdef CONFIG_MSEAL -static inline bool check_vma_seals_mergeable(unsigned long vm_seals) -{ - /* - * Set sealed VMA not mergeable with another VMA for now. - * This will be changed in later commit to make sealed - * VMA also mergeable. - */ - if (vm_seals & MM_SEAL_ALL) - return false; - - return true; -} - /* * return the valid sealing (after mask). */ @@ -3353,6 +3340,14 @@ static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm vma->vm_seals |= vm_seals; } +static inline bool check_vma_seals_mergeable(unsigned long vm_seals1, unsigned long vm_seals2) +{ + if ((vm_seals1 & MM_SEAL_ALL) != (vm_seals2 & MM_SEAL_ALL)) + return false; + + return true; +} + extern bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned long checkSeals); @@ -3390,14 +3385,14 @@ static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) return 0; } #else -static inline bool check_vma_seals_mergeable(unsigned long vm_seals1) +static inline unsigned long vma_seals(struct vm_area_struct *vma) { - return true; + return 0; } -static inline unsigned long vma_seals(struct vm_area_struct *vma) +static inline bool check_vma_seals_mergeable(unsigned long vm_seals1, unsigned long vm_seals2) { - return 0; + return true; } static inline bool can_modify_mm(struct mm_struct *mm, unsigned long start, diff --git a/mm/madvise.c b/mm/madvise.c index 4dded5d27e7e..e2d219a4b6ef 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -152,7 +152,7 @@ static int madvise_update_vma(struct vm_area_struct *vma, pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); *prev = vma_merge(&vmi, mm, *prev, start, end, new_flags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx, anon_name); + vma->vm_userfaultfd_ctx, anon_name, vma_seals(vma)); if (*prev) { vma = *prev; goto success; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index e52e3a0b8f2e..e70b69c64564 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -836,7 +836,7 @@ static int mbind_range(struct vma_iterator *vmi, struct vm_area_struct *vma, pgoff = vma->vm_pgoff + ((vmstart - vma->vm_start) >> PAGE_SHIFT); merged = vma_merge(vmi, vma->vm_mm, *prev, vmstart, vmend, vma->vm_flags, vma->anon_vma, vma->vm_file, pgoff, new_pol, - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + vma->vm_userfaultfd_ctx, anon_vma_name(vma), vma_seals(vma)); if (merged) { *prev = merged; return vma_replace_policy(merged, new_pol); diff --git a/mm/mlock.c b/mm/mlock.c index 06bdfab83b58..b537a2cbd337 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -428,7 +428,7 @@ static int mlock_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); *prev = vma_merge(vmi, mm, *prev, start, end, newflags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + vma->vm_userfaultfd_ctx, anon_vma_name(vma), vma_seals(vma)); if (*prev) { vma = *prev; goto success; diff --git a/mm/mmap.c b/mm/mmap.c index 3e1bf5a131b0..6da8d83f2e66 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -720,7 +720,8 @@ int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma, static inline bool is_mergeable_vma(struct vm_area_struct *vma, struct file *file, unsigned long vm_flags, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name, bool may_remove_vma) + struct anon_vma_name *anon_name, bool may_remove_vma, + unsigned long vm_seals) { /* * VM_SOFTDIRTY should not prevent from VMA merging, if we @@ -740,7 +741,7 @@ static inline bool is_mergeable_vma(struct vm_area_struct *vma, return false; if (!anon_vma_name_eq(anon_vma_name(vma), anon_name)) return false; - if (!check_vma_seals_mergeable(vma_seals(vma))) + if (!check_vma_seals_mergeable(vma_seals(vma), vm_seals)) return false; return true; @@ -776,9 +777,10 @@ static bool can_vma_merge_before(struct vm_area_struct *vma, unsigned long vm_flags, struct anon_vma *anon_vma, struct file *file, pgoff_t vm_pgoff, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name) + struct anon_vma_name *anon_name, unsigned long vm_seals) { - if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name, true) && + if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, + anon_name, true, vm_seals) && is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { if (vma->vm_pgoff == vm_pgoff) return true; @@ -799,9 +801,10 @@ static bool can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags, struct anon_vma *anon_vma, struct file *file, pgoff_t vm_pgoff, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name) + struct anon_vma_name *anon_name, unsigned long vm_seals) { - if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name, false) && + if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, + anon_name, false, vm_seals) && is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { pgoff_t vm_pglen; vm_pglen = vma_pages(vma); @@ -869,7 +872,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, struct anon_vma *anon_vma, struct file *file, pgoff_t pgoff, struct mempolicy *policy, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name) + struct anon_vma_name *anon_name, unsigned long vm_seals) { struct vm_area_struct *curr, *next, *res; struct vm_area_struct *vma, *adjust, *remove, *remove2; @@ -908,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, /* Can we merge the predecessor? */ if (addr == prev->vm_end && mpol_equal(vma_policy(prev), policy) && can_vma_merge_after(prev, vm_flags, anon_vma, file, - pgoff, vm_userfaultfd_ctx, anon_name)) { + pgoff, vm_userfaultfd_ctx, anon_name, vm_seals)) { merge_prev = true; vma_prev(vmi); } @@ -917,7 +920,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, /* Can we merge the successor? */ if (next && mpol_equal(policy, vma_policy(next)) && can_vma_merge_before(next, vm_flags, anon_vma, file, pgoff+pglen, - vm_userfaultfd_ctx, anon_name)) { + vm_userfaultfd_ctx, anon_name, vm_seals)) { merge_next = true; } @@ -2727,13 +2730,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr, next = vma_next(&vmi); prev = vma_prev(&vmi); - /* - * For now, sealed VMA doesn't merge with other VMA, - * Will change this in later commit when we make sealed VMA - * also mergeable. - */ - if ((vm_flags & VM_SPECIAL) || - (vm_seals & MM_SEAL_ALL)) { + + if (vm_flags & VM_SPECIAL) { if (prev) vma_iter_next_range(&vmi); goto cannot_expand; @@ -2743,7 +2741,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr, /* Check next */ if (next && next->vm_start == end && !vma_policy(next) && can_vma_merge_before(next, vm_flags, NULL, file, pgoff+pglen, - NULL_VM_UFFD_CTX, NULL)) { + NULL_VM_UFFD_CTX, NULL, vm_seals)) { merge_end = next->vm_end; vma = next; vm_pgoff = next->vm_pgoff - pglen; @@ -2752,9 +2750,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr, /* Check prev */ if (prev && prev->vm_end == addr && !vma_policy(prev) && (vma ? can_vma_merge_after(prev, vm_flags, vma->anon_vma, file, - pgoff, vma->vm_userfaultfd_ctx, NULL) : + pgoff, vma->vm_userfaultfd_ctx, NULL, vm_seals) : can_vma_merge_after(prev, vm_flags, NULL, file, pgoff, - NULL_VM_UFFD_CTX, NULL))) { + NULL_VM_UFFD_CTX, NULL, vm_seals))) { merge_start = prev->vm_start; vma = prev; vm_pgoff = prev->vm_pgoff; @@ -2822,7 +2820,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr, merge = vma_merge(&vmi, mm, prev, vma->vm_start, vma->vm_end, vma->vm_flags, NULL, vma->vm_file, vma->vm_pgoff, NULL, - NULL_VM_UFFD_CTX, NULL); + NULL_VM_UFFD_CTX, NULL, vma_seals(vma)); if (merge) { /* * ->mmap() can change vma->vm_file and fput @@ -3130,14 +3128,14 @@ static int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma, if (security_vm_enough_memory_mm(mm, len >> PAGE_SHIFT)) return -ENOMEM; - /* * Expand the existing vma if possible; Note that singular lists do not * occur after forking, so the expand will only happen on new VMAs. */ if (vma && vma->vm_end == addr && !vma_policy(vma) && can_vma_merge_after(vma, flags, NULL, NULL, - addr >> PAGE_SHIFT, NULL_VM_UFFD_CTX, NULL)) { + addr >> PAGE_SHIFT, NULL_VM_UFFD_CTX, NULL, + vma_seals(vma))) { vma_iter_config(vmi, vma->vm_start, addr + len); if (vma_iter_prealloc(vmi, vma)) goto unacct_fail; @@ -3380,7 +3378,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, new_vma = vma_merge(&vmi, mm, prev, addr, addr + len, vma->vm_flags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + vma->vm_userfaultfd_ctx, anon_vma_name(vma), vma_seals(vma)); if (new_vma) { /* * Source vma may have been merged into new_vma diff --git a/mm/mprotect.c b/mm/mprotect.c index 1527188b1e92..a4c90e71607b 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -632,7 +632,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); *pprev = vma_merge(vmi, mm, *pprev, start, end, newflags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + vma->vm_userfaultfd_ctx, anon_vma_name(vma), vma_seals(vma)); if (*pprev) { vma = *pprev; VM_WARN_ON((vma->vm_flags ^ newflags) & ~VM_SOFTDIRTY); diff --git a/mm/mremap.c b/mm/mremap.c index ff7429bfbbe1..357efd6b48b9 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -1098,7 +1098,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, vma = vma_merge(&vmi, mm, vma, extension_start, extension_end, vma->vm_flags, vma->anon_vma, vma->vm_file, extension_pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + vma->vm_userfaultfd_ctx, anon_vma_name(vma), vma_seals(vma)); if (!vma) { vm_unacct_memory(pages); ret = -ENOMEM; diff --git a/mm/mseal.c b/mm/mseal.c index d12aa628ebdc..3b90dce7d20e 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -7,8 +7,10 @@ * Author: Jeff Xu */ +#include #include #include +#include #include #include #include "internal.h" @@ -81,14 +83,25 @@ static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end, unsigned long addtypes) { + pgoff_t pgoff; int ret = 0; + unsigned long newtypes = vma_seals(vma) | addtypes; + + if (newtypes != vma_seals(vma)) { + /* + * Attempt to merge with prev and next vma. + */ + pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); + *prev = vma_merge(vmi, vma->vm_mm, *prev, start, end, vma->vm_flags, + vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), + vma->vm_userfaultfd_ctx, anon_vma_name(vma), newtypes); + if (*prev) { + vma = *prev; + goto out; + } - if (addtypes & ~(vma_seals(vma))) { /* * Handle split at start and end. - * For now sealed VMA doesn't merge with other VMAs. - * This will be updated in later commit to make - * sealed VMA also mergeable. */ if (start != vma->vm_start) { ret = split_vma(vmi, vma, start, 1); @@ -102,7 +115,7 @@ static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, goto out; } - vma->vm_seals |= addtypes; + vma->vm_seals = newtypes; } out: From patchwork Tue Dec 12 23:17:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 177661 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp8060889vqy; Tue, 12 Dec 2023 15:18:02 -0800 (PST) X-Google-Smtp-Source: AGHT+IFCViFbdUAaeKGIcnZvIv2qdvwyK4XZaZ2nckR+orAGvhQ0J/3sY+qcnp+ve8C1Gbj1PVfG X-Received: by 2002:a05:6a00:6084:b0:6ce:f5c9:da67 with SMTP id fp4-20020a056a00608400b006cef5c9da67mr5096737pfb.14.1702423082621; Tue, 12 Dec 2023 15:18:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702423082; cv=none; d=google.com; s=arc-20160816; b=NJf6IuAIIVG6PMIoUEF8tZwtT1AXYZYk8f7nR8OlH5n6itdqjjap12BfkL8M0SBVGo YZfaceaZISZXw1NNWg+AdajBHUAOjm+NPw3IWhbeYAYNdZDNWkw7g9+Yf0N4lof/b+Na VdrVnt4YQiLlbHGA94Mj9LPMmxBttFs70l064s3B7LGHnAM5u+RowlKZ+4rW+VpMg97q LL9wFjTcuFnH+DV6Ci6rslzHwCoLWvRmIpDyELsHhYSIbozc2CJXXawofv/Dkqzsbll4 Lq139l5+2b5LnLGWYs9EBG3MpaTMxkHx1Mu9KAattU4oWbwZ1JqMum8DdWyHZqqXYqco 3oMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=oTvy78zmaIXZsTIPkmebRTxUrUOjQ/CWIjSmsuuygoc=; fh=0mRPlJo0PPrkJQqE76MB2W/JiJzBa9Y7evJmiT9T7OE=; b=tb2E0AdgINeO2/Ioef/Egtntm/OxKnafo23o5KGr0yCDpBUclxYR7AQQzF4ilk47wa A2MnG/cub7PEz5K197QLKWeT4O2nE86v2Uzc13WsFq+VDCVt1/dfGwxuTRFEysNgR5bJ 6duhsvNtFpTmMe9y+5sQ9/0SbQMxKWhiqjUHmN691Dfz4uRWgHAJmUMCEFbItn8lzIfe ACmkrIYEF8KNLSRRl9LkfxxSw584ns/6lIxceUtFCHH8fgumuPtEBlu7lUGz5oRqxxPt dZtTSc7PpQl8lyo8KSzEiauAIdZC+CCWmmhPZG9rIek3SR68I9W4GF1NHt/xPASWwj8O 6ePg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=B+rra7Ql; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id y31-20020a056a00181f00b006cd8db77f84si8367524pfa.274.2023.12.12.15.18.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 15:18:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=B+rra7Ql; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id CA45580417CE; Tue, 12 Dec 2023 15:18:01 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378092AbjLLXRj (ORCPT + 99 others); Tue, 12 Dec 2023 18:17:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378008AbjLLXRR (ORCPT ); Tue, 12 Dec 2023 18:17:17 -0500 Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B786F2 for ; Tue, 12 Dec 2023 15:17:19 -0800 (PST) Received: by mail-pj1-x102c.google.com with SMTP id 98e67ed59e1d1-28ac6ecb9bdso926408a91.1 for ; Tue, 12 Dec 2023 15:17:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423039; x=1703027839; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oTvy78zmaIXZsTIPkmebRTxUrUOjQ/CWIjSmsuuygoc=; b=B+rra7Qlh5GXmswpkI4XOv34lb9U47sKHZldy63FrdMK39oQ+BpBMOsxphFQ35KlmF oO2swczMZNXZQk/piSbpkEIYmyPCRauAnkACK4EYdpbB+/H8e+OpVc5Let8uvrc7CU6W Wdg1NmQMl4rIAmcdksYTCMfqgmL6UQBxH6ezM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423039; x=1703027839; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oTvy78zmaIXZsTIPkmebRTxUrUOjQ/CWIjSmsuuygoc=; b=e5gW8QviQdZrAdiWVk2M0NTtx4tgu4H44/uzR4fr2wLFmN2Sb0nFw2n565HojAn0md PAVWJJbAxp2ebmkDW+cFpi0Et4SbknI0g9vleKpWr3MsszNfES/qS+iKSZfhZ9q7/zks 4vZPQp7hLJzZLOf1khqeEyv9CBRUmKYCPsJxA0byK0tXi9KN0ZawJysxvPInOxFfSveF NxgGhy53PSTUfu9xC64jKl7di0PX9EcAEKcj4JoN1VI91ZQeYrM4pZTFVVN5POrC/YtR YhW/t4oz5P/VOmm/bPab0xIsQtR3aV1xgEcg6eMGtdRC8emFGwKB0Tlzi7qOHxhB8q+V RN2Q== X-Gm-Message-State: AOJu0YwLR+8CaF9TwzvqMQOySU+DnkwvHu7jl6K+BYqkQGynqIP8uUHJ VtfCveUq98ddtLM0t3jpKYhceg== X-Received: by 2002:a17:90a:7562:b0:28a:79b0:afc1 with SMTP id q89-20020a17090a756200b0028a79b0afc1mr5931086pjk.6.1702423038769; Tue, 12 Dec 2023 15:17:18 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id n20-20020a17090ade9400b00286a275d65asm11093878pjv.41.2023.12.12.15.17.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:18 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 08/11] mseal: add MM_SEAL_DISCARD_RO_ANON Date: Tue, 12 Dec 2023 23:17:02 +0000 Message-ID: <20231212231706.2680890-9-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 12 Dec 2023 15:18:01 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785119986639044559 X-GMAIL-MSGID: 1785119986639044559 From: Jeff Xu Certain types of madvise() operations are destructive, such as MADV_DONTNEED, which can effectively alter region contents by discarding pages, especially when memory is anonymous. This blocks such operations for anonymous memory which is not writable to the user. The MM_SEAL_DISCARD_RO_ANON blocks such operations if users don't have access to the memory, and the memory is anonymous memory. We do not think such sealing is useful for file-backed mapping because it should repopulate the memory contents from the underlying mapped file. We also do not think it is useful if the user can write to the memory because then the attacker can also write. Signed-off-by: Jeff Xu Suggested-by: Jann Horn Suggested-by: Stephen Röttger --- include/linux/mm.h | 19 +++++-- include/uapi/asm-generic/mman-common.h | 2 + include/uapi/linux/mman.h | 1 + mm/madvise.c | 12 +++++ mm/mseal.c | 73 ++++++++++++++++++++++++-- 5 files changed, 98 insertions(+), 9 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 1f162bb5b38d..50dda474acc2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -264,7 +264,8 @@ extern unsigned int kobjsize(const void *objp); #define MM_SEAL_ALL ( \ MM_SEAL_SEAL | \ MM_SEAL_BASE | \ - MM_SEAL_PROT_PKEY) + MM_SEAL_PROT_PKEY | \ + MM_SEAL_DISCARD_RO_ANON) /* * PROT_SEAL_ALL is all supported flags in mmap(). @@ -273,7 +274,8 @@ extern unsigned int kobjsize(const void *objp); #define PROT_SEAL_ALL ( \ PROT_SEAL_SEAL | \ PROT_SEAL_BASE | \ - PROT_SEAL_PROT_PKEY) + PROT_SEAL_PROT_PKEY | \ + PROT_SEAL_DISCARD_RO_ANON) /* * vm_flags in vm_area_struct, see mm_types.h. @@ -3354,6 +3356,9 @@ extern bool can_modify_mm(struct mm_struct *mm, unsigned long start, extern bool can_modify_vma(struct vm_area_struct *vma, unsigned long checkSeals); +extern bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, + unsigned long end, int behavior); + /* * Convert prot field of mmap to vm_seals type. */ @@ -3362,9 +3367,9 @@ static inline unsigned long convert_mmap_seals(unsigned long prot) unsigned long seals = 0; /* - * set SEAL_PROT_PKEY implies SEAL_BASE. + * set SEAL_PROT_PKEY or SEAL_DISCARD_RO_ANON implies SEAL_BASE. */ - if (prot & PROT_SEAL_PROT_PKEY) + if (prot & (PROT_SEAL_PROT_PKEY | PROT_SEAL_DISCARD_RO_ANON)) prot |= PROT_SEAL_BASE; /* @@ -3407,6 +3412,12 @@ static inline bool can_modify_vma(struct vm_area_struct *vma, return true; } +static inline bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, + unsigned long end, int behavior) +{ + return true; +} + static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm_seals) { } diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index f07ad9e70b3a..bf503962409a 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -29,6 +29,8 @@ #define PROT_SEAL_SEAL _BITUL(PROT_SEAL_BIT_BEGIN) /* 0x04000000 seal seal */ #define PROT_SEAL_BASE _BITUL(PROT_SEAL_BIT_BEGIN + 1) /* 0x08000000 base for all sealing types */ #define PROT_SEAL_PROT_PKEY _BITUL(PROT_SEAL_BIT_BEGIN + 2) /* 0x10000000 seal prot and pkey */ +/* seal destructive madvise for non-writeable anonymous memory. */ +#define PROT_SEAL_DISCARD_RO_ANON _BITUL(PROT_SEAL_BIT_BEGIN + 3) /* 0x20000000 */ /* 0x01 - 0x03 are defined in linux/mman.h */ #define MAP_TYPE 0x0f /* Mask for type of mapping */ diff --git a/include/uapi/linux/mman.h b/include/uapi/linux/mman.h index f561652886c4..3872cc118c8a 100644 --- a/include/uapi/linux/mman.h +++ b/include/uapi/linux/mman.h @@ -58,5 +58,6 @@ struct cachestat { #define MM_SEAL_SEAL _BITUL(0) #define MM_SEAL_BASE _BITUL(1) #define MM_SEAL_PROT_PKEY _BITUL(2) +#define MM_SEAL_DISCARD_RO_ANON _BITUL(3) #endif /* _UAPI_LINUX_MMAN_H */ diff --git a/mm/madvise.c b/mm/madvise.c index e2d219a4b6ef..ff038e323779 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1403,6 +1403,7 @@ int madvise_set_anon_name(struct mm_struct *mm, unsigned long start, * -EIO - an I/O error occurred while paging in data. * -EBADF - map exists, but area maps something that isn't a file. * -EAGAIN - a kernel resource was temporarily unavailable. + * -EACCES - memory is sealed. */ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int behavior) { @@ -1446,10 +1447,21 @@ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int beh start = untagged_addr_remote(mm, start); end = start + len; + /* + * Check if the address range is sealed for do_madvise(). + * can_modify_mm_madv assumes we have acquired the lock on MM. + */ + if (!can_modify_mm_madv(mm, start, end, behavior)) { + error = -EACCES; + goto out; + } + blk_start_plug(&plug); error = madvise_walk_vmas(mm, start, end, behavior, madvise_vma_behavior); blk_finish_plug(&plug); + +out: if (write) mmap_write_unlock(mm); else diff --git a/mm/mseal.c b/mm/mseal.c index 3b90dce7d20e..294f48d33db6 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include "internal.h" @@ -66,6 +67,55 @@ bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end, return true; } +static bool is_madv_discard(int behavior) +{ + return behavior & + (MADV_FREE | MADV_DONTNEED | MADV_DONTNEED_LOCKED | + MADV_REMOVE | MADV_DONTFORK | MADV_WIPEONFORK); +} + +static bool is_ro_anon(struct vm_area_struct *vma) +{ + /* check anonymous mapping. */ + if (vma->vm_file || vma->vm_flags & VM_SHARED) + return false; + + /* + * check for non-writable: + * PROT=RO or PKRU is not writeable. + */ + if (!(vma->vm_flags & VM_WRITE) || + !arch_vma_access_permitted(vma, true, false, false)) + return true; + + return false; +} + +/* + * Check if the vmas of a memory range are allowed to be modified by madvise. + * the memory ranger can have a gap (unallocated memory). + * return true, if it is allowed. + */ +bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, unsigned long end, + int behavior) +{ + struct vm_area_struct *vma; + + VMA_ITERATOR(vmi, mm, start); + + if (!is_madv_discard(behavior)) + return true; + + /* going through each vma to check. */ + for_each_vma_range(vmi, vma, end) + if (is_ro_anon(vma) && !can_modify_vma( + vma, MM_SEAL_DISCARD_RO_ANON)) + return false; + + /* Allow by default. */ + return true; +} + /* * Check if a seal type can be added to VMA. */ @@ -76,6 +126,12 @@ static bool can_add_vma_seals(struct vm_area_struct *vma, unsigned long newSeals (newSeals & ~(vma_seals(vma)))) return false; + /* + * For simplicity, we allow adding all sealing types during mmap or mseal. + * The actual sealing check will happen later during particular action. + * E.g. For MM_SEAL_DISCARD_RO_ANON, we always allow adding it, at the + * time madvice() call, we will check if the sealing condition isn't met. + */ return true; } @@ -225,15 +281,22 @@ static int apply_mm_seal(unsigned long start, unsigned long end, * mprotect() and pkey_mprotect() will be denied if the memory is * sealed with MM_SEAL_PROT_PKEY. * - * The MM_SEAL_SEAL - * MM_SEAL_SEAL denies adding a new seal for an VMA. - * * The kernel will remember which seal types are applied, and the * application doesn’t need to repeat all existing seal types in * the next mseal(). Once a seal type is applied, it can’t be * unsealed. Call mseal() on an existing seal type is a no-action, * not a failure. * + * MM_SEAL_DISCARD_RO_ANON: block some destructive madvice() + * behavior, such as MADV_DONTNEED, which can effectively + * alter gegion contents by discarding pages, block such + * operation if users don't have write access to the memory, and + * the memory is anonymous memory. + * Setting this implies MM_SEAL_BASE is also set. + * + * The MM_SEAL_SEAL + * MM_SEAL_SEAL denies adding a new seal for an VMA. + * * flags: reserved. * * return values: @@ -264,8 +327,8 @@ static int do_mseal(unsigned long start, size_t len_in, unsigned long types, struct mm_struct *mm = current->mm; size_t len; - /* MM_SEAL_BASE is set when other seal types are set. */ - if (types & MM_SEAL_PROT_PKEY) + /* MM_SEAL_BASE is set when other seal types are set */ + if (types & (MM_SEAL_PROT_PKEY | MM_SEAL_DISCARD_RO_ANON)) types |= MM_SEAL_BASE; if (!can_do_mseal(types, flags)) From patchwork Tue Dec 12 23:17:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 177666 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp8061011vqy; Tue, 12 Dec 2023 15:18:16 -0800 (PST) X-Google-Smtp-Source: AGHT+IFeFnvwKMuED5GaLbGIW0+O8a9CHp9gsuMT2/oAe6/UzcJNsm75o6OZgxiPjMxrBXtfMFCh X-Received: by 2002:a05:6358:70a:b0:170:cacc:f7ee with SMTP id e10-20020a056358070a00b00170caccf7eemr6131656rwj.8.1702423096104; Tue, 12 Dec 2023 15:18:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702423096; cv=none; d=google.com; s=arc-20160816; b=p53CtAdgyw48PTWCq0LGAkAJPs9nIblKKX5WT4C6IlivMHPZ7eNXNMWWRPwnZY/Py/ YVkzIeggb/Zm0RqwGOQAwkXO8d1SzBbzeHmAqdDnCu/sX255CSU7e27QMIqrLdbttecJ pqDe/F6dxEJzX+5xMzc75yjAHTRCGlhl1I1fxTmc322fEeDqnP9f7aNAusQVwb3gQJUp pgTpdapbi9ykOAgXnK7nh34RXYZP4RZT0QhZIiEYOrw3gWqZnHDC0paF6SgWQaOp9WYS i25kUxqpy/63WkzthwUD1nJ7EOqrAS7nk3wzyffty/aAokii0gR1Z2tYbcOdCfEURJwi /0lA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=cZjoMoZfbLgYK7cJaZ0EBpgFOBwo/12832X2C7MVeuk=; fh=0mRPlJo0PPrkJQqE76MB2W/JiJzBa9Y7evJmiT9T7OE=; b=QjI8TrNbH5eXvjL/9QH46mogBSsfeUoX4wXT2m2h1AwXKdN72wYqQj/es1zuvDMpnB 8FJY0ux0e4w9q9Ie7adMBxcsXwzS0303sYtNqcar4ZYDHgn8204tqnZlM3+egTrcVpJu Fg0z28u9AXtTzo38240KWXdAYg4N1OMZUYAVwxaVFwZPCuZQ3Tfz7mm9TAbJkAV7Tha2 2G/zooK4ZaCDbwRaCA9V/dyS7JJfFMbxVPPrW0fh8bLlkFDmDTXZZAZVDtidN6TlvU9d 7Hc+qoe/Mg9qAcxAfdNbYOKbl7pdEGdrV0evheE1HSxDvKAO+oD9bYwsDR9zxDp9Mj40 jHUw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b="AlrxvD/t"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id k21-20020aa788d5000000b006cc096fa2d6si8438035pff.70.2023.12.12.15.18.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 15:18:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b="AlrxvD/t"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 9A02481CB0EB; Tue, 12 Dec 2023 15:18:08 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377895AbjLLXRh (ORCPT + 99 others); Tue, 12 Dec 2023 18:17:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378005AbjLLXRR (ORCPT ); Tue, 12 Dec 2023 18:17:17 -0500 Received: from mail-oo1-xc2e.google.com (mail-oo1-xc2e.google.com [IPv6:2607:f8b0:4864:20::c2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD0E1E9 for ; Tue, 12 Dec 2023 15:17:20 -0800 (PST) Received: by mail-oo1-xc2e.google.com with SMTP id 006d021491bc7-58df5988172so3954849eaf.0 for ; Tue, 12 Dec 2023 15:17:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423040; x=1703027840; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cZjoMoZfbLgYK7cJaZ0EBpgFOBwo/12832X2C7MVeuk=; b=AlrxvD/tb29EN8sl2PtfN9xifrh6UOw7clLmq9lNivZ0Dqs8vq6wKTn2zeYOFUlILc S2dBS4mxoSJBA6Mnery4qzDfMCUCRxK7Lq3zZ5kcRuALSUm2rvHG++KypBTac2ckuM29 WDhFPLbF8fT6+rp94pGR1suK8Ep6uwTY1MrPE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423040; x=1703027840; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cZjoMoZfbLgYK7cJaZ0EBpgFOBwo/12832X2C7MVeuk=; b=wZvsGyuDlq26+TzwCYVrr/kFhIJ660LFqGKtGQRSkXqjiFNpJfET/lL2+MAlnud4Tz KbSgtMoEBEyH6ZkCm4xK8lXl/0wGcx25zigbXuUqmY5pF3lhM+EAGdInGT72UdIDLHvW IdhsgauzASBSPubjzhPrCyS23F6uH0M1eC/t6gu+mmwIrv4XHwWIxUt6lvKeC+H18brU ptlUtosm2I4w215Gy55zZIYyx0O+wY2NoUReO3fYIg3Eu1jAhyY7oviMc5FwsG8qiX+d QHoswuT4tblaFVLb2/6DGeZci1+Sl6y+07Q7oFILS9fpYNFNDM7jYDPMtoWY+umuaNDx ur3Q== X-Gm-Message-State: AOJu0YzdryhpYR/4cHsjY5FDCBY+YmLfd3asXguT/pi1rac4iNCgTC7D Dnsv0PD7n6Rpv/VI/0RsnWEZqQ== X-Received: by 2002:a05:6358:7296:b0:170:17eb:2039 with SMTP id w22-20020a056358729600b0017017eb2039mr9338350rwf.34.1702423039955; Tue, 12 Dec 2023 15:17:19 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id z17-20020aa785d1000000b006ce5bb61a5fsm8749920pfn.3.2023.12.12.15.17.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:19 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 09/11] mseal: add MAP_SEALABLE to mmap() Date: Tue, 12 Dec 2023 23:17:03 +0000 Message-ID: <20231212231706.2680890-10-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Tue, 12 Dec 2023 15:18:08 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785120000439897631 X-GMAIL-MSGID: 1785120000439897631 From: Jeff Xu The MAP_SEALABLE flag is added to the flags field of mmap(). When present, it marks the map as sealable. A map created without MAP_SEALABLE will not support sealing; In other words, mseal() will fail for such a map. Applications that don't care about sealing will expect their behavior unchanged. For those that need sealing support, opt-in by adding MAP_SEALABLE when creating the map. Signed-off-by: Jeff Xu --- include/linux/mm.h | 52 ++++++++++++++++++++++++-- include/linux/mm_types.h | 1 + include/uapi/asm-generic/mman-common.h | 1 + mm/mmap.c | 2 +- mm/mseal.c | 7 +++- 5 files changed, 57 insertions(+), 6 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 50dda474acc2..6f5dba9fbe21 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -267,6 +267,17 @@ extern unsigned int kobjsize(const void *objp); MM_SEAL_PROT_PKEY | \ MM_SEAL_DISCARD_RO_ANON) +/* define VM_SEALABLE in vm_seals of vm_area_struct. */ +#define VM_SEALABLE _BITUL(31) + +/* + * VM_SEALS_BITS_ALL marks the bits used for + * sealing in vm_seals of vm_area_structure. + */ +#define VM_SEALS_BITS_ALL ( \ + MM_SEAL_ALL | \ + VM_SEALABLE) + /* * PROT_SEAL_ALL is all supported flags in mmap(). * See include/uapi/asm-generic/mman-common.h. @@ -3330,9 +3341,17 @@ static inline void mm_populate(unsigned long addr, unsigned long len) {} #ifdef CONFIG_MSEAL /* - * return the valid sealing (after mask). + * return the valid sealing (after mask), this includes sealable bit. */ static inline unsigned long vma_seals(struct vm_area_struct *vma) +{ + return (vma->vm_seals & VM_SEALS_BITS_ALL); +} + +/* + * return the enabled sealing type (after mask), without sealable bit. + */ +static inline unsigned long vma_enabled_seals(struct vm_area_struct *vma) { return (vma->vm_seals & MM_SEAL_ALL); } @@ -3342,9 +3361,14 @@ static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm vma->vm_seals |= vm_seals; } +static inline bool is_vma_sealable(struct vm_area_struct *vma) +{ + return vma->vm_seals & VM_SEALABLE; +} + static inline bool check_vma_seals_mergeable(unsigned long vm_seals1, unsigned long vm_seals2) { - if ((vm_seals1 & MM_SEAL_ALL) != (vm_seals2 & MM_SEAL_ALL)) + if ((vm_seals1 & VM_SEALS_BITS_ALL) != (vm_seals2 & VM_SEALS_BITS_ALL)) return false; return true; @@ -3384,9 +3408,15 @@ static inline unsigned long convert_mmap_seals(unsigned long prot) * check input sealing type from the "prot" field of mmap(). * for CONFIG_MSEAL case, this always return 0 (successful). */ -static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) +static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals, + unsigned long flags) { *vm_seals = convert_mmap_seals(prot); + if (*vm_seals) + /* setting one of MM_SEAL_XX means the map is sealable. */ + *vm_seals |= VM_SEALABLE; + else + *vm_seals |= (flags & MAP_SEALABLE) ? VM_SEALABLE:0; return 0; } #else @@ -3395,6 +3425,16 @@ static inline unsigned long vma_seals(struct vm_area_struct *vma) return 0; } +static inline unsigned long vma_enabled_seals(struct vm_area_struct *vma) +{ + return 0; +} + +static inline bool is_vma_sealable(struct vm_area_struct *vma) +{ + return false; +} + static inline bool check_vma_seals_mergeable(unsigned long vm_seals1, unsigned long vm_seals2) { return true; @@ -3426,11 +3466,15 @@ static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm * check input sealing type from the "prot" field of mmap(). * For not CONFIG_MSEAL, if SEAL flag is set, it will return failure. */ -static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) +static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals, + unsigned long flags) { if (prot & PROT_SEAL_ALL) return -EINVAL; + if (flags & MAP_SEALABLE) + return -EINVAL; + return 0; } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 052799173c86..c9b04c545f39 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -691,6 +691,7 @@ struct vm_area_struct { /* * bit masks for seal. * need this since vm_flags is full. + * We could merge this into vm_flags if vm_flags ever get expanded. */ unsigned long vm_seals; /* seal flags, see mm.h. */ #endif diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index bf503962409a..57ef4507c00b 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -47,6 +47,7 @@ #define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be * uninitialized */ +#define MAP_SEALABLE 0x8000000 /* map is sealable. */ /* * Flags for mlock diff --git a/mm/mmap.c b/mm/mmap.c index 6da8d83f2e66..6e35e2070060 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1235,7 +1235,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, if (flags & MAP_FIXED_NOREPLACE) flags |= MAP_FIXED; - if (check_mmap_seals(prot, &vm_seals) < 0) + if (check_mmap_seals(prot, &vm_seals, flags) < 0) return -EINVAL; if (!(flags & MAP_FIXED)) diff --git a/mm/mseal.c b/mm/mseal.c index 294f48d33db6..5d4cf71b497e 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -121,9 +121,13 @@ bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, unsigned long */ static bool can_add_vma_seals(struct vm_area_struct *vma, unsigned long newSeals) { + /* if map is not sealable, reject. */ + if (!is_vma_sealable(vma)) + return false; + /* When SEAL_MSEAL is set, reject if a new type of seal is added. */ if ((vma->vm_seals & MM_SEAL_SEAL) && - (newSeals & ~(vma_seals(vma)))) + (newSeals & ~(vma_enabled_seals(vma)))) return false; /* @@ -185,6 +189,7 @@ static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, * 2> end is part of a valid vma. * 3> No gap (unallocated address) between start and end. * 4> requested seal type can be added in given address range. + * 5> map is sealable. */ static int check_mm_seal(unsigned long start, unsigned long end, unsigned long newtypes) From patchwork Tue Dec 12 23:17:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 177665 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp8060952vqy; Tue, 12 Dec 2023 15:18:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IGftTENJuSvjhDu+MMUzMLkf/tiHjMngfMz7cZlqafnAnQsb6mQyv0BglQTKfwQypW0H7BY X-Received: by 2002:a17:902:784a:b0:1d0:4778:fb3f with SMTP id e10-20020a170902784a00b001d04778fb3fmr2926104pln.32.1702423089651; Tue, 12 Dec 2023 15:18:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702423089; cv=none; d=google.com; s=arc-20160816; b=pckeo42OKFVsXFvMPovsrOyudOCqwVO79nzfiwtf0+dpVOjsxJIoiUV25M62wFahiC zJjfR3+lPHrRe6f9Qgp5Elzy4zqrPSCwrKz0iRw+ztO3NKxsVEGgKzu4cXOkuDpYg1EH LKH0mysYYP6XPUjjdaM/63cmQXVk6WChqI3CDV5TAWBMzZSmm5vig9Xw18To99AAafNA Q9xmXoRfcEj7d/tgnsUdHkPxiDchMpsWTD6o+75Z7NXsBzjQ6UvQ6rkwqFmUUCXoCcUN EBwVkGz96BBcj0QT+b49SvpR01bv/wzGma3YieM3ayPzdFLZgU+Ixm4cVxAR5QuFxuXs siWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4ks0JloPGamaa61d4Hb0TBtMWFANdRTfuPlMlHWvZS0=; fh=0mRPlJo0PPrkJQqE76MB2W/JiJzBa9Y7evJmiT9T7OE=; b=0We+tFdcYbl3R43lG5XhrQbGqkFij+K2hXv5gZLK8vwQZhx8bJ9fgIIAi0PD4PRPUD GBLEh+LydO8yRUjNe8YQ6XAwd3KosQv+1hMeHeJ3hEx8JM9fvpgz6GpJuMQ/lyozZq/Y q99qMusgBOkStFIiDL4fklBDXSIZFcBJdXdklfcPcHcphhSrsbbQenmHao+St6tsDmEI /H545MXPgDvtDwVVrbFi3QtReVgs/d0XbrGs5VY1rOFa8t64Q7YXQvCvEwjRCJFDpkkT ez4OK4Jpr5mzTLzKu5RuS/lu1xTqT8vcUsL8jPr2E1s4ruz4MEPon71wLeRnPCY/+Rac 3sCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=I5GnwQeh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id i12-20020a170902c94c00b001cfb52101e3si8860597pla.413.2023.12.12.15.18.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 15:18:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=I5GnwQeh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 6326680B81D3; Tue, 12 Dec 2023 15:18:08 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378111AbjLLXRs (ORCPT + 99 others); Tue, 12 Dec 2023 18:17:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378017AbjLLXRV (ORCPT ); Tue, 12 Dec 2023 18:17:21 -0500 Received: from mail-oo1-xc32.google.com (mail-oo1-xc32.google.com [IPv6:2607:f8b0:4864:20::c32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B7F610C for ; Tue, 12 Dec 2023 15:17:22 -0800 (PST) Received: by mail-oo1-xc32.google.com with SMTP id 006d021491bc7-5906eac104bso3636086eaf.2 for ; Tue, 12 Dec 2023 15:17:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423041; x=1703027841; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4ks0JloPGamaa61d4Hb0TBtMWFANdRTfuPlMlHWvZS0=; b=I5GnwQeh50tWvbB+BJ/bBm/w1ci8AMXDbCBFFTvoQNQJG42WZtTkSklbQG0eXIHW2h xn7DVR1UV3EXN+Xvhk3C4QNLqmA62/A6mBQ91I6C15/A83FkLXGcnhg0FnRrb3jH/TAI U2gpAeuWHKoWb53A4o7/m/erXG1peS3/LX74M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423041; x=1703027841; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4ks0JloPGamaa61d4Hb0TBtMWFANdRTfuPlMlHWvZS0=; b=BGK9ACdVWE6MYr7sM1wHTI8ZUqcMuqC71zqokTSgTzv6+XLXeXhnynPkaViVY2kxld lvXmDm7T5gwyJvi2NqoUrFikcolsruCHTZ4U8TZZwkkTHn/pky/56IvflEWMOR0krChv sEL808vc5whrlbWrGk12iJv/+k/Fnv+WrZRNEgc12GFmCWgJY4j1BPA6pR/lJCYxJRKY adVNQq9sMbMtIVW2lzGiv45jJewMxdOHgL5yeoo00Z6L10s0PNKjdhZOTU1hwG3O/+00 NnebCxV30wX/vBJQjAaWx/8J0O5KyudYf+QJDk3R8ZxC/rh0VYsX0gczsN6jd68gHYrY fjbA== X-Gm-Message-State: AOJu0YyQJXTAt8L9j9kgPihLrjwbDkxBHI399IyGyVSHF3+6FDA+Kv/c 6wDk1XrH8KlgP1eKK01rESePOg== X-Received: by 2002:a05:6359:223:b0:170:ad0e:c224 with SMTP id ej35-20020a056359022300b00170ad0ec224mr8667929rwb.23.1702423041112; Tue, 12 Dec 2023 15:17:21 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id r3-20020aa79883000000b006cbb65edcbfsm8922291pfl.12.2023.12.12.15.17.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:20 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 10/11] selftest mm/mseal memory sealing Date: Tue, 12 Dec 2023 23:17:04 +0000 Message-ID: <20231212231706.2680890-11-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 12 Dec 2023 15:18:08 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785119993620724945 X-GMAIL-MSGID: 1785119993620724945 From: Jeff Xu selftest for memory sealing change in mmap() and mseal(). Signed-off-by: Jeff Xu --- tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + tools/testing/selftests/mm/config | 1 + tools/testing/selftests/mm/mseal_test.c | 2141 +++++++++++++++++++++++ 4 files changed, 2144 insertions(+) create mode 100644 tools/testing/selftests/mm/mseal_test.c diff --git a/tools/testing/selftests/mm/.gitignore b/tools/testing/selftests/mm/.gitignore index cdc9ce4426b9..f0f22a649985 100644 --- a/tools/testing/selftests/mm/.gitignore +++ b/tools/testing/selftests/mm/.gitignore @@ -43,3 +43,4 @@ mdwe_test gup_longterm mkdirty va_high_addr_switch +mseal_test diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index 6a9fc5693145..0c086cecc093 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -59,6 +59,7 @@ TEST_GEN_FILES += mlock2-tests TEST_GEN_FILES += mrelease_test TEST_GEN_FILES += mremap_dontunmap TEST_GEN_FILES += mremap_test +TEST_GEN_FILES += mseal_test TEST_GEN_FILES += on-fault-limit TEST_GEN_FILES += thuge-gen TEST_GEN_FILES += transhuge-stress diff --git a/tools/testing/selftests/mm/config b/tools/testing/selftests/mm/config index be087c4bc396..cf2b8780e9b1 100644 --- a/tools/testing/selftests/mm/config +++ b/tools/testing/selftests/mm/config @@ -6,3 +6,4 @@ CONFIG_TEST_HMM=m CONFIG_GUP_TEST=y CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_MEM_SOFT_DIRTY=y +CONFIG_MSEAL=y diff --git a/tools/testing/selftests/mm/mseal_test.c b/tools/testing/selftests/mm/mseal_test.c new file mode 100644 index 000000000000..0692485d8b3c --- /dev/null +++ b/tools/testing/selftests/mm/mseal_test.c @@ -0,0 +1,2141 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include "../kselftest.h" +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * need those definition for manually build using gcc. + * gcc -I ../../../../usr/include -DDEBUG -O3 -DDEBUG -O3 mseal_test.c -o mseal_test + */ +#ifndef MM_SEAL_SEAL +#define MM_SEAL_SEAL 0x1 +#endif + +#ifndef MM_SEAL_BASE +#define MM_SEAL_BASE 0x2 +#endif + +#ifndef MM_SEAL_PROT_PKEY +#define MM_SEAL_PROT_PKEY 0x4 +#endif + +#ifndef MM_SEAL_DISCARD_RO_ANON +#define MM_SEAL_DISCARD_RO_ANON 0x8 +#endif + +#ifndef MAP_SEALABLE +#define MAP_SEALABLE 0x8000000 +#endif + +#ifndef PROT_SEAL_SEAL +#define PROT_SEAL_SEAL 0x04000000 +#endif + +#ifndef PROT_SEAL_BASE +#define PROT_SEAL_BASE 0x08000000 +#endif + +#ifndef PROT_SEAL_PROT_PKEY +#define PROT_SEAL_PROT_PKEY 0x10000000 +#endif + +#ifndef PROT_SEAL_DISCARD_RO_ANON +#define PROT_SEAL_DISCARD_RO_ANON 0x20000000 +#endif + +#ifndef PKEY_DISABLE_ACCESS +# define PKEY_DISABLE_ACCESS 0x1 +#endif + +#ifndef PKEY_DISABLE_WRITE +# define PKEY_DISABLE_WRITE 0x2 +#endif + +#ifndef PKEY_BITS_PER_KEY +#define PKEY_BITS_PER_PKEY 2 +#endif + +#ifndef PKEY_MASK +#define PKEY_MASK (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE) +#endif + +#ifndef DEBUG +#define LOG_TEST_ENTER() {} +#else +#define LOG_TEST_ENTER() {ksft_print_msg("%s\n", __func__); } +#endif + +#ifndef u64 +#define u64 unsigned long long +#endif + +/* + * define sys_xyx to call syscall directly. + */ +static int sys_mseal(void *start, size_t len, int types) +{ + int sret; + + errno = 0; + sret = syscall(__NR_mseal, start, len, types, 0); + return sret; +} + +int sys_mprotect(void *ptr, size_t size, unsigned long prot) +{ + int sret; + + errno = 0; + sret = syscall(SYS_mprotect, ptr, size, prot); + return sret; +} + +int sys_mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot, + unsigned long pkey) +{ + int sret; + + errno = 0; + sret = syscall(__NR_pkey_mprotect, ptr, size, orig_prot, pkey); + return sret; +} + +int sys_munmap(void *ptr, size_t size) +{ + int sret; + + errno = 0; + sret = syscall(SYS_munmap, ptr, size); + return sret; +} + +static int sys_madvise(void *start, size_t len, int types) +{ + int sret; + + errno = 0; + sret = syscall(__NR_madvise, start, len, types); + return sret; +} + +int sys_pkey_alloc(unsigned long flags, unsigned long init_val) +{ + int ret = syscall(SYS_pkey_alloc, flags, init_val); + return ret; +} + +static inline unsigned int __read_pkey_reg(void) +{ + unsigned int eax, edx; + unsigned int ecx = 0; + unsigned int pkey_reg; + + asm volatile(".byte 0x0f,0x01,0xee\n\t" + : "=a" (eax), "=d" (edx) + : "c" (ecx)); + pkey_reg = eax; + return pkey_reg; +} + +static inline void __write_pkey_reg(u64 pkey_reg) +{ + unsigned int eax = pkey_reg; + unsigned int ecx = 0; + unsigned int edx = 0; + + asm volatile(".byte 0x0f,0x01,0xef\n\t" + : : "a" (eax), "c" (ecx), "d" (edx)); + assert(pkey_reg == __read_pkey_reg()); +} + +static inline unsigned long pkey_bit_position(int pkey) +{ + return pkey * PKEY_BITS_PER_PKEY; +} + +static inline u64 set_pkey_bits(u64 reg, int pkey, u64 flags) +{ + unsigned long shift = pkey_bit_position(pkey); + /* mask out bits from pkey in old value */ + reg &= ~((u64)PKEY_MASK << shift); + /* OR in new bits for pkey */ + reg |= (flags & PKEY_MASK) << shift; + return reg; +} + +static inline void set_pkey(int pkey, unsigned long pkey_value) +{ + unsigned long mask = (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE); + u64 new_pkey_reg; + + assert(!(pkey_value & ~mask)); + new_pkey_reg = set_pkey_bits(__read_pkey_reg(), pkey, pkey_value); + __write_pkey_reg(new_pkey_reg); +} + +void setup_single_address(int size, void **ptrOut) +{ + void *ptr; + + ptr = mmap(NULL, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE | MAP_SEALABLE, -1, 0); + assert(ptr != (void *)-1); + *ptrOut = ptr; +} + +void setup_single_address_sealable(int size, void **ptrOut, bool sealable) +{ + void *ptr; + unsigned long mapflags = MAP_ANONYMOUS | MAP_PRIVATE; + + if (sealable) + mapflags |= MAP_SEALABLE; + + ptr = mmap(NULL, size, PROT_READ, mapflags, -1, 0); + assert(ptr != (void *)-1); + *ptrOut = ptr; +} + +void setup_single_address_rw_sealable(int size, void **ptrOut, bool sealable) +{ + void *ptr; + unsigned long mapflags = MAP_ANONYMOUS | MAP_PRIVATE; + + if (sealable) + mapflags |= MAP_SEALABLE; + + ptr = mmap(NULL, size, PROT_READ | PROT_WRITE, mapflags, -1, 0); + assert(ptr != (void *)-1); + *ptrOut = ptr; +} + +void clean_single_address(void *ptr, int size) +{ + int ret; + + ret = munmap(ptr, size); + assert(!ret); +} + +void seal_mprotect_single_address(void *ptr, int size) +{ + int ret; + + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); +} + +void seal_discard_ro_anon_single_address(void *ptr, int size) +{ + int ret; + + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); +} + +static void test_seal_addseals(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + /* adding seal one by one */ + + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(!ret); +} + +static void test_seal_addseals_combined(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); + + /* adding multiple seals */ + ret = sys_mseal(ptr, size, + MM_SEAL_PROT_PKEY | MM_SEAL_BASE| + MM_SEAL_SEAL); + assert(!ret); + + /* not adding more seal type, so ok. */ + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + + /* not adding more seal type, so ok. */ + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(!ret); +} + +static void test_seal_addseals_reject(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + ret = sys_mseal(ptr, size, MM_SEAL_BASE | MM_SEAL_SEAL); + assert(!ret); + + /* MM_SEAL_SEAL is set, so not allow new seal type . */ + ret = sys_mseal(ptr, size, + MM_SEAL_PROT_PKEY | MM_SEAL_BASE | MM_SEAL_SEAL); + assert(ret < 0); +} + +static void test_seal_unmapped_start(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + // munmap 2 pages from ptr. + ret = sys_munmap(ptr, 2 * page_size); + assert(!ret); + + // mprotect will fail because 2 pages from ptr are unmapped. + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(ret < 0); + + // mseal will fail because 2 pages from ptr are unmapped. + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(ret < 0); + + ret = sys_mseal(ptr + 2 * page_size, 2 * page_size, MM_SEAL_SEAL); + assert(!ret); +} + +static void test_seal_unmapped_middle(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + // munmap 2 pages from ptr + page. + ret = sys_munmap(ptr + page_size, 2 * page_size); + assert(!ret); + + // mprotect will fail, since size is 4 pages. + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(ret < 0); + + // mseal will fail as well. + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(ret < 0); + + /* we still can add seal to the first page and last page*/ + ret = sys_mseal(ptr, page_size, MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(!ret); + + ret = sys_mseal(ptr + 3 * page_size, page_size, + MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(!ret); +} + +static void test_seal_unmapped_end(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + // unmap last 2 pages. + ret = sys_munmap(ptr + 2 * page_size, 2 * page_size); + assert(!ret); + + //mprotect will fail since last 2 pages are unmapped. + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(ret < 0); + + //mseal will fail as well. + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(ret < 0); + + /* The first 2 pages is not sealed, and can add seals */ + ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(!ret); +} + +static void test_seal_multiple_vmas(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + // use mprotect to split the vma into 3. + ret = sys_mprotect(ptr + page_size, 2 * page_size, + PROT_READ | PROT_WRITE); + assert(!ret); + + // mprotect will get applied to all 4 pages - 3 VMAs. + ret = sys_mprotect(ptr, size, PROT_READ); + assert(!ret); + + // use mprotect to split the vma into 3. + ret = sys_mprotect(ptr + page_size, 2 * page_size, + PROT_READ | PROT_WRITE); + assert(!ret); + + // mseal get applied to all 4 pages - 3 VMAs. + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(!ret); + + // verify additional seal type will fail after MM_SEAL_SEAL set. + ret = sys_mseal(ptr, page_size, MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(ret < 0); + + ret = sys_mseal(ptr + page_size, 2 * page_size, + MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(ret < 0); + + ret = sys_mseal(ptr + 3 * page_size, page_size, + MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(ret < 0); +} + +static void test_seal_split_start(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + /* use mprotect to split at middle */ + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + /* seal the first page, this will split the VMA */ + ret = sys_mseal(ptr, page_size, MM_SEAL_SEAL); + assert(!ret); + + /* can't add seal to the first page */ + ret = sys_mseal(ptr, page_size, MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(ret < 0); + + /* add seal to the remain 3 pages */ + ret = sys_mseal(ptr + page_size, 3 * page_size, + MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(!ret); +} + +static void test_seal_split_end(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + /* use mprotect to split at middle */ + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + /* seal the last page */ + ret = sys_mseal(ptr + 3 * page_size, page_size, MM_SEAL_SEAL); + assert(!ret); + + /* adding seal to the last page is rejected. */ + ret = sys_mseal(ptr + 3 * page_size, page_size, + MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(ret < 0); + + /* Adding seals to the first 3 pages */ + ret = sys_mseal(ptr, 3 * page_size, MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(!ret); +} + +static void test_seal_invalid_input(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(8 * page_size, &ptr); + clean_single_address(ptr + 4 * page_size, 4 * page_size); + + /* invalid flag */ + ret = sys_mseal(ptr, size, 0x20); + assert(ret < 0); + + ret = sys_mseal(ptr, size, 0x31); + assert(ret < 0); + + ret = sys_mseal(ptr, size, 0x3F); + assert(ret < 0); + + /* unaligned address */ + ret = sys_mseal(ptr + 1, 2 * page_size, MM_SEAL_SEAL); + assert(ret < 0); + + /* length too big */ + ret = sys_mseal(ptr, 5 * page_size, MM_SEAL_SEAL); + assert(ret < 0); + + /* start is not in a valid VMA */ + ret = sys_mseal(ptr - page_size, 5 * page_size, MM_SEAL_SEAL); + assert(ret < 0); +} + +static void test_seal_zero_length(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + ret = sys_mprotect(ptr, 0, PROT_READ | PROT_WRITE); + assert(!ret); + + /* seal 0 length will be OK, same as mprotect */ + ret = sys_mseal(ptr, 0, MM_SEAL_PROT_PKEY); + assert(!ret); + + // verify the 4 pages are not sealed by previous call. + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(!ret); +} + +static void test_seal_twice(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); + + // apply the same seal will be OK. idempotent. + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); + + ret = sys_mseal(ptr, size, + MM_SEAL_PROT_PKEY | MM_SEAL_BASE | + MM_SEAL_SEAL); + assert(!ret); + + ret = sys_mseal(ptr, size, + MM_SEAL_PROT_PKEY | MM_SEAL_BASE | + MM_SEAL_SEAL); + assert(!ret); + + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(!ret); +} + +static void test_seal_mprotect(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) + seal_mprotect_single_address(ptr, size); + + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_start_mprotect(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) + seal_mprotect_single_address(ptr, page_size); + + // the first page is sealed. + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + // pages after the first page is not sealed. + ret = sys_mprotect(ptr + page_size, page_size * 3, + PROT_READ | PROT_WRITE); + assert(!ret); +} + +static void test_seal_end_mprotect(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) + seal_mprotect_single_address(ptr + page_size, 3 * page_size); + + /* first page is not sealed */ + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + /* last 3 page are sealed */ + ret = sys_mprotect(ptr + page_size, page_size * 3, + PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_mprotect_unalign_len(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) + seal_mprotect_single_address(ptr, page_size * 2 - 1); + + // 2 pages are sealed. + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + ret = sys_mprotect(ptr + page_size * 2, page_size, + PROT_READ | PROT_WRITE); + assert(!ret); +} + +static void test_seal_mprotect_unalign_len_variant_2(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + if (seal) + seal_mprotect_single_address(ptr, page_size * 2 + 1); + + // 3 pages are sealed. + ret = sys_mprotect(ptr, page_size * 3, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + ret = sys_mprotect(ptr + page_size * 3, page_size, + PROT_READ | PROT_WRITE); + assert(!ret); +} + +static void test_seal_mprotect_two_vma(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + /* use mprotect to split */ + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE); + assert(!ret); + + if (seal) + seal_mprotect_single_address(ptr, page_size * 4); + + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + ret = sys_mprotect(ptr + page_size * 2, page_size * 2, + PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_mprotect_two_vma_with_split(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + // use mprotect to split as two vma. + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE); + assert(!ret); + + // mseal can apply across 2 vma, also split them. + if (seal) + seal_mprotect_single_address(ptr + page_size, page_size * 2); + + // the first page is not sealed. + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + // the second page is sealed. + ret = sys_mprotect(ptr + page_size, page_size, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + // the third page is sealed. + ret = sys_mprotect(ptr + 2 * page_size, page_size, + PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + // the fouth page is not sealed. + ret = sys_mprotect(ptr + 3 * page_size, page_size, + PROT_READ | PROT_WRITE); + assert(!ret); +} + +static void test_seal_mprotect_partial_mprotect(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + // seal one page. + if (seal) + seal_mprotect_single_address(ptr, page_size); + + // mprotect first 2 page will fail, since the first page are sealed. + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_mprotect_two_vma_with_gap(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + // use mprotect to split. + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + // use mprotect to split. + ret = sys_mprotect(ptr + 3 * page_size, page_size, + PROT_READ | PROT_WRITE); + assert(!ret); + + // use munmap to free two pages in the middle + ret = sys_munmap(ptr + page_size, 2 * page_size); + assert(!ret); + + // mprotect will fail, because there is a gap in the address. + // notes, internally mprotect still updated the first page. + ret = sys_mprotect(ptr, 4 * page_size, PROT_READ); + assert(ret < 0); + + // mseal will fail as well. + ret = sys_mseal(ptr, 4 * page_size, MM_SEAL_PROT_PKEY); + assert(ret < 0); + + // unlike mprotect, the first page is not sealed. + ret = sys_mprotect(ptr, page_size, PROT_READ); + assert(ret == 0); + + // the last page is not sealed. + ret = sys_mprotect(ptr + 3 * page_size, page_size, PROT_READ); + assert(ret == 0); +} + +static void test_seal_mprotect_split(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + //use mprotect to split. + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + //seal all 4 pages. + if (seal) { + ret = sys_mseal(ptr, 4 * page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + } + + //madvice is OK. + ret = sys_madvise(ptr, page_size * 2, MADV_WILLNEED); + assert(!ret); + + //mprotect is sealed. + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ); + if (seal) + assert(ret < 0); + else + assert(!ret); + + + ret = sys_mprotect(ptr + 2 * page_size, 2 * page_size, PROT_READ); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_mprotect_merge(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + // use mprotect to split one page. + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + // seal first two pages. + if (seal) { + ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + } + + ret = sys_madvise(ptr, page_size, MADV_WILLNEED); + assert(!ret); + + // 2 pages are sealed. + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ); + if (seal) + assert(ret < 0); + else + assert(!ret); + + // last 2 pages are not sealed. + ret = sys_mprotect(ptr + 2 * page_size, 2 * page_size, PROT_READ); + assert(ret == 0); +} + +static void test_seal_munmap(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // 4 pages are sealed. + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +/* + * allocate 4 pages, + * use mprotect to split it as two VMAs + * seal the whole range + * munmap will fail on both + */ +static void test_seal_munmap_two_vma(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + /* use mprotect to split */ + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE); + assert(!ret); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + ret = sys_munmap(ptr, page_size * 2); + if (seal) + assert(ret < 0); + else + assert(!ret); + + ret = sys_munmap(ptr + page_size, page_size * 2); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +/* + * allocate a VMA with 4 pages. + * munmap the middle 2 pages. + * seal the whole 4 pages, will fail. + * note: one of the pages are sealed + * munmap the first page will be OK. + * munmap the last page will be OK. + */ +static void test_seal_munmap_vma_with_gap(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + ret = sys_munmap(ptr + page_size, page_size * 2); + assert(!ret); + + if (seal) { + // can't have gap in the middle. + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(ret < 0); + } + + ret = sys_munmap(ptr, page_size); + assert(!ret); + + ret = sys_munmap(ptr + page_size * 2, page_size); + assert(!ret); + + ret = sys_munmap(ptr, size); + assert(!ret); +} + +static void test_munmap_start_freed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + // unmap the first page. + ret = sys_munmap(ptr, page_size); + assert(!ret); + + // seal the last 3 pages. + if (seal) { + ret = sys_mseal(ptr + page_size, 3 * page_size, MM_SEAL_BASE); + assert(!ret); + } + + // unmap from the first page. + ret = sys_munmap(ptr, size); + if (seal) { + assert(ret < 0); + + // use mprotect to verify page is not unmapped. + ret = sys_mprotect(ptr + page_size, 3 * page_size, PROT_READ); + assert(!ret); + } else + // note: this will be OK, even the first page is + // already unmapped. + assert(!ret); +} + +static void test_munmap_end_freed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + // unmap last page. + ret = sys_munmap(ptr + page_size * 3, page_size); + assert(!ret); + + // seal the first 3 pages. + if (seal) { + ret = sys_mseal(ptr, 3 * page_size, MM_SEAL_BASE); + assert(!ret); + } + + // unmap all pages. + ret = sys_munmap(ptr, size); + if (seal) { + assert(ret < 0); + + // use mprotect to verify page is not unmapped. + ret = sys_mprotect(ptr, 3 * page_size, PROT_READ); + assert(!ret); + } else + assert(!ret); +} + +static void test_munmap_middle_freed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + // unmap 2 pages in the middle. + ret = sys_munmap(ptr + page_size, page_size * 2); + assert(!ret); + + // seal the first page. + if (seal) { + ret = sys_mseal(ptr, page_size, MM_SEAL_BASE); + assert(!ret); + } + + // munmap all 4 pages. + ret = sys_munmap(ptr, size); + if (seal) { + assert(ret < 0); + + // use mprotect to verify page is not unmapped. + ret = sys_mprotect(ptr, page_size, PROT_READ); + assert(!ret); + } else + assert(!ret); +} + +void test_seal_mremap_shrink(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // shrink from 4 pages to 2 pages. + ret2 = mremap(ptr, size, 2 * page_size, 0, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 != MAP_FAILED); + + } +} + +void test_seal_mremap_expand(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + // ummap last 2 pages. + ret = sys_munmap(ptr + 2 * page_size, 2 * page_size); + assert(!ret); + + if (seal) { + ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_BASE); + assert(!ret); + } + + // expand from 2 page to 4 pages. + ret2 = mremap(ptr, 2 * page_size, 4 * page_size, 0, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 == ptr); + + } +} + +void test_seal_mremap_move(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr, *newPtr; + unsigned long page_size = getpagesize(); + unsigned long size = page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + setup_single_address(size, &newPtr); + clean_single_address(newPtr, size); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // move from ptr to fixed address. + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_FIXED, newPtr); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 != MAP_FAILED); + + } +} + +void test_seal_mmap_overwrite_prot(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // use mmap to change protection. + ret2 = mmap(ptr, size, PROT_NONE, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == ptr); +} + +void test_seal_mmap_expand(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 12 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + // ummap last 4 pages. + ret = sys_munmap(ptr + 8 * page_size, 4 * page_size); + assert(!ret); + + if (seal) { + ret = sys_mseal(ptr, 8 * page_size, MM_SEAL_BASE); + assert(!ret); + } + + // use mmap to expand. + ret2 = mmap(ptr, size, PROT_READ, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == ptr); +} + +void test_seal_mmap_shrink(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 12 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // use mmap to shrink. + ret2 = mmap(ptr, 8 * page_size, PROT_READ, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == ptr); +} + +void test_seal_mremap_shrink_fixed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + setup_single_address(size, &newAddr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // mremap to move and shrink to fixed address + ret2 = mremap(ptr, size, 2 * page_size, MREMAP_MAYMOVE | MREMAP_FIXED, + newAddr); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == newAddr); +} + +void test_seal_mremap_expand_fixed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(page_size, &ptr); + setup_single_address(size, &newAddr); + + if (seal) { + ret = sys_mseal(newAddr, size, MM_SEAL_BASE); + assert(!ret); + } + + // mremap to move and expand to fixed address + ret2 = mremap(ptr, page_size, size, MREMAP_MAYMOVE | MREMAP_FIXED, + newAddr); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == newAddr); +} + +void test_seal_mremap_move_fixed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + setup_single_address(size, &newAddr); + + if (seal) { + ret = sys_mseal(newAddr, size, MM_SEAL_BASE); + assert(!ret); + } + + // mremap to move to fixed address + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_FIXED, newAddr); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == newAddr); +} + +void test_seal_mremap_move_fixed_zero(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + /* + * MREMAP_FIXED can move the mapping to zero address + */ + ret2 = mremap(ptr, size, 2 * page_size, MREMAP_MAYMOVE | MREMAP_FIXED, + 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 == 0); + + } +} + +void test_seal_mremap_move_dontunmap(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // mremap to move, and don't unmap src addr. + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_DONTUNMAP, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 != MAP_FAILED); + + } +} + +void test_seal_mremap_move_dontunmap_anyaddr(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + /* + * The 0xdeaddead should not have effect on dest addr + * when MREMAP_DONTUNMAP is set. + */ + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_DONTUNMAP, + 0xdeaddead); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 != MAP_FAILED); + assert((long)ret2 != 0xdeaddead); + + } +} + +unsigned long get_vma_size(void *addr) +{ + FILE *maps; + char line[256]; + int size = 0; + uintptr_t addr_start, addr_end; + + maps = fopen("/proc/self/maps", "r"); + if (!maps) + return 0; + + while (fgets(line, sizeof(line), maps)) { + if (sscanf(line, "%lx-%lx", &addr_start, &addr_end) == 2) { + if (addr_start == (uintptr_t) addr) { + size = addr_end - addr_start; + break; + } + } + } + fclose(maps); + return size; +} + +void test_seal_mmap_seal_base(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + ptr = mmap(NULL, size, PROT_READ | PROT_SEAL_BASE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr != (void *)-1); + + ret = sys_munmap(ptr, size); + assert(ret < 0); + + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(!ret); + + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); + + ret = sys_mprotect(ptr, size, PROT_READ); + assert(ret < 0); +} + +void test_seal_mmap_seal_mprotect(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + ptr = mmap(NULL, size, PROT_READ | PROT_SEAL_PROT_PKEY, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr != (void *)-1); + + ret = sys_munmap(ptr, size); + assert(ret < 0); + + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(ret < 0); +} + +void test_seal_mmap_seal_mseal(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + ptr = mmap(NULL, size, PROT_READ | PROT_SEAL_SEAL, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr != (void *)-1); + + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(ret < 0); + + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(!ret); + + ret = sys_munmap(ptr, size); + assert(!ret); +} + +void test_seal_merge_and_split(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size; + int ret; + void *ret2; + + // (24 RO) + setup_single_address(24 * page_size, &ptr); + + // use mprotect(NONE) to set out boundary + // (1 NONE) (22 RO) (1 NONE) + ret = sys_mprotect(ptr, page_size, PROT_NONE); + assert(!ret); + ret = sys_mprotect(ptr + 23 * page_size, page_size, PROT_NONE); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 22 * page_size); + + // use mseal to split from beginning + // (1 NONE) (1 RO_SBASE) (21 RO) (1 NONE) + ret = sys_mseal(ptr + page_size, page_size, MM_SEAL_BASE); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == page_size); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 21 * page_size); + + // use mseal to split from the end. + // (1 NONE) (1 RO_SBASE) (20 RO) (1 RO_SBASE) (1 NONE) + ret = sys_mseal(ptr + 22 * page_size, page_size, MM_SEAL_BASE); + assert(!ret); + size = get_vma_size(ptr + 22 * page_size); + assert(size == page_size); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 20 * page_size); + + // merge with prev. + // (1 NONE) (2 RO_SBASE) (19 RO) (1 RO_SBASE) (1 NONE) + ret = sys_mseal(ptr + 2 * page_size, page_size, MM_SEAL_BASE); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 2 * page_size); + + // merge with after. + // (1 NONE) (2 RO_SBASE) (18 RO) (2 RO_SBASES) (1 NONE) + ret = sys_mseal(ptr + 21 * page_size, page_size, MM_SEAL_BASE); + assert(!ret); + size = get_vma_size(ptr + 21 * page_size); + assert(size == 2 * page_size); + + // split from prev + // (1 NONE) (1 RO_SBASE) (2RO_SPROT) (17 RO) (2 RO_SBASES) (1 NONE) + ret = sys_mseal(ptr + 2 * page_size, 2 * page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 2 * page_size); + ret = sys_munmap(ptr + page_size, page_size); + assert(ret < 0); + ret = sys_mprotect(ptr + 2 * page_size, page_size, PROT_NONE); + assert(ret < 0); + + // split from next + // (1 NONE) (1 RO_SBASE) (2 RO_SPROT) (16 RO) (2 RO_SPROT) (1 RO_SBASES) (1 NONE) + ret = sys_mseal(ptr + 20 * page_size, 2 * page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + size = get_vma_size(ptr + 20 * page_size); + assert(size == 2 * page_size); + + // merge from middle of prev and middle of next. + // (1 NONE) (1 RW_SBASE) (20 RO_SPROT) (1 RW_SBASES) (1 NONE) + ret = sys_mseal(ptr + 3 * page_size, 18 * page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 20 * page_size); + + size = get_vma_size(ptr + 22 * page_size); + assert(size == page_size); + + size = get_vma_size(ptr + 23 * page_size); + assert(size == page_size); + + // Add split using SEAL_ALL + // (1 NONE) (1 RW_SBASE) (1 RO_SALL) (18 RO_SPROT) (1 RO_SALL) (1 RW_SBASES) (1 NONE) + ret = sys_mseal(ptr + 2 * page_size, page_size, + MM_SEAL_PROT_PKEY | MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 1 * page_size); + + ret = sys_mseal(ptr + 21 * page_size, page_size, + MM_SEAL_PROT_PKEY | MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + size = get_vma_size(ptr + 21 * page_size); + assert(size == 1 * page_size); + + // add a new seal type, and merge with next + // (1 NONE) (2 RO_SALL) (18 RO_SPROT) (2 RO_SALL) (1 NONE) + ret = sys_mprotect(ptr + page_size, page_size, PROT_READ); + assert(!ret); + ret = sys_mseal(ptr + page_size, page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + ret = sys_mseal(ptr + page_size, page_size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 2 * page_size); + + ret = sys_mprotect(ptr + 22 * page_size, page_size, PROT_READ); + assert(!ret); + ret = sys_mseal(ptr + 22 * page_size, page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + ret = sys_mseal(ptr + 22 * page_size, page_size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 2 * page_size); +} + +void test_seal_mmap_merge(void) +{ + LOG_TEST_ENTER(); + + void *ptr, *ptr2; + unsigned long page_size = getpagesize(); + unsigned long size; + int ret; + void *ret2; + + // (24 RO) + setup_single_address(24 * page_size, &ptr); + + // use mprotect(NONE) to set out boundary + // (1 NONE) (22 RO) (1 NONE) + ret = sys_mprotect(ptr, page_size, PROT_NONE); + assert(!ret); + ret = sys_mprotect(ptr + 23 * page_size, page_size, PROT_NONE); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 22 * page_size); + + // use munmap to free 2 segment of memory. + // (1 NONE) (1 free) (20 RO) (1 free) (1 NONE) + ret = sys_munmap(ptr + page_size, page_size); + assert(!ret); + + ret = sys_munmap(ptr + 22 * page_size, page_size); + assert(!ret); + + // apply seal to the middle + // (1 NONE) (1 free) (20 RO_SBASE) (1 free) (1 NONE) + ret = sys_mseal(ptr + 2 * page_size, 20 * page_size, MM_SEAL_BASE); + assert(!ret); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 20 * page_size); + + // allocate a mapping at beginning, and make sure it merges. + // (1 NONE) (21 RO_SBASE) (1 free) (1 NONE) + ptr2 = mmap(ptr + page_size, page_size, PROT_READ | PROT_SEAL_BASE, + MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + assert(ptr != (void *)-1); + size = get_vma_size(ptr + page_size); + assert(size == 21 * page_size); + + // allocate a mapping at end, and make sure it merges. + // (1 NONE) (22 RO_SBASE) (1 NONE) + ptr2 = mmap(ptr + 22 * page_size, page_size, PROT_READ | PROT_SEAL_BASE, + MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + assert(ptr != (void *)-1); + size = get_vma_size(ptr + page_size); + assert(size == 22 * page_size); +} + +static void test_not_sealable(void) +{ + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + ptr = mmap(NULL, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr != (void *)-1); + + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(ret < 0); +} + +static void test_merge_sealable(void) +{ + int ret; + void *ptr, *ptr2; + unsigned long page_size = getpagesize(); + unsigned long size; + + // (24 RO) + setup_single_address(24 * page_size, &ptr); + + // use mprotect(NONE) to set out boundary + // (1 NONE) (22 RO) (1 NONE) + ret = sys_mprotect(ptr, page_size, PROT_NONE); + assert(!ret); + ret = sys_mprotect(ptr + 23 * page_size, page_size, PROT_NONE); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 22 * page_size); + + // (1 NONE) (RO) (4 free) (17 RO) (1 NONE) + ret = sys_munmap(ptr + 2 * page_size, 4 * page_size); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 1 * page_size); + size = get_vma_size(ptr + 6 * page_size); + assert(size == 17 * page_size); + + // (1 NONE) (RO) (1 free) (2 RO) (1 free) (17 RO) (1 NONE) + ptr2 = mmap(ptr + 3 * page_size, 2 * page_size, PROT_READ, + MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE | MAP_SEALABLE, -1, 0); + size = get_vma_size(ptr + 3 * page_size); + assert(size == 2 * page_size); + + // (1 NONE) (RO) (1 free) (20 RO) (1 NONE) + ptr2 = mmap(ptr + 5 * page_size, 1 * page_size, PROT_READ, + MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE | MAP_SEALABLE, -1, 0); + assert(ptr2 != (void *)-1); + size = get_vma_size(ptr + 3 * page_size); + assert(size == 20 * page_size); + + // (1 NONE) (RO) (1 free) (19 RO) (1 RO_SB) (1 NONE) + ret = sys_mseal(ptr + 22 * page_size, page_size, MM_SEAL_BASE); + assert(!ret); + + // (1 NONE) (RO) (not sealable) (19 RO) (1 RO_SB) (1 NONE) + ptr2 = mmap(ptr + 2 * page_size, page_size, PROT_READ, + MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr2 != (void *)-1); + size = get_vma_size(ptr + page_size); + assert(size == page_size); + size = get_vma_size(ptr + 2 * page_size); + assert(size == page_size); +} + +static void test_seal_discard_ro_anon_on_rw(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address_rw_sealable(size, &ptr, seal); + assert(ptr != (void *)-1); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + } + + // sealing doesn't take effect on RW memory. + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(!ret); + + // base seal still apply. + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_discard_ro_anon_on_pkey(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + int pkey; + + setup_single_address_rw_sealable(size, &ptr, seal); + assert(ptr != (void *)-1); + + pkey = sys_pkey_alloc(0, 0); + assert(pkey > 0); + + ret = sys_mprotect_pkey((void *)ptr, size, PROT_READ | PROT_WRITE, pkey); + assert(!ret); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + } + + // sealing doesn't take effect if PKRU allow write. + set_pkey(pkey, 0); + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(!ret); + + // sealing will take effect if PKRU deny write. + set_pkey(pkey, PKEY_DISABLE_WRITE); + ret = sys_madvise(ptr, size, MADV_DONTNEED); + if (seal) + assert(ret < 0); + else + assert(!ret); + + // base seal still apply. + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_discard_ro_anon_on_filebacked(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + int fd; + unsigned long mapflags = MAP_PRIVATE; + + if (seal) + mapflags |= MAP_SEALABLE; + + fd = memfd_create("test", 0); + assert(fd > 0); + + ret = fallocate(fd, 0, 0, size); + assert(!ret); + + ptr = mmap(NULL, size, PROT_READ, mapflags, fd, 0); + assert(ptr != MAP_FAILED); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + } + + // sealing doesn't apply for file backed mapping. + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(!ret); + + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); + close(fd); +} + +static void test_seal_discard_ro_anon_on_shared(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + unsigned long mapflags = MAP_ANONYMOUS | MAP_SHARED; + + if (seal) + mapflags |= MAP_SEALABLE; + + ptr = mmap(NULL, size, PROT_READ, mapflags, -1, 0); + assert(ptr != (void *)-1); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + } + + // sealing doesn't apply for shared mapping. + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(!ret); + + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_discard_ro_anon_invalid_shared(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + int fd; + + fd = open("/proc/self/maps", O_RDONLY); + ptr = mmap(NULL, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, fd, 0); + assert(ptr != (void *)-1); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + } + + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(!ret); + + ret = sys_munmap(ptr, size); + assert(ret < 0); + close(fd); +} + +static void test_seal_discard_ro_anon(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) + seal_discard_ro_anon_single_address(ptr, size); + + ret = sys_madvise(ptr, size, MADV_DONTNEED); + if (seal) + assert(ret < 0); + else + assert(!ret); + + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_mmap_seal_discard_ro_anon(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + ptr = mmap(NULL, size, PROT_READ | PROT_WRITE | PROT_SEAL_DISCARD_RO_ANON, + MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr != (void *)-1); + + ret = sys_mprotect(ptr, size, PROT_READ); + assert(!ret); + + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(ret < 0); + + ret = sys_munmap(ptr, size); + assert(ret < 0); +} + +bool seal_support(void) +{ + void *ptr; + unsigned long page_size = getpagesize(); + + ptr = mmap(NULL, page_size, PROT_READ | PROT_SEAL_BASE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + if (ptr == (void *) -1) + return false; + return true; +} + +bool pkey_supported(void) +{ + int pkey = sys_pkey_alloc(0, 0); + + if (pkey > 0) + return true; + return false; +} + +int main(int argc, char **argv) +{ + bool test_seal = seal_support(); + + if (!test_seal) { + ksft_print_msg("%s CONFIG_MSEAL might be disabled, skip test\n", __func__); + return 0; + } + + test_seal_invalid_input(); + test_seal_addseals(); + test_seal_addseals_combined(); + test_seal_addseals_reject(); + test_seal_unmapped_start(); + test_seal_unmapped_middle(); + test_seal_unmapped_end(); + test_seal_multiple_vmas(); + test_seal_split_start(); + test_seal_split_end(); + + test_seal_zero_length(); + test_seal_twice(); + + test_seal_mprotect(false); + test_seal_mprotect(true); + + test_seal_start_mprotect(false); + test_seal_start_mprotect(true); + + test_seal_end_mprotect(false); + test_seal_end_mprotect(true); + + test_seal_mprotect_unalign_len(false); + test_seal_mprotect_unalign_len(true); + + test_seal_mprotect_unalign_len_variant_2(false); + test_seal_mprotect_unalign_len_variant_2(true); + + test_seal_mprotect_two_vma(false); + test_seal_mprotect_two_vma(true); + + test_seal_mprotect_two_vma_with_split(false); + test_seal_mprotect_two_vma_with_split(true); + + test_seal_mprotect_partial_mprotect(false); + test_seal_mprotect_partial_mprotect(true); + + test_seal_mprotect_two_vma_with_gap(false); + test_seal_mprotect_two_vma_with_gap(true); + + test_seal_mprotect_merge(false); + test_seal_mprotect_merge(true); + + test_seal_mprotect_split(false); + test_seal_mprotect_split(true); + + test_seal_munmap(false); + test_seal_munmap(true); + test_seal_munmap_two_vma(false); + test_seal_munmap_two_vma(true); + test_seal_munmap_vma_with_gap(false); + test_seal_munmap_vma_with_gap(true); + + test_munmap_start_freed(false); + test_munmap_start_freed(true); + test_munmap_middle_freed(false); + test_munmap_middle_freed(true); + test_munmap_end_freed(false); + test_munmap_end_freed(true); + + test_seal_mremap_shrink(false); + test_seal_mremap_shrink(true); + test_seal_mremap_expand(false); + test_seal_mremap_expand(true); + test_seal_mremap_move(false); + test_seal_mremap_move(true); + + test_seal_mremap_shrink_fixed(false); + test_seal_mremap_shrink_fixed(true); + test_seal_mremap_expand_fixed(false); + test_seal_mremap_expand_fixed(true); + test_seal_mremap_move_fixed(false); + test_seal_mremap_move_fixed(true); + test_seal_mremap_move_dontunmap(false); + test_seal_mremap_move_dontunmap(true); + test_seal_mremap_move_fixed_zero(false); + test_seal_mremap_move_fixed_zero(true); + test_seal_mremap_move_dontunmap_anyaddr(false); + test_seal_mremap_move_dontunmap_anyaddr(true); + test_seal_discard_ro_anon(false); + test_seal_discard_ro_anon(true); + test_seal_discard_ro_anon_on_rw(false); + test_seal_discard_ro_anon_on_rw(true); + test_seal_discard_ro_anon_on_shared(false); + test_seal_discard_ro_anon_on_shared(true); + test_seal_discard_ro_anon_on_filebacked(false); + test_seal_discard_ro_anon_on_filebacked(true); + test_seal_mmap_overwrite_prot(false); + test_seal_mmap_overwrite_prot(true); + test_seal_mmap_expand(false); + test_seal_mmap_expand(true); + test_seal_mmap_shrink(false); + test_seal_mmap_shrink(true); + + test_seal_mmap_seal_base(); + test_seal_mmap_seal_mprotect(); + test_seal_mmap_seal_mseal(); + test_mmap_seal_discard_ro_anon(); + test_seal_merge_and_split(); + test_seal_mmap_merge(); + + test_not_sealable(); + test_merge_sealable(); + + if (pkey_supported()) { + test_seal_discard_ro_anon_on_pkey(false); + test_seal_discard_ro_anon_on_pkey(true); + } + + ksft_print_msg("Done\n"); + return 0; +} From patchwork Tue Dec 12 23:17:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 177662 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp8060928vqy; Tue, 12 Dec 2023 15:18:06 -0800 (PST) X-Google-Smtp-Source: AGHT+IFbhPJ659Z4/sx00uebwXQG0Au2LHOF7KbmyM4bGNad0flVvmonakjszj6/bHt2STxqwbtn X-Received: by 2002:a9d:7358:0:b0:6d9:e90d:48ca with SMTP id l24-20020a9d7358000000b006d9e90d48camr6768926otk.44.1702423086735; Tue, 12 Dec 2023 15:18:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702423086; cv=none; d=google.com; s=arc-20160816; b=tKGnC3Ip2o4J4Op3i9nmeT0O7xRNf+tIH/4LC1Dnf6zuHVW70qqgcBiA6NgTjijgG5 ubDLZV95U3HI/42RiMR9CjzyaA98mYuaZbZZKOd/tUipT9o5i8blEM7OnaubLxg1Fhqq yuIpz3Vwt45dNSJ7FoZHhSWx1hjCZQSdToEat275vVsjMas+eAPrhoyaCMXayvqer7lM wNkULaEJsbh531lEJO+aAGrX/+2HIYd5+Wk6s+XRv4rIfRE++BYElfdPaHiO8K/cFshF GI18YGPbQIC8BgOluwqlEhEdUfay5Jmty5fEDTH8CWrtyqtiUeiNopK2CaQoPN2eFPff wvuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=uU36FjOIbdJKQPRiJg4nrZQU0fb/SKB6wgyMQ4t+i44=; fh=0mRPlJo0PPrkJQqE76MB2W/JiJzBa9Y7evJmiT9T7OE=; b=0z4dZGFwpPnTMhbfwmJnt+QJTgAslz/6wzmWg9M0ZDL9UjYGCNOelkbFFd44Ju4j3l OGKLHsVSz1G5t16yzR5dHBAvkhjgNeX9bNlclMFdeoB+SmQmABJfWxs8lHrF+4StVfax fvQKo94Q98+MFQ8z7yCzb7A4wvslSrWkHw8DZzQ0oxowOf9Kbyy118FRIIQMQt8/X791 8cj7136zeZNW30O1AQme+GYbfBbn+APx4TQHGmK8La2F11trK6GbOggx3k/2U1NcJiGi maCEAO/cPqP5l9JspGVgSYyj+ysgnM+/5l1I2zncQ0xGoqjIQ4fRvIYYjw/fyZf+YUUC EUWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=Q6rto8wf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id a8-20020a62d408000000b006ce65842c57si8368027pfh.130.2023.12.12.15.18.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 15:18:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=Q6rto8wf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 78C7C80417E8; Tue, 12 Dec 2023 15:18:05 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378049AbjLLXRo (ORCPT + 99 others); Tue, 12 Dec 2023 18:17:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378038AbjLLXRU (ORCPT ); Tue, 12 Dec 2023 18:17:20 -0500 Received: from mail-oa1-x2f.google.com (mail-oa1-x2f.google.com [IPv6:2001:4860:4864:20::2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 927CCAF for ; Tue, 12 Dec 2023 15:17:23 -0800 (PST) Received: by mail-oa1-x2f.google.com with SMTP id 586e51a60fabf-202d6823844so1516368fac.2 for ; Tue, 12 Dec 2023 15:17:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423042; x=1703027842; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uU36FjOIbdJKQPRiJg4nrZQU0fb/SKB6wgyMQ4t+i44=; b=Q6rto8wfztnbMECUp7AIeqx0uu86PE9aG6ZHMHfsjUoqaLNMUDRhZk/SE9OjXCUtFl tyNx8suTCa8z103x5Vnhmo1qX9D6NaCaZE+t4uFYEu19z4tSFEZdam0z9d4iVNwCSfCg LwwbGS3JrdK+T+sSR4ad7VVWE1Nq2hRlTCaGo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423042; x=1703027842; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uU36FjOIbdJKQPRiJg4nrZQU0fb/SKB6wgyMQ4t+i44=; b=ZEtd7Giog3a7IfK+Vopz8mg+Vqv5ck6GDd6fhV5+ajOkOy3agOcPrQkzfLvXS6q4sd e1kQetglxfpkqp2xdeYnhL6aQoBpgPS+NoYoajtHhk9mWzFZrg/hwA0ZaSy1l2kzuAVk Y1d8QxcObsHfjoz0oHKjpTTqIz+WVhaRWUwY8cOKlCAevmN4ipHLivOfU48Z4up1nzfT PNNB9JPx1GIi/LaSs7tLTE8iZ0wGTnzUkiVxpwKgy4OLJXxW+TtQ1aYPVs4gsBcyMVx8 UY39bc8IP8Ulj09zdyVujoiuolRyPywbUI/lo8Yks9W+VNjbG5r+IDXW1UA5sd/d/hkM w/Xw== X-Gm-Message-State: AOJu0YwQnBfqB/kRf1j9k5M8gyhs7bcOuLdiMNBzysK9ExCd918wa3Tb 7WOU37KdXg5r6582RUL3ncFzFQ== X-Received: by 2002:a05:6870:ec8b:b0:1fb:75c:400e with SMTP id eo11-20020a056870ec8b00b001fb075c400emr8193744oab.110.1702423042075; Tue, 12 Dec 2023 15:17:22 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id v29-20020a63481d000000b005c19c586cb7sm8685104pga.33.2023.12.12.15.17.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:21 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 11/11] mseal:add documentation Date: Tue, 12 Dec 2023 23:17:05 +0000 Message-ID: <20231212231706.2680890-12-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 12 Dec 2023 15:18:05 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785119990486108339 X-GMAIL-MSGID: 1785119990486108339 From: Jeff Xu Add documentation for mseal(). Signed-off-by: Jeff Xu --- Documentation/userspace-api/mseal.rst | 189 ++++++++++++++++++++++++++ 1 file changed, 189 insertions(+) create mode 100644 Documentation/userspace-api/mseal.rst diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst new file mode 100644 index 000000000000..651c618d0664 --- /dev/null +++ b/Documentation/userspace-api/mseal.rst @@ -0,0 +1,189 @@ +.. SPDX-License-Identifier: GPL-2.0 + +===================== +Introduction of mseal +===================== + +:Author: Jeff Xu + +Modern CPUs support memory permissions such as RW and NX bits. The memory +permission feature improves security stance on memory corruption bugs, i.e. +the attacker can’t just write to arbitrary memory and point the code to it, +the memory has to be marked with X bit, or else an exception will happen. + +Memory sealing additionally protects the mapping itself against +modifications. This is useful to mitigate memory corruption issues where a +corrupted pointer is passed to a memory management system. For example, +such an attacker primitive can break control-flow integrity guarantees +since read-only memory that is supposed to be trusted can become writable +or .text pages can get remapped. Memory sealing can automatically be +applied by the runtime loader to seal .text and .rodata pages and +applications can additionally seal security critical data at runtime. + +A similar feature already exists in the XNU kernel with the +VM_FLAGS_PERMANENT flag [1] and on OpenBSD with the mimmutable syscall [2]. + +User API +======== +Two system calls are involved in virtual memory sealing, ``mseal()`` and ``mmap()``. + +``mseal()`` +----------- + +The ``mseal()`` is an architecture independent syscall, and with following +signature: + +``int mseal(void addr, size_t len, unsigned long types, unsigned long flags)`` + +**addr/len**: virtual memory address range. + +The address range set by ``addr``/``len`` must meet: + - start (addr) must be in a valid VMA. + - end (addr + len) must be in a valid VMA. + - no gap (unallocated memory) between start and end. + - start (addr) must be page aligned. + +The ``len`` will be paged aligned implicitly by kernel. + +**types**: bit mask to specify the sealing types, they are: + +- The ``MM_SEAL_BASE``: Prevent VMA from: + + Unmapping, moving to another location, and shrinking the size, + via munmap() and mremap(), can leave an empty space, therefore + can be replaced with a VMA with a new set of attributes. + + Move or expand a different vma into the current location, + via mremap(). + + Modifying sealed VMA via mmap(MAP_FIXED). + + Size expansion, via mremap(), does not appear to pose any + specific risks to sealed VMAs. It is included anyway because + the use case is unclear. In any case, users can rely on + merging to expand a sealed VMA. + + We consider the MM_SEAL_BASE feature, on which other sealing + features will depend. For instance, it probably does not make sense + to seal PROT_PKEY without sealing the BASE, and the kernel will + implicitly add SEAL_BASE for SEAL_PROT_PKEY. (If the application + wants to relax this in future, we could use the “flags” field in + mseal() to overwrite this the behavior.) + +- The ``MM_SEAL_PROT_PKEY``: + + Seal PROT and PKEY of the address range, in other words, + mprotect() and pkey_mprotect() will be denied if the memory is + sealed with MM_SEAL_PROT_PKEY. + +- The ``MM_SEAL_DISCARD_RO_ANON``: + + Certain types of madvise() operations are destructive [3], such + as MADV_DONTNEED, which can effectively alter region contents by + discarding pages, especially when memory is anonymous. This blocks + such operations for anonymous memory which is not writable to the + user. + +- The ``MM_SEAL_SEAL`` + Denies adding a new seal. + +**flags**: reserved for future use. + +**return values**: + +- ``0``: + - Success. + +- ``-EINVAL``: + - Invalid seal type. + - Invalid input flags. + - Start address is not page aligned. + - Address range (``addr`` + ``len``) overflow. + +- ``-ENOMEM``: + - ``addr`` is not a valid address (not allocated). + - End address (``addr`` + ``len``) is not a valid address. + - A gap (unallocated memory) between start and end. + +- ``-EACCES``: + - ``MM_SEAL_SEAL`` is set, adding a new seal is not allowed. + - Address range is not sealable, e.g. ``MAP_SEALABLE`` is not + set during ``mmap()``. + +**Note**: + +- User can call mseal(2) multiple times to add new seal types. +- Adding an already added seal type is a no-action (no error). +- unseal() or removing a seal type is not supported. +- In case of error return, one can expect the memory range is unchanged. + +``mmap()`` +---------- +``void *mmap(void* addr, size_t length, int prot, int flags, int fd, +off_t offset);`` + +We made two changes (``prot`` and ``flags``) to ``mmap()`` related to +memory sealing. + +**prot**: + +- ``PROT_SEAL_SEAL`` +- ``PROT_SEAL_BASE`` +- ``PROT_SEAL_PROT_PKEY`` +- ``PROT_SEAL_DISCARD_RO_ANON`` + +Allow ``mmap()`` to set the sealing type when creating a mapping. This is +useful for optimization because it avoids having to make two system +calls: one for ``mmap()`` and one for ``mseal()``. + +It's worth noting that even though the sealing type is set via the +``prot`` field in ``mmap()``, we don't require it to be set in the ``prot`` +field in later ``mprotect()`` call. This is unlike the ``PROT_READ``, +``PROT_WRITE``, ``PROT_EXEC`` bits, e.g. if ``PROT_WRITE`` is not set in +``mprotect()``, it means that the region is not writable. + +**flags** +The ``MAP_SEALABLE`` flag is added to the ``flags`` field of ``mmap()``. +When present, it marks the map as sealable. A map created +without ``MAP_SEALABLE`` will not support sealing; In other words, +``mseal()`` will fail for such a map. + +Applications that don't care about sealing will expect their +behavior unchanged. For those that need sealing support, opt-in +by adding ``MAP_SEALABLE`` when creating the map. + +Use Case: +========= +- glibc: + The dymamic linker, during loading ELF executables, can apply sealing to + to non-writeable memory segments. + +- Chrome browser: protect some security sensitive data-structures. + +Additional notes: +================= +As Jann Horn pointed out in [3], there are still a few ways to write +to RO memory, which is, in a way, by design. Those are not covered by +``mseal()``. If applications want to block such cases, sandboxer +(such as seccomp, LSM, etc) might be considered. + +Those cases are: + +- Write to read-only memory through ``/proc/self/mem`` interface. + +- Write to read-only memory through ``ptrace`` (such as ``PTRACE_POKETEXT``). + +- ``userfaultfd()``. + +The idea that inspired this patch comes from Stephen Röttger’s work in V8 +CFI [4].Chrome browser in ChromeOS will be the first user of this API. + +Reference: +========== +[1] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b9f69dbd3c8c3fd30a/osfmk/mach/vm_statistics.h#L274 + +[2] https://man.openbsd.org/mimmutable.2 + +[3] https://lore.kernel.org/lkml/CAG48ez3ShUYey+ZAFsU2i1RpQn0a5eOs2hzQ426FkcgnfUGLvA@mail.gmail.com + +[4] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXgeaRHo/edit#heading=h.bvaojj9fu6hc