From patchwork Fri Dec 2 06:13:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chao Peng X-Patchwork-Id: 28714 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp673507wrr; Thu, 1 Dec 2022 22:20:24 -0800 (PST) X-Google-Smtp-Source: AA0mqf5pXKNKbId6RYwKx9l1Pr20ek1yuBIOh7BdZjlLbxX/3RMS03J7q1d/MGZF3F1Q2xQSOw9q X-Received: by 2002:a17:90b:3d90:b0:200:7cf7:3d79 with SMTP id pq16-20020a17090b3d9000b002007cf73d79mr77588150pjb.206.1669962024178; Thu, 01 Dec 2022 22:20:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669962024; cv=none; d=google.com; s=arc-20160816; b=JWc1ODLxn8GZcvHKtBcSJuryJYFTt7TpjaKnvIJxknuoRdlGbDR9+Alzempj4pGits QHVXi7E+eYJ7f8a2TlzMhfXTbBP6lP96tHRDTUkg5jmaVbW1p0Lh7hsaAbFOcGUKwqzd W7AXpv9by0EP8YEu96/PPfjnlbj/CnY3JWoOzB+zQGzPFNLZHl7+4jjG/wfVO5WSRJel NmzHAH81MYkAWIVb1E8WveZ3DQaSEJXXOVDmCg2ZQxXAeQMK+/Xqh539shiuEFBg5J48 to8G2sDUPvRsIdYToVFzQBmHADRlUQZh74TwqXqv9c1lLwgYlELfN3qb1av4ZCXW4PzI xw/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ePd9gVn+lp+sr/KOzp1HR00BteXBj6Ds3pwRmfClbWo=; b=oHN0VpvNAMQJNDg2t3d12VxCGWQ7EJhh6d5O0UgpDA1Icjcv8IzLZhTebY/lkpB8X0 BZqZMhtRPeY99yPSAxO/GF83QApVCoTZvahTNhSU6/9LLS3Nq1wKznev9lS7h8vPBQe+ 5oXjMMLM+CE6wuwmtnoRcbNgEbZKqIMFdzOMpdyB/NvST7Wa0pCZgqna1qOWkWq5lb+Y AhJXiMkVWmlbSfVo6Dc9NJ/tgqO+87mlLjwQCnX5JCjPShnKc0WVirhCN3x6ePVb6WfB IcbaMIz7EDoNO1LtiVKb0JDAAei4t1Dif8Y0s5BI1gV/VrF8/chRf3AfWlCf6brNinDT 6ZUQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Ng0JuxQI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i189-20020a6387c6000000b004788c847621si12530pge.236.2022.12.01.22.20.11; Thu, 01 Dec 2022 22:20:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Ng0JuxQI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232292AbiLBGTz (ORCPT + 99 others); Fri, 2 Dec 2022 01:19:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232435AbiLBGTi (ORCPT ); Fri, 2 Dec 2022 01:19:38 -0500 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 832C5DF62A; Thu, 1 Dec 2022 22:19:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1669961941; x=1701497941; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6LvyjYHjuWAwPSyQcfYrCfXc2wEuDH2WGLmg2tkJEmY=; b=Ng0JuxQIZci8gzpn+699q/GjGzXuQuAxvZjtfB+Vdqu8TWrS+tY7PAcl Ws1Ab5GIQTQCaBAJRCmigNp49QKk/7T4XO5FSkWD/h3uPhIJKvlq1pLGv FunRamgkPyAI+LXBV5Mag1Q+I/5gKM+lCuCEvbLSbbc1bq6nSOBdYKE4s Vs8EQ7npdLhRhC57tz++SQjC5ZWnzOZmKHRyUdvP9OBzfMoWyMAKVXGx0 /r9XnvGeSMVFK4eJ8J0wqD6+gfdCqarm/BWpbnq34Prl2dDdbnx8IK6xE kVe6y3yYWbo7Uygw/KvH0UTGNE1m8hY5W9Jlg6BkN9Ranw0er9fSEh8c6 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10548"; a="380170444" X-IronPort-AV: E=Sophos;i="5.96,210,1665471600"; d="scan'208";a="380170444" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2022 22:18:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10548"; a="733698624" X-IronPort-AV: E=Sophos;i="5.96,210,1665471600"; d="scan'208";a="733698624" Received: from chaop.bj.intel.com ([10.240.193.75]) by FMSMGA003.fm.intel.com with ESMTP; 01 Dec 2022 22:18:30 -0800 From: Chao Peng To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org Cc: Paolo Bonzini , Jonathan Corbet , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Arnd Bergmann , Naoya Horiguchi , Miaohe Lin , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Shuah Khan , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , Chao Peng , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com, aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , tabba@google.com, Michael Roth , mhocko@suse.com, wei.w.wang@intel.com Subject: [PATCH v10 2/9] KVM: Introduce per-page memory attributes Date: Fri, 2 Dec 2022 14:13:40 +0800 Message-Id: <20221202061347.1070246-3-chao.p.peng@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221202061347.1070246-1-chao.p.peng@linux.intel.com> References: <20221202061347.1070246-1-chao.p.peng@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751082099722914689?= X-GMAIL-MSGID: =?utf-8?q?1751082099722914689?= In confidential computing usages, whether a page is private or shared is necessary information for KVM to perform operations like page fault handling, page zapping etc. There are other potential use cases for per-page memory attributes, e.g. to make memory read-only (or no-exec, or exec-only, etc.) without having to modify memslots. Introduce two ioctls (advertised by KVM_CAP_MEMORY_ATTRIBUTES) to allow userspace to operate on the per-page memory attributes. - KVM_SET_MEMORY_ATTRIBUTES to set the per-page memory attributes to a guest memory range. - KVM_GET_SUPPORTED_MEMORY_ATTRIBUTES to return the KVM supported memory attributes. KVM internally uses xarray to store the per-page memory attributes. Suggested-by: Sean Christopherson Signed-off-by: Chao Peng Link: https://lore.kernel.org/all/Y2WB48kD0J4VGynX@google.com/ Reviewed-by: Fuad Tabba Tested-by: Fuad Tabba --- Documentation/virt/kvm/api.rst | 63 ++++++++++++++++++++++++++++ arch/x86/kvm/Kconfig | 1 + include/linux/kvm_host.h | 3 ++ include/uapi/linux/kvm.h | 17 ++++++++ virt/kvm/Kconfig | 3 ++ virt/kvm/kvm_main.c | 76 ++++++++++++++++++++++++++++++++++ 6 files changed, 163 insertions(+) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 5617bc4f899f..bb2f709c0900 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -5952,6 +5952,59 @@ delivery must be provided via the "reg_aen" struct. The "pad" and "reserved" fields may be used for future extensions and should be set to 0s by userspace. +4.138 KVM_GET_SUPPORTED_MEMORY_ATTRIBUTES +----------------------------------------- + +:Capability: KVM_CAP_MEMORY_ATTRIBUTES +:Architectures: x86 +:Type: vm ioctl +:Parameters: u64 memory attributes bitmask(out) +:Returns: 0 on success, <0 on error + +Returns supported memory attributes bitmask. Supported memory attributes will +have the corresponding bits set in u64 memory attributes bitmask. + +The following memory attributes are defined:: + + #define KVM_MEMORY_ATTRIBUTE_READ (1ULL << 0) + #define KVM_MEMORY_ATTRIBUTE_WRITE (1ULL << 1) + #define KVM_MEMORY_ATTRIBUTE_EXECUTE (1ULL << 2) + #define KVM_MEMORY_ATTRIBUTE_PRIVATE (1ULL << 3) + +4.139 KVM_SET_MEMORY_ATTRIBUTES +----------------------------------------- + +:Capability: KVM_CAP_MEMORY_ATTRIBUTES +:Architectures: x86 +:Type: vm ioctl +:Parameters: struct kvm_memory_attributes(in/out) +:Returns: 0 on success, <0 on error + +Sets memory attributes for pages in a guest memory range. Parameters are +specified via the following structure:: + + struct kvm_memory_attributes { + __u64 address; + __u64 size; + __u64 attributes; + __u64 flags; + }; + +The user sets the per-page memory attributes to a guest memory range indicated +by address/size, and in return KVM adjusts address and size to reflect the +actual pages of the memory range have been successfully set to the attributes. +If the call returns 0, "address" is updated to the last successful address + 1 +and "size" is updated to the remaining address size that has not been set +successfully. The user should check the return value as well as the size to +decide if the operation succeeded for the whole range or not. The user may want +to retry the operation with the returned address/size if the previous range was +partially successful. + +Both address and size should be page aligned and the supported attributes can be +retrieved with KVM_GET_SUPPORTED_MEMORY_ATTRIBUTES. + +The "flags" field may be used for future extensions and should be set to 0s. + 5. The kvm_run structure ======================== @@ -8270,6 +8323,16 @@ structure. When getting the Modified Change Topology Report value, the attr->addr must point to a byte where the value will be stored or retrieved from. +8.40 KVM_CAP_MEMORY_ATTRIBUTES +------------------------------ + +:Capability: KVM_CAP_MEMORY_ATTRIBUTES +:Architectures: x86 +:Type: vm + +This capability indicates KVM supports per-page memory attributes and ioctls +KVM_GET_SUPPORTED_MEMORY_ATTRIBUTES/KVM_SET_MEMORY_ATTRIBUTES are available. + 9. Known KVM API problems ========================= diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index fbeaa9ddef59..a8e379a3afee 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -49,6 +49,7 @@ config KVM select SRCU select INTERVAL_TREE select HAVE_KVM_PM_NOTIFIER if PM + select HAVE_KVM_MEMORY_ATTRIBUTES help Support hosting fully virtualized guest machines using hardware virtualization extensions. You will need a fairly recent diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 8f874a964313..a784e2b06625 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -800,6 +800,9 @@ struct kvm { #ifdef CONFIG_HAVE_KVM_PM_NOTIFIER struct notifier_block pm_notifier; +#endif +#ifdef CONFIG_HAVE_KVM_MEMORY_ATTRIBUTES + struct xarray mem_attr_array; #endif char stats_id[KVM_STATS_NAME_SIZE]; }; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 64dfe9c07c87..5d0941acb5bb 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1182,6 +1182,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_S390_CPU_TOPOLOGY 222 #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223 #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224 +#define KVM_CAP_MEMORY_ATTRIBUTES 225 #ifdef KVM_CAP_IRQ_ROUTING @@ -2238,4 +2239,20 @@ struct kvm_s390_zpci_op { /* flags for kvm_s390_zpci_op->u.reg_aen.flags */ #define KVM_S390_ZPCIOP_REGAEN_HOST (1 << 0) +/* Available with KVM_CAP_MEMORY_ATTRIBUTES */ +#define KVM_GET_SUPPORTED_MEMORY_ATTRIBUTES _IOR(KVMIO, 0xd2, __u64) +#define KVM_SET_MEMORY_ATTRIBUTES _IOWR(KVMIO, 0xd3, struct kvm_memory_attributes) + +struct kvm_memory_attributes { + __u64 address; + __u64 size; + __u64 attributes; + __u64 flags; +}; + +#define KVM_MEMORY_ATTRIBUTE_READ (1ULL << 0) +#define KVM_MEMORY_ATTRIBUTE_WRITE (1ULL << 1) +#define KVM_MEMORY_ATTRIBUTE_EXECUTE (1ULL << 2) +#define KVM_MEMORY_ATTRIBUTE_PRIVATE (1ULL << 3) + #endif /* __LINUX_KVM_H */ diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 800f9470e36b..effdea5dd4f0 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -19,6 +19,9 @@ config HAVE_KVM_IRQ_ROUTING config HAVE_KVM_DIRTY_RING bool +config HAVE_KVM_MEMORY_ATTRIBUTES + bool + # Only strongly ordered architectures can select this, as it doesn't # put any explicit constraint on userspace ordering. They can also # select the _ACQ_REL version. diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 1782c4555d94..7f0f5e9f2406 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1150,6 +1150,9 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) spin_lock_init(&kvm->mn_invalidate_lock); rcuwait_init(&kvm->mn_memslots_update_rcuwait); xa_init(&kvm->vcpu_array); +#ifdef CONFIG_HAVE_KVM_MEMORY_ATTRIBUTES + xa_init(&kvm->mem_attr_array); +#endif INIT_LIST_HEAD(&kvm->gpc_list); spin_lock_init(&kvm->gpc_lock); @@ -1323,6 +1326,9 @@ static void kvm_destroy_vm(struct kvm *kvm) kvm_free_memslots(kvm, &kvm->__memslots[i][0]); kvm_free_memslots(kvm, &kvm->__memslots[i][1]); } +#ifdef CONFIG_HAVE_KVM_MEMORY_ATTRIBUTES + xa_destroy(&kvm->mem_attr_array); +#endif cleanup_srcu_struct(&kvm->irq_srcu); cleanup_srcu_struct(&kvm->srcu); kvm_arch_free_vm(kvm); @@ -2323,6 +2329,49 @@ static int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, } #endif /* CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT */ +#ifdef CONFIG_HAVE_KVM_MEMORY_ATTRIBUTES +static u64 kvm_supported_mem_attributes(struct kvm *kvm) +{ + return 0; +} + +static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm, + struct kvm_memory_attributes *attrs) +{ + gfn_t start, end; + unsigned long i; + void *entry; + u64 supported_attrs = kvm_supported_mem_attributes(kvm); + + /* flags is currently not used. */ + if (attrs->flags) + return -EINVAL; + if (attrs->attributes & ~supported_attrs) + return -EINVAL; + if (attrs->size == 0 || attrs->address + attrs->size < attrs->address) + return -EINVAL; + if (!PAGE_ALIGNED(attrs->address) || !PAGE_ALIGNED(attrs->size)) + return -EINVAL; + + start = attrs->address >> PAGE_SHIFT; + end = (attrs->address + attrs->size - 1 + PAGE_SIZE) >> PAGE_SHIFT; + + entry = attrs->attributes ? xa_mk_value(attrs->attributes) : NULL; + + mutex_lock(&kvm->lock); + for (i = start; i < end; i++) + if (xa_err(xa_store(&kvm->mem_attr_array, i, entry, + GFP_KERNEL_ACCOUNT))) + break; + mutex_unlock(&kvm->lock); + + attrs->address = i << PAGE_SHIFT; + attrs->size = (end - i) << PAGE_SHIFT; + + return 0; +} +#endif /* CONFIG_HAVE_KVM_MEMORY_ATTRIBUTES */ + struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn) { return __gfn_to_memslot(kvm_memslots(kvm), gfn); @@ -4459,6 +4508,9 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) #ifdef CONFIG_HAVE_KVM_MSI case KVM_CAP_SIGNAL_MSI: #endif +#ifdef CONFIG_HAVE_KVM_MEMORY_ATTRIBUTES + case KVM_CAP_MEMORY_ATTRIBUTES: +#endif #ifdef CONFIG_HAVE_KVM_IRQFD case KVM_CAP_IRQFD: case KVM_CAP_IRQFD_RESAMPLE: @@ -4804,6 +4856,30 @@ static long kvm_vm_ioctl(struct file *filp, break; } #endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */ +#ifdef CONFIG_HAVE_KVM_MEMORY_ATTRIBUTES + case KVM_GET_SUPPORTED_MEMORY_ATTRIBUTES: { + u64 attrs = kvm_supported_mem_attributes(kvm); + + r = -EFAULT; + if (copy_to_user(argp, &attrs, sizeof(attrs))) + goto out; + r = 0; + break; + } + case KVM_SET_MEMORY_ATTRIBUTES: { + struct kvm_memory_attributes attrs; + + r = -EFAULT; + if (copy_from_user(&attrs, argp, sizeof(attrs))) + goto out; + + r = kvm_vm_ioctl_set_mem_attributes(kvm, &attrs); + + if (!r && copy_to_user(argp, &attrs, sizeof(attrs))) + r = -EFAULT; + break; + } +#endif /* CONFIG_HAVE_KVM_MEMORY_ATTRIBUTES */ case KVM_CREATE_DEVICE: { struct kvm_create_device cd;