From patchwork Wed Jul 12 23:01:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119390 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1471120vqm; Wed, 12 Jul 2023 16:13:01 -0700 (PDT) X-Google-Smtp-Source: APBJJlFQviWphSwdvSot04CZr2KnXPOCiZLz8lNHGvc4a2McROB3X1JOf2RjTPoCuzVufTil3/zv X-Received: by 2002:a05:6a00:190c:b0:682:4c1c:a0f6 with SMTP id y12-20020a056a00190c00b006824c1ca0f6mr155444pfi.3.1689203580909; Wed, 12 Jul 2023 16:13:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203580; cv=none; d=google.com; s=arc-20160816; b=L0O8HmgUr79WfLTuoHPUAQi45A2hIg6j5faAJH/dCFL/8bTfnhB6JqL8q/JqbrRLNL +L8kXIo86UKxFrpBu2ks4iNUxym1KU3aNp4v4PfgVoSUQ3p2ePvSvovvxY63MexHBOCU yHn4MIXuWOBWCQ93/FRSsdjyEqQdnEHVEaljtuRkMbHE5Ojr4YU78VWBW1jXQnmHgaNW LzfQCeQOq4OjGEwT1DYPS05OJNY6pb3KMsqmFFbMjQBTUEHSeIG0IqN4vjkdkhAwOsP+ Tw0wRrWX0qWMcJePq+IsN2c4HfKNm0sDr8zhTUr22P4PliF4Cm7Drk1qFFdLSgIL0hUh maJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=cPgxlQRuEF4kLV/m9O7Pin0lzZwEcWXKXwrBZBGWi7Q=; fh=qLO4/VO18j+9BHx+eDGN4B9wTedSSqlU+IMbNJ/OHOg=; b=LI6XJ0ZBlymo8JX6XWNwZqXBGtAAHFwAyTFTvYLm7Nz4+nyLqNHsaD/gYPt1qYprZo z6R/SpVGVHX9BHqrIXDHO+Q7iYCQ6Bypp3vqSh5E/RP/Z2ys+InqQvM9TaVhnkIvwzeX uSJJIE1WbC5XvFs98ZUxcSey1nYVtd27IzT0R2EOTFvaTLzjC3mo3Ld3LoD0cQ2g5UWz +OdH+8ybVbP+tqTmYzCWJ6f2wxPxUgVtaoI8soGIOO6A7tx97mIznT5KKS7vaWgyPhnq NvafUrHbIlPsVuXqaUFWUS4z9c/2ticF+cx8ZpJcfta3tohv45cOzKIQGukr9p+UlinQ qU5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=l0jD8nJ+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q185-20020a632ac2000000b00553b5116cefsi4044382pgq.16.2023.07.12.16.12.48; Wed, 12 Jul 2023 16:13:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=l0jD8nJ+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232716AbjGLXCR (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230381AbjGLXCG (ORCPT ); Wed, 12 Jul 2023 19:02:06 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97BAC11D; Wed, 12 Jul 2023 16:02:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202925; x=1720738925; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UBFxVX9XZ+KYUAX4AWaBycO09tYP5L5/OF+5KXvou4U=; b=l0jD8nJ+nYIqUvWdHpx+hJwBYezU5MGnvahQtuAuQVdfy+GpJiahmlzO VcoRh7+excSNtJdlJoltn62B541aTbx0Ri/Pw9BZRceYaMoSaeXS57mBM kAUgU8qxLa1Fbxm1iSWw33KiKXJlZ/ksPVLjMShp40hbjXyKLa5rPRhDx Ztv/HrpFqbPzn9NMXENilxD2rdrh4mT0cvTNi8cLUHN7RbxTX3sVb1jVv 1sNLsk4M28YmdwKKs3pCL7H8q6a/hqmRCMffVZeI254vHWhIX+qo9wJrE EioMkLcQ4a7R/KH35TdzHS1jm1063Xl7Od0ryyE4y/ah8YumcZqowBNUY Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428773836" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428773836" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338582" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338582" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:03 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 01/28] x86/sgx: Store struct sgx_encl when allocating new VA pages Date: Wed, 12 Jul 2023 16:01:35 -0700 Message-Id: <20230712230202.47929-2-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258333992732295 X-GMAIL-MSGID: 1771258333992732295 In a later patch, when a cgroup has exceeded the max capacity for EPC pages and there are no more Enclave EPC pages associated with the cgroup that can be reclaimed, the only pages still associated with an enclave will be the unreclaimable Version Array (VA) pages or SECS pages, and the entire enclave will need to be killed to free up those pages. Currently, given an enclave pointer it is easy to find the associated VA pages and free them, however, OOM killing an enclave based on cgroup limits will require examining a cgroup's unreclaimable page list, and finding an enclave given a SECS page or a VA page. This will require a backpointer from a page to an enclave, including for VA pages. When allocating new Version Array (VA) pages, pass the struct sgx_encl of the enclave that is allocating the page. sgx_alloc_epc_page() will store this value in the owner field of the struct sgx_epc_page. In a later patch, VA pages will be placed in an unreclaimable queue, and then when the cgroup max limit is reached and there are no more reclaimable pages and the enclave must be OOM killed, all the VA pages associated with that enclave can be uncharged and freed. To avoid casting needed to access the two types of owners: sgx_encl for VA pages, sgx_encl_page for other pages, replace 'owner' field in sgx_epc_page with a union of the two types. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson V3: - rename encl_owner to encl_page. - revise commit messages --- arch/x86/kernel/cpu/sgx/encl.c | 5 +++-- arch/x86/kernel/cpu/sgx/encl.h | 2 +- arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- arch/x86/kernel/cpu/sgx/main.c | 20 ++++++++++---------- arch/x86/kernel/cpu/sgx/sgx.h | 5 ++++- 5 files changed, 19 insertions(+), 15 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 2a0e90fe2abc..98e1086eab07 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -1210,6 +1210,7 @@ void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr) /** * sgx_alloc_va_page() - Allocate a Version Array (VA) page + * @encl: The enclave that this page is allocated to. * @reclaim: Reclaim EPC pages directly if none available. Enclave * mutex should not be held if this is set. * @@ -1219,12 +1220,12 @@ void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr) * a VA page, * -errno otherwise */ -struct sgx_epc_page *sgx_alloc_va_page(bool reclaim) +struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim) { struct sgx_epc_page *epc_page; int ret; - epc_page = sgx_alloc_epc_page(NULL, reclaim); + epc_page = sgx_alloc_epc_page(encl, reclaim); if (IS_ERR(epc_page)) return ERR_CAST(epc_page); diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index f94ff14c9486..831d63f80f5a 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -116,7 +116,7 @@ struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl, unsigned long offset, u64 secinfo_flags); void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr); -struct sgx_epc_page *sgx_alloc_va_page(bool reclaim); +struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim); unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page); void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset); bool sgx_va_page_full(struct sgx_va_page *va_page); diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 21ca0a831b70..fa8c3f32ccf6 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -30,7 +30,7 @@ struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, bool reclaim) if (!va_page) return ERR_PTR(-ENOMEM); - va_page->epc_page = sgx_alloc_va_page(reclaim); + va_page->epc_page = sgx_alloc_va_page(encl, reclaim); if (IS_ERR(va_page->epc_page)) { err = ERR_CAST(va_page->epc_page); kfree(va_page); diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 166692f2d501..39939b7496b0 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -108,7 +108,7 @@ static unsigned long __sgx_sanitize_pages(struct list_head *dirty_page_list) static bool sgx_reclaimer_age(struct sgx_epc_page *epc_page) { - struct sgx_encl_page *page = epc_page->owner; + struct sgx_encl_page *page = epc_page->encl_page; struct sgx_encl *encl = page->encl; struct sgx_encl_mm *encl_mm; bool ret = true; @@ -140,7 +140,7 @@ static bool sgx_reclaimer_age(struct sgx_epc_page *epc_page) static void sgx_reclaimer_block(struct sgx_epc_page *epc_page) { - struct sgx_encl_page *page = epc_page->owner; + struct sgx_encl_page *page = epc_page->encl_page; unsigned long addr = page->desc & PAGE_MASK; struct sgx_encl *encl = page->encl; int ret; @@ -197,7 +197,7 @@ void sgx_ipi_cb(void *info) static void sgx_encl_ewb(struct sgx_epc_page *epc_page, struct sgx_backing *backing) { - struct sgx_encl_page *encl_page = epc_page->owner; + struct sgx_encl_page *encl_page = epc_page->encl_page; struct sgx_encl *encl = encl_page->encl; struct sgx_va_page *va_page; unsigned int va_offset; @@ -250,7 +250,7 @@ static void sgx_encl_ewb(struct sgx_epc_page *epc_page, static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, struct sgx_backing *backing) { - struct sgx_encl_page *encl_page = epc_page->owner; + struct sgx_encl_page *encl_page = epc_page->encl_page; struct sgx_encl *encl = encl_page->encl; struct sgx_backing secs_backing; int ret; @@ -312,7 +312,7 @@ static void sgx_reclaim_pages(void) epc_page = list_first_entry(&sgx_active_page_list, struct sgx_epc_page, list); list_del_init(&epc_page->list); - encl_page = epc_page->owner; + encl_page = epc_page->encl_page; if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) chunk[cnt++] = epc_page; @@ -326,7 +326,7 @@ static void sgx_reclaim_pages(void) for (i = 0; i < cnt; i++) { epc_page = chunk[i]; - encl_page = epc_page->owner; + encl_page = epc_page->encl_page; if (!sgx_reclaimer_age(epc_page)) goto skip; @@ -365,7 +365,7 @@ static void sgx_reclaim_pages(void) if (!epc_page) continue; - encl_page = epc_page->owner; + encl_page = epc_page->encl_page; sgx_reclaimer_write(epc_page, &backing[i]); kref_put(&encl_page->encl->refcount, sgx_encl_release); @@ -563,7 +563,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) for ( ; ; ) { page = __sgx_alloc_epc_page(); if (!IS_ERR(page)) { - page->owner = owner; + page->encl_page = owner; break; } @@ -606,7 +606,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); - page->owner = NULL; + page->encl_page = NULL; if (page->poison) list_add(&page->list, &node->sgx_poison_page_list); else @@ -641,7 +641,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; section->pages[i].flags = 0; - section->pages[i].owner = NULL; + section->pages[i].encl_page = NULL; section->pages[i].poison = 0; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index d2dad21259a8..dc1cbcfcf2d4 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -33,7 +33,10 @@ struct sgx_epc_page { unsigned int section; u16 flags; u16 poison; - struct sgx_encl_page *owner; + union { + struct sgx_encl_page *encl_page; + struct sgx_encl *encl; + }; struct list_head list; };