From patchwork Wed Jul 12 23:01:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119390 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1471120vqm; Wed, 12 Jul 2023 16:13:01 -0700 (PDT) X-Google-Smtp-Source: APBJJlFQviWphSwdvSot04CZr2KnXPOCiZLz8lNHGvc4a2McROB3X1JOf2RjTPoCuzVufTil3/zv X-Received: by 2002:a05:6a00:190c:b0:682:4c1c:a0f6 with SMTP id y12-20020a056a00190c00b006824c1ca0f6mr155444pfi.3.1689203580909; Wed, 12 Jul 2023 16:13:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203580; cv=none; d=google.com; s=arc-20160816; b=L0O8HmgUr79WfLTuoHPUAQi45A2hIg6j5faAJH/dCFL/8bTfnhB6JqL8q/JqbrRLNL +L8kXIo86UKxFrpBu2ks4iNUxym1KU3aNp4v4PfgVoSUQ3p2ePvSvovvxY63MexHBOCU yHn4MIXuWOBWCQ93/FRSsdjyEqQdnEHVEaljtuRkMbHE5Ojr4YU78VWBW1jXQnmHgaNW LzfQCeQOq4OjGEwT1DYPS05OJNY6pb3KMsqmFFbMjQBTUEHSeIG0IqN4vjkdkhAwOsP+ Tw0wRrWX0qWMcJePq+IsN2c4HfKNm0sDr8zhTUr22P4PliF4Cm7Drk1qFFdLSgIL0hUh maJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=cPgxlQRuEF4kLV/m9O7Pin0lzZwEcWXKXwrBZBGWi7Q=; fh=qLO4/VO18j+9BHx+eDGN4B9wTedSSqlU+IMbNJ/OHOg=; b=LI6XJ0ZBlymo8JX6XWNwZqXBGtAAHFwAyTFTvYLm7Nz4+nyLqNHsaD/gYPt1qYprZo z6R/SpVGVHX9BHqrIXDHO+Q7iYCQ6Bypp3vqSh5E/RP/Z2ys+InqQvM9TaVhnkIvwzeX uSJJIE1WbC5XvFs98ZUxcSey1nYVtd27IzT0R2EOTFvaTLzjC3mo3Ld3LoD0cQ2g5UWz +OdH+8ybVbP+tqTmYzCWJ6f2wxPxUgVtaoI8soGIOO6A7tx97mIznT5KKS7vaWgyPhnq NvafUrHbIlPsVuXqaUFWUS4z9c/2ticF+cx8ZpJcfta3tohv45cOzKIQGukr9p+UlinQ qU5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=l0jD8nJ+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q185-20020a632ac2000000b00553b5116cefsi4044382pgq.16.2023.07.12.16.12.48; Wed, 12 Jul 2023 16:13:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=l0jD8nJ+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232716AbjGLXCR (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230381AbjGLXCG (ORCPT ); Wed, 12 Jul 2023 19:02:06 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97BAC11D; Wed, 12 Jul 2023 16:02:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202925; x=1720738925; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UBFxVX9XZ+KYUAX4AWaBycO09tYP5L5/OF+5KXvou4U=; b=l0jD8nJ+nYIqUvWdHpx+hJwBYezU5MGnvahQtuAuQVdfy+GpJiahmlzO VcoRh7+excSNtJdlJoltn62B541aTbx0Ri/Pw9BZRceYaMoSaeXS57mBM kAUgU8qxLa1Fbxm1iSWw33KiKXJlZ/ksPVLjMShp40hbjXyKLa5rPRhDx Ztv/HrpFqbPzn9NMXENilxD2rdrh4mT0cvTNi8cLUHN7RbxTX3sVb1jVv 1sNLsk4M28YmdwKKs3pCL7H8q6a/hqmRCMffVZeI254vHWhIX+qo9wJrE EioMkLcQ4a7R/KH35TdzHS1jm1063Xl7Od0ryyE4y/ah8YumcZqowBNUY Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428773836" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428773836" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338582" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338582" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:03 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 01/28] x86/sgx: Store struct sgx_encl when allocating new VA pages Date: Wed, 12 Jul 2023 16:01:35 -0700 Message-Id: <20230712230202.47929-2-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258333992732295 X-GMAIL-MSGID: 1771258333992732295 In a later patch, when a cgroup has exceeded the max capacity for EPC pages and there are no more Enclave EPC pages associated with the cgroup that can be reclaimed, the only pages still associated with an enclave will be the unreclaimable Version Array (VA) pages or SECS pages, and the entire enclave will need to be killed to free up those pages. Currently, given an enclave pointer it is easy to find the associated VA pages and free them, however, OOM killing an enclave based on cgroup limits will require examining a cgroup's unreclaimable page list, and finding an enclave given a SECS page or a VA page. This will require a backpointer from a page to an enclave, including for VA pages. When allocating new Version Array (VA) pages, pass the struct sgx_encl of the enclave that is allocating the page. sgx_alloc_epc_page() will store this value in the owner field of the struct sgx_epc_page. In a later patch, VA pages will be placed in an unreclaimable queue, and then when the cgroup max limit is reached and there are no more reclaimable pages and the enclave must be OOM killed, all the VA pages associated with that enclave can be uncharged and freed. To avoid casting needed to access the two types of owners: sgx_encl for VA pages, sgx_encl_page for other pages, replace 'owner' field in sgx_epc_page with a union of the two types. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson V3: - rename encl_owner to encl_page. - revise commit messages --- arch/x86/kernel/cpu/sgx/encl.c | 5 +++-- arch/x86/kernel/cpu/sgx/encl.h | 2 +- arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- arch/x86/kernel/cpu/sgx/main.c | 20 ++++++++++---------- arch/x86/kernel/cpu/sgx/sgx.h | 5 ++++- 5 files changed, 19 insertions(+), 15 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 2a0e90fe2abc..98e1086eab07 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -1210,6 +1210,7 @@ void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr) /** * sgx_alloc_va_page() - Allocate a Version Array (VA) page + * @encl: The enclave that this page is allocated to. * @reclaim: Reclaim EPC pages directly if none available. Enclave * mutex should not be held if this is set. * @@ -1219,12 +1220,12 @@ void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr) * a VA page, * -errno otherwise */ -struct sgx_epc_page *sgx_alloc_va_page(bool reclaim) +struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim) { struct sgx_epc_page *epc_page; int ret; - epc_page = sgx_alloc_epc_page(NULL, reclaim); + epc_page = sgx_alloc_epc_page(encl, reclaim); if (IS_ERR(epc_page)) return ERR_CAST(epc_page); diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index f94ff14c9486..831d63f80f5a 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -116,7 +116,7 @@ struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl, unsigned long offset, u64 secinfo_flags); void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr); -struct sgx_epc_page *sgx_alloc_va_page(bool reclaim); +struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim); unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page); void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset); bool sgx_va_page_full(struct sgx_va_page *va_page); diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 21ca0a831b70..fa8c3f32ccf6 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -30,7 +30,7 @@ struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, bool reclaim) if (!va_page) return ERR_PTR(-ENOMEM); - va_page->epc_page = sgx_alloc_va_page(reclaim); + va_page->epc_page = sgx_alloc_va_page(encl, reclaim); if (IS_ERR(va_page->epc_page)) { err = ERR_CAST(va_page->epc_page); kfree(va_page); diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 166692f2d501..39939b7496b0 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -108,7 +108,7 @@ static unsigned long __sgx_sanitize_pages(struct list_head *dirty_page_list) static bool sgx_reclaimer_age(struct sgx_epc_page *epc_page) { - struct sgx_encl_page *page = epc_page->owner; + struct sgx_encl_page *page = epc_page->encl_page; struct sgx_encl *encl = page->encl; struct sgx_encl_mm *encl_mm; bool ret = true; @@ -140,7 +140,7 @@ static bool sgx_reclaimer_age(struct sgx_epc_page *epc_page) static void sgx_reclaimer_block(struct sgx_epc_page *epc_page) { - struct sgx_encl_page *page = epc_page->owner; + struct sgx_encl_page *page = epc_page->encl_page; unsigned long addr = page->desc & PAGE_MASK; struct sgx_encl *encl = page->encl; int ret; @@ -197,7 +197,7 @@ void sgx_ipi_cb(void *info) static void sgx_encl_ewb(struct sgx_epc_page *epc_page, struct sgx_backing *backing) { - struct sgx_encl_page *encl_page = epc_page->owner; + struct sgx_encl_page *encl_page = epc_page->encl_page; struct sgx_encl *encl = encl_page->encl; struct sgx_va_page *va_page; unsigned int va_offset; @@ -250,7 +250,7 @@ static void sgx_encl_ewb(struct sgx_epc_page *epc_page, static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, struct sgx_backing *backing) { - struct sgx_encl_page *encl_page = epc_page->owner; + struct sgx_encl_page *encl_page = epc_page->encl_page; struct sgx_encl *encl = encl_page->encl; struct sgx_backing secs_backing; int ret; @@ -312,7 +312,7 @@ static void sgx_reclaim_pages(void) epc_page = list_first_entry(&sgx_active_page_list, struct sgx_epc_page, list); list_del_init(&epc_page->list); - encl_page = epc_page->owner; + encl_page = epc_page->encl_page; if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) chunk[cnt++] = epc_page; @@ -326,7 +326,7 @@ static void sgx_reclaim_pages(void) for (i = 0; i < cnt; i++) { epc_page = chunk[i]; - encl_page = epc_page->owner; + encl_page = epc_page->encl_page; if (!sgx_reclaimer_age(epc_page)) goto skip; @@ -365,7 +365,7 @@ static void sgx_reclaim_pages(void) if (!epc_page) continue; - encl_page = epc_page->owner; + encl_page = epc_page->encl_page; sgx_reclaimer_write(epc_page, &backing[i]); kref_put(&encl_page->encl->refcount, sgx_encl_release); @@ -563,7 +563,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) for ( ; ; ) { page = __sgx_alloc_epc_page(); if (!IS_ERR(page)) { - page->owner = owner; + page->encl_page = owner; break; } @@ -606,7 +606,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); - page->owner = NULL; + page->encl_page = NULL; if (page->poison) list_add(&page->list, &node->sgx_poison_page_list); else @@ -641,7 +641,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; section->pages[i].flags = 0; - section->pages[i].owner = NULL; + section->pages[i].encl_page = NULL; section->pages[i].poison = 0; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index d2dad21259a8..dc1cbcfcf2d4 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -33,7 +33,10 @@ struct sgx_epc_page { unsigned int section; u16 flags; u16 poison; - struct sgx_encl_page *owner; + union { + struct sgx_encl_page *encl_page; + struct sgx_encl *encl; + }; struct list_head list; }; From patchwork Wed Jul 12 23:01:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119393 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1471896vqm; Wed, 12 Jul 2023 16:14:35 -0700 (PDT) X-Google-Smtp-Source: APBJJlF3IitxCfaigRjG1PpgAF3u13EM4bUbfg4J0/rZ3ScN4zcneDaE9PhoCbphzkXacumWtWLj X-Received: by 2002:a05:6a20:605:b0:12d:1d8f:6af8 with SMTP id 5-20020a056a20060500b0012d1d8f6af8mr14262782pzl.18.1689203675110; Wed, 12 Jul 2023 16:14:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203675; cv=none; d=google.com; s=arc-20160816; b=W59N1cgoT5QXlJusLgbt2J3HsZZol9NsEZaZKvXhiClBqOizxglKy5HTbJDbxILaLL NgOrx0bnv+ezs96wcxJ7CJEHwV+Ga1TuaxNPwGHnIdofDZXSxxnEqnwJucqE2kxvq+QV A/8gK6A2W9aATbx0Xlh/ghjwXxd/LSvXoai4lK8B0WA6jvx5cJCvz/AsK8JZOt+3Bivg pvu+QntaU1Xr5vANZ0DTt2JXt0do80RB0fwDLqQfiBsXI71iFdsfCQOPf0gIi3ANm1Lg 0f981ujKlhNmQFg3nuVaRsFlKJAL6B6epJanGTAKL8wedJC9fJ7rLd+/NPOFIR+LkG13 5d9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Yvcaq+RbJ86Ra86FBdSGm6dP1/mBGMkyHBx52NWh4PE=; fh=yb4uG0y4asZWHqlPLNfXLnkT1yMR9ApbW5Mi8ElCAYw=; b=lIU7UgvPkpal7kDjy3coIDwFpfZLC2Em5EocdnXOpdn2dJQc8B5IwEb3a4NuStyRCn Lo0akh2B9iFK4hNZOBlHm0ik3UkG20oUEdvquaZJlUwyM/IWbqferebKWRhfZGNMWUbe 2nHMIMv+/Gk7VU1YgjowjhdvXeQ4g4CRIoDk/wXMYtx+igkzbBMORuYUn+3mjw4DXbMJ +YB7QcDMekuSmwMI78SG4PeqfzPvDLmFMq4X/UJtk93CXLs/xwkk6l0eR1OhZhVLhvjC hCZRXCx/X0ATDA5e7FVQb4tLaMi7VVYN9hYKC0Mo6YM6UXQtlCfO6MlfAJI61Sesvzse TTWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MDlDClxb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n18-20020a170903405200b001b8a89a2726si3948982pla.80.2023.07.12.16.14.22; Wed, 12 Jul 2023 16:14:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MDlDClxb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232797AbjGLXCV (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230429AbjGLXCG (ORCPT ); Wed, 12 Jul 2023 19:02:06 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C31BA18E; Wed, 12 Jul 2023 16:02:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202925; x=1720738925; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YaqmO5zwAAno/VG9svXY8o2blO0PZnu2ossyoP+/A04=; b=MDlDClxbT7f5pTgQZSoVCz2ncGXd+K+B4CivkTQP7LUKcRiNiPm7YG2G nxoDozHPrgxATZjgyZ9fq0bKEqgX41Ff02j6UTKPZ5e/ieD3GsulLB0Et tLkx9jq2hZU8ngPPpiOMgUVAdR74l/ASxvRAMeSWStmkOeuWBojENXV/E GvGsb6hBbUK+/tn0SICu1kRmh1O5i9j9o+jjqCIg+kB6FCiRy1wMjMpv9 qHt0I7WdbY+kz2FUC51w5Pog4lJYBvYz9At7K67rCczsHC1qs5+KALSvk gMntWZYN0C9Hr/67kJbqlx5bLnqb6AyB4+bzEcijQCaBeJVysG4hm6UoR g==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428773851" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428773851" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338586" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338586" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:04 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Sean Christopherson , zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 02/28] x86/sgx: Add EPC page flags to identify owner type Date: Wed, 12 Jul 2023 16:01:36 -0700 Message-Id: <20230712230202.47929-3-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258432406015578 X-GMAIL-MSGID: 1771258432406015578 From: Sean Christopherson Two types of owners, 'sgx_encl' for VA pages and 'sgx_encl_page' for other, can be stored in the union field in sgx_epc_page struct introduced in the previous patch. When cgroup OOM support is added in a later patch, the owning enclave of a page will need to be identified. Retrieving the sgx_encl struct from a sgx_epc_page will be different if the page is a VA page vs. other enclave pages. Add 2 flags which will identify the type of the owner and apply them accordingly to newly allocated pages. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson V3: - Renamed the flags to clarify they are used to identify the type of the owner. --- arch/x86/kernel/cpu/sgx/encl.c | 4 ++++ arch/x86/kernel/cpu/sgx/ioctl.c | 4 ++++ arch/x86/kernel/cpu/sgx/sgx.h | 6 ++++++ 3 files changed, 14 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 98e1086eab07..3bc2f95b1da2 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -252,6 +252,7 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, epc_page = sgx_encl_eldu(&encl->secs, NULL); if (IS_ERR(epc_page)) return ERR_CAST(epc_page); + epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; } epc_page = sgx_encl_eldu(entry, encl->secs.epc_page); @@ -260,6 +261,7 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, encl->secs_child_cnt++; sgx_mark_page_reclaimable(entry->epc_page); + entry->epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; return entry; } @@ -379,6 +381,7 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma, encl->secs_child_cnt++; sgx_mark_page_reclaimable(encl_page->epc_page); + encl_page->epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; phys_addr = sgx_get_epc_phys_addr(epc_page); /* @@ -1235,6 +1238,7 @@ struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim) sgx_encl_free_epc_page(epc_page); return ERR_PTR(-EFAULT); } + epc_page->flags |= SGX_EPC_OWNER_ENCL; return epc_page; } diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index fa8c3f32ccf6..fe3e89cf013f 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -113,6 +113,8 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs) encl->attributes = secs->attributes; encl->attributes_mask = SGX_ATTR_UNPRIV_MASK; + encl->secs.epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; + /* Set only after completion, as encl->lock has not been taken. */ set_bit(SGX_ENCL_CREATED, &encl->flags); @@ -323,6 +325,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src, } sgx_mark_page_reclaimable(encl_page->epc_page); + encl_page->epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; mutex_unlock(&encl->lock); mmap_read_unlock(current->mm); return ret; @@ -977,6 +980,7 @@ static long sgx_enclave_modify_types(struct sgx_encl *encl, mutex_lock(&encl->lock); sgx_mark_page_reclaimable(entry->epc_page); + entry->epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; } /* Change EPC type */ diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index dc1cbcfcf2d4..f6e3c5810eef 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -29,6 +29,12 @@ /* Pages on free list */ #define SGX_EPC_PAGE_IS_FREE BIT(1) +/* flag for pages owned by a sgx_encl_page */ +#define SGX_EPC_OWNER_ENCL_PAGE BIT(3) + +/* flag for pages owned by a sgx_encl struct */ +#define SGX_EPC_OWNER_ENCL BIT(4) + struct sgx_epc_page { unsigned int section; u16 flags; From patchwork Wed Jul 12 23:01:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119385 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1470829vqm; Wed, 12 Jul 2023 16:12:31 -0700 (PDT) X-Google-Smtp-Source: APBJJlFdK5Vb3VvpF61WSQv5dj4nKvWJW13VKcZvE4hDRa2A8oRZ7rr08COIapMDT+gHXHQqIwTb X-Received: by 2002:a05:6808:2388:b0:3a4:1c0b:85d0 with SMTP id bp8-20020a056808238800b003a41c0b85d0mr6636477oib.55.1689203550854; Wed, 12 Jul 2023 16:12:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203550; cv=none; d=google.com; s=arc-20160816; b=nju4zjZnINytzabNUxRE1Mi68VqXA11cmAW1r5LYaW1zVnmIt6tPtubuShXu6/HiBm 5e5y0K+qhqIFJ0NQb5p5FYoN4sECgf6CJmAcj1fHNEOZJD2RCSb2nxcE667EBV+fybA0 /m+4qOfu6nV2NDwnZDrzRHunX4IMDKc/8rfo3Ky4eK83XmP/Ottxl/qQVdAE69HlzRzj 5XI4mGLPmfmi9fBouhaumgObs+qA43TLroFRd9Euip7ipdzqPY03t3+K4WIgy7mK0fO0 Kw+LltEy/gVmqKnPRA/8qXdyD0+PwnTFYJ368gAg9sIR8uuI4HpwQXcJyjYwTcaNQPQP swCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=r3IiiiWagHJsOQ3M/VksCyE3D/NVlKL04GAltjpmPH8=; fh=sp9Fy9uDTUYbTQiOiWOfY01qY8ktdO/xQ5V81goPrjk=; b=jf09JdGNFPe9Vatx2JVkX4afHT0LBdYFrQ7t5KV1w06rSH9v6I/zK4rMMC9o+2z7Y7 I2/m6HSSCCbF3c2pEnwz02+14WNj3gUdPT8TAXWWxaDnYH+V3KXtg9RxRRi1/mPeXohi r4KKEcghYx53LeaIVYuAKc0aKO6ugBV+N8VB7mNvO7AIdCcys7HWTdp19VukQO7Sqwqu Cm2RgtDwy5HXtmB6ue8WSPy444VPcO9gb7YUq06js3NF8ZN+eSqPfc+9sdsJsn4TWGQv KzmtCQwlsqhCzueF8qayyWobgOe/a0przMOSAH/AmKUkwem6JeiOeu16rwp5kes300D9 FkOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DBICTo6I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 10-20020a63124a000000b0053f163363c0si3500550pgs.95.2023.07.12.16.12.17; Wed, 12 Jul 2023 16:12:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DBICTo6I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232841AbjGLXCX (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231461AbjGLXCI (ORCPT ); Wed, 12 Jul 2023 19:02:08 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7B5819B; Wed, 12 Jul 2023 16:02:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202926; x=1720738926; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=I7lZHiacZcvp7bKbGNbGwMPQS1s/EpOcXyFib1sY4o0=; b=DBICTo6IJ+C9g9UaUIyGtYk2G28gm5YT97MVsWZBkSh5HM+Mq1DyHFFx DCQD+ctJvrT3vP60cd+NZVNIDjMRgtwporF8egGsrYk/VGm1hLSScNMrQ rkh1peCMmUwwYKwjMKnZxY9ElA8yvBjlSBjwOw/dHbiyebdrfhyGCKHwL Ai3NkXdoIwvWMDr6c8W+VvNtpcxZymLm7UoCYrAbT+kLLNr0zASfgN3oR 0qhmqB5UqdXuEd4JQY7MM/TLPm6zsdJtSb+dWyfWnNal3NHmE7jjGUdmc bRjTvGALwCHzzVXxhPCzF64uVB5YDmOcMVAZ248dEysnGfR96eCbkj5j8 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428773872" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428773872" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338589" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338589" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:04 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Kristen Carlson Accardi , zhiquan1.li@intel.com, seanjc@google.com Subject: [PATCH v3 03/28] x86/sgx: Add 'struct sgx_epc_lru_lists' to encapsulate lru list(s) Date: Wed, 12 Jul 2023 16:01:37 -0700 Message-Id: <20230712230202.47929-4-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258302584221217 X-GMAIL-MSGID: 1771258302584221217 From: Kristen Carlson Accardi Introduce a data structure to wrap the existing reclaimable list and its spinlock in a struct to minimize the code changes needed to handle multiple LRUs as well as reclaimable and non-reclaimable lists. The new structure will be used in a following set of patches to implement SGX EPC cgroups. The changes to the structure needed for unreclaimable lists will be added in later patches. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson V3: Removed the helper functions and revised commit messages --- arch/x86/kernel/cpu/sgx/sgx.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index f6e3c5810eef..77fceba73a25 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -92,6 +92,23 @@ static inline void *sgx_get_epc_virt_addr(struct sgx_epc_page *page) return section->virt_addr + index * PAGE_SIZE; } +/* + * This data structure wraps a list of reclaimable EPC pages, and a list of + * non-reclaimable EPC pages and is used to implement a LRU policy during + * reclamation. + */ +struct sgx_epc_lru_lists { + /* Must acquire this lock to access */ + spinlock_t lock; + struct list_head reclaimable; +}; + +static inline void sgx_lru_init(struct sgx_epc_lru_lists *lrus) +{ + spin_lock_init(&lrus->lock); + INIT_LIST_HEAD(&lrus->reclaimable); +} + struct sgx_epc_page *__sgx_alloc_epc_page(void); void sgx_free_epc_page(struct sgx_epc_page *page); From patchwork Wed Jul 12 23:01:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119394 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1471912vqm; Wed, 12 Jul 2023 16:14:38 -0700 (PDT) X-Google-Smtp-Source: APBJJlGj75G9brVJvAWxsg9Vk7dSBjA2RiLQqXk7xR6LCEs50c4RdJFEQycKyv4n6bZoeC54CxLg X-Received: by 2002:a05:6a00:24c8:b0:644:d775:60bb with SMTP id d8-20020a056a0024c800b00644d77560bbmr64991pfv.20.1689203678186; Wed, 12 Jul 2023 16:14:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203678; cv=none; d=google.com; s=arc-20160816; b=ScI8gqFPraPvaeIzw3bE6iJtgEAe/Ys3mywnYLAiA7ffAiNjPNB0tCl6MYwFvR6ujm Kf2ORD9WwVwNl7v2FPXnSDTItnmZG/MA5rvWtDaEHf/w5zG8iVfJpHCz7v1hctXbPp3r RSUZhXcvvEWjleslPjWLxGusuS/M8ztkrq9HncRiiqWQrsUlPjCBsIZvLi92xldBPxRr aYjHJnehYgyQJVmW0DD1T0J7zJQgp1TEDa+DN9QQNLFiyVLaFOkJCxx/ppLcVPf892xO T1RBDx5/4XNKCvWbroXe5OkxTm1LKDrg8VjDnESCWhhCD+BubcFRiyJgFI49FR9CpCk6 Q2tw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=PvDKiwt4ENfFYv7aKidplhVvo3uS5swXpg/6iZ7zcZk=; fh=sp9Fy9uDTUYbTQiOiWOfY01qY8ktdO/xQ5V81goPrjk=; b=LXzy7SKu14+kTa1w0Y938k7AfBgI+dlRyNhsW6y4qYHVUG2EaftBTs9fJhZgI8iOQh jZTA1bstFq/TDXzKQLhFwhl4el7UkXTKsjNv5W+DKFkL6zuTBDNUO/LRI0o36P/6jt3X QLagYrtAqs8ZQJGaeVYwqQ+ToFJvjdbWLzdwGptknnZjxSwQKjHaQBTTD/X7BsP1eRRr acV1U3t8uiKyYAFwBGkcmeOYc+DNwpiy1S0MVniXSohvMzcYq7ItFUGxWSeoPIs6VFkO 66pj8ZiEgCep4plTcoHrGbG5VWRTofICBgzXZW/kY630fEgpZyd8qmF5sH99+cVMUYQ0 0dWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kdnuYNH2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s24-20020a656918000000b005429411d104si4001586pgq.897.2023.07.12.16.14.25; Wed, 12 Jul 2023 16:14:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kdnuYNH2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232902AbjGLXCZ (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231298AbjGLXCI (ORCPT ); Wed, 12 Jul 2023 19:02:08 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2C2F10D4; Wed, 12 Jul 2023 16:02:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202926; x=1720738926; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ndL93YXY4vK42843f/0Ki+d7gkMl28hJU78J6sdcluM=; b=kdnuYNH2O72UtA+NeyN/uIB13Gz1OY+oidh4SDqNE3vWdcJwH1B90cMt kSf0LnXdC2fK8auEltOynFQ9cCOeiMtltMVCTY2Dbg//RxXZdGD0LrGGw 60KGDQbJqzrbScicyJ3v71XDSGINzNZ1FMUgyd115IAkwJQZD5r3RK7hs i+BpeaA1KBrK12eiRo7xohSlFo0Cn7swV0OZZsWq72AmSoId/LFp8CLXm YpCmigOAVxILWTXa3nZ53HMLQWI4miGI7RtHRS23DLR9w1EaFXWza4Dd0 GKSJ4rTcmXiJy6sIse1iRYxqqThvsF5NGTreg3opIdWwMtkxYbSViF4pT A==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428773890" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428773890" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338593" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338593" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:05 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Kristen Carlson Accardi , zhiquan1.li@intel.com, seanjc@google.com Subject: [PATCH v3 04/28] x86/sgx: Use sgx_epc_lru_lists for existing active page list Date: Wed, 12 Jul 2023 16:01:38 -0700 Message-Id: <20230712230202.47929-5-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258435880467535 X-GMAIL-MSGID: 1771258435880467535 From: Kristen Carlson Accardi Replace the existing sgx_active_page_list and its spinlock with a global sgx_epc_lru_lists struct. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson V3: - Remove usage of list wrapper --- arch/x86/kernel/cpu/sgx/main.c | 39 +++++++++++++++++----------------- 1 file changed, 20 insertions(+), 19 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 39939b7496b0..71c3386ccf23 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -26,10 +26,9 @@ static DEFINE_XARRAY(sgx_epc_address_space); /* * These variables are part of the state of the reclaimer, and must be accessed - * with sgx_reclaimer_lock acquired. + * with sgx_global_lru.lock acquired. */ -static LIST_HEAD(sgx_active_page_list); -static DEFINE_SPINLOCK(sgx_reclaimer_lock); +static struct sgx_epc_lru_lists sgx_global_lru; static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0); @@ -304,13 +303,13 @@ static void sgx_reclaim_pages(void) int ret; int i; - spin_lock(&sgx_reclaimer_lock); + spin_lock(&sgx_global_lru.lock); for (i = 0; i < SGX_NR_TO_SCAN; i++) { - if (list_empty(&sgx_active_page_list)) + epc_page = list_first_entry_or_null(&sgx_global_lru.reclaimable, + struct sgx_epc_page, list); + if (!epc_page) break; - epc_page = list_first_entry(&sgx_active_page_list, - struct sgx_epc_page, list); list_del_init(&epc_page->list); encl_page = epc_page->encl_page; @@ -322,7 +321,7 @@ static void sgx_reclaim_pages(void) */ epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; } - spin_unlock(&sgx_reclaimer_lock); + spin_unlock(&sgx_global_lru.lock); for (i = 0; i < cnt; i++) { epc_page = chunk[i]; @@ -345,9 +344,9 @@ static void sgx_reclaim_pages(void) continue; skip: - spin_lock(&sgx_reclaimer_lock); - list_add_tail(&epc_page->list, &sgx_active_page_list); - spin_unlock(&sgx_reclaimer_lock); + spin_lock(&sgx_global_lru.lock); + list_add_tail(&epc_page->list, &sgx_global_lru.reclaimable); + spin_unlock(&sgx_global_lru.lock); kref_put(&encl_page->encl->refcount, sgx_encl_release); @@ -378,7 +377,7 @@ static void sgx_reclaim_pages(void) static bool sgx_should_reclaim(unsigned long watermark) { return atomic_long_read(&sgx_nr_free_pages) < watermark && - !list_empty(&sgx_active_page_list); + !list_empty(&sgx_global_lru.reclaimable); } /* @@ -430,6 +429,8 @@ static bool __init sgx_page_reclaimer_init(void) ksgxd_tsk = tsk; + sgx_lru_init(&sgx_global_lru); + return true; } @@ -505,10 +506,10 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) */ void sgx_mark_page_reclaimable(struct sgx_epc_page *page) { - spin_lock(&sgx_reclaimer_lock); + spin_lock(&sgx_global_lru.lock); page->flags |= SGX_EPC_PAGE_RECLAIMER_TRACKED; - list_add_tail(&page->list, &sgx_active_page_list); - spin_unlock(&sgx_reclaimer_lock); + list_add_tail(&page->list, &sgx_global_lru.reclaimable); + spin_unlock(&sgx_global_lru.lock); } /** @@ -523,18 +524,18 @@ void sgx_mark_page_reclaimable(struct sgx_epc_page *page) */ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page) { - spin_lock(&sgx_reclaimer_lock); + spin_lock(&sgx_global_lru.lock); if (page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) { /* The page is being reclaimed. */ if (list_empty(&page->list)) { - spin_unlock(&sgx_reclaimer_lock); + spin_unlock(&sgx_global_lru.lock); return -EBUSY; } list_del(&page->list); page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; } - spin_unlock(&sgx_reclaimer_lock); + spin_unlock(&sgx_global_lru.lock); return 0; } @@ -567,7 +568,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - if (list_empty(&sgx_active_page_list)) + if (list_empty(&sgx_global_lru.reclaimable)) return ERR_PTR(-ENOMEM); if (!reclaim) { From patchwork Wed Jul 12 23:01:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119396 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1472232vqm; Wed, 12 Jul 2023 16:15:18 -0700 (PDT) X-Google-Smtp-Source: APBJJlFPanYDJGJOkSWg/USliloVWvU9bHKGhz+0AUUPCUI5sg1o6GgjgdmmbFs+w9HxfmJ+t6E7 X-Received: by 2002:a17:90a:8914:b0:25b:c8b7:9e5b with SMTP id u20-20020a17090a891400b0025bc8b79e5bmr19684690pjn.31.1689203718550; Wed, 12 Jul 2023 16:15:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203718; cv=none; d=google.com; s=arc-20160816; b=A/pKbu+fqfrS31MceGHkGXJK6XP8yPxJknINXToOni+VoOfb0k3lhQsK3sf+ph3115 tsL5iXa9R7+FgmTZblX0n64dub5AAmhVTu8dhzmppOiBu9jz3cAr02RL4dmThnPPzkQl Q3iGslLxXURrxjsjGasMqFiOjxw9Rsh7/zq8no1K3YRPzBzh5lBKCKjM63SK9s9Kw76+ CNMQqLiWeSeBn3hs86YBIKqw52vtIGZvnyWWnt3RI5fiAReSuKlk/D9YqDyp5coPdtq1 VgZSK8J5GliwN5BJOf6qx35YwLkUik0gl2TP+7SdZ6c0BpSOxzLfUibtx/HeHaQI4DdQ XHuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=wuXGzMt0O6Upn8Rk8bRb3/3FW4nO9BhiXaMF0ObjQmw=; fh=sp9Fy9uDTUYbTQiOiWOfY01qY8ktdO/xQ5V81goPrjk=; b=Wy/p+zPP5yrHS65IYJFlCOMw9N6OQpPhXhgMipht7B9+BOwDidjmM1oX4KXCkjPLIK 7QxyT73YgurCBJGaOv0EldTjJV7GYwtXC9R4G2IwoTrBIT/gUMyeD6uPb3Mr1+HUUJ9M k6oZILL/SnQ3ZkCAf9FrxhPBL6HyC47TrYvKX74hasWeRyWF/1Sibyt+JfJcMKjH0qW4 iohfvmkWVsIGvxRBbQNWYjqI6ZJJ6zSnbeUucW+sa3ai1bOgP3lJ5ol+tN8Q1bK/gnx1 P58UznublaNdo2JYVV0UBbNY8zBcKdEF7AhUFCDZ5ETd2LVLvf2i0fR6ikTvGlqGCvkG 2Iig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=E5HaPGmK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i194-20020a636dcb000000b0055b2c94e7b1si3844334pgc.190.2023.07.12.16.15.05; Wed, 12 Jul 2023 16:15:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=E5HaPGmK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232935AbjGLXC1 (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231819AbjGLXCK (ORCPT ); Wed, 12 Jul 2023 19:02:10 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8321518E; Wed, 12 Jul 2023 16:02:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202928; x=1720738928; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=chkYgAsoYzG4KCqvu3JEl3d42kseLyS4qIMTUoOAaz0=; b=E5HaPGmKN1qS2XMkSO2yWN/kgfnrkwU4vrWWUmspMclQLDKiK4j1Ixb/ IzzI4+PzJRdwU503A7aqDlao3RlWqw2635FxaCSkuTLQXu6AbbjTGW2Q3 k+4uIFU4T+2COipzGWA5KZlOYuPkJCva401oVFqAcX616IAc5kZwL3HBT vQK2xEPnnMaIU8zSX8c3hI26YiOv/e3XbSDkidJycwWXwU5/9jqn2ZONi pahAp3bkaD+hwFscPWzWweM1QO9KG62lz+Piwq9eLYqgF31igyCtyUcwX ySQZKsByq87XqsTgtMFruOiBjgIi3W8jQnfqYtAbg+zOA/Y2ZP0KogdE2 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428773911" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428773911" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338597" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338597" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:06 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Kristen Carlson Accardi , zhiquan1.li@intel.com, seanjc@google.com Subject: [PATCH v3 05/28] x86/sgx: Store reclaimable epc pages in sgx_epc_lru_lists Date: Wed, 12 Jul 2023 16:01:39 -0700 Message-Id: <20230712230202.47929-6-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258478075330615 X-GMAIL-MSGID: 1771258478075330615 From: Kristen Carlson Accardi When an OOM event occurs, it becomes necessary to free all pages associated with an enclave, including those not currently tracked by the reclaimer. As a result, each page must eventually be added to the cgroup's LRU list struct, regardless of whether it is tracked by the reclaimer or not. This patch prepares for the inclusion of currently untracked pages by replacing the functions sgx_mark_page_reclaimable() and sgx_unmark_page_reclaimable() with sgx_record_epc_page() and sgx_drop_epc_page(). The sgx_record_epc_page() function adds the epc_page to the "reclaimable" list in the sgx_epc_lru_lists struct, while sgx_drop_epc_page() removes the page from the LRU list. For now, this change serves as a straightforward replacement of the two functions for pages tracked by the reclaimer. A subsequent patch will introduce the capability to track unreclaimable pages using these same functions. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/encl.c | 10 +++++----- arch/x86/kernel/cpu/sgx/ioctl.c | 12 ++++++------ arch/x86/kernel/cpu/sgx/main.c | 22 ++++++++++++---------- arch/x86/kernel/cpu/sgx/sgx.h | 4 ++-- 4 files changed, 25 insertions(+), 23 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 3bc2f95b1da2..f68af9e37daa 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -260,8 +260,8 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, return ERR_CAST(epc_page); encl->secs_child_cnt++; - sgx_mark_page_reclaimable(entry->epc_page); - entry->epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; + sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE | + SGX_EPC_PAGE_RECLAIMER_TRACKED); return entry; } @@ -380,8 +380,8 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma, encl_page->type = SGX_PAGE_TYPE_REG; encl->secs_child_cnt++; - sgx_mark_page_reclaimable(encl_page->epc_page); - encl_page->epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; + sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE | + SGX_EPC_PAGE_RECLAIMER_TRACKED); phys_addr = sgx_get_epc_phys_addr(epc_page); /* @@ -697,7 +697,7 @@ void sgx_encl_release(struct kref *ref) * The page and its radix tree entry cannot be freed * if the page is being held by the reclaimer. */ - if (sgx_unmark_page_reclaimable(entry->epc_page)) + if (sgx_drop_epc_page(entry->epc_page)) continue; sgx_encl_free_epc_page(entry->epc_page); diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index fe3e89cf013f..dd7ab1c80db6 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -324,8 +324,8 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src, goto err_out; } - sgx_mark_page_reclaimable(encl_page->epc_page); - encl_page->epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; + sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE | + SGX_EPC_PAGE_RECLAIMER_TRACKED); mutex_unlock(&encl->lock); mmap_read_unlock(current->mm); return ret; @@ -964,7 +964,7 @@ static long sgx_enclave_modify_types(struct sgx_encl *encl, * Prevent page from being reclaimed while mutex * is released. */ - if (sgx_unmark_page_reclaimable(entry->epc_page)) { + if (sgx_drop_epc_page(entry->epc_page)) { ret = -EAGAIN; goto out_entry_changed; } @@ -979,8 +979,8 @@ static long sgx_enclave_modify_types(struct sgx_encl *encl, mutex_lock(&encl->lock); - sgx_mark_page_reclaimable(entry->epc_page); - entry->epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; + sgx_record_epc_page(entry->epc_page, SGX_EPC_OWNER_ENCL_PAGE | + SGX_EPC_PAGE_RECLAIMER_TRACKED); } /* Change EPC type */ @@ -1137,7 +1137,7 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl, goto out_unlock; } - if (sgx_unmark_page_reclaimable(entry->epc_page)) { + if (sgx_drop_epc_page(entry->epc_page)) { ret = -EBUSY; goto out_unlock; } diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 71c3386ccf23..371135665ff7 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -268,7 +268,6 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, goto out; sgx_encl_ewb(encl->secs.epc_page, &secs_backing); - sgx_encl_free_epc_page(encl->secs.epc_page); encl->secs.epc_page = NULL; @@ -498,31 +497,34 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) } /** - * sgx_mark_page_reclaimable() - Mark a page as reclaimable + * sgx_record_epc_page() - Add a page to the appropriate LRU list * @page: EPC page + * @flags: The type of page that is being recorded * - * Mark a page as reclaimable and add it to the active page list. Pages - * are automatically removed from the active list when freed. + * Mark a page with the specified flags and add it to the appropriate + * list. */ -void sgx_mark_page_reclaimable(struct sgx_epc_page *page) +void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) { spin_lock(&sgx_global_lru.lock); - page->flags |= SGX_EPC_PAGE_RECLAIMER_TRACKED; - list_add_tail(&page->list, &sgx_global_lru.reclaimable); + WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED); + page->flags |= flags; + if (flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) + list_add_tail(&page->list, &sgx_global_lru.reclaimable); spin_unlock(&sgx_global_lru.lock); } /** - * sgx_unmark_page_reclaimable() - Remove a page from the reclaim list + * sgx_drop_epc_page() - Remove a page from a LRU list * @page: EPC page * - * Clear the reclaimable flag and remove the page from the active page list. + * Clear the reclaimable flag if set and remove the page from its LRU. * * Return: * 0 on success, * -EBUSY if the page is in the process of being reclaimed */ -int sgx_unmark_page_reclaimable(struct sgx_epc_page *page) +int sgx_drop_epc_page(struct sgx_epc_page *page) { spin_lock(&sgx_global_lru.lock); if (page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) { diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 77fceba73a25..c60bbd995942 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -113,8 +113,8 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void); void sgx_free_epc_page(struct sgx_epc_page *page); void sgx_reclaim_direct(void); -void sgx_mark_page_reclaimable(struct sgx_epc_page *page); -int sgx_unmark_page_reclaimable(struct sgx_epc_page *page); +void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags); +int sgx_drop_epc_page(struct sgx_epc_page *page); struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); void sgx_ipi_cb(void *info); From patchwork Wed Jul 12 23:01:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119391 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1471440vqm; Wed, 12 Jul 2023 16:13:38 -0700 (PDT) X-Google-Smtp-Source: APBJJlG5zQk8lxQP+tDIUSwq4WI03oq5W58ntf0gvBDBxCQS/7AaYkngnvMGmnEeulNX0mubtGKn X-Received: by 2002:a17:902:c14d:b0:1b8:4607:c3d7 with SMTP id 13-20020a170902c14d00b001b84607c3d7mr7204plj.41.1689203617768; Wed, 12 Jul 2023 16:13:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203617; cv=none; d=google.com; s=arc-20160816; b=Ji4JIa9DlWtWOQ4PbKnn7vjPk+xRdDjNxsh+T4cxlYAZV+2ZFpTRtJhDxS/51Dwyw5 66G12KMTDzHgoAdEMB8e+Xg3870YAb7D6zoHcZU0MoOXzRv6RlI5fT3Xw5hq2nEpgSIK bStXm/3Xn0nnICUoeIvm9mBWaZbWgdSaNwziDP8icWiLSsxItmre+YZXpX0SXxqgORM3 1rvc73c4eguYMJ5LpLs2CMCkDLwV6/ASM76Aatqm+fKDrmNDoI8lfPZkMY0hw1Ily5ec 3vJHdOjnQv6gbnKrdd1WgPi693mPRe3yP7FEsKRgboYpfeZjGMrnn80PW5RNt4wzqUM0 UzSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=H6MCj9Tjqnc3WG8nMpXlubMfzuEirH27gHV/vmxBGjk=; fh=sp9Fy9uDTUYbTQiOiWOfY01qY8ktdO/xQ5V81goPrjk=; b=khnNnuhQeSZzz76Fzz2W+Dl7VgfKAXTWI+ZpYjR2fibbMdzN+WUamBg6iged5Eklhf 0BnXS/OmDm98Wa7s8U0V0yLd+TZ2VDQJWFZDNmpicxd2kQJ521wtKCeRn7oZtKtHzqrd 8sXD97Ufuf2inEqCh9xuaUjlX5eeSP8oyUgeQatifo/NX2WmB9J3U0s3DFecB7PagOcp 0EceLJCRG8eIIVPnG5zOce0jhvlLYPpi5axmQ/W3KGQJ6jlgMUkBrR1etwbjCWzcocQL Ff6x2ssntCjmzQpxacrXG9XE8jNj9SQIYp4UjkQRcIMvyjQMkKl6GN1y5DMIrnb2+OK8 Fi8A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hLJvN1em; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y15-20020a17090322cf00b001b9e38b8167si4239368plg.169.2023.07.12.16.13.24; Wed, 12 Jul 2023 16:13:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hLJvN1em; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232994AbjGLXCc (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231987AbjGLXCL (ORCPT ); Wed, 12 Jul 2023 19:02:11 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED8EE11D; Wed, 12 Jul 2023 16:02:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202930; x=1720738930; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LP9W4hlkm5ekzqLCSSJ7Z7Ak8eW+S808RjkB/UQVNtg=; b=hLJvN1em6sEGSJtNcSpguutP/9RqhfcVp+SrgSJn83s5If6i3PnIHcjt NJ47J4gaLLmTvbYh+NwiV4beYftSCfQdGgi36HgfVxT3sBmywqFKmDcQt CHr9OAPfNkRukIh3mx/oh/Cr7smbP/GKN/J32AR5aV54aJpYcCimgfANs QRMSn20KBPVRlGVYx4Isf7MtHPjL1Vxulq20kvDR64Tqibj9dgnDkWORj K44uyAmkHXs+WWmJ93WojLjm3bK/kzSY5ijm2DUpiYmqtg+WD6RBj4iJi AFtQocVfsLTcUarBaQFfkrLjbRN8v6YW90y9S1uGZmoTV+KD7rOJOM3R3 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428773946" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428773946" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338601" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338601" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:06 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Kristen Carlson Accardi , zhiquan1.li@intel.com, seanjc@google.com Subject: [PATCH v3 06/28] x86/sgx: store unreclaimable EPC pages in sgx_epc_lru_lists Date: Wed, 12 Jul 2023 16:01:40 -0700 Message-Id: <20230712230202.47929-7-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258372695764422 X-GMAIL-MSGID: 1771258372695764422 From: Kristen Carlson Accardi When an OOM event occurs, all pages associated with an enclave will need to be freed, including pages that are not currently tracked by the reclaimer. A previous patch converted the SGX code to use a pair of generic "sgx_record/drop_epc_pages()" for storing the EPC pages that are tracked by the reclaimer. This patch utilizes those functions to store the remaining untracked pages to a new "unreclaimable" list stored with the struct sgx_epc_lru_lists struct. Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang V3: - Removed tracking virtual EPC pages in unreclaimable list as host kernel does not reclaim them. The EPC cgroups implemented later only blocks allocating for a guest if the limit is reached by returning -ENOMEM from sgx_alloc_epc_page() called by virt_epc, and does nothing else. Therefore, no need to track those in LRU lists. --- arch/x86/kernel/cpu/sgx/encl.c | 8 ++++++-- arch/x86/kernel/cpu/sgx/ioctl.c | 4 +++- arch/x86/kernel/cpu/sgx/main.c | 3 +++ arch/x86/kernel/cpu/sgx/sgx.h | 5 +++++ 4 files changed, 17 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index f68af9e37daa..edb8d8c1c229 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -252,7 +252,8 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, epc_page = sgx_encl_eldu(&encl->secs, NULL); if (IS_ERR(epc_page)) return ERR_CAST(epc_page); - epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; + sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE | + SGX_EPC_PAGE_RECLAIMER_UNTRACKED); } epc_page = sgx_encl_eldu(entry, encl->secs.epc_page); @@ -724,6 +725,7 @@ void sgx_encl_release(struct kref *ref) xa_destroy(&encl->page_array); if (!encl->secs_child_cnt && encl->secs.epc_page) { + sgx_drop_epc_page(encl->secs.epc_page); sgx_encl_free_epc_page(encl->secs.epc_page); encl->secs.epc_page = NULL; } @@ -732,6 +734,7 @@ void sgx_encl_release(struct kref *ref) va_page = list_first_entry(&encl->va_pages, struct sgx_va_page, list); list_del(&va_page->list); + sgx_drop_epc_page(va_page->epc_page); sgx_encl_free_epc_page(va_page->epc_page); kfree(va_page); } @@ -1238,7 +1241,8 @@ struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim) sgx_encl_free_epc_page(epc_page); return ERR_PTR(-EFAULT); } - epc_page->flags |= SGX_EPC_OWNER_ENCL; + sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL | + SGX_EPC_PAGE_RECLAIMER_UNTRACKED); return epc_page; } diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index dd7ab1c80db6..4e6d0c9d043a 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -48,6 +48,7 @@ void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page) encl->page_cnt--; if (va_page) { + sgx_drop_epc_page(va_page->epc_page); sgx_encl_free_epc_page(va_page->epc_page); list_del(&va_page->list); kfree(va_page); @@ -113,7 +114,8 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs) encl->attributes = secs->attributes; encl->attributes_mask = SGX_ATTR_UNPRIV_MASK; - encl->secs.epc_page->flags |= SGX_EPC_OWNER_ENCL_PAGE; + sgx_record_epc_page(encl->secs.epc_page, SGX_EPC_OWNER_ENCL_PAGE | + SGX_EPC_PAGE_RECLAIMER_UNTRACKED); /* Set only after completion, as encl->lock has not been taken. */ set_bit(SGX_ENCL_CREATED, &encl->flags); diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 371135665ff7..9252728865fa 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -268,6 +268,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, goto out; sgx_encl_ewb(encl->secs.epc_page, &secs_backing); + sgx_drop_epc_page(encl->secs.epc_page); sgx_encl_free_epc_page(encl->secs.epc_page); encl->secs.epc_page = NULL; @@ -511,6 +512,8 @@ void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) page->flags |= flags; if (flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) list_add_tail(&page->list, &sgx_global_lru.reclaimable); + else + list_add_tail(&page->list, &sgx_global_lru.unreclaimable); spin_unlock(&sgx_global_lru.lock); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index c60bbd995942..9f780b2c4cfe 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -23,6 +23,9 @@ #define SGX_NR_LOW_PAGES 32 #define SGX_NR_HIGH_PAGES 64 +/* Pages, which are not tracked by the page reclaimer. */ +#define SGX_EPC_PAGE_RECLAIMER_UNTRACKED 0 + /* Pages, which are being tracked by the page reclaimer. */ #define SGX_EPC_PAGE_RECLAIMER_TRACKED BIT(0) @@ -101,12 +104,14 @@ struct sgx_epc_lru_lists { /* Must acquire this lock to access */ spinlock_t lock; struct list_head reclaimable; + struct list_head unreclaimable; }; static inline void sgx_lru_init(struct sgx_epc_lru_lists *lrus) { spin_lock_init(&lrus->lock); INIT_LIST_HEAD(&lrus->reclaimable); + INIT_LIST_HEAD(&lrus->unreclaimable); } struct sgx_epc_page *__sgx_alloc_epc_page(void); From patchwork Wed Jul 12 23:01:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119387 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1470886vqm; Wed, 12 Jul 2023 16:12:37 -0700 (PDT) X-Google-Smtp-Source: APBJJlHdmdO18AmL+WM9zja5wA6gBdPEhN654XMpN+5a4BmDCCHaxQYItJn3omrijCQ+7CZ/SczD X-Received: by 2002:a17:903:2285:b0:1b8:5a49:a290 with SMTP id b5-20020a170903228500b001b85a49a290mr29775plh.43.1689203557644; Wed, 12 Jul 2023 16:12:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203557; cv=none; d=google.com; s=arc-20160816; b=WbwcsrU5MhF3gp0Sr14NR1ZjgblcrydmfDDwMirlcltXZbDm486XcKa3iHVvNfLNdY 7teWavPLSq97gfQldIcr3RMvEPUBZk9cSYNkzzrhX76vqqiF5+XPpu1oaysqyBhAyMcD h622uJ47eph0PCSYm3I2/MNPviRkzWUp97MRj+yjGoGphkrenORstgJs6h4aeMzL5kRk tc5Nt4q/eD3uzap/gHdfflB701hNtIzTt1wyfAXVv+LLDnSdhS+W29F1zwyV6Okapgsf w5yc0bZtaoRQSrj5zqvd6IrvfBmBBMyFbpI0ySaT8tFPrahSmaEn1f7I1kWsInhvklL/ jGTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=V7LQkaWqO6iwRRQSejRVhe/QB6JC1V26ZyPbYiWPjr0=; fh=qLO4/VO18j+9BHx+eDGN4B9wTedSSqlU+IMbNJ/OHOg=; b=FwtGX2HYLOgWANMOKp+z21SUsictJFnN39NvNzH55vk/Gaip2OofZouIVo77Q1R88i SrA7ddNh9vjYoOCDCDHM/N1Vw20qsyvos1F7ZOuLi7sUz0VAgu07elLAIAW2o0Q1jg5Y 6PuSLFQZdwIq+FVuLdVm8BlkBue95JzltVAFgaY56YZfSiStY7sKg5PnWV+wnMmPTonK qPY8V9taVXkefPj4pgSnwEJ2UbjtUBVghOM7P7tX3BZdoYQTJBvv/20adGSSZn+r0Dyl yhLcPIM5O7gDW0zBGyVSvUTpS6bKH5Vyo0RlFbjOEUGh1MaB4E4Ko9lPD4A0gSms2s17 34ag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LI+gcS5j; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q11-20020a170902c9cb00b001ac9ab8539csi3961822pld.77.2023.07.12.16.12.24; Wed, 12 Jul 2023 16:12:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LI+gcS5j; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231979AbjGLXCa (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232056AbjGLXCM (ORCPT ); Wed, 12 Jul 2023 19:02:12 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A96F12E; Wed, 12 Jul 2023 16:02:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202930; x=1720738930; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=f3mu4uIXArDYT21Apy+M3Wl9cTO7R4RRDcqW9oEa0ps=; b=LI+gcS5j0Ttgf57PoQR8Qmmpye52LKc2/VuweXwqyzG1Z5XkDFOc8b8e ZqGGJm3tlBdq+cVUA4eV46N9mnlsIGgzdNBd/k64iZDGW/JNQucZTovmJ 2DhOlwDmcZChrFpbTeMFFTYiJ9p0fQpJXtR+8PIqSWaS+GGL8nkqtBUk1 KbNvlO3kT7vTRVmNJ6VAO3141PspIq49fSepTqAIvNZGouRJcOYnpopMA 0JJx+AcTvxnq/A6qsacTLI6/QwUt3dvTDkNZiVGIcU2hWz5sL+AeZ6H5/ JT2QJ+DAYKoJeP9abtkpaOyKiyo1mrgGlCHfyY5VDLL8Q/PQ/gFycADbc Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428773981" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428773981" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338604" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338604" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:07 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 07/28] x86/sgx: Introduce EPC page states Date: Wed, 12 Jul 2023 16:01:41 -0700 Message-Id: <20230712230202.47929-8-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258309662960564 X-GMAIL-MSGID: 1771258309662960564 Use the lower 3 bits in the flags field of sgx_epc_page struct to track EPC states in its life cycle and define an enum for possible states. More state(s) will be added later. Signed-off-by: Haitao Huang V3: - This is new in V3 to replace the bit mask based approach (requested by Jarkko) --- arch/x86/kernel/cpu/sgx/encl.c | 10 +++---- arch/x86/kernel/cpu/sgx/ioctl.c | 6 ++-- arch/x86/kernel/cpu/sgx/main.c | 19 +++++++------ arch/x86/kernel/cpu/sgx/sgx.h | 50 +++++++++++++++++++++++++++++---- 4 files changed, 63 insertions(+), 22 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index edb8d8c1c229..e7319209fc4a 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -253,7 +253,7 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, if (IS_ERR(epc_page)) return ERR_CAST(epc_page); sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE | - SGX_EPC_PAGE_RECLAIMER_UNTRACKED); + SGX_EPC_PAGE_UNRECLAIMABLE); } epc_page = sgx_encl_eldu(entry, encl->secs.epc_page); @@ -262,7 +262,7 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, encl->secs_child_cnt++; sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE | - SGX_EPC_PAGE_RECLAIMER_TRACKED); + SGX_EPC_PAGE_RECLAIMABLE); return entry; } @@ -382,7 +382,7 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma, encl->secs_child_cnt++; sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE | - SGX_EPC_PAGE_RECLAIMER_TRACKED); + SGX_EPC_PAGE_RECLAIMABLE); phys_addr = sgx_get_epc_phys_addr(epc_page); /* @@ -1242,7 +1242,7 @@ struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim) return ERR_PTR(-EFAULT); } sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL | - SGX_EPC_PAGE_RECLAIMER_UNTRACKED); + SGX_EPC_PAGE_UNRECLAIMABLE); return epc_page; } @@ -1302,7 +1302,7 @@ void sgx_encl_free_epc_page(struct sgx_epc_page *page) { int ret; - WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED); + WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_STATE_MASK); ret = __eremove(sgx_get_epc_virt_addr(page)); if (WARN_ONCE(ret, EREMOVE_ERROR_MESSAGE, ret, ret)) diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 4e6d0c9d043a..4f95096c9786 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -115,7 +115,7 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs) encl->attributes_mask = SGX_ATTR_UNPRIV_MASK; sgx_record_epc_page(encl->secs.epc_page, SGX_EPC_OWNER_ENCL_PAGE | - SGX_EPC_PAGE_RECLAIMER_UNTRACKED); + SGX_EPC_PAGE_UNRECLAIMABLE); /* Set only after completion, as encl->lock has not been taken. */ set_bit(SGX_ENCL_CREATED, &encl->flags); @@ -327,7 +327,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src, } sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE | - SGX_EPC_PAGE_RECLAIMER_TRACKED); + SGX_EPC_PAGE_RECLAIMABLE); mutex_unlock(&encl->lock); mmap_read_unlock(current->mm); return ret; @@ -982,7 +982,7 @@ static long sgx_enclave_modify_types(struct sgx_encl *encl, mutex_lock(&encl->lock); sgx_record_epc_page(entry->epc_page, SGX_EPC_OWNER_ENCL_PAGE | - SGX_EPC_PAGE_RECLAIMER_TRACKED); + SGX_EPC_PAGE_RECLAIMABLE); } /* Change EPC type */ diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 9252728865fa..02c358f10383 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -319,7 +319,7 @@ static void sgx_reclaim_pages(void) /* The owner is freeing the page. No need to add the * page back to the list of reclaimable pages. */ - epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; + sgx_epc_page_reset_state(epc_page); } spin_unlock(&sgx_global_lru.lock); @@ -345,6 +345,7 @@ static void sgx_reclaim_pages(void) skip: spin_lock(&sgx_global_lru.lock); + sgx_epc_page_set_state(epc_page, SGX_EPC_PAGE_RECLAIMABLE); list_add_tail(&epc_page->list, &sgx_global_lru.reclaimable); spin_unlock(&sgx_global_lru.lock); @@ -368,7 +369,7 @@ static void sgx_reclaim_pages(void) sgx_reclaimer_write(epc_page, &backing[i]); kref_put(&encl_page->encl->refcount, sgx_encl_release); - epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; + sgx_epc_page_reset_state(epc_page); sgx_free_epc_page(epc_page); } @@ -508,9 +509,9 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) { spin_lock(&sgx_global_lru.lock); - WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED); + WARN_ON_ONCE(sgx_epc_page_reclaimable(page->flags)); page->flags |= flags; - if (flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) + if (sgx_epc_page_reclaimable(flags)) list_add_tail(&page->list, &sgx_global_lru.reclaimable); else list_add_tail(&page->list, &sgx_global_lru.unreclaimable); @@ -530,7 +531,7 @@ void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) int sgx_drop_epc_page(struct sgx_epc_page *page) { spin_lock(&sgx_global_lru.lock); - if (page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) { + if (sgx_epc_page_reclaimable(page->flags)) { /* The page is being reclaimed. */ if (list_empty(&page->list)) { spin_unlock(&sgx_global_lru.lock); @@ -538,7 +539,7 @@ int sgx_drop_epc_page(struct sgx_epc_page *page) } list_del(&page->list); - page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; + sgx_epc_page_reset_state(page); } spin_unlock(&sgx_global_lru.lock); @@ -610,6 +611,8 @@ void sgx_free_epc_page(struct sgx_epc_page *page) struct sgx_epc_section *section = &sgx_epc_sections[page->section]; struct sgx_numa_node *node = section->node; + WARN_ON_ONCE(page->flags & (SGX_EPC_PAGE_STATE_MASK)); + spin_lock(&node->lock); page->encl_page = NULL; @@ -617,7 +620,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) list_add(&page->list, &node->sgx_poison_page_list); else list_add_tail(&page->list, &node->free_page_list); - page->flags = SGX_EPC_PAGE_IS_FREE; + page->flags = SGX_EPC_PAGE_FREE; spin_unlock(&node->lock); atomic_long_inc(&sgx_nr_free_pages); @@ -718,7 +721,7 @@ int arch_memory_failure(unsigned long pfn, int flags) * If the page is on a free list, move it to the per-node * poison page list. */ - if (page->flags & SGX_EPC_PAGE_IS_FREE) { + if (page->flags == SGX_EPC_PAGE_FREE) { list_move(&page->list, &node->sgx_poison_page_list); goto out; } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 9f780b2c4cfe..057905eba466 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -23,14 +23,36 @@ #define SGX_NR_LOW_PAGES 32 #define SGX_NR_HIGH_PAGES 64 -/* Pages, which are not tracked by the page reclaimer. */ -#define SGX_EPC_PAGE_RECLAIMER_UNTRACKED 0 +enum sgx_epc_page_state { + /* Not tracked by the reclaimer: + * Pages allocated for virtual EPC which are never tracked by the host + * reclaimer; pages just allocated from free list but not yet put in + * use; pages just reclaimed, but not yet returned to the free list. + * Becomes FREE after sgx_free_epc() + * Becomes RECLAIMABLE or UNRECLAIMABLE after sgx_record_epc() + */ + SGX_EPC_PAGE_NOT_TRACKED = 0, + + /* Page is in the free list, ready for allocation + * Becomes NOT_TRACKED after sgx_alloc_epc_page() + */ + SGX_EPC_PAGE_FREE = 1, + + /* Page is in use and tracked in a reclaimable LRU list + * Becomes NOT_TRACKED after sgx_drop_epc() + */ + SGX_EPC_PAGE_RECLAIMABLE = 2, + + /* Page is in use but tracked in an unreclaimable LRU list. These are + * only reclaimable when the whole enclave is OOM killed or the enclave + * is released, e.g., VA, SECS pages + * Becomes NOT_TRACKED after sgx_drop_epc() + */ + SGX_EPC_PAGE_UNRECLAIMABLE = 3, -/* Pages, which are being tracked by the page reclaimer. */ -#define SGX_EPC_PAGE_RECLAIMER_TRACKED BIT(0) +}; -/* Pages on free list */ -#define SGX_EPC_PAGE_IS_FREE BIT(1) +#define SGX_EPC_PAGE_STATE_MASK GENMASK(2, 0) /* flag for pages owned by a sgx_encl_page */ #define SGX_EPC_OWNER_ENCL_PAGE BIT(3) @@ -49,6 +71,22 @@ struct sgx_epc_page { struct list_head list; }; +static inline void sgx_epc_page_reset_state(struct sgx_epc_page *page) +{ + page->flags &= ~SGX_EPC_PAGE_STATE_MASK; +} + +static inline void sgx_epc_page_set_state(struct sgx_epc_page *page, unsigned long flags) +{ + page->flags &= ~SGX_EPC_PAGE_STATE_MASK; + page->flags |= (flags & SGX_EPC_PAGE_STATE_MASK); +} + +static inline bool sgx_epc_page_reclaimable(unsigned long flags) +{ + return SGX_EPC_PAGE_RECLAIMABLE == (flags & SGX_EPC_PAGE_STATE_MASK); +} + /* * Contains the tracking data for NUMA nodes having EPC pages. Most importantly, * the free page list local to the node is stored here. From patchwork Wed Jul 12 23:01:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119401 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1472461vqm; Wed, 12 Jul 2023 16:15:48 -0700 (PDT) X-Google-Smtp-Source: APBJJlEQ+DgII3E0LRgfwjoqB5j/NzVY+bOgKIPBTLfDjG0BeKalMSXP7opSi6/+1aPl6sGuhwJk X-Received: by 2002:a05:6a20:548e:b0:133:1644:1d94 with SMTP id i14-20020a056a20548e00b0013316441d94mr1531689pzk.5.1689203747826; Wed, 12 Jul 2023 16:15:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203747; cv=none; d=google.com; s=arc-20160816; b=haSW7IFsWdiwdfHzLHofoVyFngExD0YvD1EU53bcVdZuoDjlK17esg3p/Xv1lmcfyJ DWmDugzcISuK0cYR2gEkfnibKm29qglcBl74VgEBltKoblMQrScsGKATa3+0I4ab0G0G Y2dIbuQ3Y+5aieVYPv9Q02h/z1tjofhXvzUG9LdOoURtnIkWKcz2aIQnNOiECbPEQ9Qo 1Op24L190riU3a2t6TgOub0q3d6tQ32gqoR9a29r6BjHnL0SceDSfZZZ9v3x8kQ/zsZl Na/kAWdNM7uqNK5W7/iCv6bbsyjnWRzUzQVlYnHv3br/Z6BvJMG05jmUOtWkp4GscTkS xUfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=l57Q/jgjy/T0mdcnORijURWowSqGphx3V+JFlMjupg0=; fh=yb4uG0y4asZWHqlPLNfXLnkT1yMR9ApbW5Mi8ElCAYw=; b=MewDJZHzXSKr0FJORoViz70YV0Q00s3qFiUecl9GabY3UI77Fgosr3UeH7IFkwQvNm hBN28KZGW0amZYYKSe/yo+PhFB0PjFfDbb25Y6Y4DURy/keq5P1xckwxNRjf3Ls8ebhB AlEcBlfRCyPZSE5gxheeVa7BG/kkpyJW0vzB2/LaGEEbb6rInhgT6WUfXvudVtb2nclR qJd5YHQ0I4XndBbEMdAFoRvFFZ//3fZgAPucLBW9IxPOvVCforuvA1Ln7ua73ddp+lkc MTy/RlBUqG3XYT/SenA+0gpRicFcBp3hjfr+qOUxdpBzDFW+eeqaP2HkNV1OaGZ9T7fY 670g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=D+qOdD0I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u8-20020a170902e80800b001b81e6ce809si4247438plg.5.2023.07.12.16.15.35; Wed, 12 Jul 2023 16:15:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=D+qOdD0I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233027AbjGLXCf (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232073AbjGLXCM (ORCPT ); Wed, 12 Jul 2023 19:02:12 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7FEAB18E; Wed, 12 Jul 2023 16:02:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202931; x=1720738931; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CVUrYW5MmLWmAT8eePPeED+pne2A9Lo2ZCZNyX7klu8=; b=D+qOdD0IncXzotuN+zFLyVfPbGvxjogjnCKlpQ9LcCzJ4BMDKlMIjeIU gukbRBYd89/pDGsg6C8WVmChgd41FUgfnadmOXIQ/Ksntsd3nX2zwUDYV G5iduHj7BBlgaJK0H605ov8x/3P9WXyy8B6HdnCE6imwpoGc25TgOyO7M PAWeUjJscaRdGDQaTX5AoiTNjrrOkwVOQE4HQStUVGMViPg4DJLrmQVOF KMyBzMW8dpHkJBnfv9l2GschfwNV/QQpOVQjND+ttZ+UUybOfC7hvhaYA vCDqpmNic+81HGRzSeP7+rHP0cTVZFs5FwMRoWK13tSh2Xb8rSsMfsbVt w==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774026" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774026" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338608" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338608" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:08 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Sean Christopherson , zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 08/28] x86/sgx: Introduce RECLAIM_IN_PROGRESS state Date: Wed, 12 Jul 2023 16:01:42 -0700 Message-Id: <20230712230202.47929-9-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258508708457758 X-GMAIL-MSGID: 1771258508708457758 From: Sean Christopherson When a page is being reclaimed from the page pool (sgx_global_lru), there is an intermediate stage where a page may have been identified as a candidate for reclaiming, but has not yet been reclaimed. Currently such pages are list_del_init()'d from the global LRU, and stored in a an array on stack. To prevent another thread from dropping the same page in the middle of reclaiming, sgx_drop_epc_page() checks for list_empty(&page->list). In future patches these pages need be list_move()'d into a temporary list that is shared with multiple cgroup reclaimers. so list_empty() should no longer be used for this purpose. Add a RECLAIM_IN_PROGRESS state to explicitly designate such intermediate state of EPC in the reclaiming process. Do not drop any page in this state in sgx_drop_epc_page(). Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson V3: - Extend the sgx_epc_page_state enum introduced earlier to replace the flag based approach. --- arch/x86/kernel/cpu/sgx/main.c | 21 ++++++++++----------- arch/x86/kernel/cpu/sgx/sgx.h | 16 ++++++++++++++++ 2 files changed, 26 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 02c358f10383..9eea9038758f 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -313,13 +313,15 @@ static void sgx_reclaim_pages(void) list_del_init(&epc_page->list); encl_page = epc_page->encl_page; - if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) + if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) { + sgx_epc_page_set_state(epc_page, SGX_EPC_PAGE_RECLAIM_IN_PROGRESS); chunk[cnt++] = epc_page; - else + } else { /* The owner is freeing the page. No need to add the * page back to the list of reclaimable pages. */ sgx_epc_page_reset_state(epc_page); + } } spin_unlock(&sgx_global_lru.lock); @@ -531,16 +533,13 @@ void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) int sgx_drop_epc_page(struct sgx_epc_page *page) { spin_lock(&sgx_global_lru.lock); - if (sgx_epc_page_reclaimable(page->flags)) { - /* The page is being reclaimed. */ - if (list_empty(&page->list)) { - spin_unlock(&sgx_global_lru.lock); - return -EBUSY; - } - - list_del(&page->list); - sgx_epc_page_reset_state(page); + if (sgx_epc_page_reclaim_in_progress(page->flags)) { + spin_unlock(&sgx_global_lru.lock); + return -EBUSY; } + + list_del(&page->list); + sgx_epc_page_reset_state(page); spin_unlock(&sgx_global_lru.lock); return 0; diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 057905eba466..f26ed4c0d12f 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -40,6 +40,8 @@ enum sgx_epc_page_state { /* Page is in use and tracked in a reclaimable LRU list * Becomes NOT_TRACKED after sgx_drop_epc() + * Becomes RECLAIM_IN_PROGRESS in sgx_reclaim_pages() when identified + * for reclaiming */ SGX_EPC_PAGE_RECLAIMABLE = 2, @@ -50,6 +52,14 @@ enum sgx_epc_page_state { */ SGX_EPC_PAGE_UNRECLAIMABLE = 3, + /* Page is being prepared for reclaimation, tracked in a temporary + * isolated list by the reclaimer. + * Changes in sgx_reclaim_pages() back to RECLAIMABLE if preparation + * fails for any reason. + * Becomes NOT_TRACKED if reclaimed successfully in sgx_reclaim_pages() + * and immediately sgx_free_epc() is called to make it FREE. + */ + SGX_EPC_PAGE_RECLAIM_IN_PROGRESS = 4, }; #define SGX_EPC_PAGE_STATE_MASK GENMASK(2, 0) @@ -82,6 +92,12 @@ static inline void sgx_epc_page_set_state(struct sgx_epc_page *page, unsigned lo page->flags |= (flags & SGX_EPC_PAGE_STATE_MASK); } +static inline bool sgx_epc_page_reclaim_in_progress(unsigned long flags) +{ + return SGX_EPC_PAGE_RECLAIM_IN_PROGRESS == (flags & + SGX_EPC_PAGE_STATE_MASK); +} + static inline bool sgx_epc_page_reclaimable(unsigned long flags) { return SGX_EPC_PAGE_RECLAIMABLE == (flags & SGX_EPC_PAGE_STATE_MASK); From patchwork Wed Jul 12 23:01:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119397 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1472243vqm; Wed, 12 Jul 2023 16:15:19 -0700 (PDT) X-Google-Smtp-Source: APBJJlEpxNQUdx9VnXsXC1UV8drSNCjGGztYAVDDNWan4M+KZI39P/1gcvYfBluc93a0ExW7xOhS X-Received: by 2002:a05:6a21:329b:b0:131:494a:1275 with SMTP id yt27-20020a056a21329b00b00131494a1275mr14646047pzb.36.1689203719677; Wed, 12 Jul 2023 16:15:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203719; cv=none; d=google.com; s=arc-20160816; b=Hz8rajRcSuGCMBPi5M4VPqEgkXYEpfz4HUY0UMcR6nn1bxDDWAdBfcFXa7vQYYP1bO 9+r3D8k+rFzFHZb8AP367NIxzfl24n2clb6TADhdsGHZbP5HqpjZc29ofCNpZlX+nY2K XevlKa7AF4FAC4dWiTmF+gC2elsAnEvDKRhuk1+Gv2an9MeYEaB/co8RKMTFec02mp8b KuVpV53h7vVa6wtrsgUwZFC8HyXcTOFE3bVgpKhlx1613OQPGZiaqSzDcU6jlXpM/j0i 0glRCMF+3YPpZYJgtSiQgh9pWrn9SfMDrNPEja2xk8CzMM/mIVMSjp8+ac0BIDP7Z86s M75g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=qJhJ5ws8EVqtLyYrHKmb7V3seMc+3H5CAkxGK88qoXA=; fh=sp9Fy9uDTUYbTQiOiWOfY01qY8ktdO/xQ5V81goPrjk=; b=G9tR0vXL8XMmeFslOv2odRA4H0x4vl7+f2BjrAHtFAYru3+nku/Ep9cf8e7FBEg7AG kEyuv6SEHQ/JUB3YTRkVR2aCB9fwbAUkzCASL1os6jyj4aX2bblgHUXIIfu5pciX7rr2 nNm09I+1ADREU1kMbf2koBsFSc493D/LkDSlWZHWjGK/Et6X4cFlDyPudAWgjeLVI1Zk 0quT0fLupBE3vwebZ7IX2qpGhss72bL9MeRb69oxij9AOFI6sAKUeBVAqKaoRNTumfdT hzJNZ1ADNNEFa+VrlpjLDP7m8KrLvP2T8r55dtoFxnsepXQtZYEuncAgm1maxAxLzzvg UqVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aGnDsO7W; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z9-20020aa78889000000b00677429ef6a2si3998425pfe.186.2023.07.12.16.15.07; Wed, 12 Jul 2023 16:15:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aGnDsO7W; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233056AbjGLXCn (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232077AbjGLXCM (ORCPT ); Wed, 12 Jul 2023 19:02:12 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4A7519B; Wed, 12 Jul 2023 16:02:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202931; x=1720738931; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5PydI83cKvrPzhjzQxB3kjgQ0AGgyY2Uz5x1crKJURs=; b=aGnDsO7Ww1LzqK+qGDDu1Pi4PQKDTp6O/SIju7f1B9CuFj5yuhsKqMc+ paHbdKxn+IG6YsQLPp7Xay36/ee8n8cE8gkXVdzwpDqtAsxJHv5JGULiu rPnLKuEaOM2DGLqEPM0U6ErhK3CmPLRlt0ljhHHGhEzEfBEgzEZsVUJC/ AGFtU3oHI/OHxU/B91VcjPBZMeYeJ/vkUCxbM3FrDm1GduS0tspJJESOm bAwhOV/+/EMIJQiylDBWgAWaygrkJ66P3NXk0F94WEAX06isRSqAZ9ymu 1VZauMTkqcQRI3WEnf07NWA2rh2G1hew+C17xRWTkowhuRZvbzpNniS3w Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774062" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774062" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338611" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338611" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:09 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Kristen Carlson Accardi , zhiquan1.li@intel.com, seanjc@google.com Subject: [PATCH v3 09/28] x86/sgx: Use a list to track to-be-reclaimed pages Date: Wed, 12 Jul 2023 16:01:43 -0700 Message-Id: <20230712230202.47929-10-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258479360512677 X-GMAIL-MSGID: 1771258479360512677 From: Kristen Carlson Accardi Change sgx_reclaim_pages() to use a list rather than an array for storing the epc_pages which will be reclaimed. This change is needed to transition to the LRU implementation for EPC cgroup support, which uses lists to store reclaimable and unreclaimable pages. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson V3: - Removed list wrappers --- arch/x86/kernel/cpu/sgx/main.c | 40 +++++++++++++++------------------- 1 file changed, 18 insertions(+), 22 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 9eea9038758f..f3a3ed894616 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -294,12 +294,11 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, */ static void sgx_reclaim_pages(void) { - struct sgx_epc_page *chunk[SGX_NR_TO_SCAN]; struct sgx_backing backing[SGX_NR_TO_SCAN]; + struct sgx_epc_page *epc_page, *tmp; struct sgx_encl_page *encl_page; - struct sgx_epc_page *epc_page; pgoff_t page_index; - int cnt = 0; + LIST_HEAD(iso); int ret; int i; @@ -315,18 +314,22 @@ static void sgx_reclaim_pages(void) if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) { sgx_epc_page_set_state(epc_page, SGX_EPC_PAGE_RECLAIM_IN_PROGRESS); - chunk[cnt++] = epc_page; + list_move_tail(&epc_page->list, &iso); } else { - /* The owner is freeing the page. No need to add the - * page back to the list of reclaimable pages. + /* The owner is freeing the page, remove it from the + * LRU list */ sgx_epc_page_reset_state(epc_page); + list_del_init(&epc_page->list); } } spin_unlock(&sgx_global_lru.lock); - for (i = 0; i < cnt; i++) { - epc_page = chunk[i]; + if (list_empty(&iso)) + return; + + i = 0; + list_for_each_entry_safe(epc_page, tmp, &iso, list) { encl_page = epc_page->encl_page; if (!sgx_reclaimer_age(epc_page)) @@ -341,6 +344,7 @@ static void sgx_reclaim_pages(void) goto skip; } + i++; encl_page->desc |= SGX_ENCL_PAGE_BEING_RECLAIMED; mutex_unlock(&encl_page->encl->lock); continue; @@ -348,27 +352,19 @@ static void sgx_reclaim_pages(void) skip: spin_lock(&sgx_global_lru.lock); sgx_epc_page_set_state(epc_page, SGX_EPC_PAGE_RECLAIMABLE); - list_add_tail(&epc_page->list, &sgx_global_lru.reclaimable); + list_move_tail(&epc_page->list, &sgx_global_lru.reclaimable); spin_unlock(&sgx_global_lru.lock); kref_put(&encl_page->encl->refcount, sgx_encl_release); - - chunk[i] = NULL; - } - - for (i = 0; i < cnt; i++) { - epc_page = chunk[i]; - if (epc_page) - sgx_reclaimer_block(epc_page); } - for (i = 0; i < cnt; i++) { - epc_page = chunk[i]; - if (!epc_page) - continue; + list_for_each_entry(epc_page, &iso, list) + sgx_reclaimer_block(epc_page); + i = 0; + list_for_each_entry_safe(epc_page, tmp, &iso, list) { encl_page = epc_page->encl_page; - sgx_reclaimer_write(epc_page, &backing[i]); + sgx_reclaimer_write(epc_page, &backing[i++]); kref_put(&encl_page->encl->refcount, sgx_encl_release); sgx_epc_page_reset_state(epc_page); From patchwork Wed Jul 12 23:01:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119383 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1470769vqm; Wed, 12 Jul 2023 16:12:23 -0700 (PDT) X-Google-Smtp-Source: APBJJlE6qRUj1TPE86O315YidC0krpwONBci4rMVHlk4XFw9XPUQcFY84Waty/cVCIePGybv//UE X-Received: by 2002:a05:6a00:2444:b0:67e:ca79:36f0 with SMTP id d4-20020a056a00244400b0067eca7936f0mr137059pfj.0.1689203542635; Wed, 12 Jul 2023 16:12:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203542; cv=none; d=google.com; s=arc-20160816; b=kDBwtcTFnUswf3OQ9N/F1ni50FT8HNeCNQOwVCWrts1cPwWuOHcbZ1gEUI+nAXqY7A cyoWjufS+vriawTfHcaS5ED4k8ZAHThOdChZdF3cvRoffHbSrwmzScjbUcUhoLAkjQD5 8lzjPYzJupQVxu4OnEiQWjjQUpWq5a3mR1mpnqGvxNKRYKjWA3Geg/dnyK8FqFgIrZ83 LDTlhshTCl37CgcL1lyjtJKRVACM+yoAWkc26vjwYQ+s0nL5x3M3VZnv54S/AlRTmZl3 HVpwaBWub2wauZGPuXcC4aVlSNuCRwrqckyzC/Yd+8hnzr41HHqCOQxj40RQdS8+7Mr5 UmoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3XU5/QTFG50lWWWrUFzJVR5yJJyA0nZHvMFm9A+Z1PE=; fh=yb4uG0y4asZWHqlPLNfXLnkT1yMR9ApbW5Mi8ElCAYw=; b=HNqXzTplRblIvbV+q5wWaW4//3uq06R3qe9SiJlgo0GvOoSgPpO0CO8BIndi8w2/CR MvMNsf/6ILQyE623Waw+/UxKqjyKydoXYfjCTk0FrwYAyfxKeihquKtL8QCNYrNntwG1 eXwOz/YcgXnLjX/CFo5KCZsbrAxA8j7LOZYCDRZ9DEjWXEAp5mZJ/KULuXcS0Y02Ks3I Agoyv6/9+W/0ZiCwOy78MbmhLD5N5QdFuDFNNsI0kRNyF1ssuTbp47U8BO/omPVggDSw dcJUNUwxWx2FTX0OWb16SXvn7ZluGI592GEB+SOeDZcgYyRyAUsNOTUlVtsjhuF3N/qL 7W+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ggPEXt2r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g22-20020a056a0023d600b0067b2f265d2fsi3969275pfc.83.2023.07.12.16.12.08; Wed, 12 Jul 2023 16:12:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ggPEXt2r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231298AbjGLXCr (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232272AbjGLXCN (ORCPT ); Wed, 12 Jul 2023 19:02:13 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA67E11D; Wed, 12 Jul 2023 16:02:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202932; x=1720738932; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+aBrkhurGrTRn92+lgf5H9zRq0/fmP5ELps+0C0V4FI=; b=ggPEXt2rx5PIvQwodb+XtLw1w1rSTSrSj8ZsKTPQTs+lKBHHumtRvlFW s6fOr0p3L47dUEFO8Z/jyYtCSE+YbO2zUfgviIIo5MNMVaUknw+mraobV VY7/jbgX40P9lE/LzvIh1rAzJEv/cOWXlrImHEE3ygC/s+S8JuoAP/VMo KM8t9ykPXCXGc1WnAlHMCEl5Sn/SxqmFIrPl/nSdhRl4gxz4k7LjHn36u C0lH8lWy6ZjKFaadMhQcblSaCcLZz8k42hr80xUGjiYTUQIIad6u9b3ao d6gnsYQtCbB7gPRjzUBaFiRCm0URQkC+ARDYRPgIRf2qWLTwHjjvapfgY Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774090" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774090" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338616" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338616" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:09 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Sean Christopherson , zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 10/28] x86/sgx: Allow reclaiming up to 32 pages, but scan 16 by default Date: Wed, 12 Jul 2023 16:01:44 -0700 Message-Id: <20230712230202.47929-11-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258293914824175 X-GMAIL-MSGID: 1771258293914824175 From: Sean Christopherson Modify sgx_reclaim_pages() to take a parameter that specifies the number of pages to scan for reclaiming. Specify a max value of 32, but scan 16 in the usual case. This allows the number of pages sgx_reclaim_pages() scans to be specified by the caller, and adjusted in future patches. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index f3a3ed894616..cd5e5517866a 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -17,6 +17,10 @@ #include "driver.h" #include "encl.h" #include "encls.h" +/** + * Maximum number of pages to scan for reclaiming. + */ +#define SGX_NR_TO_SCAN_MAX 32 struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static int sgx_nr_epc_sections; @@ -279,7 +283,10 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, mutex_unlock(&encl->lock); } -/* +/** + * sgx_reclaim_pages() - Reclaim EPC pages from the consumers + * @nr_to_scan: Number of EPC pages to scan for reclaim + * * Take a fixed number of pages from the head of the active page pool and * reclaim them to the enclave's private shmem files. Skip the pages, which have * been accessed since the last scan. Move those pages to the tail of active @@ -292,9 +299,9 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, * problematic as it would increase the lock contention too much, which would * halt forward progress. */ -static void sgx_reclaim_pages(void) +static void sgx_reclaim_pages(int nr_to_scan) { - struct sgx_backing backing[SGX_NR_TO_SCAN]; + struct sgx_backing backing[SGX_NR_TO_SCAN_MAX]; struct sgx_epc_page *epc_page, *tmp; struct sgx_encl_page *encl_page; pgoff_t page_index; @@ -332,7 +339,7 @@ static void sgx_reclaim_pages(void) list_for_each_entry_safe(epc_page, tmp, &iso, list) { encl_page = epc_page->encl_page; - if (!sgx_reclaimer_age(epc_page)) + if (i == SGX_NR_TO_SCAN_MAX || !sgx_reclaimer_age(epc_page)) goto skip; page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base); @@ -387,7 +394,7 @@ static bool sgx_should_reclaim(unsigned long watermark) void sgx_reclaim_direct(void) { if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) - sgx_reclaim_pages(); + sgx_reclaim_pages(SGX_NR_TO_SCAN); } static int ksgxd(void *p) @@ -410,7 +417,7 @@ static int ksgxd(void *p) sgx_should_reclaim(SGX_NR_HIGH_PAGES)); if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) - sgx_reclaim_pages(); + sgx_reclaim_pages(SGX_NR_TO_SCAN); cond_resched(); } @@ -582,7 +589,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - sgx_reclaim_pages(); + sgx_reclaim_pages(SGX_NR_TO_SCAN); cond_resched(); } From patchwork Wed Jul 12 23:01:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119406 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1477645vqm; Wed, 12 Jul 2023 16:27:17 -0700 (PDT) X-Google-Smtp-Source: APBJJlEu8ynI/FDajo8rXyFEE1J7tnnQkGinw43Are8KQrqA3wqEMlYnMGuY3s2OZOx9SZ5m5d93 X-Received: by 2002:a17:907:3f0b:b0:98e:2423:708 with SMTP id hq11-20020a1709073f0b00b0098e24230708mr26423611ejc.62.1689204437642; Wed, 12 Jul 2023 16:27:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689204437; cv=none; d=google.com; s=arc-20160816; b=t6Bt7ss0plmqOpbbX+PK23Pj2cwu6gvCOf191PIBYSCrlJx+Np+62OYE9WvVMl6WpV pHKz2tKok3dFlu2WVtcJXOr3Zd11jWeEGXNihcMK4a/oeBzcKS8Jh/sx3X6CGGlSeN1V uTxf5aESYLaXE6vR3iYuKd3D8XI8J38g6P7lNbejaiOFvkGApGaBMsVGXOflfxG65Wo4 Z9C73i9Q0gsCJaoR/jrNLWSmkSQGVmEiFwGOyXh/b0kjwBDbrOGoxs5fHOSdebtYQp3J vhsLF9Yeefg/TQ4MMAaL0ACiodkVg3mxOpoR0sIR3Y3PetFfgNiyRR7tVfcAo08gyUXn cIhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=jftUwg68c2zBXhtc+RhBzOUc4qvlZMeAdt3SnrL0ZEI=; fh=yb4uG0y4asZWHqlPLNfXLnkT1yMR9ApbW5Mi8ElCAYw=; b=hjUMgPZecj74HseGAPwmuui+aAIDI6QqpcGSnCobiizvtEFm/UE9u177kuGG0mmhNy pTa2lvE8oWn3VaDqd9BqdX0S6sMPwYmi3A2229RfAqH/sPIXMw7g4UrG0HD3gxjU+vxV cts/YMh5QeJwr92sHApj4LfYGwTXQP/7KvTzEkWXOdS83iFLzLlI9Nh0VjJDoeO3h/4/ Ny6lDbjer1dB8lcrz+iFyac2N0l2C8CvgI3RkgM9oHVI81s6eZqy3sG/8O6x0uTriT+3 juBaKFviQEW6gylrCGAqkE9wK8bUd6v33xqIC3A0B5p+fj+pl1u5I4u8r+86COc8/gMn ZAuw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JB7IdPOr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f17-20020a1709064dd100b009893fe84e74si5984395ejw.484.2023.07.12.16.26.53; Wed, 12 Jul 2023 16:27:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JB7IdPOr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232817AbjGLXC4 (ORCPT + 99 others); Wed, 12 Jul 2023 19:02:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232424AbjGLXCO (ORCPT ); Wed, 12 Jul 2023 19:02:14 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60D1E12E; Wed, 12 Jul 2023 16:02:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202933; x=1720738933; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fJtlT/xQbGuqifSvzspe+hCOxY7ihArgj6I6Qvo1HZE=; b=JB7IdPOrpTpeHYj1zlQuFbeN7nZEXzgZLxLb9TGoq49s2ys7RwHSmj81 r7st3YpFI4mKcgDbehEUF3XjzCPD/p7cea427vlK03IWoVg/3OUybDTtT jRXm4XMVEmbpPMkg5Mj89IBkd5qLK1IN7gwPaG8wxt3S39HRjuU5N5x16 9J1BoW7bpRKoFgNUbwoiKtNWoxcCv2X0M1sYbvFLScUxm/RYcl6DhpCHJ /kAHonVL0jTR5rAULM+/mkQgrxZhi2qSpAZpoLPBIfMiCTVvVybiqJmQv f7bvkrKBaa3zUSw3K8o8Udj1vbP2q1Vr619j7RM/pupouZa00uyfKMojb A==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774104" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774104" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338621" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338621" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:10 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Sean Christopherson , zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 11/28] x85/sgx: Return the number of EPC pages that were successfully reclaimed Date: Wed, 12 Jul 2023 16:01:45 -0700 Message-Id: <20230712230202.47929-12-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771259232772695822 X-GMAIL-MSGID: 1771259232772695822 From: Sean Christopherson Return the number of reclaimed pages from sgx_reclaim_pages(), the EPC cgroup will use the result to track the success rate of its reclaim calls, e.g. to escalate to a more forceful reclaiming mode if necessary. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index cd5e5517866a..4fc931156972 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -299,15 +299,15 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, * problematic as it would increase the lock contention too much, which would * halt forward progress. */ -static void sgx_reclaim_pages(int nr_to_scan) +static size_t sgx_reclaim_pages(size_t nr_to_scan) { struct sgx_backing backing[SGX_NR_TO_SCAN_MAX]; struct sgx_epc_page *epc_page, *tmp; struct sgx_encl_page *encl_page; pgoff_t page_index; LIST_HEAD(iso); - int ret; - int i; + size_t ret; + size_t i; spin_lock(&sgx_global_lru.lock); for (i = 0; i < SGX_NR_TO_SCAN; i++) { @@ -333,7 +333,7 @@ static void sgx_reclaim_pages(int nr_to_scan) spin_unlock(&sgx_global_lru.lock); if (list_empty(&iso)) - return; + return 0; i = 0; list_for_each_entry_safe(epc_page, tmp, &iso, list) { @@ -378,6 +378,7 @@ static void sgx_reclaim_pages(int nr_to_scan) sgx_free_epc_page(epc_page); } + return i; } static bool sgx_should_reclaim(unsigned long watermark) From patchwork Wed Jul 12 23:01:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119398 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1472311vqm; Wed, 12 Jul 2023 16:15:28 -0700 (PDT) X-Google-Smtp-Source: APBJJlGBAPrxA9h8Ez6P6x1F7ZCoNmFQDOHBw86uj2d+PJvG+VF/1V12Eh6XdFz0IEmteEvokCRz X-Received: by 2002:a05:6a00:21cc:b0:676:ad06:29d9 with SMTP id t12-20020a056a0021cc00b00676ad0629d9mr82171pfj.17.1689203728174; Wed, 12 Jul 2023 16:15:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203728; cv=none; d=google.com; s=arc-20160816; b=NLB5uwbvLEjbX9VrPcUgLZN+wxIlsrqMDlZpUzA3OZD08UXQwMfKVJq/Lytj8DPnAc AWHLHAh5MXWWGZPP2kyPvy2VZyWz0rVuBODlwbSjhKTvvS2qlav4DKaW+Lt0S3Qz94CU dIn9OhrGrkDlFmUDxWDU3UCJErHboAF5XHivUfvyPJqOfT+/JTSrrl+7ZJhvsnviMAJ8 k+q1O0PdMzzLmTbPjP8YiLDLMjE/IzwNdH2xTqphW8TvS0TO99TD1klFvqNWVFuWLt4l ZWgrXLJLRfTCTkKFbexawJ/3jPpQAXc+7tnNvgFNw29Uk7dweUmgJD6L8g/eBLUei4QL hGPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=hxGgBI17qKG0MQucQ49ZS+yu8PQDE8QfcT5wAYHTak8=; fh=yb4uG0y4asZWHqlPLNfXLnkT1yMR9ApbW5Mi8ElCAYw=; b=GjkerGNKKqzesSN+ld5bIEfoTq2wG3WkEfvBcBNfinHkMBtDLUXayfS+DM7xaTLxc5 Rvr12eKSfRyiQzd/tyqDsiMRbLzRvUrwuRJ+POzzeZnHziMT64qZNAk+I/TRQeEas0NP q63YxtR/dQ8nYwuvgGxYK3EF8yrNXEOjufqg7l4ZH0dbKx4FPchzWTv3jtyq6RO9ToSg PbfcfC3N/euplnkCaXHZIPMO5TDybYkVrR86MDO2HsWiKy6k03A0Wo8AQsER2b2XAti3 M82/HdetYj9t4433/lxfgoLkOt9PJlEvC8Esy1MI8R3wL4nlc8UG09Pwqpryb0nFQib3 BVEQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BCrZqv8D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u189-20020a6385c6000000b0054ff40bf1ccsi3920247pgd.702.2023.07.12.16.15.15; Wed, 12 Jul 2023 16:15:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BCrZqv8D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231782AbjGLXDG (ORCPT + 99 others); Wed, 12 Jul 2023 19:03:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232531AbjGLXCP (ORCPT ); Wed, 12 Jul 2023 19:02:15 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7ECB19B; Wed, 12 Jul 2023 16:02:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202934; x=1720738934; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iHo09Hp6a2Hqq4tdVzNvgVvRLKdzq3gAOdGs+9yItKc=; b=BCrZqv8D//PEsmlAShnTqAtwqxvE19Xty4N06cpFFsolhdYKbZuVh7Em BPEYZXwoNeaHXAUTXu4FBgtmh5ct/4E5gEvamsADKTN7v59KQMis47oz9 2cwzvkDP1ZPx342kB21f/dQ97475cXUqZSG+y3wp73mdUVvjo9L3vuhQ8 ChNtFJyunhWw85pfargacO5Mzf7ydZFe9zcDh48RMtOx9hxIOitc+N4tX teber+BU8mNPXXvRFrmIpyMcMIqeqydRLhwr3MWguDzgzWxoTb0S08f7d oOqkJX2LsLvz6Uz5eKlvJuwWQfMyt5WSyQZWcaimc2n3/Ye7PvRcS6Rkl A==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774114" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774114" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338625" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338625" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:11 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Sean Christopherson , zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 12/28] x86/sgx: Add option to ignore age of page during EPC reclaim Date: Wed, 12 Jul 2023 16:01:46 -0700 Message-Id: <20230712230202.47929-13-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258488520086069 X-GMAIL-MSGID: 1771258488520086069 From: Sean Christopherson Add a flag to sgx_reclaim_pages() to instruct it to ignore the age of page, i.e. reclaim the page even if it's young. The EPC cgroup will use the flag to enforce its limits by draining the reclaimable lists before resorting to other measures, e.g. forcefully reclaimable "unreclaimable" pages by killing enclaves. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 44 +++++++++++++++++++++------------- 1 file changed, 28 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 4fc931156972..ea0698db8698 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -34,6 +34,11 @@ static DEFINE_XARRAY(sgx_epc_address_space); */ static struct sgx_epc_lru_lists sgx_global_lru; +static inline struct sgx_epc_lru_lists *sgx_lru_lists(struct sgx_epc_page *epc_page) +{ + return &sgx_global_lru; +} + static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0); /* Nodes with one or more EPC sections. */ @@ -286,6 +291,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, /** * sgx_reclaim_pages() - Reclaim EPC pages from the consumers * @nr_to_scan: Number of EPC pages to scan for reclaim + * @ignore_age: Reclaim a page even if it is young * * Take a fixed number of pages from the head of the active page pool and * reclaim them to the enclave's private shmem files. Skip the pages, which have @@ -299,11 +305,12 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, * problematic as it would increase the lock contention too much, which would * halt forward progress. */ -static size_t sgx_reclaim_pages(size_t nr_to_scan) +static size_t sgx_reclaim_pages(size_t nr_to_scan, bool ignore_age) { struct sgx_backing backing[SGX_NR_TO_SCAN_MAX]; struct sgx_epc_page *epc_page, *tmp; struct sgx_encl_page *encl_page; + struct sgx_epc_lru_lists *lru; pgoff_t page_index; LIST_HEAD(iso); size_t ret; @@ -339,7 +346,8 @@ static size_t sgx_reclaim_pages(size_t nr_to_scan) list_for_each_entry_safe(epc_page, tmp, &iso, list) { encl_page = epc_page->encl_page; - if (i == SGX_NR_TO_SCAN_MAX || !sgx_reclaimer_age(epc_page)) + if (i == SGX_NR_TO_SCAN_MAX || + (!ignore_age && !sgx_reclaimer_age(epc_page))) goto skip; page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base); @@ -357,10 +365,11 @@ static size_t sgx_reclaim_pages(size_t nr_to_scan) continue; skip: - spin_lock(&sgx_global_lru.lock); + lru = sgx_lru_lists(epc_page); + spin_lock(&lru->lock); sgx_epc_page_set_state(epc_page, SGX_EPC_PAGE_RECLAIMABLE); - list_move_tail(&epc_page->list, &sgx_global_lru.reclaimable); - spin_unlock(&sgx_global_lru.lock); + list_move_tail(&epc_page->list, &lru->reclaimable); + spin_unlock(&lru->lock); kref_put(&encl_page->encl->refcount, sgx_encl_release); } @@ -395,7 +404,7 @@ static bool sgx_should_reclaim(unsigned long watermark) void sgx_reclaim_direct(void) { if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) - sgx_reclaim_pages(SGX_NR_TO_SCAN); + sgx_reclaim_pages(SGX_NR_TO_SCAN, false); } static int ksgxd(void *p) @@ -418,7 +427,7 @@ static int ksgxd(void *p) sgx_should_reclaim(SGX_NR_HIGH_PAGES)); if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) - sgx_reclaim_pages(SGX_NR_TO_SCAN); + sgx_reclaim_pages(SGX_NR_TO_SCAN, false); cond_resched(); } @@ -514,14 +523,16 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) */ void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) { - spin_lock(&sgx_global_lru.lock); + struct sgx_epc_lru_lists *lru = sgx_lru_lists(page); + + spin_lock(&lru->lock); WARN_ON_ONCE(sgx_epc_page_reclaimable(page->flags)); page->flags |= flags; if (sgx_epc_page_reclaimable(flags)) - list_add_tail(&page->list, &sgx_global_lru.reclaimable); + list_add_tail(&page->list, &lru->reclaimable); else - list_add_tail(&page->list, &sgx_global_lru.unreclaimable); - spin_unlock(&sgx_global_lru.lock); + list_add_tail(&page->list, &lru->unreclaimable); + spin_unlock(&lru->lock); } /** @@ -536,15 +547,16 @@ void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) */ int sgx_drop_epc_page(struct sgx_epc_page *page) { - spin_lock(&sgx_global_lru.lock); + struct sgx_epc_lru_lists *lru = sgx_lru_lists(page); + + spin_lock(&lru->lock); if (sgx_epc_page_reclaim_in_progress(page->flags)) { - spin_unlock(&sgx_global_lru.lock); + spin_unlock(&lru->lock); return -EBUSY; } - list_del(&page->list); sgx_epc_page_reset_state(page); - spin_unlock(&sgx_global_lru.lock); + spin_unlock(&lru->lock); return 0; } @@ -590,7 +602,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - sgx_reclaim_pages(SGX_NR_TO_SCAN); + sgx_reclaim_pages(SGX_NR_TO_SCAN, false); cond_resched(); } From patchwork Wed Jul 12 23:01:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119378 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1467114vqm; Wed, 12 Jul 2023 16:05:11 -0700 (PDT) X-Google-Smtp-Source: APBJJlEs5jUOtkvAvClAl4130I6mIYXWdRgVOFtlHPiyjhp2zKnDpqeRz/ueG06YR8e2QqrJSOQa X-Received: by 2002:a17:906:1106:b0:98e:419b:4cc6 with SMTP id h6-20020a170906110600b0098e419b4cc6mr17687447eja.70.1689203111572; Wed, 12 Jul 2023 16:05:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203111; cv=none; d=google.com; s=arc-20160816; b=BceaoIgpyloJjWyjL4dLLxy9ApGUteARfLlEu4slf1IUiUwJzvLBKTHjohtdqWldbQ lQqzBMCUzUgzSyawFvY1q2WF0a0klhszeWERJikLjbxSGEISh9G6XrSAw8gh2fVlpWKk Qqtu6qoHFRBTtPNwHkVln+bg6UOJh7/Q22TkoUZ5p9CoJAAoePEElYVqqnhaveu+pyRM DqaeTVWPFg8T8ZbuBj9thmpoPk/bi0XyvuKqN/Cb8bBMJqRS41jjE9NrlmSmbrlKAQ9u awfV8H4UmGhzlkjEL+e87mZDvOjb1em9OxnryztHsrjCFlWg5SKI0Te/ZfIGZYJnbzHs PSDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=nvw7SKME7U3J1zTPuxM2i8UxoNmGI1pbynTQHx1+llo=; fh=yb4uG0y4asZWHqlPLNfXLnkT1yMR9ApbW5Mi8ElCAYw=; b=lwGJTvQIw0wjh5TuZpz/g/biE0zYB+zwY1/G7XyVqrFE1dOPm4Ap4tQMPzkCVWtsiL BvXqQEH2Rsoz78KtYD72//zU5v2y1/m+HbJd2Cc7v4S9sHi+speoY1+SKYY+6LpjizTw 2jwWczumezfwunruo2LydINUMnfeOZM04Hm2RDxS53597Nr3e7USHhqD/71s5UJGpGEr BUsJwF+bZrSi0znA/a5xVY6FUQWWwHUh2JawkWwkBgjLOtLZYRddHcBAvhySroWWgxeZ eL/W6QbMFDSCYfR+GdmRPKixquBYHKZcd4LnIj/YO0zbvSZXsByrWs4mxF0cKknikzgh IV/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SolF6wHy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sd26-20020a170906ce3a00b0098dfb10f3a6si6071122ejb.107.2023.07.12.16.04.47; Wed, 12 Jul 2023 16:05:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SolF6wHy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232452AbjGLXDA (ORCPT + 99 others); Wed, 12 Jul 2023 19:03:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232599AbjGLXCP (ORCPT ); Wed, 12 Jul 2023 19:02:15 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F293910D4; Wed, 12 Jul 2023 16:02:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202934; x=1720738934; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vWWqi0BRvPwh6dlml1YtBKdZnukhG8cG4DxV7m62/7Y=; b=SolF6wHyv6D0F1JBMZRPHCtIZapOrTJkvLTt945s5n52I9LfEHfgutn5 C+i3Q3vQRmUeQTiGo7sECxvZ7SyKmwnFU313b+bCUfvR6M1lHufrnLaJQ iiQSJgYcjIRFLEJ8LK6eA5bydNPBgXu+xlbXqCVB8+y4UL7z4D4cqJ4/i LeVxTvOhJBp2F4GvLtXOYkYgYcKcBDr3M7qBf7Qyi1yNfejw4aIqLJC40 7ucL8UlELrW5/vhPpHD+n53mGdh9DNAB5geWD3sy7go18HDDFPEfKKg1V ThBcVDtxcAQqZkB6eUrw+rd9nt4H143qg5C/2EkG+BhdWKEGIsoX4uyma g==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774122" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774122" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:12 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338631" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338631" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:11 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Sean Christopherson , zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 13/28] x86/sgx: Prepare for multiple LRUs Date: Wed, 12 Jul 2023 16:01:47 -0700 Message-Id: <20230712230202.47929-14-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771257842174012295 X-GMAIL-MSGID: 1771257842174012295 From: Sean Christopherson Add sgx_can_reclaim() wrapper so that in a subsequent patch, multiple LRUs can be used cleanly. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index ea0698db8698..a829555b9675 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -390,10 +390,15 @@ static size_t sgx_reclaim_pages(size_t nr_to_scan, bool ignore_age) return i; } +static bool sgx_can_reclaim(void) +{ + return !list_empty(&sgx_global_lru.reclaimable); +} + static bool sgx_should_reclaim(unsigned long watermark) { return atomic_long_read(&sgx_nr_free_pages) < watermark && - !list_empty(&sgx_global_lru.reclaimable); + sgx_can_reclaim(); } /* @@ -589,7 +594,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - if (list_empty(&sgx_global_lru.reclaimable)) + if (!sgx_can_reclaim()) return ERR_PTR(-ENOMEM); if (!reclaim) { From patchwork Wed Jul 12 23:01:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119403 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1476651vqm; Wed, 12 Jul 2023 16:25:00 -0700 (PDT) X-Google-Smtp-Source: APBJJlFscUo5TGPIKFa0rsJTllFQ31ujC4E+vZCDsq0a+3aVoNcf/8q9C9vB5cw8SsAPfDNHz1OE X-Received: by 2002:a05:6000:cf:b0:313:f38f:1f4e with SMTP id q15-20020a05600000cf00b00313f38f1f4emr16611002wrx.27.1689204300296; Wed, 12 Jul 2023 16:25:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689204300; cv=none; d=google.com; s=arc-20160816; b=tZfMuV0munSSoMmpY3dGl4thF+aFI6IuLscTN64a+Smf4ZWZcVTPaFSXS5wAJfy02y nj6O7nELkUFWOiMB9lzT8WWWf1uZ7twuec8vv/bfDNfGow6Cih1gBkhQjLoEO9qdmNW3 zSkLWItG/v8xIoHlvmV0A0Ghc8j5e/z4Bm0jOQ2vs2YSLdXDpA6xBCzRVrUM9nmDzjAP aoHF+THM4UX/l8z2kCGDaujlnHuwtIsAI5zZo99RHZA66ymTkuj9nuSbBkqvaIgxuGJ1 /M3suFPn8MFIvWOBw+XSPrPrkXuz+sDylwUic8ZY4qc98AfKAD8gz3HMlWrLpjkEZGUO BRiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=IkFpFcTS704jS7bHFi/VmwufPlO0b/+72xFchyAT2Sc=; fh=yb4uG0y4asZWHqlPLNfXLnkT1yMR9ApbW5Mi8ElCAYw=; b=vstIdTnEb8kcIMXPMuRC1HuiTswDAi5xEbsqh+6viY6/n1NHWmCK8L1kKJEfeM3qU6 FwE8kJh9se6GMPe26FoIKVG2C62drdv5Ly5e7SRtcyQXkiWSw9s/85utCgw6NhIY93au 7N+AgvcZXE5mSU5z6qklNpxpZCqbwqSaI7aoiro2Jvol+OYwH7aHZDVAeZRgfDJ/WCfy 13D6llF9xezlOODYo2sZuXPBGnWBlAXuWPMXI8TopF1mK47tK+hPFN4cmIO0mbi7+kWA 8MNy1EPSH0wPG9KdMfPCQ/DxNnF4CbDjdxF1EqJoxOkO74emWDpZ/wqEHRZnxVHvnmu9 VsYg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=bW2xzc32; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d8-20020a056402516800b0051de2148abdsi5750466ede.359.2023.07.12.16.24.34; Wed, 12 Jul 2023 16:25:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=bW2xzc32; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233082AbjGLXDD (ORCPT + 99 others); Wed, 12 Jul 2023 19:03:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232439AbjGLXCQ (ORCPT ); Wed, 12 Jul 2023 19:02:16 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A49411D; Wed, 12 Jul 2023 16:02:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202934; x=1720738934; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=daxYHRT3QA1Zpwip8LWDoQLP3b3KzCxxiKp1osHbrTA=; b=bW2xzc32J98NefciNve4vJGMp7SJ1nVl0kN+Z7dMuOaI3KjLp6zkYv6v KYg4PF0SAUFcRMroS1WnyU+o4XykaqPuyTapgWPvhPDm1oZ6C1QP/SDsk 7bN34rONoXRLDYXP4TyN0jUTey03RnSsop2Bxgy50s+75OY3jcsTz9F4k RyiUXdXeYvF2Xc++gmEhWsZ46Kic7nVBrlojO8EmQay1H56/80T60++61 8dfecJsKwD0WN8PV8PnYHQU3b7bV9xw5aIJaj7b5x2z/YXgLcvHuQqG1B 3BXd38YuXKJn2LgNQywngGsKSIBaCvlCJQZaaO8ZGZDfuAZCug3URWTnf w==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774129" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774129" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338635" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338635" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:12 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Sean Christopherson , zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 14/28] x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup Date: Wed, 12 Jul 2023 16:01:48 -0700 Message-Id: <20230712230202.47929-15-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771259088351887810 X-GMAIL-MSGID: 1771259088351887810 From: Sean Christopherson Expose the top-level reclaim function as sgx_reclaim_epc_pages() for use by the upcoming EPC cgroup, which will initiate reclaim to enforce changes to high/max limits. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 10 +++++----- arch/x86/kernel/cpu/sgx/sgx.h | 1 + 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index a829555b9675..e9c9e0d97300 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -289,7 +289,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, } /** - * sgx_reclaim_pages() - Reclaim EPC pages from the consumers + * sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers * @nr_to_scan: Number of EPC pages to scan for reclaim * @ignore_age: Reclaim a page even if it is young * @@ -305,7 +305,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, * problematic as it would increase the lock contention too much, which would * halt forward progress. */ -static size_t sgx_reclaim_pages(size_t nr_to_scan, bool ignore_age) +size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age) { struct sgx_backing backing[SGX_NR_TO_SCAN_MAX]; struct sgx_epc_page *epc_page, *tmp; @@ -409,7 +409,7 @@ static bool sgx_should_reclaim(unsigned long watermark) void sgx_reclaim_direct(void) { if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) - sgx_reclaim_pages(SGX_NR_TO_SCAN, false); + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); } static int ksgxd(void *p) @@ -432,7 +432,7 @@ static int ksgxd(void *p) sgx_should_reclaim(SGX_NR_HIGH_PAGES)); if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) - sgx_reclaim_pages(SGX_NR_TO_SCAN, false); + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); cond_resched(); } @@ -607,7 +607,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - sgx_reclaim_pages(SGX_NR_TO_SCAN, false); + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); cond_resched(); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index f26ed4c0d12f..98d3b15341b1 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -175,6 +175,7 @@ void sgx_reclaim_direct(void); void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags); int sgx_drop_epc_page(struct sgx_epc_page *page); struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); +size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age); void sgx_ipi_cb(void *info); From patchwork Wed Jul 12 23:01:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119379 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1467220vqm; Wed, 12 Jul 2023 16:05:20 -0700 (PDT) X-Google-Smtp-Source: APBJJlF77G80ZRhhP8CyMHq7YwUqx2wbqEtslmifaqCT1lc9twoS0Q/UKRf8jMF24ag9QebNzjHN X-Received: by 2002:a17:906:c417:b0:991:fef4:bb7 with SMTP id u23-20020a170906c41700b00991fef40bb7mr16752167ejz.73.1689203120308; Wed, 12 Jul 2023 16:05:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203120; cv=none; d=google.com; s=arc-20160816; b=g/0hUPSyQrwd1OivS0Me6mmzmxn2KlUov2dYkG1Nn4CDI2fk5guQLW2ikUcQLafMI5 WsfjAo1s3/GER3RfGyVS8jY2KY4EblcddVqd5mUQgC2MlmMI4SBtyNNNR8mPrg1wtRTf kt5tif9qf3Rqo6wn6ha9H1c6olnZtEIpa2vBIrobFEZVvDDdfDqrzHobA7CkrGA6cVEX zw1tqsg6Rmfq8/4z965dFj5pbHMWrrq17R1IL0hamXytdnfFH24sD/v+5DrxG6e+hRII fKqIfBjS+mXcHRS51Z3LHw/EFYaek3WAbzU423dKCarDV5j/R3x1Pom6DlmazyrwUUAt wCfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=izIrxpEh7D09X6AMw7uVw/no2C2TcBoyc6tpPT12GkE=; fh=yb4uG0y4asZWHqlPLNfXLnkT1yMR9ApbW5Mi8ElCAYw=; b=cDHJqDqOHYYXGBR9i+RSI4nHuQL6+H55wM7fBbp0Gd7EHlW8nCgngsKEhwVQHfDnnM aByB+ET6DCysxKl2Pm1OF5mLu7ahh+VbSs0WOUFrBI49eEEHlRFCgK5stIlVoXDiKgke /cJqsQHficEZ+B9INPfKT66pdp8RD/idF9XSYSb/mU2j92tGmxObN6T1hJQ9H8PKzD2H G7EAreQgL0y97dmz0zwcIhP/15tEkMxQeHloVzIc+c3W90qFwS9iYB5hfH2W1BBILyl7 Smw3EdlhMG5QM5TuewmSEKt2ZxAW3TfUML12QCMp1aSubv3oW0Dbfzaas4LzmASN5K0G tylg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=agrHstpK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e16-20020a170906045000b009885c5f1d7asi5477752eja.319.2023.07.12.16.04.55; Wed, 12 Jul 2023 16:05:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=agrHstpK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233096AbjGLXDJ (ORCPT + 99 others); Wed, 12 Jul 2023 19:03:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232714AbjGLXCR (ORCPT ); Wed, 12 Jul 2023 19:02:17 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D39D10E2; Wed, 12 Jul 2023 16:02:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202935; x=1720738935; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tA0QQdRKFxwvJ3GfeRZ7KDjSQ0meiTP93JAaft2AeWw=; b=agrHstpKriBNmRDcJB6E1/yRtxsSES7TORHQEMeifhhBn319tUlzQlJZ moZHBLqMzGTGT8+aVE3BAqeLTkOWr06uHXAnczVABySpIB7MlD/0qWdjX aWTyy7wsB0z2Fu923FaKuEtLiBLs3W4/qHwacdxp0hkIT1m8pauUbrl3p oRIENHi6Lts1RzNaytfJaBWUPgJUXXnLg9r4qwCQ+PFT4TjdVekNOWm3a ipD+1VIBMaKr//ZSAOLB0mYei7vFp40htNH+kwRG+flqXpyn1quvFg4Fp R7+U5+mP+aSWwcZ1HsU2Y8+yaNoKT+MWXbYga3K1BkjaNI377cJ9t0fl+ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774138" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774138" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338640" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338640" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:13 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Sean Christopherson , zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 15/28] x86/sgx: Add helper to grab pages from an arbitrary EPC LRU Date: Wed, 12 Jul 2023 16:01:49 -0700 Message-Id: <20230712230202.47929-16-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771257851260646219 X-GMAIL-MSGID: 1771257851260646219 From: Sean Christopherson Move the isolation loop into a helper, sgx_isolate_pages(), in preparation for existence of multiple LRUs. Expose the helper to other SGX code so that it can be called from the EPC cgroup code, e.g. to isolate pages from a single cgroup LRU. Exposing the isolation loop allows the cgroup iteration logic to be wholly encapsulated within the cgroup code. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 60 +++++++++++++++++++++------------- arch/x86/kernel/cpu/sgx/sgx.h | 2 ++ 2 files changed, 40 insertions(+), 22 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index e9c9e0d97300..883470062514 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -288,6 +288,43 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, mutex_unlock(&encl->lock); } +/** + * sgx_isolate_epc_pages() - Isolate pages from an LRU for reclaim + * @lru: LRU from which to reclaim + * @nr_to_scan: Number of pages to scan for reclaim + * @dst: Destination list to hold the isolated pages + */ +void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lru, size_t nr_to_scan, + struct list_head *dst) +{ + struct sgx_encl_page *encl_page; + struct sgx_epc_page *epc_page; + + spin_lock(&lru->lock); + for (; nr_to_scan > 0; --nr_to_scan) { + epc_page = list_first_entry_or_null(&lru->reclaimable, struct sgx_epc_page, list); + if (!epc_page) + break; + + encl_page = epc_page->encl_page; + + if (WARN_ON_ONCE(!(epc_page->flags & SGX_EPC_OWNER_ENCL_PAGE))) + continue; + + if (kref_get_unless_zero(&encl_page->encl->refcount)) { + sgx_epc_page_set_state(epc_page, SGX_EPC_PAGE_RECLAIM_IN_PROGRESS); + list_move_tail(&epc_page->list, dst); + } else { + /* The owner is freeing the page, remove it from the + * LRU list + */ + sgx_epc_page_reset_state(epc_page); + list_del_init(&epc_page->list); + } + } + spin_unlock(&lru->lock); +} + /** * sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers * @nr_to_scan: Number of EPC pages to scan for reclaim @@ -316,28 +353,7 @@ size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age) size_t ret; size_t i; - spin_lock(&sgx_global_lru.lock); - for (i = 0; i < SGX_NR_TO_SCAN; i++) { - epc_page = list_first_entry_or_null(&sgx_global_lru.reclaimable, - struct sgx_epc_page, list); - if (!epc_page) - break; - - list_del_init(&epc_page->list); - encl_page = epc_page->encl_page; - - if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) { - sgx_epc_page_set_state(epc_page, SGX_EPC_PAGE_RECLAIM_IN_PROGRESS); - list_move_tail(&epc_page->list, &iso); - } else { - /* The owner is freeing the page, remove it from the - * LRU list - */ - sgx_epc_page_reset_state(epc_page); - list_del_init(&epc_page->list); - } - } - spin_unlock(&sgx_global_lru.lock); + sgx_isolate_epc_pages(&sgx_global_lru, nr_to_scan, &iso); if (list_empty(&iso)) return 0; diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 98d3b15341b1..25db815f5add 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -176,6 +176,8 @@ void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags); int sgx_drop_epc_page(struct sgx_epc_page *page); struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age); +void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lrus, size_t nr_to_scan, + struct list_head *dst); void sgx_ipi_cb(void *info); From patchwork Wed Jul 12 23:01:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119382 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1468042vqm; Wed, 12 Jul 2023 16:06:54 -0700 (PDT) X-Google-Smtp-Source: APBJJlGvmfIDWHlkiozlnCj138NU1eRHF1YFMCHF0cuuLp4/dIV9ML6f3N5vaglbHQueXrysoZY7 X-Received: by 2002:adf:ec86:0:b0:314:8d:7eb5 with SMTP id z6-20020adfec86000000b00314008d7eb5mr20077679wrn.29.1689203214066; Wed, 12 Jul 2023 16:06:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203214; cv=none; d=google.com; s=arc-20160816; b=C/cuftpzZyYGER9e4yLB5maBicSRJuRMnyiEXyX0NLpxreVuTGv9qGcn9EMMHrDpC5 Vg8c5HyJxJpRQB/J2HCYnATpJVWSG8UwAqICf8UR62JCm1+Zx9h+Uj33FDIlexY/dfF5 kMKJt4N7wKUuFlBUYMEEjcbfNlQe6nJhaQz9Qws65mvvvLeXTgSMfOPu84bQgxub/ABx Bw1Qz0IWgKD9NUcpfVAFr/0SkQh5GdvN1/3zahWbSORw8VGADK7oWRHvH10klxohu/PJ CfhNNj31/hW6sXV2xWn+bOgi+DC4RgRxuTqrfbzs8PNEFJclfl5Er5SNwRiMjVvsaYcH oGig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=OSTNfxBaBNTuADxBvIp+1fL8LFcOZTkKoF1iEBwdhZA=; fh=yb4uG0y4asZWHqlPLNfXLnkT1yMR9ApbW5Mi8ElCAYw=; b=YmpjVwUW/FmVVS0KglUOiuVZVdyzEVa6ySb4f8mu/aQJIC2daH8h+zV5KfHgD1NcuG +izqy1CdDHaVnWHoNK13Xn1wSjAvao+ANbcxa8XILs2uxDheAQmutIX9glcEHWs660TS U9pJVa8gBTZsbH+B5KPn6a1H1ABgEAMkCjC5swj7vK7SWQPebYwjmHGl9k/qSjGAes7m TK1PuDWF0kFtpszIufhfDB8oOEa6+dLLWyRP3gVFZcskA6pCCveInbKT7ZArWo6mMxIx CRwvMQjmuCrohFJJPCbkEKoP/+j+VL8bxDFyZh/rL/ljcCXVpYYmavOnw0JUoHSFa29B xRfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fRB9Wp3X; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o16-20020a170906289000b0098e31cd141asi5797165ejd.747.2023.07.12.16.06.29; Wed, 12 Jul 2023 16:06:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fRB9Wp3X; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232118AbjGLXDP (ORCPT + 99 others); Wed, 12 Jul 2023 19:03:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232727AbjGLXCR (ORCPT ); Wed, 12 Jul 2023 19:02:17 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93D18173C; Wed, 12 Jul 2023 16:02:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202935; x=1720738935; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yu336iBUDP7BjU+WkSBkWx6y2b0+7f0dw5mWwSyUGZU=; b=fRB9Wp3XdyM5OIOr5bi86Tvju8oXarRpsAV5j+yKTbq14+WyLZnRTv70 B6yC+Jfxr1WjPRNFTEqCXYVww+v9f08QzfenplUjPtXzpllN7sWvFq0+7 oZ73llITrHzkm9nszSRcqWvwIW2TL4Pll6yP8xT6YDUjigESjaqsfRo50 8Uix6MWVXRy33tx+i7hKRrwNIcb9a5DgfxZuPyat4lmirzDhaolVlYB7/ mwGxl7HHyJaOG2aN4Bxxjc/f0zg7d3wG1raDcjDTa74btdLznTkK5ueDN xNr/I7GxuhR6Il+dqndtJ6fMvo40oQB9oawt9dKxfSEGDqJZFhnJh3B/6 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774147" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774147" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338643" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338643" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:13 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Sean Christopherson , zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 16/28] x86/sgx: Add EPC OOM path to forcefully reclaim EPC Date: Wed, 12 Jul 2023 16:01:50 -0700 Message-Id: <20230712230202.47929-17-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771257949445928029 X-GMAIL-MSGID: 1771257949445928029 From: Sean Christopherson Introduce the OOM path for killing an enclave with the reclaimer is no longer able to reclaim enough EPC pages. Find a victim enclave, which will be an enclave with EPC pages remaining that are not accessible to the reclaimer ("unreclaimable"). Once a victim is identified, mark the enclave as OOM and zap the enclaves entire page range, and drain all mm references in encl->mm_list. Block allocating any EPC pages in #PF handler, or reloading any pages in all paths, or creating any new mappings. The OOM killing path may race with the reclaimers: in some cases, the victim enclave is in the process of reclaiming the last EPC pages when OOM happens, that is, all pages other than SECS and VA pages are in RECLAIMING_IN_PROGRESS state. The reclaiming process requires access to the enclave backing, VA pages as well as SECS. So the OOM killer does not directly release those enclave resources, instead, it lets all reclaiming in progress to finish, and relies (as currently done) on kref_put on encl->refcount to trigger sgx_encl_release() to do the final cleanup. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson V3: - Rebased to use the new VMA_ITERATOR to zap VMAs. - Fixed the racing cases by blocking new page allocation/mapping and reloading when enclave is marked for OOM. And do not release any enclave resources other than draining mm_list entries, and let pages in RECLAIMING_IN_PROGRESS to be reaped by reclaimers. - Due to above changes, also removed the no-longer needed encl->lock in the OOM path which was causing deadlocks reported by the lock prover. --- arch/x86/kernel/cpu/sgx/driver.c | 27 +----- arch/x86/kernel/cpu/sgx/encl.c | 48 ++++++++++- arch/x86/kernel/cpu/sgx/encl.h | 2 + arch/x86/kernel/cpu/sgx/ioctl.c | 9 ++ arch/x86/kernel/cpu/sgx/main.c | 140 +++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/sgx/sgx.h | 1 + 6 files changed, 200 insertions(+), 27 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c index 262f5fb18d74..ff42d649c7b6 100644 --- a/arch/x86/kernel/cpu/sgx/driver.c +++ b/arch/x86/kernel/cpu/sgx/driver.c @@ -44,7 +44,6 @@ static int sgx_open(struct inode *inode, struct file *file) static int sgx_release(struct inode *inode, struct file *file) { struct sgx_encl *encl = file->private_data; - struct sgx_encl_mm *encl_mm; /* * Drain the remaining mm_list entries. At this point the list contains @@ -52,31 +51,7 @@ static int sgx_release(struct inode *inode, struct file *file) * not exited yet. The processes, which have exited, are gone from the * list by sgx_mmu_notifier_release(). */ - for ( ; ; ) { - spin_lock(&encl->mm_lock); - - if (list_empty(&encl->mm_list)) { - encl_mm = NULL; - } else { - encl_mm = list_first_entry(&encl->mm_list, - struct sgx_encl_mm, list); - list_del_rcu(&encl_mm->list); - } - - spin_unlock(&encl->mm_lock); - - /* The enclave is no longer mapped by any mm. */ - if (!encl_mm) - break; - - synchronize_srcu(&encl->srcu); - mmu_notifier_unregister(&encl_mm->mmu_notifier, encl_mm->mm); - kfree(encl_mm); - - /* 'encl_mm' is gone, put encl_mm->encl reference: */ - kref_put(&encl->refcount, sgx_encl_release); - } - + sgx_encl_mm_drain(encl); kref_put(&encl->refcount, sgx_encl_release); return 0; } diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index e7319209fc4a..c321c848baa9 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -430,6 +430,9 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) if (unlikely(!encl)) return VM_FAULT_SIGBUS; + if (test_bit(SGX_ENCL_OOM, &encl->flags)) + return VM_FAULT_SIGBUS; + /* * The page_array keeps track of all enclave pages, whether they * are swapped out or not. If there is no entry for this page and @@ -628,7 +631,8 @@ static int sgx_vma_access(struct vm_area_struct *vma, unsigned long addr, if (!encl) return -EFAULT; - if (!test_bit(SGX_ENCL_DEBUG, &encl->flags)) + if (!test_bit(SGX_ENCL_DEBUG, &encl->flags) || + test_bit(SGX_ENCL_OOM, &encl->flags)) return -EFAULT; for (i = 0; i < len; i += cnt) { @@ -753,6 +757,45 @@ void sgx_encl_release(struct kref *ref) kfree(encl); } +/** + * sgx_encl_mm_drain - drain all mm_list entries + * @encl: address of the sgx_encl to drain + * + * Used during oom kill to empty the mm_list entries after they have been + * zapped. Or used by sgx_release to drain the remaining mm_list entries when + * the enclave fd is closing. After this call, sgx_encl_release will be called + * with kref_put. + */ +void sgx_encl_mm_drain(struct sgx_encl *encl) +{ + struct sgx_encl_mm *encl_mm; + + for ( ; ; ) { + spin_lock(&encl->mm_lock); + + if (list_empty(&encl->mm_list)) { + encl_mm = NULL; + } else { + encl_mm = list_first_entry(&encl->mm_list, + struct sgx_encl_mm, list); + list_del_rcu(&encl_mm->list); + } + + spin_unlock(&encl->mm_lock); + + /* The enclave is no longer mapped by any mm. */ + if (!encl_mm) + break; + + synchronize_srcu(&encl->srcu); + mmu_notifier_unregister(&encl_mm->mmu_notifier, encl_mm->mm); + kfree(encl_mm); + + /* 'encl_mm' is gone, put encl_mm->encl reference: */ + kref_put(&encl->refcount, sgx_encl_release); + } +} + /* * 'mm' is exiting and no longer needs mmu notifications. */ @@ -822,6 +865,9 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm) struct sgx_encl_mm *encl_mm; int ret; + if (test_bit(SGX_ENCL_OOM, &encl->flags)) + return -ENOMEM; + /* * Even though a single enclave may be mapped into an mm more than once, * each 'mm' only appears once on encl->mm_list. This is guaranteed by diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index 831d63f80f5a..47792fb00cee 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -39,6 +39,7 @@ enum sgx_encl_flags { SGX_ENCL_DEBUG = BIT(1), SGX_ENCL_CREATED = BIT(2), SGX_ENCL_INITIALIZED = BIT(3), + SGX_ENCL_OOM = BIT(4), }; struct sgx_encl_mm { @@ -125,5 +126,6 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, unsigned long addr); struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, bool reclaim); void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page); +void sgx_encl_mm_drain(struct sgx_encl *encl); #endif /* _X86_ENCL_H */ diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 4f95096c9786..2c159168f346 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -420,6 +420,9 @@ static long sgx_ioc_enclave_add_pages(struct sgx_encl *encl, void __user *arg) test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) return -EINVAL; + if (test_bit(SGX_ENCL_OOM, &encl->flags)) + return -ENOMEM; + if (copy_from_user(&add_arg, arg, sizeof(add_arg))) return -EFAULT; @@ -605,6 +608,9 @@ static long sgx_ioc_enclave_init(struct sgx_encl *encl, void __user *arg) test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) return -EINVAL; + if (test_bit(SGX_ENCL_OOM, &encl->flags)) + return -ENOMEM; + if (copy_from_user(&init_arg, arg, sizeof(init_arg))) return -EFAULT; @@ -681,6 +687,9 @@ static int sgx_ioc_sgx2_ready(struct sgx_encl *encl) if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) return -EINVAL; + if (test_bit(SGX_ENCL_OOM, &encl->flags)) + return -ENOMEM; + return 0; } diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 883470062514..9ea487469e4c 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -662,6 +662,146 @@ void sgx_free_epc_page(struct sgx_epc_page *page) atomic_long_inc(&sgx_nr_free_pages); } +static bool sgx_oom_get_ref(struct sgx_epc_page *epc_page) +{ + struct sgx_encl *encl; + + if (epc_page->flags & SGX_EPC_OWNER_ENCL_PAGE) + encl = epc_page->encl_page->encl; + else if (epc_page->flags & SGX_EPC_OWNER_ENCL) + encl = epc_page->encl; + else + return false; + + return kref_get_unless_zero(&encl->refcount); +} + +static struct sgx_epc_page *sgx_oom_get_victim(struct sgx_epc_lru_lists *lru) +{ + struct sgx_epc_page *epc_page, *tmp; + + if (list_empty(&lru->unreclaimable)) + return NULL; + + list_for_each_entry_safe(epc_page, tmp, &lru->unreclaimable, list) { + list_del_init(&epc_page->list); + + if (sgx_oom_get_ref(epc_page)) + return epc_page; + } + return NULL; +} + +static void sgx_epc_oom_zap(void *owner, struct mm_struct *mm, unsigned long start, + unsigned long end, const struct vm_operations_struct *ops) +{ + VMA_ITERATOR(vmi, mm, start); + struct vm_area_struct *vma; + + /** + * Use end because start can be zero and not mapped into + * enclave even if encl->base = 0. + */ + for_each_vma_range(vmi, vma, end) { + if (vma->vm_ops == ops && vma->vm_private_data == owner && + vma->vm_start < end) { + zap_vma_pages(vma); + } + } +} + +static bool sgx_oom_encl(struct sgx_encl *encl) +{ + unsigned long mm_list_version; + struct sgx_encl_mm *encl_mm; + bool ret = false; + int idx; + + if (!test_bit(SGX_ENCL_CREATED, &encl->flags)) + goto out_put; + + /* Done OOM on this enclave previously, do not redo it. + * This may happen when the SECS page is still UNCLRAIMABLE because + * another page is in RECLAIM_IN_PROGRESS. Still return true so OOM + * killer can wait until the reclaimer done with the hold-up page and + * SECS before it move on to find another victim. + */ + if (test_bit(SGX_ENCL_OOM, &encl->flags)) + goto out; + + set_bit(SGX_ENCL_OOM, &encl->flags); + + do { + mm_list_version = encl->mm_list_version; + + /* Pairs with smp_rmb() in sgx_encl_mm_add(). */ + smp_rmb(); + + idx = srcu_read_lock(&encl->srcu); + + list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) { + if (!mmget_not_zero(encl_mm->mm)) + continue; + + mmap_read_lock(encl_mm->mm); + + sgx_epc_oom_zap(encl, encl_mm->mm, encl->base, + encl->base + encl->size, &sgx_vm_ops); + + mmap_read_unlock(encl_mm->mm); + + mmput_async(encl_mm->mm); + } + + srcu_read_unlock(&encl->srcu, idx); + } while (WARN_ON_ONCE(encl->mm_list_version != mm_list_version)); + + sgx_encl_mm_drain(encl); +out: + ret = true; + +out_put: + /* + * This puts the refcount we took when we identified this enclave as + * an OOM victim. + */ + kref_put(&encl->refcount, sgx_encl_release); + return ret; +} + +static inline bool sgx_oom_encl_page(struct sgx_encl_page *encl_page) +{ + return sgx_oom_encl(encl_page->encl); +} + +/** + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU + * @lru: LRU that is low + * + * Return: %true if a victim was found and kicked. + */ +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) +{ + struct sgx_epc_page *victim; + + spin_lock(&lru->lock); + victim = sgx_oom_get_victim(lru); + spin_unlock(&lru->lock); + + if (!victim) + return false; + + if (victim->flags & SGX_EPC_OWNER_ENCL_PAGE) + return sgx_oom_encl_page(victim->encl_page); + + if (victim->flags & SGX_EPC_OWNER_ENCL) + return sgx_oom_encl(victim->encl); + + /*Will never happen unless we add more owner types in future */ + WARN_ON_ONCE(1); + return false; +} + static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, unsigned long index, struct sgx_epc_section *section) diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 25db815f5add..c6b3c90db0fa 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -178,6 +178,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age); void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lrus, size_t nr_to_scan, struct list_head *dst); +bool sgx_epc_oom(struct sgx_epc_lru_lists *lrus); void sgx_ipi_cb(void *info); From patchwork Wed Jul 12 23:01:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119380 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1467468vqm; Wed, 12 Jul 2023 16:05:49 -0700 (PDT) X-Google-Smtp-Source: APBJJlGUoPd30WaazooIkha7isk8BCTqUsmkVrKU3d1RGWrVZgw3DeSbsNtHHS9ISx9sHT7HdrpH X-Received: by 2002:a17:906:9b8f:b0:98d:f6eb:3b03 with SMTP id dd15-20020a1709069b8f00b0098df6eb3b03mr22631541ejc.56.1689203149242; Wed, 12 Jul 2023 16:05:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203149; cv=none; d=google.com; s=arc-20160816; b=q0Dj15BOZfYu0NSsd8fQXfJk0Br36VBwuNUjz9A7ib7cu1HEJ22Huq7oDsgcgc2wXr dlzyW/r1TQmvaRqDFNB2+PAsz7Kc7/aWl9G//kLylLAfClAfDexcnWYdZySBfmZRXqXZ 983yicTWqDNOVa/trpfSW3p/D5Le9qWU1WcU2iuWo0KElJ7WiLgccvnOG/ou+pQT5lQx 4j4aAkWlgf76JeADfyJU4DXvJ54z3gPuZcjNey9Oz/XwsI802oaHXcVtxkHPNYYcd453 IF4C6gmAZD9JvOI1W+VU5OIiAm/XTURcepKWRU6DmDXmWSPZD6WTX8+2pY5UcB9MkC2D iUDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=JMn5C+SuiMidlQ59XUH64zNjGldBgm+zQ80WuiVSgyY=; fh=qLO4/VO18j+9BHx+eDGN4B9wTedSSqlU+IMbNJ/OHOg=; b=z+1IAPRIzyAw+AfdSMq16CcfFJLn6tkFEab3ziJAmSqu2sNAbDihDsDGA0arc5JfiC 2t7Uos5BOt7lr8UY2dnwfziVg3rb+rX3urIeGHH+r3uaWKpNf3JNhGYfydFzZZL8AZnB vR5U/ZJXL2vd5NchbiErfXFingontn0Rphcv0C6gei+PbssIvxGk4NjkvRPrgw8bY0o5 1FbdbuTBBVHkQBINuavrZi/Dtl519cNH3WrwCTWj8IG5f33JTD10GOLXXO79jgNfJ2RF codFTSiSchJF7E+anhVfW/7qnIUKN+uTOe5WBtP+VkftwpPJBZ+ItOYglAw2lz5ViiAQ uP8w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GXltZie+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id rp1-20020a170906d96100b009888e0923cfsi6020244ejb.712.2023.07.12.16.05.25; Wed, 12 Jul 2023 16:05:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GXltZie+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233104AbjGLXDM (ORCPT + 99 others); Wed, 12 Jul 2023 19:03:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232717AbjGLXCR (ORCPT ); Wed, 12 Jul 2023 19:02:17 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B47C10D4; Wed, 12 Jul 2023 16:02:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202936; x=1720738936; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=29rCY8b1sI/pP+SjE7VInjolfMafoPZmrodrQ56lQbM=; b=GXltZie+I4jPLkN3A6s3ad60A3gpvCwELiAms4Wm9FVXszPi9KL3nkkq mmfhAPa5pOyzyDgffbNlrVprAkdzVL9jDd26Sy2R7uxfimS9ZfEFsnwff m1Nduajsy/9+vJhif1yVqHLJ5NCKn9HPsWfXdyTuruEGcU6gxWoAa/rCy Kzz32WTz77TxwxgMs7q4hTH1jvlr8cSuXNzp5KD9aTDkL1TXf+sTWq8kB zRn8vyIdg4S32/2v6JjvZFxf9vpbBC308aOQG+VG5PK6hPoiQs1yBVVG8 Ykt2D4AJAAP/ZtbJZncskae2RqFrHdgMS0LWIWSjbUZZFVbBBHuCojIfL w==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774156" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774156" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338646" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338646" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:14 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com Subject: [PATCH v3 17/28] x86/sgx: fix a NULL pointer Date: Wed, 12 Jul 2023 16:01:51 -0700 Message-Id: <20230712230202.47929-18-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771257881039776032 X-GMAIL-MSGID: 1771257881039776032 Under heavy load, the SGX EPC reclaimers (ksgxd or future EPC cgroup worker) may reclaim SECS EPC page for an enclave and set encl->secs.epc_page to NULL. But the SECS EPC page is required for EAUG in #PF handler and is used without checking for NULL and reloading. Fix this by checking if SECS is loaded before EAUG and load it if it was reclaimed. Signed-off-by: Haitao Huang --- arch/x86/kernel/cpu/sgx/encl.c | 30 +++++++++++++++++++++++------- arch/x86/kernel/cpu/sgx/main.c | 4 ++++ 2 files changed, 27 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index c321c848baa9..028d1b9d6572 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -235,6 +235,19 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page, return epc_page; } +static struct sgx_epc_page *sgx_encl_load_secs(struct sgx_encl *encl) +{ + struct sgx_epc_page *epc_page = encl->secs.epc_page; + + if (!epc_page) { + epc_page = sgx_encl_eldu(&encl->secs, NULL); + if (!IS_ERR(epc_page)) + sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE | + SGX_EPC_PAGE_UNRECLAIMABLE); + } + return epc_page; +} + static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, struct sgx_encl_page *entry) { @@ -248,13 +261,9 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, return entry; } - if (!(encl->secs.epc_page)) { - epc_page = sgx_encl_eldu(&encl->secs, NULL); - if (IS_ERR(epc_page)) - return ERR_CAST(epc_page); - sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE | - SGX_EPC_PAGE_UNRECLAIMABLE); - } + epc_page = sgx_encl_load_secs(encl); + if (IS_ERR(epc_page)) + return ERR_CAST(epc_page); epc_page = sgx_encl_eldu(entry, encl->secs.epc_page); if (IS_ERR(epc_page)) @@ -342,6 +351,13 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma, mutex_lock(&encl->lock); + epc_page = sgx_encl_load_secs(encl); + if (IS_ERR(epc_page)) { + if (PTR_ERR(epc_page) == -EBUSY) + vmret = VM_FAULT_NOPAGE; + goto err_out_unlock; + } + epc_page = sgx_alloc_epc_page(encl_page, false); if (IS_ERR(epc_page)) { if (PTR_ERR(epc_page) == -EBUSY) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 9ea487469e4c..68c89d575abc 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -265,6 +265,10 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, mutex_lock(&encl->lock); + /* Should not be possible */ + if (WARN_ON(!(encl->secs.epc_page))) + goto out; + sgx_encl_ewb(epc_page, backing); encl_page->epc_page = NULL; encl->secs_child_cnt--; From patchwork Wed Jul 12 23:01:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119424 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1482598vqm; Wed, 12 Jul 2023 16:39:03 -0700 (PDT) X-Google-Smtp-Source: APBJJlFklh2hiMN2K7PBeOSsJPC06CfBAPBf26fA0TeNdhlV0KltFJTOuhSkL3unf3FHFa2ENl/8 X-Received: by 2002:aa7:8888:0:b0:682:e24c:f4e0 with SMTP id z8-20020aa78888000000b00682e24cf4e0mr178115pfe.11.1689205143187; Wed, 12 Jul 2023 16:39:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689205143; cv=none; d=google.com; s=arc-20160816; b=QM9hlS6Ie3UpiyWxlaxQwKsjY7mGm4XLovpiCLop6/zpZa8G+zDk0vE4sRT2BOVQKH GCBwomprXfK1LD6UJG1gNNyP+jWBmI22M77aRpZaTzhWUHYb7FObxd/w2dF6ZdZmhbjl 0cpMAmJmWWx2InFeGqfYAszZvZeVx6dxg+3ruQXSZpQuy+1qZo9Th+utY4dXVqR/hjAp pkMLRZwCLM79HCS1PQ9/yS4GmqNTxLk3c0xwhIIqm9NquMDec1JD/vM+iWYW5AOibkEp KfMCfHyPjd1n8k+YfEdTllpgXHWt+Q+nNl6K2kbgs+RPzvN9GrhRjmuP+Dd1I8Ttg9if 9MDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=v4ggJB3AVNTsmO+J3q+FuFJ2ApDbBY0FRfH4hKv83PA=; fh=OZr4LIl5No2JYFQArlJ8xitR5GErnb3n5yX+P8GvfUY=; b=ZZyQAJvOc/MYGiF0SZ8Qe611nWtCyGourh6qstp5ecTRLM0feFWk4jsnT1gU/1uxVT KaM0aG7k0RVXtpV8f9iBL0yVYmCf33OBbpGECTToHj8cQH57hFiJKb8Y38AAorf3EsZ8 PzfamkQqNUCEp0F/iVqxfFFTyMhPTDJGqqGvaG7NBAuBu6UduYMAgUcCvS17cdWSjFV8 96YJwOjGN9mAQIN1gwn/kW8xRwtkTo27v2AwJUmSsoHgv86HE2Oj8ZT6TDzcDKGwpcpj ZYtDfFO7/oxna0C1oCc9gloUkoCJKNWeH/LRGUpf5ZzCb1W2gABLzPcwGOtwdh6wIA+e cX8w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="LU/fwLB0"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g9-20020a056a001a0900b0068035bb7a40si4154238pfv.366.2023.07.12.16.38.50; Wed, 12 Jul 2023 16:39:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="LU/fwLB0"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232721AbjGLXDV (ORCPT + 99 others); Wed, 12 Jul 2023 19:03:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232732AbjGLXCS (ORCPT ); Wed, 12 Jul 2023 19:02:18 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FB2418E; Wed, 12 Jul 2023 16:02:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202937; x=1720738937; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gtPme0IuXCAV4jbp0xb73aBMQFtTiscdTlcYYwDyGg8=; b=LU/fwLB0h47yZrs3fFKLbojxHgRpYrQ6iIvzDpxHqSnhwm5C1Bb8lLs3 ZZ6eclaF+n5kv3WCCBcwpdZ6Bbk/iqdD9EfLa77GTJO+z9mNYI+HXbDjw Y85VOTZTDQegwUXEfgzDNOdIbhgoO/gOSGk9uKinxNZv1aKMEMqUMzaWh hGKMsgb9vB9RX1YA2Nsepz7A34ZIoWfQXbWMU+jLkSuXjDid1mudOwH+y JVcIWZvhc9ffHBl1yLv1/kt7JjkPJ6DpegqdUBLCCM/KmrxWFJNudHqC9 o+PNaBrus11gnGpxuGjD4b4J7cOq/j/q9dX6kGwTfYHJNt7hAV1lC+qxO A==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774166" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774166" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338649" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338649" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:15 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Zefan Li , Johannes Weiner Cc: kai.huang@intel.com, reinette.chatre@intel.com, zhiquan1.li@intel.com, kristen@linux.intel.com Subject: [PATCH v3 18/28] cgroup/misc: Fix an overflow Date: Wed, 12 Jul 2023 16:01:52 -0700 Message-Id: <20230712230202.47929-19-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771259972295516320 X-GMAIL-MSGID: 1771259972295516320 Overflow may happen in misc_cg_try_charge if new_usage becomes above INT_MAX, for example, on platforms with large SGX EPC sizes. Change type of new_usage to long from int and check overflow. Signed-off-by: Haitao Huang --- kernel/cgroup/misc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c index fe3e8a0eb7ed..ff9f900981a3 100644 --- a/kernel/cgroup/misc.c +++ b/kernel/cgroup/misc.c @@ -143,7 +143,7 @@ int misc_cg_try_charge(enum misc_res_type type, struct misc_cg *cg, struct misc_cg *i, *j; int ret; struct misc_res *res; - int new_usage; + long new_usage; if (!(valid_type(type) && cg && READ_ONCE(misc_res_capacity[type]))) return -EINVAL; @@ -153,10 +153,10 @@ int misc_cg_try_charge(enum misc_res_type type, struct misc_cg *cg, for (i = cg; i; i = parent_misc(i)) { res = &i->res[type]; - new_usage = atomic_long_add_return(amount, &res->usage); if (new_usage > READ_ONCE(res->max) || - new_usage > READ_ONCE(misc_res_capacity[type])) { + new_usage > READ_ONCE(misc_res_capacity[type]) || + new_usage < 0) { ret = -EBUSY; goto err_charge; } From patchwork Wed Jul 12 23:01:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119388 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1470894vqm; Wed, 12 Jul 2023 16:12:38 -0700 (PDT) X-Google-Smtp-Source: APBJJlEqYzg//ISQGmX2yNeZ/U2YwvQr0eG5MJanP5pVgNCppAa44cajd4OOJwHN/+t0eXMuL80R X-Received: by 2002:a17:902:a713:b0:1b8:475d:ebf6 with SMTP id w19-20020a170902a71300b001b8475debf6mr104170plq.0.1689203558222; Wed, 12 Jul 2023 16:12:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203558; cv=none; d=google.com; s=arc-20160816; b=IGRoVRKaImAFfm08NB6dhK6bCsNUWR9WcKubVwqodD3JOzAQbuJbz8WvSZ7WbolPKu LaUuTCWRdaGLposFmbm0DhY/hhtJikoLbWTMq6+gga7V2DqsbQM+D+WcYkTtTs8PmnQY ybHw3FRQCK8IO1Gl6hcHbKabFlWoBZ707qLIyRwV1pXUqWIN1QqZba5ooDiReKLaXIEO CDwzpTAG3RCHnzrVEWJn5f4XX+4rAUf9+qi9JOKmwa+xSYmTrjJszN1cIlYN9ze3Riv/ ia2/ZZTrdRaFkfqMRONAQQV5r81nVZGty0xbNrnTpBPUJONSRsH2KcQsGiPbXX+22Tee o5Iw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=rJ0Ip6KmFP6Q0sxP++sLdTAgYwgTg1ulaqhBH3mYJgM=; fh=yNohEIybLp+a0hzgsRq+H5j6MTgiO5TOkS8u6RPK3SM=; b=Ch0rZ7uPeeKJ7/3Z0ZxXe/ZGn//EdEkHZa4owfBzi+7lm5VwWzpVo57hMT6nWptPsx 80XFoGM0gZfSbIbYP+iqOqsNHRrwcPtmWhLhrYWt4wm8wHB/qC/amD2Apr7e3wKr+aRf WKMIOknz0B2o4sAYQIJDRg4xG9gTQG8da7WSI86eQq+KSiKcz/UMXhldI9MDwdffVmle lzioo6Vk4Qjo/8JxvtG7Xhym+IQR6WfFap5Fx6frCLNqr0eVRDH3boQHZg+pDqHv7lIc 4RI5uiSPXudJnSGxN6wjM1BLBfHzIP7cB6ei8qxwzJ9iAaPRMzRAVYOMvyHuY/Y2Rmy0 d0eA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=I7QJTOhH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u11-20020a170902e5cb00b001b8c689061asi4217547plf.421.2023.07.12.16.12.25; Wed, 12 Jul 2023 16:12:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=I7QJTOhH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233132AbjGLXD2 (ORCPT + 99 others); Wed, 12 Jul 2023 19:03:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232746AbjGLXCT (ORCPT ); Wed, 12 Jul 2023 19:02:19 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8D57310E2; Wed, 12 Jul 2023 16:02:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202937; x=1720738937; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TP2QR91RjVHQ9dyPHa8/7eEH5VE2VT3Zn/qVSkO1wJ0=; b=I7QJTOhHjresphU3ZeiAGnP4l/U7xq5Yz3pjhPKfyoSPQvdiC8rpXWLo DS1Tz3RDsfIoutLYVyBxfZ//pjNMhwvoHBoR3rWUjShFCx+C7FzIZFexI 0l2pho/1n67eWs2zgNh8IQzHwLenpS8O/6BAUR4waZiN5QHYtsFZhClzj ydkRj3iuHgJhxDnTFHtO7i7Ih0Yi/ruVTDRtaSx7mvj6s/W1Tt34b/8tn NGWaxcr0DTXbTnxjA341pR4a/4ke4R3TLkNMPDYf1zDAjjoRrsgZ0JQCa Six9nfVzd9+4gFtf3h7IFJSyFmbu5VViURPqnQI7TImDMcYFJWcYlz5/J Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774175" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774175" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338653" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338653" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:15 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Zefan Li , Johannes Weiner Cc: kai.huang@intel.com, reinette.chatre@intel.com, Kristen Carlson Accardi , zhiquan1.li@intel.com Subject: [PATCH v3 19/28] cgroup/misc: Add per resource callbacks for CSS events Date: Wed, 12 Jul 2023 16:01:53 -0700 Message-Id: <20230712230202.47929-20-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258310195989214 X-GMAIL-MSGID: 1771258310195989214 From: Kristen Carlson Accardi Consumers of the misc cgroup controller might need to perform separate actions for Cgroups Subsystem State(CSS) events: cgroup alloc and free. In addition, writes to the max value may also need separate action. Add the ability to allow downstream users to setup callbacks for these operations, and call the corresponding per-resource-type callback when appropriate. This code will be utilized by the SGX driver in a future patch. Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Changes from V2: - Removed the released() callback --- include/linux/misc_cgroup.h | 5 +++++ kernel/cgroup/misc.c | 32 +++++++++++++++++++++++++++++--- 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h index c238207d1615..9962b870d382 100644 --- a/include/linux/misc_cgroup.h +++ b/include/linux/misc_cgroup.h @@ -37,6 +37,11 @@ struct misc_res { unsigned long max; atomic_long_t usage; atomic_long_t events; + + /* per resource callback ops */ + int (*misc_cg_alloc)(struct misc_cg *cg); + void (*misc_cg_free)(struct misc_cg *cg); + void (*misc_cg_max_write)(struct misc_cg *cg); }; /** diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c index ff9f900981a3..4736db3cd418 100644 --- a/kernel/cgroup/misc.c +++ b/kernel/cgroup/misc.c @@ -278,10 +278,13 @@ static ssize_t misc_cg_max_write(struct kernfs_open_file *of, char *buf, cg = css_misc(of_css(of)); - if (READ_ONCE(misc_res_capacity[type])) + if (READ_ONCE(misc_res_capacity[type])) { WRITE_ONCE(cg->res[type].max, max); - else + if (cg->res[type].misc_cg_max_write) + cg->res[type].misc_cg_max_write(cg); + } else { ret = -EINVAL; + } return ret ? ret : nbytes; } @@ -385,23 +388,39 @@ static struct cftype misc_cg_files[] = { static struct cgroup_subsys_state * misc_cg_alloc(struct cgroup_subsys_state *parent_css) { + struct misc_cg *parent_cg; enum misc_res_type i; struct misc_cg *cg; + int ret; if (!parent_css) { cg = &root_cg; + parent_cg = &root_cg; } else { cg = kzalloc(sizeof(*cg), GFP_KERNEL); if (!cg) return ERR_PTR(-ENOMEM); + parent_cg = css_misc(parent_css); } for (i = 0; i < MISC_CG_RES_TYPES; i++) { WRITE_ONCE(cg->res[i].max, MAX_NUM); atomic_long_set(&cg->res[i].usage, 0); + if (parent_cg->res[i].misc_cg_alloc) { + ret = parent_cg->res[i].misc_cg_alloc(cg); + if (ret) + goto alloc_err; + } } return &cg->css; + +alloc_err: + for (i = 0; i < MISC_CG_RES_TYPES; i++) + if (parent_cg->res[i].misc_cg_free) + cg->res[i].misc_cg_free(cg); + kfree(cg); + return ERR_PTR(ret); } /** @@ -412,7 +431,14 @@ misc_cg_alloc(struct cgroup_subsys_state *parent_css) */ static void misc_cg_free(struct cgroup_subsys_state *css) { - kfree(css_misc(css)); + struct misc_cg *cg = css_misc(css); + enum misc_res_type i; + + for (i = 0; i < MISC_CG_RES_TYPES; i++) + if (cg->res[i].misc_cg_free) + cg->res[i].misc_cg_free(cg); + + kfree(cg); } /* Cgroup controller callbacks */ From patchwork Wed Jul 12 23:01:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119381 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1467971vqm; Wed, 12 Jul 2023 16:06:46 -0700 (PDT) X-Google-Smtp-Source: APBJJlH6bTnES5WQvcsID8JRc7svqVaek70tolwO706Xv3HokU8lE6y90ploiXi/uUy9+4u9tU+c X-Received: by 2002:a2e:a383:0:b0:2b6:e3d5:76a7 with SMTP id r3-20020a2ea383000000b002b6e3d576a7mr16544974lje.24.1689203205938; Wed, 12 Jul 2023 16:06:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203205; cv=none; d=google.com; s=arc-20160816; b=qE8d0AcBJDUXweTD4uRbGVzSSvcs6UfuWBrm75yG74i6FKKaflEzII2SkBLH8IlvVj LZ14Pfn3SO5Tk+RC9WXrQdybXApbQYsRV4qMmH8zMpRNPRAu+TlSEWBSv+XXMhvPfwEU wBcbVFRlGCjoLZSKNnQMODxkqg8/wF9QMp/VYkaT3lW7u0nzFmERIiTLMTc4qmHvRgae cwYljKEmWv4MrtYqwchPlQcImUc/9cEUB70kkf7+pHw3AFnqsE8oh0Idzb662BRLYOID abhWI4033OoQEeaVxDtETtsBAYXZmtl71hBrRRrmrO1NeaFNOSh5qVe2FqRk19fGxiwW IWiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=s/jZl9Fpp/gcJla65JZ8zg7s8InpYYHfQvxBywI/p68=; fh=yNohEIybLp+a0hzgsRq+H5j6MTgiO5TOkS8u6RPK3SM=; b=hVxbLtkETU+K1XhQsZ/E3xXwxC7L59O3HoZSS0IA115+gYP1LTi4ll6mkc8yfdA+/S O4vtBJS8tgQ8aoPwfreYbUUbTp55YSMrtK0GbKXbRWOBxZoiq0DQcUDfS4d9ICNoqNbN zuQqDJ6QpKuLRvXtcfstsjweXFnPZriY+PP5Rfjz535pBfk/g00MawbeOx69BF7pQHau QIFvzBis2KxD0S2Zu7Nk5VJyl8uX7Nm/j2P/Au1OIvOQUmlPQspx9GQvb4ZA6RP8Vf43 kYQ1UfiY1qOU8IJ8G3QsQaACUY5D7YMjACKID7jk95py5/jjxykIF1M4JKCKko9Xn8M5 6qOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mj4Oh7sR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id br12-20020a170906d14c00b00991ece4c966si6103421ejb.101.2023.07.12.16.06.21; Wed, 12 Jul 2023 16:06:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mj4Oh7sR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233042AbjGLXDZ (ORCPT + 99 others); Wed, 12 Jul 2023 19:03:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230381AbjGLXCT (ORCPT ); Wed, 12 Jul 2023 19:02:19 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D956719B; Wed, 12 Jul 2023 16:02:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202937; x=1720738937; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zH+XX+VHFK8n349nOoRMB7hDW6982oRC8Tt5cNcQSvM=; b=mj4Oh7sRh93BpuCLSxSnKB11aJS3Nm9hJPt0N3mr9ETzlWpcYNw/LsrR ld0MPuvyCmlABUrEFMjnTLGsERthz0XlDXDkqrQMKAf2iVqYh5Vz+Mri+ yXkN0STUMIbecOeb5T0XDRaWb+YI7sf/uZBHYjXrYYrfR558w5Hqewtu7 2j66Q3wp5XUX1NaiPnp1px4LKNYhEWdBtZqOzbLogadmtgTwIF3hFQGYR VeKeMRy7laqson8As6VhRANiBVgWZ8GVvAilhoAk12BmCI8N2fmdiBDY/ u2xJ/7Hu1eDzCicI0359pJUt5Y9tV3twRRl6TETuP3bUShUUeK3RKGuJp g==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774182" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774182" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338659" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338659" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:16 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Zefan Li , Johannes Weiner Cc: kai.huang@intel.com, reinette.chatre@intel.com, Kristen Carlson Accardi , zhiquan1.li@intel.com Subject: [PATCH v3 20/28] cgroup/misc: Add SGX EPC resource type and export APIs for SGX driver Date: Wed, 12 Jul 2023 16:01:54 -0700 Message-Id: <20230712230202.47929-21-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771257940886575990 X-GMAIL-MSGID: 1771257940886575990 From: Kristen Carlson Accardi The SGX driver will need to get access to the root misc_cg object to do iterative walks and also determine if a charge will be towards the root cgroup or not. To manage the SGX EPC memory via the misc controller, the SGX driver will also need to be able to iterate over the misc cgroup hierarchy. Move parent_misc() into misc_cgroup.h and make inline to make this function available to SGX, rename it to misc_cg_parent(), and update misc.c to use the new name. Add per resource type private data so that SGX can store additional per cgroup data with the misc_cg struct. Allow SGX EPC memory to be a valid resource type for the misc controller. Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang --- include/linux/misc_cgroup.h | 29 +++++++++++++++++++++++++++++ kernel/cgroup/misc.c | 25 ++++++++++++------------- 2 files changed, 41 insertions(+), 13 deletions(-) diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h index 9962b870d382..8bef9d92e36a 100644 --- a/include/linux/misc_cgroup.h +++ b/include/linux/misc_cgroup.h @@ -17,6 +17,10 @@ enum misc_res_type { MISC_CG_RES_SEV, /* AMD SEV-ES ASIDs resource */ MISC_CG_RES_SEV_ES, +#endif +#ifdef CONFIG_CGROUP_SGX_EPC + /* SGX EPC memory resource */ + MISC_CG_RES_SGX_EPC, #endif MISC_CG_RES_TYPES }; @@ -37,6 +41,7 @@ struct misc_res { unsigned long max; atomic_long_t usage; atomic_long_t events; + void *priv; /* per resource callback ops */ int (*misc_cg_alloc)(struct misc_cg *cg); @@ -58,6 +63,7 @@ struct misc_cg { struct misc_res res[MISC_CG_RES_TYPES]; }; +struct misc_cg *misc_cg_root(void); unsigned long misc_cg_res_total_usage(enum misc_res_type type); int misc_cg_set_capacity(enum misc_res_type type, unsigned long capacity); int misc_cg_try_charge(enum misc_res_type type, struct misc_cg *cg, @@ -79,6 +85,20 @@ static inline struct misc_cg *css_misc(struct cgroup_subsys_state *css) return css ? container_of(css, struct misc_cg, css) : NULL; } +/** + * misc_cg_parent() - Get the parent of the passed misc cgroup. + * @cgroup: cgroup whose parent needs to be fetched. + * + * Context: Any context. + * Return: + * * struct misc_cg* - Parent of the @cgroup. + * * %NULL - If @cgroup is null or the passed cgroup does not have a parent. + */ +static inline struct misc_cg *misc_cg_parent(struct misc_cg *cgroup) +{ + return cgroup ? css_misc(cgroup->css.parent) : NULL; +} + /* * get_current_misc_cg() - Find and get the misc cgroup of the current task. * @@ -103,6 +123,15 @@ static inline void put_misc_cg(struct misc_cg *cg) } #else /* !CONFIG_CGROUP_MISC */ +static inline struct misc_cg *misc_cg_root(void) +{ + return NULL; +} + +static inline struct misc_cg *misc_cg_parent(struct misc_cg *cg) +{ + return NULL; +} static inline unsigned long misc_cg_res_total_usage(enum misc_res_type type) { diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c index 4736db3cd418..ea18eae862a4 100644 --- a/kernel/cgroup/misc.c +++ b/kernel/cgroup/misc.c @@ -24,6 +24,10 @@ static const char *const misc_res_name[] = { /* AMD SEV-ES ASIDs resource */ "sev_es", #endif +#ifdef CONFIG_CGROUP_SGX_EPC + /* Intel SGX EPC memory bytes */ + "sgx_epc", +#endif }; /* Root misc cgroup */ @@ -40,18 +44,13 @@ static struct misc_cg root_cg; static unsigned long misc_res_capacity[MISC_CG_RES_TYPES]; /** - * parent_misc() - Get the parent of the passed misc cgroup. - * @cgroup: cgroup whose parent needs to be fetched. - * - * Context: Any context. - * Return: - * * struct misc_cg* - Parent of the @cgroup. - * * %NULL - If @cgroup is null or the passed cgroup does not have a parent. + * misc_cg_root() - Return the root misc cgroup. */ -static struct misc_cg *parent_misc(struct misc_cg *cgroup) +struct misc_cg *misc_cg_root(void) { - return cgroup ? css_misc(cgroup->css.parent) : NULL; + return &root_cg; } +EXPORT_SYMBOL_GPL(misc_cg_root); /** * valid_type() - Check if @type is valid or not. @@ -151,7 +150,7 @@ int misc_cg_try_charge(enum misc_res_type type, struct misc_cg *cg, if (!amount) return 0; - for (i = cg; i; i = parent_misc(i)) { + for (i = cg; i; i = misc_cg_parent(i)) { res = &i->res[type]; new_usage = atomic_long_add_return(amount, &res->usage); if (new_usage > READ_ONCE(res->max) || @@ -164,12 +163,12 @@ int misc_cg_try_charge(enum misc_res_type type, struct misc_cg *cg, return 0; err_charge: - for (j = i; j; j = parent_misc(j)) { + for (j = i; j; j = misc_cg_parent(j)) { atomic_long_inc(&j->res[type].events); cgroup_file_notify(&j->events_file); } - for (j = cg; j != i; j = parent_misc(j)) + for (j = cg; j != i; j = misc_cg_parent(j)) misc_cg_cancel_charge(type, j, amount); misc_cg_cancel_charge(type, i, amount); return ret; @@ -192,7 +191,7 @@ void misc_cg_uncharge(enum misc_res_type type, struct misc_cg *cg, if (!(amount && valid_type(type) && cg)) return; - for (i = cg; i; i = parent_misc(i)) + for (i = cg; i; i = misc_cg_parent(i)) misc_cg_cancel_charge(type, i, amount); } EXPORT_SYMBOL_GPL(misc_cg_uncharge); From patchwork Wed Jul 12 23:01:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119400 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1472427vqm; Wed, 12 Jul 2023 16:15:43 -0700 (PDT) X-Google-Smtp-Source: APBJJlHZOl6t2+gwBpDW0jUaw1iaq4V5o85DILMrZs1lOTXXK9DjBvXPKROruzd5auhEBQPkfurL X-Received: by 2002:a05:6a00:24ca:b0:66a:386c:e6a6 with SMTP id d10-20020a056a0024ca00b0066a386ce6a6mr113763pfv.6.1689203743164; Wed, 12 Jul 2023 16:15:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203743; cv=none; d=google.com; s=arc-20160816; b=JLqYsjX3DDrvPEzeocuzwC9SiNgFDxaegmFbF8aE3vdcFF7os3l8YpODsfMMP5oJBe xcOlqQqqbhA5Mo9zLQHgtG+bzY1AFzaueCkc2JIeiJJKFKsQJeBlsEdqu5iPvPVMdHM3 I7m7dR0BmRzeI9fstcPXtRTXRy9EatuRIL/F8CkGj/MMiiHR0ayAsJpTdT40bfksgVyd srkmpY/DBZLurVA0iKVOwad7ONKB/J+aW2EZUCv/tONRiLiZGSWSHHpQtnvWR9cXNBoF UWX5EwD0tCeqerqd9AucWu+Ft/7Og3pEvyhrhMaIMWuIyyB/aUMf86f5dbOXwAX7WxX8 jmQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=txBe5TKE5QfuaDbZyQEGK63PVD56VHe0CsgGPj6mazs=; fh=sp9Fy9uDTUYbTQiOiWOfY01qY8ktdO/xQ5V81goPrjk=; b=tBOxw9bP6RleuFOO1XMgdGwsMYbPczLZKD7AuKgFIRjeRgAuI84CfZ0Qm9ZoFgLRRR btO0NEzhAQ9PmG/VSt043EPwjI1VqI3OBHCxW+8fS766/94DrYt1J4JgZi5/CzBrpGxd IipZ16csOPdthAkI5j6w1FO0HY/layLnu9WREiAQRLHXxG2UTv0C8WUbr0AHn53PZ//o n7L2OlkXwDBsYAD8SChe/sQKYvdFq8J9Tx23JVrTBPxqTEvHu6RHYeIsGjKx1V7bkY7S l1qe/bnIQQnXHm6QZHEv7GxIX72ED0uKrnUceXd2yAI00cASS4IyooNBH/MAQC2aGgqt ynkQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="fq/xb+z+"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cw26-20020a056a00451a00b006829969e393si3958347pfb.189.2023.07.12.16.15.30; Wed, 12 Jul 2023 16:15:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="fq/xb+z+"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233187AbjGLXDe (ORCPT + 99 others); Wed, 12 Jul 2023 19:03:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232813AbjGLXCW (ORCPT ); Wed, 12 Jul 2023 19:02:22 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C4D61992; Wed, 12 Jul 2023 16:02:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202938; x=1720738938; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9pkdFeFKYJoU0jF7XMY7Rkw+53JGPnL8Udni3aFbNe4=; b=fq/xb+z+8aRVa5LJVisaTSOPTz8SIC4gZnN5XtvvVDW5RMWjdEB/046R ekJyisXdLs0FaVFS0t+i3eU6c7EpkxHi0j/djujb/iWaiVT02b3WHPThk y3mzaC7+iGndU2njL47FGO2SBnOPoUYc04SgfYqro8XBmc/QmicpjHeKs Ym11FGxtTWvKOKUVYNPlHvU+I/fwhgUhStzGKLPyej0I/2iZq7XkwWYhb FlcbvDPj70mn3rrteVvS/iCrPTT8emrAB5YJNDdJfOse25ZpCHlnCEMgs O14EGOMTvU075jCQKU7d5ODlPIHVd19N0s9l35uIDYjQtgeTOg31pJ95y A==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774187" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774187" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338667" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338667" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:16 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: kai.huang@intel.com, reinette.chatre@intel.com, Kristen Carlson Accardi , zhiquan1.li@intel.com, seanjc@google.com Subject: [PATCH v3 21/28] x86/sgx: Limit process EPC usage with misc cgroup controller Date: Wed, 12 Jul 2023 16:01:55 -0700 Message-Id: <20230712230202.47929-22-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258504194223661 X-GMAIL-MSGID: 1771258504194223661 From: Kristen Carlson Accardi Implement support for cgroup control of SGX Enclave Page Cache (EPC) memory using the misc cgroup controller. EPC memory is independent from normal system memory, e.g. must be reserved at boot from RAM and cannot be converted between EPC and normal memory while the system is running. EPC is managed by the SGX subsystem and is not accounted by the memory controller. Much like normal system memory, EPC memory can be overcommitted via virtual memory techniques and pages can be swapped out of the EPC to their backing store (normal system memory, e.g. shmem). The SGX EPC subsystem is analogous to the memory subsytem and the SGX EPC controller is in turn analogous to the memory controller; it implements limit and protection models for EPC memory. The misc controller provides a mechanism to set a hard limit of EPC usage via the "sgx_epc" resource in "misc.max". The total EPC memory available on the system is reported via the "sgx_epc" resource in "misc.capacity". This patch was modified from its original version to use the misc cgroup controller instead of a custom controller. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Signed-off-by: Haitao Huang Cc: Sean Christopherson V3: 1) Use the same maximum number of reclaiming candidate pages to be processed, SGX_NR_TO_SCAN_MAX, for each reclaiming iteration in both cgroup worker function and ksgxd. This fixes an overflow in the backing store buffer with the same fixed size allocated on stack in sgx_reclaim_epc_pages(). 2) Initialize max for root EPC cgroup. Otherwise, all misc_cg_try_charge() calls would fail as it checks for all limits of ancestors all the way to the root node. 3) Start reclaiming whenever misc_cg_try_charge fails. Removed all re-checks for limits and current usage. For all purposes and intent, when misc_try_charge() fails, reclaiming is needed. This also corrects an error of not reclaiming when the child limit is larger than one of its ancestors. 4) Handle failure on charging to the root EPC cgroup. Failure on charging to root means we are at or above capacity, so start reclaiming or return OOM error. 5) Removed the custom cgroup tree walking iterator with epoch tracking logic. Replaced it with just the plain css_for_each_descendant_pre iterator. The custom iterator implemented a rather complex epoch scheme I believe was intended to prevent extra reclaiming from multiple worker threads doing the same walk but it turned out not matter much as each thread would only reclaim when usage is above limit. Using the plain css_for_each_descendant_pre iterator simplified code a bit. 6) Do not reclaim synchrously in misc_max_write callback which would block the user. Instead queue an async work item to run the reclaiming loop. 7) Other minor refactorings: - Remove unused params in epc_cgroup APIs - centralize uncharge into sgx_free_epc_page() --- arch/x86/Kconfig | 13 + arch/x86/kernel/cpu/sgx/Makefile | 1 + arch/x86/kernel/cpu/sgx/epc_cgroup.c | 406 +++++++++++++++++++++++++++ arch/x86/kernel/cpu/sgx/epc_cgroup.h | 60 ++++ arch/x86/kernel/cpu/sgx/main.c | 79 ++++-- arch/x86/kernel/cpu/sgx/sgx.h | 14 +- 6 files changed, 552 insertions(+), 21 deletions(-) create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.c create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 53bab123a8ee..8a7378159e9e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1952,6 +1952,19 @@ config X86_SGX If unsure, say N. +config CGROUP_SGX_EPC + bool "Miscellaneous Cgroup Controller for Enclave Page Cache (EPC) for Intel SGX" + depends on X86_SGX && CGROUP_MISC + help + Provides control over the EPC footprint of tasks in a cgroup via + the Miscellaneous cgroup controller. + + EPC is a subset of regular memory that is usable only by SGX + enclaves and is very limited in quantity, e.g. less than 1% + of total DRAM. + + Say N if unsure. + config EFI bool "EFI runtime service support" depends on ACPI diff --git a/arch/x86/kernel/cpu/sgx/Makefile b/arch/x86/kernel/cpu/sgx/Makefile index 9c1656779b2a..12901a488da7 100644 --- a/arch/x86/kernel/cpu/sgx/Makefile +++ b/arch/x86/kernel/cpu/sgx/Makefile @@ -4,3 +4,4 @@ obj-y += \ ioctl.o \ main.o obj-$(CONFIG_X86_SGX_KVM) += virt.o +obj-$(CONFIG_CGROUP_SGX_EPC) += epc_cgroup.o diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.c b/arch/x86/kernel/cpu/sgx/epc_cgroup.c new file mode 100644 index 000000000000..de0833e5606b --- /dev/null +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c @@ -0,0 +1,406 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright(c) 2022 Intel Corporation. + +#include +#include +#include +#include +#include +#include + +#include "epc_cgroup.h" + +#define SGX_EPC_RECLAIM_MIN_PAGES 16UL +#define SGX_EPC_RECLAIM_IGNORE_AGE_THRESHOLD 5 +#define SGX_EPC_RECLAIM_OOM_THRESHOLD 5 + +static struct workqueue_struct *sgx_epc_cg_wq; +static bool sgx_epc_cgroup_oom(struct sgx_epc_cgroup *root); + +struct sgx_epc_reclaim_control { + struct sgx_epc_cgroup *epc_cg; + int nr_fails; + bool ignore_age; +}; + +static inline unsigned long sgx_epc_cgroup_page_counter_read(struct sgx_epc_cgroup *epc_cg) +{ + return atomic_long_read(&epc_cg->cg->res[MISC_CG_RES_SGX_EPC].usage) / PAGE_SIZE; +} + +static inline unsigned long sgx_epc_cgroup_max_pages(struct sgx_epc_cgroup *epc_cg) +{ + return READ_ONCE(epc_cg->cg->res[MISC_CG_RES_SGX_EPC].max) / PAGE_SIZE; +} + +static inline unsigned long sgx_epc_cgroup_max_pages_to_root(struct sgx_epc_cgroup *epc_cg) +{ + struct misc_cg *i = epc_cg->cg; + unsigned long m = ULONG_MAX; + + while (i) { + m = min(m, READ_ONCE(i->res[MISC_CG_RES_SGX_EPC].max)); + i = misc_cg_parent(i); + } + return m / PAGE_SIZE; +} + +static inline struct sgx_epc_cgroup *sgx_epc_cgroup_from_misc_cg(struct misc_cg *cg) +{ + if (cg) + return (struct sgx_epc_cgroup *)(cg->res[MISC_CG_RES_SGX_EPC].priv); + + return NULL; +} + +static inline bool sgx_epc_cgroup_disabled(void) +{ + return !cgroup_subsys_enabled(misc_cgrp_subsys); +} + +/** + * sgx_epc_cgroup_lru_empty - check if a cgroup tree has no pages on its lrus + * @root: root of the tree to check + * + * Return: %true if all cgroups under the specified root have empty LRU lists. + * Used to avoid livelocks due to a cgroup having a non-zero charge count but + * no pages on its LRUs, e.g. due to a dead enclave waiting to be released or + * because all pages in the cgroup are unreclaimable. + */ +bool sgx_epc_cgroup_lru_empty(struct sgx_epc_cgroup *root) +{ + struct cgroup_subsys_state *css_root = NULL; + struct cgroup_subsys_state *pos = NULL; + struct sgx_epc_cgroup *epc_cg = NULL; + bool ret = true; + + /* + * Caller ensure css_root ref acquired + */ + css_root = root ? &root->cg->css : &(misc_cg_root()->css); + + rcu_read_lock(); + css_for_each_descendant_pre(pos, css_root) { + if (!css_tryget(pos)) + break; + + rcu_read_unlock(); + + epc_cg = sgx_epc_cgroup_from_misc_cg(css_misc(pos)); + + spin_lock(&epc_cg->lru.lock); + ret = list_empty(&epc_cg->lru.reclaimable); + spin_unlock(&epc_cg->lru.lock); + + rcu_read_lock(); + css_put(pos); + if (!ret) + break; + } + rcu_read_unlock(); + return ret; +} + +/** + * sgx_epc_cgroup_isolate_pages - walk a cgroup tree and separate pages + * @root: root of the tree to start walking + * @nr_to_scan: The number of pages that need to be isolated + * @dst: Destination list to hold the isolated pages + * + * Walk the cgroup tree and isolate the pages in the hierarchy + * for reclaiming. + */ +void sgx_epc_cgroup_isolate_pages(struct sgx_epc_cgroup *root, + size_t *nr_to_scan, struct list_head *dst) +{ + struct cgroup_subsys_state *css_root = NULL; + struct cgroup_subsys_state *pos = NULL; + struct sgx_epc_cgroup *epc_cg = NULL; + + if (!*nr_to_scan) + return; + + /* Caller ensure css_root ref acquired */ + css_root = root ? &root->cg->css : &(misc_cg_root()->css); + + rcu_read_lock(); + css_for_each_descendant_pre(pos, css_root) { + if (!css_tryget(pos)) + break; + rcu_read_unlock(); + + epc_cg = sgx_epc_cgroup_from_misc_cg(css_misc(pos)); + sgx_isolate_epc_pages(&epc_cg->lru, nr_to_scan, dst); + + rcu_read_lock(); + css_put(pos); + if (!*nr_to_scan) + break; + } + rcu_read_unlock(); +} + +static int sgx_epc_cgroup_reclaim_pages(unsigned long nr_pages, + struct sgx_epc_reclaim_control *rc) +{ + /* + * Ensure sgx_reclaim_pages is called with a minimum and maximum + * number of pages. Attempting to reclaim only a few pages will + * often fail and is inefficient, while reclaiming a huge number + * of pages can result in soft lockups due to holding various + * locks for an extended duration. This also bounds nr_pages so + */ + nr_pages = max(nr_pages, SGX_EPC_RECLAIM_MIN_PAGES); + nr_pages = min(nr_pages, SGX_NR_TO_SCAN_MAX); + + return sgx_reclaim_epc_pages(nr_pages, rc->ignore_age, rc->epc_cg); +} + +static int sgx_epc_cgroup_reclaim_failed(struct sgx_epc_reclaim_control *rc) +{ + if (sgx_epc_cgroup_lru_empty(rc->epc_cg)) + return -ENOMEM; + + ++rc->nr_fails; + if (rc->nr_fails > SGX_EPC_RECLAIM_IGNORE_AGE_THRESHOLD) + rc->ignore_age = true; + + return 0; +} + +static inline +void sgx_epc_reclaim_control_init(struct sgx_epc_reclaim_control *rc, + struct sgx_epc_cgroup *epc_cg) +{ + rc->epc_cg = epc_cg; + rc->nr_fails = 0; + rc->ignore_age = false; +} + +/* + * Scheduled by sgx_epc_cgroup_try_charge() to reclaim pages from the + * cgroup when the cgroup is at/near its maximum capacity + */ +static void sgx_epc_cgroup_reclaim_work_func(struct work_struct *work) +{ + struct sgx_epc_reclaim_control rc; + struct sgx_epc_cgroup *epc_cg; + unsigned long cur, max; + + epc_cg = container_of(work, struct sgx_epc_cgroup, reclaim_work); + + sgx_epc_reclaim_control_init(&rc, epc_cg); + + for (;;) { + max = sgx_epc_cgroup_max_pages_to_root(epc_cg); + + /* + * Adjust the limit down by one page, the goal is to free up + * pages for fault allocations, not to simply obey the limit. + * Conditionally decrementing max also means the cur vs. max + * check will correctly handle the case where both are zero. + */ + if (max) + max--; + + /* + * Unless the limit is extremely low, in which case forcing + * reclaim will likely cause thrashing, force the cgroup to + * reclaim at least once if it's operating *near* its maximum + * limit by adjusting @max down by half the min reclaim size. + * This work func is scheduled by sgx_epc_cgroup_try_charge + * when it cannot directly reclaim due to being in an atomic + * context, e.g. EPC allocation in a fault handler. Waiting + * to reclaim until the cgroup is actually at its limit is less + * performant as it means the faulting task is effectively + * blocked until a worker makes its way through the global work + * queue. + */ + if (max > SGX_NR_TO_SCAN_MAX) + max -= (SGX_EPC_RECLAIM_MIN_PAGES / 2); + + max = min(max, sgx_epc_total_pages); + cur = sgx_epc_cgroup_page_counter_read(epc_cg); + if (cur <= max) + break; + /* Nothing reclaimable */ + if (sgx_epc_cgroup_lru_empty(epc_cg)) { + if (!sgx_epc_cgroup_oom(epc_cg)) + break; + + continue; + } + + if (!sgx_epc_cgroup_reclaim_pages(cur - max, &rc)) { + if (sgx_epc_cgroup_reclaim_failed(&rc)) + break; + } + } +} + +static int __sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg, + bool reclaim) +{ + struct sgx_epc_reclaim_control rc; + unsigned int nr_empty = 0; + + sgx_epc_reclaim_control_init(&rc, epc_cg); + + for (;;) { + if (!misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg, + PAGE_SIZE)) + break; + + if (sgx_epc_cgroup_lru_empty(epc_cg)) + return -ENOMEM; + + if (signal_pending(current)) + return -ERESTARTSYS; + + if (!reclaim) { + queue_work(sgx_epc_cg_wq, &rc.epc_cg->reclaim_work); + return -EBUSY; + } + + if (!sgx_epc_cgroup_reclaim_pages(1, &rc)) { + if (sgx_epc_cgroup_reclaim_failed(&rc)) { + if (++nr_empty > SGX_EPC_RECLAIM_OOM_THRESHOLD) + return -ENOMEM; + schedule(); + } + } + } + if (epc_cg->cg != misc_cg_root()) + css_get(&epc_cg->cg->css); + + return 0; +} + +/** + * sgx_epc_cgroup_try_charge - hierarchically try to charge a single EPC page + * @mm: the mm_struct of the process to charge + * @reclaim: whether or not synchronous reclaim is allowed + * + * Returns EPC cgroup or NULL on success, -errno on failure. + */ +struct sgx_epc_cgroup *sgx_epc_cgroup_try_charge(bool reclaim) +{ + struct sgx_epc_cgroup *epc_cg; + int ret; + + if (sgx_epc_cgroup_disabled()) + return NULL; + + epc_cg = sgx_epc_cgroup_from_misc_cg(get_current_misc_cg()); + ret = __sgx_epc_cgroup_try_charge(epc_cg, reclaim); + put_misc_cg(epc_cg->cg); + + if (ret) + return ERR_PTR(ret); + + return epc_cg; +} + +/** + * sgx_epc_cgroup_uncharge - hierarchically uncharge EPC pages + * @epc_cg: the charged epc cgroup + */ +void sgx_epc_cgroup_uncharge(struct sgx_epc_cgroup *epc_cg) +{ + if (sgx_epc_cgroup_disabled()) + return; + + misc_cg_uncharge(MISC_CG_RES_SGX_EPC, epc_cg->cg, PAGE_SIZE); + + if (epc_cg->cg != misc_cg_root()) + put_misc_cg(epc_cg->cg); +} + +static bool sgx_epc_cgroup_oom(struct sgx_epc_cgroup *root) +{ + struct cgroup_subsys_state *css_root = NULL; + struct cgroup_subsys_state *pos = NULL; + struct sgx_epc_cgroup *epc_cg = NULL; + bool oom = false; + + /* Caller ensure css_root ref acquired */ + css_root = root ? &root->cg->css : &(misc_cg_root()->css); + + rcu_read_lock(); + css_for_each_descendant_pre(pos, css_root) { + /* skip dead ones */ + if (!css_tryget(pos)) + continue; + + rcu_read_unlock(); + + epc_cg = sgx_epc_cgroup_from_misc_cg(css_misc(pos)); + oom = sgx_epc_oom(&epc_cg->lru); + + rcu_read_lock(); + css_put(pos); + if (oom) + break; + } + rcu_read_unlock(); + return oom; +} + +static void sgx_epc_cgroup_free(struct misc_cg *cg) +{ + struct sgx_epc_cgroup *epc_cg; + + epc_cg = sgx_epc_cgroup_from_misc_cg(cg); + cancel_work_sync(&epc_cg->reclaim_work); + kfree(epc_cg); +} + +static void sgx_epc_cgroup_max_write(struct misc_cg *cg) +{ + struct sgx_epc_reclaim_control rc; + struct sgx_epc_cgroup *epc_cg; + + epc_cg = sgx_epc_cgroup_from_misc_cg(cg); + + sgx_epc_reclaim_control_init(&rc, epc_cg); + /* Let the reclaimer to do the work so user is not blocked */ + queue_work(sgx_epc_cg_wq, &rc.epc_cg->reclaim_work); +} + +static int sgx_epc_cgroup_alloc(struct misc_cg *cg) +{ + struct sgx_epc_cgroup *epc_cg; + + epc_cg = kzalloc(sizeof(*epc_cg), GFP_KERNEL); + if (!epc_cg) + return -ENOMEM; + + sgx_lru_init(&epc_cg->lru); + INIT_WORK(&epc_cg->reclaim_work, sgx_epc_cgroup_reclaim_work_func); + cg->res[MISC_CG_RES_SGX_EPC].misc_cg_alloc = sgx_epc_cgroup_alloc; + cg->res[MISC_CG_RES_SGX_EPC].misc_cg_free = sgx_epc_cgroup_free; + cg->res[MISC_CG_RES_SGX_EPC].misc_cg_max_write = sgx_epc_cgroup_max_write; + cg->res[MISC_CG_RES_SGX_EPC].priv = epc_cg; + epc_cg->cg = cg; + return 0; +} + +static int __init sgx_epc_cgroup_init(void) +{ + struct misc_cg *cg; + + if (!boot_cpu_has(X86_FEATURE_SGX)) + return 0; + + sgx_epc_cg_wq = alloc_workqueue("sgx_epc_cg_wq", + WQ_UNBOUND | WQ_FREEZABLE, + WQ_UNBOUND_MAX_ACTIVE); + BUG_ON(!sgx_epc_cg_wq); + + cg = misc_cg_root(); + BUG_ON(!cg); + WRITE_ONCE(cg->res[MISC_CG_RES_SGX_EPC].max, ULONG_MAX); + atomic_long_set(&cg->res[MISC_CG_RES_SGX_EPC].usage, 0UL); + return sgx_epc_cgroup_alloc(cg); +} +subsys_initcall(sgx_epc_cgroup_init); diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.h b/arch/x86/kernel/cpu/sgx/epc_cgroup.h new file mode 100644 index 000000000000..03ac4dcea82b --- /dev/null +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.h @@ -0,0 +1,60 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright(c) 2022 Intel Corporation. */ +#ifndef _INTEL_SGX_EPC_CGROUP_H_ +#define _INTEL_SGX_EPC_CGROUP_H_ + +#include +#include +#include +#include +#include +#include + +#include "sgx.h" + +#ifndef CONFIG_CGROUP_SGX_EPC +#define MISC_CG_RES_SGX_EPC MISC_CG_RES_TYPES +struct sgx_epc_cgroup; + +static inline struct sgx_epc_cgroup *sgx_epc_cgroup_try_charge(bool reclaim) +{ + return NULL; +} + +static inline void sgx_epc_cgroup_uncharge(struct sgx_epc_cgroup *epc_cg) { } + +static inline void sgx_epc_cgroup_isolate_pages(struct sgx_epc_cgroup *root, + size_t *nr_to_scan, + struct list_head *dst) { } + +static inline struct sgx_epc_lru_lists *epc_cg_lru(struct sgx_epc_cgroup *epc_cg) +{ + return NULL; +} + +static bool sgx_epc_cgroup_lru_empty(struct sgx_epc_cgroup *root) +{ + return true; +} +#else +struct sgx_epc_cgroup { + struct misc_cg *cg; + struct sgx_epc_lru_lists lru; + struct work_struct reclaim_work; + atomic_long_t epoch; +}; + +struct sgx_epc_cgroup *sgx_epc_cgroup_try_charge(bool reclaim); +void sgx_epc_cgroup_uncharge(struct sgx_epc_cgroup *epc_cg); +bool sgx_epc_cgroup_lru_empty(struct sgx_epc_cgroup *root); +void sgx_epc_cgroup_isolate_pages(struct sgx_epc_cgroup *root, + size_t *nr_to_scan, struct list_head *dst); +static inline struct sgx_epc_lru_lists *epc_cg_lru(struct sgx_epc_cgroup *epc_cg) +{ + if (epc_cg) + return &epc_cg->lru; + return NULL; +} +#endif + +#endif /* _INTEL_SGX_EPC_CGROUP_H_ */ diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 68c89d575abc..1e5984b881a2 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -17,11 +18,9 @@ #include "driver.h" #include "encl.h" #include "encls.h" -/** - * Maximum number of pages to scan for reclaiming. - */ -#define SGX_NR_TO_SCAN_MAX 32 +#include "epc_cgroup.h" +unsigned long sgx_epc_total_pages; struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static int sgx_nr_epc_sections; static struct task_struct *ksgxd_tsk; @@ -36,9 +35,20 @@ static struct sgx_epc_lru_lists sgx_global_lru; static inline struct sgx_epc_lru_lists *sgx_lru_lists(struct sgx_epc_page *epc_page) { + if (IS_ENABLED(CONFIG_CGROUP_SGX_EPC)) + return epc_cg_lru(epc_page->epc_cg); + return &sgx_global_lru; } +static inline bool sgx_can_reclaim(void) +{ + if (!IS_ENABLED(CONFIG_CGROUP_SGX_EPC)) + return !list_empty(&sgx_global_lru.reclaimable); + + return !sgx_epc_cgroup_lru_empty(NULL); +} + static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0); /* Nodes with one or more EPC sections. */ @@ -298,14 +308,14 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, * @nr_to_scan: Number of pages to scan for reclaim * @dst: Destination list to hold the isolated pages */ -void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lru, size_t nr_to_scan, +void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lru, size_t *nr_to_scan, struct list_head *dst) { struct sgx_encl_page *encl_page; struct sgx_epc_page *epc_page; spin_lock(&lru->lock); - for (; nr_to_scan > 0; --nr_to_scan) { + for (; *nr_to_scan > 0; --(*nr_to_scan)) { epc_page = list_first_entry_or_null(&lru->reclaimable, struct sgx_epc_page, list); if (!epc_page) break; @@ -330,9 +340,10 @@ void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lru, size_t nr_to_scan, } /** - * sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers + * __sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers * @nr_to_scan: Number of EPC pages to scan for reclaim * @ignore_age: Reclaim a page even if it is young + * @epc_cg: EPC cgroup from which to reclaim * * Take a fixed number of pages from the head of the active page pool and * reclaim them to the enclave's private shmem files. Skip the pages, which have @@ -346,7 +357,8 @@ void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lru, size_t nr_to_scan, * problematic as it would increase the lock contention too much, which would * halt forward progress. */ -size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age) +size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age, + struct sgx_epc_cgroup *epc_cg) { struct sgx_backing backing[SGX_NR_TO_SCAN_MAX]; struct sgx_epc_page *epc_page, *tmp; @@ -357,7 +369,15 @@ size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age) size_t ret; size_t i; - sgx_isolate_epc_pages(&sgx_global_lru, nr_to_scan, &iso); + /* + * If a specific cgroup is not being targeted, take from the global + * list first, even when cgroups are enabled. If there are + * pages on the global LRU then they should get reclaimed asap. + */ + if (!IS_ENABLED(CONFIG_CGROUP_SGX_EPC) || !epc_cg) + sgx_isolate_epc_pages(&sgx_global_lru, &nr_to_scan, &iso); + + sgx_epc_cgroup_isolate_pages(epc_cg, &nr_to_scan, &iso); if (list_empty(&iso)) return 0; @@ -410,11 +430,6 @@ size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age) return i; } -static bool sgx_can_reclaim(void) -{ - return !list_empty(&sgx_global_lru.reclaimable); -} - static bool sgx_should_reclaim(unsigned long watermark) { return atomic_long_read(&sgx_nr_free_pages) < watermark && @@ -429,7 +444,7 @@ static bool sgx_should_reclaim(unsigned long watermark) void sgx_reclaim_direct(void) { if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) - sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false, NULL); } static int ksgxd(void *p) @@ -452,7 +467,7 @@ static int ksgxd(void *p) sgx_should_reclaim(SGX_NR_HIGH_PAGES)); if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) - sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false, NULL); cond_resched(); } @@ -606,6 +621,11 @@ int sgx_drop_epc_page(struct sgx_epc_page *page) struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) { struct sgx_epc_page *page; + struct sgx_epc_cgroup *epc_cg; + + epc_cg = sgx_epc_cgroup_try_charge(reclaim); + if (IS_ERR(epc_cg)) + return ERR_CAST(epc_cg); for ( ; ; ) { page = __sgx_alloc_epc_page(); @@ -614,8 +634,10 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - if (!sgx_can_reclaim()) - return ERR_PTR(-ENOMEM); + if (!sgx_can_reclaim()) { + page = ERR_PTR(-ENOMEM); + break; + } if (!reclaim) { page = ERR_PTR(-EBUSY); @@ -627,10 +649,17 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false, NULL); cond_resched(); } + if (!IS_ERR(page)) { + WARN_ON_ONCE(page->epc_cg); + page->epc_cg = epc_cg; + } else { + sgx_epc_cgroup_uncharge(epc_cg); + } + if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) wake_up(&ksgxd_waitq); @@ -653,6 +682,11 @@ void sgx_free_epc_page(struct sgx_epc_page *page) WARN_ON_ONCE(page->flags & (SGX_EPC_PAGE_STATE_MASK)); + if (page->epc_cg) { + sgx_epc_cgroup_uncharge(page->epc_cg); + page->epc_cg = NULL; + } + spin_lock(&node->lock); page->encl_page = NULL; @@ -663,6 +697,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) page->flags = SGX_EPC_PAGE_FREE; spin_unlock(&node->lock); + atomic_long_inc(&sgx_nr_free_pages); } @@ -832,6 +867,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, section->pages[i].flags = 0; section->pages[i].encl_page = NULL; section->pages[i].poison = 0; + section->pages[i].epc_cg = NULL; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } @@ -976,6 +1012,7 @@ static void __init arch_update_sysfs_visibility(int nid) {} static bool __init sgx_page_cache_init(void) { u32 eax, ebx, ecx, edx, type; + u64 capacity = 0; u64 pa, size; int nid; int i; @@ -1026,6 +1063,7 @@ static bool __init sgx_page_cache_init(void) sgx_epc_sections[i].node = &sgx_numa_nodes[nid]; sgx_numa_nodes[nid].size += size; + capacity += size; sgx_nr_epc_sections++; } @@ -1035,6 +1073,9 @@ static bool __init sgx_page_cache_init(void) return false; } + misc_cg_set_capacity(MISC_CG_RES_SGX_EPC, capacity); + sgx_epc_total_pages = capacity >> PAGE_SHIFT; + return true; } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index c6b3c90db0fa..36217032433b 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -19,6 +19,11 @@ #define SGX_MAX_EPC_SECTIONS 8 #define SGX_EEXTEND_BLOCK_SIZE 256 + +/* + * Maximum number of pages to scan for reclaiming. + */ +#define SGX_NR_TO_SCAN_MAX 32UL #define SGX_NR_TO_SCAN 16 #define SGX_NR_LOW_PAGES 32 #define SGX_NR_HIGH_PAGES 64 @@ -70,6 +75,8 @@ enum sgx_epc_page_state { /* flag for pages owned by a sgx_encl struct */ #define SGX_EPC_OWNER_ENCL BIT(4) +struct sgx_epc_cgroup; + struct sgx_epc_page { unsigned int section; u16 flags; @@ -79,6 +86,7 @@ struct sgx_epc_page { struct sgx_encl *encl; }; struct list_head list; + struct sgx_epc_cgroup *epc_cg; }; static inline void sgx_epc_page_reset_state(struct sgx_epc_page *page) @@ -127,6 +135,7 @@ struct sgx_epc_section { struct sgx_numa_node *node; }; +extern unsigned long sgx_epc_total_pages; extern struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static inline unsigned long sgx_get_epc_phys_addr(struct sgx_epc_page *page) @@ -175,8 +184,9 @@ void sgx_reclaim_direct(void); void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags); int sgx_drop_epc_page(struct sgx_epc_page *page); struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); -size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age); -void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lrus, size_t nr_to_scan, +size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age, + struct sgx_epc_cgroup *epc_cg); +void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lrus, size_t *nr_to_scan, struct list_head *dst); bool sgx_epc_oom(struct sgx_epc_lru_lists *lrus); From patchwork Wed Jul 12 23:01:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haitao Huang X-Patchwork-Id: 119395 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1472184vqm; Wed, 12 Jul 2023 16:15:13 -0700 (PDT) X-Google-Smtp-Source: APBJJlEzYTTzi9atd2T9rrTY9eStXPMcU1GfDNUV1YTscJ03fMhE8CuZMAxkhSWt4PaY0gFSgAPz X-Received: by 2002:a54:479a:0:b0:398:f48:eaf with SMTP id o26-20020a54479a000000b003980f480eafmr19099226oic.26.1689203712897; Wed, 12 Jul 2023 16:15:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689203712; cv=none; d=google.com; s=arc-20160816; b=zmOu1da3SWdH/S0G7Gh8j2aQS4SFSDmly8bHIeCJ8B/+OXq7Q8RpisPE3LZ8Ak833N wu0q73Afs/To8xFXxrnmDmbUrx4MQSpbb7UbXygCCGUkmGgU6Ubmn0pP+ggC2uHkfeBp bx+JOcpC36a2n7KvbT3OYSY8GPZzV+2knbNz6sUoumdBRNtsX47qAM6Ox0X19u/QobF0 Fgw+aUnbX+T49eIFkt7fBm8S6H97tAYR7I9ycOco7IcpsBn4upW6ZboSTZnmCbdWKmkF sLmCC7Ul6x57WdXVWWo0X8y8VBH6pRMRhKfK01Vj/lUWQN1M7ldq1NCeS/WNvPExI3tW 0QtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Oz4PHPugjbEz7SRKypC7oP9vZzwwiD5rxThtF33hb1I=; fh=MuSpYD2jypKSps6dsHiiqRkFbia+Bc/N/IxBXiK1Cw8=; b=02x/mT0hhBOE/gWRST8iXhf3lcHbYxziKszLpBQCJbTueAmamrc9hLPnR8mTnEdbGu oQtX2xH2RB2EWDQLUVI9/Kp/AcbjKvPmyw1GP+9xlRmS1bnoenhymAwAF4kPRnbJ0C/z f6aKlkIu4j7IU7Jf4EocKB0YMPNXzkL4g6QeNVOsbcFlvZyO+QTGL9mJZDkv4JPEZxNC 3o1Sc+a1szX4FMyj4C2ShELBJSYOVsP5AFtjhguk19imJh078am8L7v6Ek4DIQwW90Cu 7PDct5cfwwEnXkzQiA4NMxKg/5zFV8P8uF80uATzLS6WyvU4J4KziK04Hz7sGK4yIsLQ tMaQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="oANffFx/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d5-20020a056a00244500b006825f34417fsi4032381pfj.238.2023.07.12.16.15.00; Wed, 12 Jul 2023 16:15:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="oANffFx/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233205AbjGLXDh (ORCPT + 99 others); Wed, 12 Jul 2023 19:03:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231461AbjGLXCY (ORCPT ); Wed, 12 Jul 2023 19:02:24 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35769173C; Wed, 12 Jul 2023 16:02:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689202939; x=1720738939; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3e6QAWNAF76VcJMJbXCen6twY/3D+MIdev0F0r8XF58=; b=oANffFx/YGS2HrYl6AeDIxCZkLz6ksoJBoPchd3wx2flotjBff7sN+6d LESeEwRrbdzfSWFlBb8k7N5cFOORNp/CdluwmuBiWUa5b5d5PCWzChBdx jdyJWb9SGrLTG8l7DK1cYBu2hxVj5L4MO4PNDwA98HmpaPGn3AFg/0za3 Nr6yJX1OUphVOSOiUqXungmEmTFXMqkFRCgRe7b/eMsCz7HXcrAQai2vQ r36H2emfNvTGoeg3UrZjsSvQVQ12sJYDix9OCsLqiU2uvBggiGrigUxMV mZi36gWRI0yAff4757UqMzoKNbBlfqPqHdOAKY4LZSzKhzq9JmeBtn5Ru A==; X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="428774196" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="428774196" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 16:02:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10769"; a="835338672" X-IronPort-AV: E=Sophos;i="6.01,200,1684825200"; d="scan'208";a="835338672" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga002.fm.intel.com with ESMTP; 12 Jul 2023 16:02:17 -0700 From: Haitao Huang To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" , Jonathan Corbet Cc: kai.huang@intel.com, reinette.chatre@intel.com, Kristen Carlson Accardi , zhiquan1.li@intel.com, seanjc@google.com, bagasdotme@gmail.com, linux-doc@vger.kernel.org, zhanb@microsoft.com, anakrish@microsoft.com, mikko.ylinen@linux.intel.com Subject: [PATCH v3 22/28] Docs/x86/sgx: Add description for cgroup support Date: Wed, 12 Jul 2023 16:01:56 -0700 Message-Id: <20230712230202.47929-23-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230712230202.47929-1-haitao.huang@linux.intel.com> References: <20230712230202.47929-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771258472292498729 X-GMAIL-MSGID: 1771258472292498729 From: Kristen Carlson Accardi Add initial documentation of how to regulate the distribution of SGX Enclave Page Cache (EPC) memory via the Miscellaneous cgroup controller. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson Reviewed-by: Bagas Sanjaya --- Documentation/arch/x86/sgx.rst | 77 ++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/Documentation/arch/x86/sgx.rst b/Documentation/arch/x86/sgx.rst index 2bcbffacbed5..f6ca5594dcf2 100644 --- a/Documentation/arch/x86/sgx.rst +++ b/Documentation/arch/x86/sgx.rst @@ -300,3 +300,80 @@ to expected failures and handle them as follows: first call. It indicates a bug in the kernel or the userspace client if any of the second round of ``SGX_IOC_VEPC_REMOVE_ALL`` calls has a return code other than 0. + + +Cgroup Support +============== + +The "sgx_epc" resource within the Miscellaneous cgroup controller regulates +distribution of SGX EPC memory, which is a subset of system RAM that +is used to provide SGX-enabled applications with protected memory, +and is otherwise inaccessible, i.e. shows up as reserved in +/proc/iomem and cannot be read/written outside of an SGX enclave. + +Although current systems implement EPC by stealing memory from RAM, +for all intents and purposes the EPC is independent from normal system +memory, e.g. must be reserved at boot from RAM and cannot be converted +between EPC and normal memory while the system is running. The EPC is +managed by the SGX subsystem and is not accounted by the memory +controller. Note that this is true only for EPC memory itself, i.e. +normal memory allocations related to SGX and EPC memory, e.g. the +backing memory for evicted EPC pages, are accounted, limited and +protected by the memory controller. + +Much like normal system memory, EPC memory can be overcommitted via +virtual memory techniques and pages can be swapped out of the EPC +to their backing store (normal system memory allocated via shmem). +The SGX EPC subsystem is analogous to the memory subsytem, and +it implements limit and protection models for EPC memory. + +SGX EPC Interface Files +----------------------- + +For a generic description of the Miscellaneous controller interface +files, please see Documentation/admin-guide/cgroup-v2.rst + +All SGX EPC memory amounts are in bytes unless explicitly stated +otherwise. If a value which is not PAGE_SIZE aligned is written, +the actual value used by the controller will be rounded down to +the closest PAGE_SIZE multiple. + + misc.capacity + A read-only flat-keyed file shown only in the root cgroup. + The sgx_epc resource will show the total amount of EPC + memory available on the platform. + + misc.current + A read-only flat-keyed file shown in the non-root cgroups. + The sgx_epc resource will show the current active EPC memory + usage of the cgroup and its descendants. EPC pages that are + swapped out to backing RAM are not included in the current count. + + misc.max + A read-write single value file which exists on non-root + cgroups. The sgx_epc resource will show the EPC usage + hard limit. The default is "max". + + If a cgroup's EPC usage reaches this limit, EPC allocations, + e.g. for page fault handling, will be blocked until EPC can + be reclaimed from the cgroup. If EPC cannot be reclaimed in + a timely manner, reclaim will be forced, e.g. by ignoring LRU. + + misc.events + A read-write flat-keyed file which exists on non-root cgroups. + Writes to the file reset the event counters to zero. A value + change in this file generates a file modified event. + + max + The number of times the cgroup has triggered a reclaim + due to its EPC usage approaching (or exceeding) its max + EPC boundary. + +Migration +--------- + +Once an EPC page is charged to a cgroup (during allocation), it +remains charged to the original cgroup until the page is released +or reclaimed. Migrating a process to a different cgroup doesn't +move the EPC charges that it incurred while in the previous cgroup +to its new cgroup.