From patchwork Fri Dec 2 18:36:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29059 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1015493wrr; Fri, 2 Dec 2022 10:42:01 -0800 (PST) X-Google-Smtp-Source: AA0mqf5HhXyCURBvw/rYZDtKf0GWXiZB3aO4Odhft6F11OUpEtXtxUKD9ABOCsydlSpKCvEIvds+ X-Received: by 2002:a63:5c1e:0:b0:46e:96ba:494d with SMTP id q30-20020a635c1e000000b0046e96ba494dmr46532055pgb.404.1670006521304; Fri, 02 Dec 2022 10:42:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006521; cv=none; d=google.com; s=arc-20160816; b=jdREtWwBGYGYgx/IAkVqyM9X3hmT1TtuOpEk672tF9Kjw0zStcnqBSVEIOJsvkClCD M9N8oYIpV9meYyElchFrufN47kjO/KXmc1VkPC0zJg5mEzQNgQ+nZ+Qc8Lqc8AciN2Sh MDf4BmEUsXNzJHyTOsU99JzWi0HdbZksRx8RBt/RIh5nNmLwjYhIoRJ24pe/GcjnUHTk Y7Ro9+t4d4dEHIlscaPK9rhHvHrQuKb7ewh0RlkcON3Xlw2vadOdSnmbd414NZoWW+3K VnoylkIV7QWwR18zIk4TaQUgO0kHfFx9zhD8Wwp7TSM6LuNaBxhytyohviYANXiE+wlw qU6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=1rZRYnxGuVgoAOvKoWbYKl/U6ds7FSlrzSuH4+JNyng=; b=pto8HgwHLqst9htY0uNEHuY6gfb7k8MQsSjfgzbLZBNx7N8lpt518aDq7JrWOM39So kni2BciddY4MZNG+shLBmQNKOKsDg/3KqBRqeIj5RqTbh4llKInp/DQFki3FcjO6HYbK LbVMpNPoqjZ/8+1/v8tjSgKh641lATVbkBBXWRxbF0ZBSRZZJc4bUSVdvT1/LPdd63uv h0d86ob7xDK6dFhCTEIB0JkCixFbT46iDXo0MRqAGviojmaPutDEAuoVpCRp7RUq5JPF qWXuSwjt8OklkYQMBanzE5IIfumFrjak7SlVyeSGokt25umIMmZo0CU8ioz3SdQ+R5Ue IlgQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fjwrwjP9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l192-20020a6391c9000000b00476fddee338si8431650pge.436.2022.12.02.10.41.47; Fri, 02 Dec 2022 10:42:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fjwrwjP9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234361AbiLBShL (ORCPT + 99 others); Fri, 2 Dec 2022 13:37:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234541AbiLBShH (ORCPT ); Fri, 2 Dec 2022 13:37:07 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC423EDD4D; Fri, 2 Dec 2022 10:37:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006226; x=1701542226; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qv8PoJW739IycS7cP6BKN7byFcsUeUj+E5iETGUdTaY=; b=fjwrwjP9y+xUEsMd6ZeGhn8XC348NLasrXvwxkfJICVBO3ZSoij9vVmd 6n287W0Zzn8E3fNXrvfyuK9LQn5YK/Qs11plq4giEMXVX8lTMYiFuwfEs QEpOUFuhTtLoQvWK6EQSi6TctCfuSeG0vO1EmjwEJ8QsD7FEviR89DLlT sIuJDFUD/uH85zJcrDOXw8cvjUFZOfqxGXtl7TpDw8vU07t6dbq5tKTs0 wXuvF00RoE3LxhWuD1HrUFBI3Yys/YJIUztwpoEiqJzp58mPhX8LVReCW EcFU/FwdhO5Xmlp4DIidrt4HPrm3/R3nttsrRQZIYLD/oAQqq9k4gb+w+ w==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724475" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724475" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:06 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717343" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717343" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:03 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 01/18] x86/sgx: Call cond_resched() at the end of sgx_reclaim_pages() Date: Fri, 2 Dec 2022 10:36:37 -0800 Message-Id: <20221202183655.3767674-2-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128758098989903?= X-GMAIL-MSGID: =?utf-8?q?1751128758098989903?= From: Sean Christopherson In order to avoid repetition of cond_resched() in ksgxd() and sgx_alloc_epc_page(), move the invocation of post-reclaim cond_resched() inside sgx_reclaim_pages(). Except in the case of sgx_reclaim_direct(), sgx_reclaim_pages() is always called in a loop and is always followed by a call to cond_resched(). This will hold true for the EPC cgroup as well, which adds even more calls to sgx_reclaim_pages() and thus cond_resched(). Calls to sgx_reclaim_direct() may be performance sensitive. Allow sgx_reclaim_direct() to avoid the cond_resched() call by moving the original sgx_reclaim_pages() call to __sgx_reclaim_pages() and then have sgx_reclaim_pages() become a wrapper around that call with a cond_resched(). Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 160c8dbee0ab..ffce6fc70a1f 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -287,7 +287,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, * problematic as it would increase the lock contention too much, which would * halt forward progress. */ -static void sgx_reclaim_pages(void) +static void __sgx_reclaim_pages(void) { struct sgx_epc_page *chunk[SGX_NR_TO_SCAN]; struct sgx_backing backing[SGX_NR_TO_SCAN]; @@ -369,6 +369,12 @@ static void sgx_reclaim_pages(void) } } +static void sgx_reclaim_pages(void) +{ + __sgx_reclaim_pages(); + cond_resched(); +} + static bool sgx_should_reclaim(unsigned long watermark) { return atomic_long_read(&sgx_nr_free_pages) < watermark && @@ -378,12 +384,14 @@ static bool sgx_should_reclaim(unsigned long watermark) /* * sgx_reclaim_direct() should be called (without enclave's mutex held) * in locations where SGX memory resources might be low and might be - * needed in order to make forward progress. + * needed in order to make forward progress. This call to + * __sgx_reclaim_pages() avoids the cond_resched() in sgx_reclaim_pages() + * to improve performance. */ void sgx_reclaim_direct(void) { if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) - sgx_reclaim_pages(); + __sgx_reclaim_pages(); } static int ksgxd(void *p) @@ -410,8 +418,6 @@ static int ksgxd(void *p) if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) sgx_reclaim_pages(); - - cond_resched(); } return 0; @@ -582,7 +588,6 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) } sgx_reclaim_pages(); - cond_resched(); } if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) From patchwork Fri Dec 2 18:36:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29065 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1016057wrr; Fri, 2 Dec 2022 10:43:34 -0800 (PST) X-Google-Smtp-Source: AA0mqf4QmJsuqA9gLV3SScEbCjLJYhD498/JKW4l/G++Lq0JxdFyzjDlAaDw2qpqqlS6ay1+uLF3 X-Received: by 2002:a50:eaca:0:b0:46b:6bc1:1b4a with SMTP id u10-20020a50eaca000000b0046b6bc11b4amr18776605edp.221.1670006614400; Fri, 02 Dec 2022 10:43:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006614; cv=none; d=google.com; s=arc-20160816; b=MaHfWqlmwyfJ7Jg2REg/LE0xntoJ1BMw4D+VYaGNNplg3wKZvrattcSP02TvvgfQKm deWEefu/dFPHYIoEuai4lyd/49AGNM73OkCg1I2MA1T/gvbJmGwiRul24bvU+J2sd9fN LL/ls1CPvjPhmXoI8JIp2l841j8V5A11esx3EURfKi6ntHUtDhHgigTJpSmK8PmLi3ph jVSgiBHBFwCgL77K6y80Qi+JfNTYVTUeGWZupKgpHnt6ErfPWtLVHZZAOTAcPnZ6xFAu b1flZsWC/1Gm1bnsjfzKVc7xk9u6UT2TOPapNFrXuMuINWcdeXQhpSfTSlWgc3Vnez67 DQ6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=hLAJ7Gg1rSbTxQmdH7VvXNSZor1QA1TjBf5mEOK+8o8=; b=OMFfkq54ss3Vky0/+kNtg/syF0olP5wXsZkflau9M0igwwAq/xZN/7POYZXRBTVWWj hW9dYu+nGyStt6CS0aDMVsjLPL81fR4gpmzvEY3w0O3CW0f/jUy+k5LD80PUzsTiCD5A tQ86/ea4CADMOz6x3hIG78LacxyEYuZ8okCcNAoQp+0trfjBMZEWgCd7Q+iNFfl6I/uW UlHHamWBFd+9GBqc0LFpocwDRcQcogUdEWIZK981vck98JQH+2S1tlJj06budhDACuvD m+BMb0S7vdrAc43Px0hULwvQ7AcaHR7fCOEQRAS7Nafeuko3SSRVXN+3LQ43iZerXW84 i6ng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fxbfFMvt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jg39-20020a170907972700b007c09fda33c7si6516347ejc.824.2022.12.02.10.43.11; Fri, 02 Dec 2022 10:43:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fxbfFMvt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234516AbiLBShQ (ORCPT + 99 others); Fri, 2 Dec 2022 13:37:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234473AbiLBShJ (ORCPT ); Fri, 2 Dec 2022 13:37:09 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3830EDD4D; Fri, 2 Dec 2022 10:37:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006228; x=1701542228; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ba88DeV8/njUEzN15e2sj/am8fPP9lPmiVElJCF0C+U=; b=fxbfFMvt9WUMNgypsz/7z/fJyPXW7QBFRkSLf/tQhaARtEdM4IrD/8hP rRPgwIgZsHixQn6kYqx4h+0TbhqPyKUblU5u/muAaTwDKmaVUR9Y4yWST bNGurGrB4V4GoaerfRDi9BEbLH+l8nU6zzlhX+06jbXHQbD2bS9HThx7j 6lhVhslfHoMONlzVYZU/FIqBd20Hm3kLNFtWsMcg0T57kTIbhqQpt5IkP dmRrnGN02g4xvrWocP/y2wIkMD5bcir8aklDkLgC2Z7ny7OQsBMLWtEov kKn4iOOUss1toSYrL08roAsXU9llKBnUkBUZuZuE2WR1/M5kcLCcTSXrL g==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724495" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724495" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:08 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717366" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717366" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:06 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 02/18] x86/sgx: Store struct sgx_encl when allocating new VA pages Date: Fri, 2 Dec 2022 10:36:38 -0800 Message-Id: <20221202183655.3767674-3-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128855389057928?= X-GMAIL-MSGID: =?utf-8?q?1751128855389057928?= From: Sean Christopherson When allocating new Version Array (VA) pages, pass the struct sgx_encl of the enclave that is allocating the page. sgx_alloc_epc_page() will store this value in the encl_owner field of the struct sgx_epc_page. In a later patch, VA pages will be placed in an unreclaimable queue, and then when the cgroup max limit is reached and there are no more reclaimable pages and the enclave must be oom killed, all the VA pages associated with that enclave can be uncharged and freed. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/encl.c | 5 +++-- arch/x86/kernel/cpu/sgx/encl.h | 2 +- arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- arch/x86/kernel/cpu/sgx/sgx.h | 1 + 4 files changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index f40d64206ded..4eaf9d21e71b 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -1193,6 +1193,7 @@ void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr) /** * sgx_alloc_va_page() - Allocate a Version Array (VA) page + * @encl: The enclave that this page is allocated to. * @reclaim: Reclaim EPC pages directly if none available. Enclave * mutex should not be held if this is set. * @@ -1202,12 +1203,12 @@ void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr) * a VA page, * -errno otherwise */ -struct sgx_epc_page *sgx_alloc_va_page(bool reclaim) +struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim) { struct sgx_epc_page *epc_page; int ret; - epc_page = sgx_alloc_epc_page(NULL, reclaim); + epc_page = sgx_alloc_epc_page(encl, reclaim); if (IS_ERR(epc_page)) return ERR_CAST(epc_page); diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index f94ff14c9486..831d63f80f5a 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -116,7 +116,7 @@ struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl, unsigned long offset, u64 secinfo_flags); void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr); -struct sgx_epc_page *sgx_alloc_va_page(bool reclaim); +struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim); unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page); void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset); bool sgx_va_page_full(struct sgx_va_page *va_page); diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index ebe79d60619f..9a1bb3c3211a 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -30,7 +30,7 @@ struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, bool reclaim) if (!va_page) return ERR_PTR(-ENOMEM); - va_page->epc_page = sgx_alloc_va_page(reclaim); + va_page->epc_page = sgx_alloc_va_page(encl, reclaim); if (IS_ERR(va_page->epc_page)) { err = ERR_CAST(va_page->epc_page); kfree(va_page); diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index d16a8baa28d4..39cb15a8abcb 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -39,6 +39,7 @@ struct sgx_epc_page { struct sgx_encl_page *encl_owner; /* Use when SGX_EPC_PAGE_KVM_GUEST set in ->flags: */ void __user *vepc_vaddr; + struct sgx_encl *encl; }; struct list_head list; }; From patchwork Fri Dec 2 18:36:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29060 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1015693wrr; Fri, 2 Dec 2022 10:42:32 -0800 (PST) X-Google-Smtp-Source: AA0mqf4k7RjQaJRcVPSTqNsh/OFaiTYXP8quO3P6978xZwUg/kkjXlSqjsa2V/N8LiPx4UWMOCws X-Received: by 2002:a17:906:ce4d:b0:7be:1b8b:21fc with SMTP id se13-20020a170906ce4d00b007be1b8b21fcmr26798132ejb.666.1670006552274; Fri, 02 Dec 2022 10:42:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006552; cv=none; d=google.com; s=arc-20160816; b=SN14yJxEvyizMEz0CP6YP2mbEOFybaPmKCa9sfUa6Lr8qf9YdpOQtuIeyXiyqbwa5n 2+F4v+ORgKqIQoYXRoNMDKPtQaj1VZb8w2RinCwJYSENaerxpO1F4MPZ0qy7DwLdJhB3 +t8RJTuc8Ia+akDv5ELtCudIFdvbVJ0sEgxahWCM3ymyvjrQ0FAd5HxDPoUqAGT/9arw UNwRbr5gTlJEePpZdQqOBI1yYGUaqkVp0+/kkv4qrZLH0evreZBc1KSwG/pP6iMSvN+f vNdbxOij4GKfzmddWMzsGad90Wxh2smnCgqlZvTHRr3MMJCI1qZEWaiwRaQ5D37IFZgo GCYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=SNPu4QzJ9jAwDxQLYSy/RVjPpbsOOdPAGYxGjibH7m0=; b=o0tef2Ti7Ctbf5BDJkUWyWo1ibt8Va7c16NnCF3TD0hXbur0kFjPU3zqYE97bhlmSI 8/StW8JN9T+EULY8wkLJR0aph3siUOdees/6+jrHOXhAy/3zj9LCuZhTzVT0+UW8sTLC MMKPQMBzc8KrQk8pgg9yEleB40pYTjVV2233iLYuhbxctUSbJ9HRUCAW2xYqG0QAGz76 WV/2KRmS1OaZIFdQ5WU57U6hE9rxXfg9SXmmw6bXo3OEPikhZ4UpZBZyZ+liu//IdRQa Z+DoOFdAysNA8c+WFq6MZG/Ifvo3aD/7uGYGNPtEjPzBnX5erG3+D/qTKw1cPN8qocn7 yaHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="i2/LeCXI"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l3-20020a170906794300b007a087ccd275si7816941ejo.384.2022.12.02.10.42.08; Fri, 02 Dec 2022 10:42:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="i2/LeCXI"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234522AbiLBShU (ORCPT + 99 others); Fri, 2 Dec 2022 13:37:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43802 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234490AbiLBShL (ORCPT ); Fri, 2 Dec 2022 13:37:11 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3BCEEEDD56; Fri, 2 Dec 2022 10:37:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006231; x=1701542231; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=r+7CHYhh5wuKLW8x7IX2pGjuO6Ml3uaP1xZZqkdxdwU=; b=i2/LeCXIwJoB1Fl4myXlwolIrjDhHegFAxrqOKZeo+3wrDT0lHRGAW7w vjBwUwFIoMnArnKD3mgNwS6jE1R+RV3AhlkT6SYS87KM+HtoMP9e27mxY V2TTduKeXgdhkTIQzsweBed/hc9Wt7fWIn12rFX4KZtTEzqhCS+KkdIjf 12xbC8khSNC6V3RY64tZy6CZgTRkvDe1uXQi3L28g9xaRSULB2ivVjVK/ icZgyRlph1c0Waiz4yuRgaIHymvxupyGvqT7vdXLttWZhBoF7p1/I0Mqd bKj4UL9VENCmPI/5JZXRojAbnph5YQL2zbimwoErVdU0ELNTHI8D8VgQK A==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724510" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724510" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:10 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717386" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717386" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:08 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 03/18] x86/sgx: Add 'struct sgx_epc_lru_lists' to encapsulate lru list(s) Date: Fri, 2 Dec 2022 10:36:39 -0800 Message-Id: <20221202183655.3767674-4-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128790384647936?= X-GMAIL-MSGID: =?utf-8?q?1751128790384647936?= Introduce a data structure to wrap the existing reclaimable list and its spinlock in a struct to minimize the code changes needed to handle multiple LRUs as well as reclaimable and non-reclaimable lists, both of which will be introduced and used by SGX EPC cgroups. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/sgx.h | 65 +++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 39cb15a8abcb..5e6d88438fae 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -90,6 +90,71 @@ static inline void *sgx_get_epc_virt_addr(struct sgx_epc_page *page) return section->virt_addr + index * PAGE_SIZE; } +/* + * This data structure wraps a list of reclaimable EPC pages, and a list of + * non-reclaimable EPC pages and is used to implement a LRU policy during + * reclamation. + */ +struct sgx_epc_lru_lists { + spinlock_t lock; + struct list_head reclaimable; + struct list_head unreclaimable; +}; + +static inline void sgx_lru_init(struct sgx_epc_lru_lists *lrus) +{ + spin_lock_init(&lrus->lock); + INIT_LIST_HEAD(&lrus->reclaimable); + INIT_LIST_HEAD(&lrus->unreclaimable); +} + +/* + * Must be called with queue lock acquired + */ +static inline void __sgx_epc_page_list_push(struct list_head *list, struct sgx_epc_page *page) +{ + list_add_tail(&page->list, list); +} + +/* + * Must be called with queue lock acquired + */ +static inline struct sgx_epc_page * __sgx_epc_page_list_pop(struct list_head *list) +{ + struct sgx_epc_page *epc_page; + + if (list_empty(list)) + return NULL; + + epc_page = list_first_entry(list, struct sgx_epc_page, list); + list_del_init(&epc_page->list); + return epc_page; +} + +static inline struct sgx_epc_page * +sgx_epc_pop_reclaimable(struct sgx_epc_lru_lists *lrus) +{ + return __sgx_epc_page_list_pop(&(lrus)->reclaimable); +} + +static inline void sgx_epc_push_reclaimable(struct sgx_epc_lru_lists *lrus, + struct sgx_epc_page *page) +{ + __sgx_epc_page_list_push(&(lrus)->reclaimable, page); +} + +static inline struct sgx_epc_page * +sgx_epc_pop_unreclaimable(struct sgx_epc_lru_lists *lrus) +{ + return __sgx_epc_page_list_pop(&(lrus)->unreclaimable); +} + +static inline void sgx_epc_push_unreclaimable(struct sgx_epc_lru_lists *lrus, + struct sgx_epc_page *page) +{ + __sgx_epc_page_list_push(&(lrus)->unreclaimable, page); +} + struct sgx_epc_page *__sgx_alloc_epc_page(void); void sgx_free_epc_page(struct sgx_epc_page *page); From patchwork Fri Dec 2 18:36:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29062 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1015851wrr; Fri, 2 Dec 2022 10:43:00 -0800 (PST) X-Google-Smtp-Source: AA0mqf4DvwcbTOvOI4+Jlkr4iJOhHcDIXX456qRVqw+ZLevx0abEKRB9lDxcymmlXhJdoKQfyJ3I X-Received: by 2002:a05:6402:3485:b0:468:89dd:d326 with SMTP id v5-20020a056402348500b0046889ddd326mr48041227edc.352.1670006579929; Fri, 02 Dec 2022 10:42:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006579; cv=none; d=google.com; s=arc-20160816; b=th2P5vcCvuEtwDE6gjAjd8kVW5v2uvq0WPTJXuqlV88FaTiQ6CBsNs7KAxCf5AGSA8 AuxSYAF9shtNMjttqUS/Fk2e9cHgz6EqWXctVH9OL6PHyspbZClEEr6evC1iVBmn3uOw 84kbctL57hD3S+U6/HOd2itYHdC2/c/7KBXMG0IZdmknToBzmKyFJVSFto53D4DZ1508 elb75ZkBYctxca9fKI7HgQXRByTEmG38ONXi9/8HcaMR9bFGt3LifZg5IhImbIlHbiCJ KRFvtDqnt35gNV9v/+5EaXZBDUFw3LZmkRjmA+nAGaXPwU1juvqdk8nXtXjkfO/9UnTR jO/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Shbdv/RzowbzXZk5Gwzb6k9CETg8vFVEujR+OagodhI=; b=FGor8yo/97cVEVqmcvwfSA2uLZni0gmYXL9PAv7mONMjaTBqPLL/2+oh6CWqXh0xm0 UNJ9HpzS4jlJamxJYr1NpBj7AX/z3TWCUjBU+6KeMPFJS0yoO5rfsl+2YeLNA+Lvww9Z zNG70aMwtDyWGW0QcqlBVyo0S3pk5GG3gqnLShpDY0tgTeTnAioqncZg8h4a2C6U4f7W KB66nxOib8/056A4eKbZNblcGBdeAGtg+TJROgMlAlj5DfL9scieVdGsSKEsZdNdcCKm Da9vGmmhfn23BSg7iRYARLAsRQ1brvGr/mBKHFdgapiH53QMG+sJFLegGxEcQHNudnF/ Wrzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Ms2hcNGd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qq28-20020a17090720dc00b007a2d966eeccsi5590567ejb.686.2022.12.02.10.42.36; Fri, 02 Dec 2022 10:42:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Ms2hcNGd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234537AbiLBSha (ORCPT + 99 others); Fri, 2 Dec 2022 13:37:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234506AbiLBShP (ORCPT ); Fri, 2 Dec 2022 13:37:15 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 576EBEDD4D; Fri, 2 Dec 2022 10:37:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006233; x=1701542233; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FAwE3GKmpLc7cwMFuZTK8aRPKxUsCUSpv5pSdwsqWPQ=; b=Ms2hcNGdTwatlgqSU7ec4xBy21vbehj3JCOMEbeBxrJ/nahdq5E7zSrW tY9+QTkKj5SKmmuVsdwd64VbavZkbAzPRWm/dahmRJss/MDtYvPM+kcrN /RC38sLT0b8FFGoTwyDq60QjGaUsdbT5bwV2HZ0ijyHNTN9VdVqfT5Z5w 6mtrT8UNwS03Q43pDBjbEbyz9jasN5xD3eCpx8BNWyyTlaOSasrA7JFhI o0UWtea9hf31vc4bccwrGRTa/L/co1hqPx7KGBYRT+K52iplFMwSQZKzH lDdJ+a3dvYeTLgvcM4FKPz615baGrWqkZR4AkaF6JVaO0yFRXaU4vhtAO Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724529" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724529" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:12 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717399" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717399" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:11 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 04/18] x86/sgx: Use sgx_epc_lru_lists for existing active page list Date: Fri, 2 Dec 2022 10:36:40 -0800 Message-Id: <20221202183655.3767674-5-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128819678792569?= X-GMAIL-MSGID: =?utf-8?q?1751128819678792569?= Replace the existing sgx_active_page_list and its spinlock with a global sgx_epc_lru_lists struct. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 39 +++++++++++++++++----------------- 1 file changed, 19 insertions(+), 20 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index ffce6fc70a1f..447cf4b8580c 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -26,10 +26,9 @@ static DEFINE_XARRAY(sgx_epc_address_space); /* * These variables are part of the state of the reclaimer, and must be accessed - * with sgx_reclaimer_lock acquired. + * with sgx_global_lru.lock acquired. */ -static LIST_HEAD(sgx_active_page_list); -static DEFINE_SPINLOCK(sgx_reclaimer_lock); +static struct sgx_epc_lru_lists sgx_global_lru; static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0); @@ -298,14 +297,12 @@ static void __sgx_reclaim_pages(void) int ret; int i; - spin_lock(&sgx_reclaimer_lock); + spin_lock(&sgx_global_lru.lock); for (i = 0; i < SGX_NR_TO_SCAN; i++) { - if (list_empty(&sgx_active_page_list)) + epc_page = sgx_epc_pop_reclaimable(&sgx_global_lru); + if (!epc_page) break; - epc_page = list_first_entry(&sgx_active_page_list, - struct sgx_epc_page, list); - list_del_init(&epc_page->list); encl_page = epc_page->encl_owner; if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) @@ -316,7 +313,7 @@ static void __sgx_reclaim_pages(void) */ epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; } - spin_unlock(&sgx_reclaimer_lock); + spin_unlock(&sgx_global_lru.lock); for (i = 0; i < cnt; i++) { epc_page = chunk[i]; @@ -339,9 +336,9 @@ static void __sgx_reclaim_pages(void) continue; skip: - spin_lock(&sgx_reclaimer_lock); - list_add_tail(&epc_page->list, &sgx_active_page_list); - spin_unlock(&sgx_reclaimer_lock); + spin_lock(&sgx_global_lru.lock); + sgx_epc_push_reclaimable(&sgx_global_lru, epc_page); + spin_unlock(&sgx_global_lru.lock); kref_put(&encl_page->encl->refcount, sgx_encl_release); @@ -378,7 +375,7 @@ static void sgx_reclaim_pages(void) static bool sgx_should_reclaim(unsigned long watermark) { return atomic_long_read(&sgx_nr_free_pages) < watermark && - !list_empty(&sgx_active_page_list); + !list_empty(&sgx_global_lru.reclaimable); } /* @@ -433,6 +430,8 @@ static bool __init sgx_page_reclaimer_init(void) ksgxd_tsk = tsk; + sgx_lru_init(&sgx_global_lru); + return true; } @@ -508,10 +507,10 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) */ void sgx_mark_page_reclaimable(struct sgx_epc_page *page) { - spin_lock(&sgx_reclaimer_lock); + spin_lock(&sgx_global_lru.lock); page->flags |= SGX_EPC_PAGE_RECLAIMER_TRACKED; - list_add_tail(&page->list, &sgx_active_page_list); - spin_unlock(&sgx_reclaimer_lock); + sgx_epc_push_reclaimable(&sgx_global_lru, page); + spin_unlock(&sgx_global_lru.lock); } /** @@ -526,18 +525,18 @@ void sgx_mark_page_reclaimable(struct sgx_epc_page *page) */ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page) { - spin_lock(&sgx_reclaimer_lock); + spin_lock(&sgx_global_lru.lock); if (page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) { /* The page is being reclaimed. */ if (list_empty(&page->list)) { - spin_unlock(&sgx_reclaimer_lock); + spin_unlock(&sgx_global_lru.lock); return -EBUSY; } list_del(&page->list); page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; } - spin_unlock(&sgx_reclaimer_lock); + spin_unlock(&sgx_global_lru.lock); return 0; } @@ -574,7 +573,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - if (list_empty(&sgx_active_page_list)) + if (list_empty(&sgx_global_lru.reclaimable)) return ERR_PTR(-ENOMEM); if (!reclaim) { From patchwork Fri Dec 2 18:36:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29063 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1015888wrr; Fri, 2 Dec 2022 10:43:03 -0800 (PST) X-Google-Smtp-Source: AA0mqf5yfHEFoRrP4kQ6Wb4BWxBJRWd1DiIm0hy1bNBntPGRzTD8Q+MHbTNGE9zdkfjkjfr/JVHv X-Received: by 2002:a17:906:924e:b0:782:2d3e:6340 with SMTP id c14-20020a170906924e00b007822d3e6340mr61628555ejx.234.1670006583729; Fri, 02 Dec 2022 10:43:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006583; cv=none; d=google.com; s=arc-20160816; b=DTAyZin1apGcnNdwtMYqci9WhAEDrywIm5Zw6KLBtzxh7W4UH4HDMf2NsPhKajhJC/ XQSaQjxQp/m/aXxqtxe/I7exotDne5Z4gayOHvOuGwIv/uVoQm2WfQfphosMJ575zHQv A5CsO6TgzseD6UWXcBYS/kd2ICngrZS9nBgZnwvAbyQaAIj4rlX2BltSZQdpEMHP9flc NKyNdShq8buK+rCgOwVV8vMdeG8JpGkhCRsFDpPA+oSUVPed7bvbaRU7Vy09DMbtBVB4 J+0tS2mYjinbD54cZdSci1QDtQIP2e+5ITRz3rQpojl2W3hOK8i5DGDCXAvzA7mABulS YjzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=WD8d+DtJr2yjtSpKbiynkeVLPIQflIkEjXcyq6P2CxQ=; b=mIqPMWgPwObS2OlNgLfNZjRNfA9l6t5pkXlsz5T0JK4RhDKm2DW1ElekK64Lc3ItTb 0WtRj4H6S59yUU7RL56AKGS4fGUd43js07cSSam7XLNfZIJpUFY+niwJTt5ed0ihhwte uwVD9tmtgu5PwAvU6NuqkuywsD0a7ZDIXQ9ryqixC1RWqxK6AGU39nffFxArVllQMhy0 /0ihwdhob2bcVFINGVuSFySAz6N7M99LDlp+0RmpoZ+P1eykA0IO3Ygr6bLoBZQ1ofpV sbNDyeZmTrfOsJnJSPU+OgJYckGZCMIkfpfvAcYw+2NvAIAMt5tZpIBkESvkaX4r63aZ Aqnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hHAV4CT7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c6-20020a05640227c600b0046a222b78a5si7357737ede.249.2022.12.02.10.42.40; Fri, 02 Dec 2022 10:43:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hHAV4CT7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234390AbiLBShc (ORCPT + 99 others); Fri, 2 Dec 2022 13:37:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234504AbiLBShR (ORCPT ); Fri, 2 Dec 2022 13:37:17 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 863A4EF0F7; Fri, 2 Dec 2022 10:37:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006235; x=1701542235; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BwGWFpwhkXHmY0scYL/UPLZTImaW0jFvH7J012aIUr4=; b=hHAV4CT77h3caWotSLeZ2yWijv+DCzSB+gJrFZjny7ehAtBWPH9BdPs7 Zv5O1olACZ1v2GAXW3yfcHxUDK4YyT/vZkthPYiCQC7EuKyvmvXlWoNuk 0+UXR3un+SvBhqxrGx9U+cGSSqJ61QKPjpPYL+tyQrvLOOTMjqMjGLLFr 50k9ML7A3Vet2Hfisu+3DTpopBEAiSz5UPBd/DC9QEP8sbggNuFNUgwVw lgdVOhdTyi1oPJTjBU2tn93m+OIHIOFgS6BQNoFuQC9u9PP/AVfJyoq73 VPwg4hVFm5AOms7XPbkrQZm8DkcF+p2dga6VqBFBPGmrwismuHX2MiWCR g==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724539" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724539" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:15 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717416" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717416" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:13 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 05/18] x86/sgx: Track epc pages on reclaimable or unreclaimable lists Date: Fri, 2 Dec 2022 10:36:41 -0800 Message-Id: <20221202183655.3767674-6-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128823465158595?= X-GMAIL-MSGID: =?utf-8?q?1751128823465158595?= Replace functions sgx_mark_page_reclaimable() and sgx_unmark_page_reclaimable() with sgx_record_epc_page() and sgx_drop_epc_page(). sgx_record_epc_page() wil add the epc_page to the correct "reclaimable" or "unreclaimable" list in the sgx_epc_lru_lists struct. sgx_drop_epc_page() will delete the page from the LRU list. Tracking pages that are not tracked by the reclaimer in the sgx_epc_lru_lists "unreclaimable" list allows an OOM event to cause all the pages in use by an enclave to be freed, regardless of whether they were reclaimable pages or not. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/encl.c | 10 +++++++--- arch/x86/kernel/cpu/sgx/ioctl.c | 11 +++++++---- arch/x86/kernel/cpu/sgx/main.c | 26 +++++++++++++++----------- arch/x86/kernel/cpu/sgx/sgx.h | 4 ++-- arch/x86/kernel/cpu/sgx/virt.c | 28 ++++++++++++++++++++-------- 5 files changed, 51 insertions(+), 28 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 4eaf9d21e71b..4683da9ef4f1 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -252,6 +252,7 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, epc_page = sgx_encl_eldu(&encl->secs, NULL); if (IS_ERR(epc_page)) return ERR_CAST(epc_page); + sgx_record_epc_page(epc_page, 0); } epc_page = sgx_encl_eldu(entry, encl->secs.epc_page); @@ -259,7 +260,7 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, return ERR_CAST(epc_page); encl->secs_child_cnt++; - sgx_mark_page_reclaimable(entry->epc_page); + sgx_record_epc_page(entry->epc_page, SGX_EPC_PAGE_RECLAIMER_TRACKED); return entry; } @@ -375,7 +376,7 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma, encl_page->type = SGX_PAGE_TYPE_REG; encl->secs_child_cnt++; - sgx_mark_page_reclaimable(encl_page->epc_page); + sgx_record_epc_page(encl_page->epc_page, SGX_EPC_PAGE_RECLAIMER_TRACKED); phys_addr = sgx_get_epc_phys_addr(epc_page); /* @@ -687,7 +688,7 @@ void sgx_encl_release(struct kref *ref) * The page and its radix tree entry cannot be freed * if the page is being held by the reclaimer. */ - if (sgx_unmark_page_reclaimable(entry->epc_page)) + if (sgx_drop_epc_page(entry->epc_page)) continue; sgx_encl_free_epc_page(entry->epc_page); @@ -703,6 +704,7 @@ void sgx_encl_release(struct kref *ref) xa_destroy(&encl->page_array); if (!encl->secs_child_cnt && encl->secs.epc_page) { + sgx_drop_epc_page(encl->secs.epc_page); sgx_encl_free_epc_page(encl->secs.epc_page); encl->secs.epc_page = NULL; } @@ -711,6 +713,7 @@ void sgx_encl_release(struct kref *ref) va_page = list_first_entry(&encl->va_pages, struct sgx_va_page, list); list_del(&va_page->list); + sgx_drop_epc_page(va_page->epc_page); sgx_encl_free_epc_page(va_page->epc_page); kfree(va_page); } @@ -1218,6 +1221,7 @@ struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim) sgx_encl_free_epc_page(epc_page); return ERR_PTR(-EFAULT); } + sgx_record_epc_page(epc_page, 0); return epc_page; } diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 9a1bb3c3211a..aca80a3f38a1 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -48,6 +48,7 @@ void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page) encl->page_cnt--; if (va_page) { + sgx_drop_epc_page(va_page->epc_page); sgx_encl_free_epc_page(va_page->epc_page); list_del(&va_page->list); kfree(va_page); @@ -113,6 +114,8 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs) encl->attributes = secs->attributes; encl->attributes_mask = SGX_ATTR_DEBUG | SGX_ATTR_MODE64BIT | SGX_ATTR_KSS; + sgx_record_epc_page(encl->secs.epc_page, 0); + /* Set only after completion, as encl->lock has not been taken. */ set_bit(SGX_ENCL_CREATED, &encl->flags); @@ -322,7 +325,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src, goto err_out; } - sgx_mark_page_reclaimable(encl_page->epc_page); + sgx_record_epc_page(encl_page->epc_page, SGX_EPC_PAGE_RECLAIMER_TRACKED); mutex_unlock(&encl->lock); mmap_read_unlock(current->mm); return ret; @@ -958,7 +961,7 @@ static long sgx_enclave_modify_types(struct sgx_encl *encl, * Prevent page from being reclaimed while mutex * is released. */ - if (sgx_unmark_page_reclaimable(entry->epc_page)) { + if (sgx_drop_epc_page(entry->epc_page)) { ret = -EAGAIN; goto out_entry_changed; } @@ -973,7 +976,7 @@ static long sgx_enclave_modify_types(struct sgx_encl *encl, mutex_lock(&encl->lock); - sgx_mark_page_reclaimable(entry->epc_page); + sgx_record_epc_page(entry->epc_page, SGX_EPC_PAGE_RECLAIMER_TRACKED); } /* Change EPC type */ @@ -1130,7 +1133,7 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl, goto out_unlock; } - if (sgx_unmark_page_reclaimable(entry->epc_page)) { + if (sgx_drop_epc_page(entry->epc_page)) { ret = -EBUSY; goto out_unlock; } diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 447cf4b8580c..ecd7f8e704cc 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -262,7 +262,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, goto out; sgx_encl_ewb(encl->secs.epc_page, &secs_backing); - + sgx_drop_epc_page(encl->secs.epc_page); sgx_encl_free_epc_page(encl->secs.epc_page); encl->secs.epc_page = NULL; @@ -499,31 +499,35 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) } /** - * sgx_mark_page_reclaimable() - Mark a page as reclaimable + * sgx_record_epc_page() - Add a page to the LRU tracking * @page: EPC page * - * Mark a page as reclaimable and add it to the active page list. Pages - * are automatically removed from the active list when freed. + * Mark a page with the specified flags and add it to the appropriate + * (un)reclaimable list. */ -void sgx_mark_page_reclaimable(struct sgx_epc_page *page) +void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) { spin_lock(&sgx_global_lru.lock); - page->flags |= SGX_EPC_PAGE_RECLAIMER_TRACKED; - sgx_epc_push_reclaimable(&sgx_global_lru, page); + WARN_ON(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED); + page->flags |= flags; + if (flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) + sgx_epc_push_reclaimable(&sgx_global_lru, page); + else + sgx_epc_push_unreclaimable(&sgx_global_lru, page); spin_unlock(&sgx_global_lru.lock); } /** - * sgx_unmark_page_reclaimable() - Remove a page from the reclaim list + * sgx_drop_epc_page() - Remove a page from a LRU list * @page: EPC page * - * Clear the reclaimable flag and remove the page from the active page list. + * Clear the reclaimable flag if set and remove the page from its LRU. * * Return: * 0 on success, * -EBUSY if the page is in the process of being reclaimed */ -int sgx_unmark_page_reclaimable(struct sgx_epc_page *page) +int sgx_drop_epc_page(struct sgx_epc_page *page) { spin_lock(&sgx_global_lru.lock); if (page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) { @@ -533,9 +537,9 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page) return -EBUSY; } - list_del(&page->list); page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; } + list_del(&page->list); spin_unlock(&sgx_global_lru.lock); return 0; diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 5e6d88438fae..ba4338b7303f 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -159,8 +159,8 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void); void sgx_free_epc_page(struct sgx_epc_page *page); void sgx_reclaim_direct(void); -void sgx_mark_page_reclaimable(struct sgx_epc_page *page); -int sgx_unmark_page_reclaimable(struct sgx_epc_page *page); +void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags); +int sgx_drop_epc_page(struct sgx_epc_page *page); struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); void sgx_ipi_cb(void *info); diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c index 776ae5c1c032..0eabc4db91d0 100644 --- a/arch/x86/kernel/cpu/sgx/virt.c +++ b/arch/x86/kernel/cpu/sgx/virt.c @@ -64,6 +64,8 @@ static int __sgx_vepc_fault(struct sgx_vepc *vepc, goto err_delete; } + sgx_record_epc_page(epc_page, 0); + return 0; err_delete: @@ -148,6 +150,7 @@ static int sgx_vepc_free_page(struct sgx_epc_page *epc_page) return ret; } + sgx_drop_epc_page(epc_page); sgx_free_epc_page(epc_page); return 0; } @@ -220,8 +223,15 @@ static int sgx_vepc_release(struct inode *inode, struct file *file) * have been removed, the SECS page must have a child on * another instance. */ - if (sgx_vepc_free_page(epc_page)) + if (sgx_vepc_free_page(epc_page)) { + /* + * Drop the page before adding it to the list of SECS + * pages. Moving the page off the unreclaimable list + * needs to be done under the LRU's spinlock. + */ + sgx_drop_epc_page(epc_page); list_add_tail(&epc_page->list, &secs_pages); + } xa_erase(&vepc->page_array, index); } @@ -236,15 +246,17 @@ static int sgx_vepc_release(struct inode *inode, struct file *file) mutex_lock(&zombie_secs_pages_lock); list_for_each_entry_safe(epc_page, tmp, &zombie_secs_pages, list) { /* - * Speculatively remove the page from the list of zombies, - * if the page is successfully EREMOVE'd it will be added to - * the list of free pages. If EREMOVE fails, throw the page - * on the local list, which will be spliced on at the end. + * If EREMOVE fails, throw the page on the local list, which + * will be spliced on at the end. + * + * Note, this abuses sgx_drop_epc_page() to delete the page off + * the list of zombies, but this is a very rare path (probably + * never hit in production). It's not worth special casing the + * free path for this super rare case just to avoid taking the + * LRU's spinlock. */ - list_del(&epc_page->list); - if (sgx_vepc_free_page(epc_page)) - list_add_tail(&epc_page->list, &secs_pages); + list_move_tail(&epc_page->list, &secs_pages); } if (!list_empty(&secs_pages)) From patchwork Fri Dec 2 18:36:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29061 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1015704wrr; Fri, 2 Dec 2022 10:42:34 -0800 (PST) X-Google-Smtp-Source: AA0mqf765yrk53pVKY7kwfPSSppOPRklSYp72gnckp2vKhxeBqmXlISKPiALfFAx+qDK9FsziYAR X-Received: by 2002:a17:90a:9312:b0:218:6f4b:8366 with SMTP id p18-20020a17090a931200b002186f4b8366mr79318644pjo.137.1670006553733; Fri, 02 Dec 2022 10:42:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006553; cv=none; d=google.com; s=arc-20160816; b=iDAlGvtOmK75AuJTJ5QA7gZK+U8aFeQC2CIPE9DkGTlthiHJ+2DHqX2TsShDGOGu6s ECWx2TElKKoMsf8Xl7b0aTQox7cmR5FGuLL1jNtku104bz7nqzxaXuoqLkuzj3SMf92K rOqeE/15ejyXKhjLSRZjTvuWyWvvlqyHn5PPwNC92uCyF9tW4up8lqmX/7JO+SHkaB4+ VaGM2/R+e1CfIqQnnn7flVOKAjCK5+k+YfcLsO7vHnM6JpxEYT8Wig8KENMEAtrLgUHP O6/Tr4vY520ATwnXbOffWpj8qItC16RDhdl+dodkQ3ZaTrpQbg9MsbAPiYCtUtA83rP4 52tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ijfOVYEj5nexjtxd0isF62SGW/p2VDBhJ/h3aHOhorg=; b=vzZUjMtGhOLysFUAWy0hjOzRNdULIXaJsAUdrCt12IKIpLLGlJEMGIEUnAiDwWd6lK Vit+QYHSRZUvMSCP4NGL5kOmhsnGdtN1s3EnJW1m1qMz6TF269svMIx/uHYeltaHREnO 9zGgkik2TDH3YMnt4JiJWpijymCRnKVPiLxUrMOo0GaY55hxBkAOXrckdfpG4uSL94hb iJkq16G4N0MH4TRGqoWDC0q4JUKLDu6iZu/dpGwaswkLEjKHUM03yPm572Txr0k4C4iS aidhQ/qZNNkQwslpjrx+F1Phd9fGZbhU0mGYWszUGx25NyNpRLSMEjGoNLTqc5c0RuSJ DHqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CAz3Zzbs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id on17-20020a17090b1d1100b00218de7c19efsi12728624pjb.108.2022.12.02.10.42.20; Fri, 02 Dec 2022 10:42:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CAz3Zzbs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234377AbiLBShg (ORCPT + 99 others); Fri, 2 Dec 2022 13:37:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234490AbiLBSh2 (ORCPT ); Fri, 2 Dec 2022 13:37:28 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CB3BEFD01; Fri, 2 Dec 2022 10:37:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006237; x=1701542237; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XCIiHboLqkGB7+8+/nwb/H+5rbZwuSnGijn3+bFrq9g=; b=CAz3ZzbsfFL4r45Kb8TWwr8PKzA1t8wzyJx/28cE3Hdn8bZtqnx0wHuS ls5aJKfjuUErwVsylRzTtV0VMi8+MuELeplpu+EkaCOREQqPT00la9I4N qAMfujU8qsgBFLnVc476RvS22UPjzq02jXqpj2nR2TdokuTC4QXqRuy6W iIlCFQ69I2L4LDaxyzxIw1RSz6hqhj+3EZOxJtGjjFhQVRmUu9M7uU06l UYtUStsYHqVu0wjEDTrQbrfNJTc5IfHF0Ctz0V2VQLT5Qec9NFQSjqmTb RlkooN39AJKywDEjKl3IL8KmNFfVqQ+Sd/PgXgDYWeAbFBz4mmYLIIm/K A==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724549" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724549" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:16 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717427" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717427" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:15 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 06/18] x86/sgx: Introduce RECLAIM_IN_PROGRESS flag for EPC pages Date: Fri, 2 Dec 2022 10:36:42 -0800 Message-Id: <20221202183655.3767674-7-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128792340349602?= X-GMAIL-MSGID: =?utf-8?q?1751128792340349602?= From: Sean Christopherson When selecting pages to be reclaimed from the page pool (sgx_global_lru), the list of reclaimable pages is walked, and any page that is both reclaimable and not in the process of being freed is added to a list of potential candidates to be reclaimed. After that, this separate list is further examined and may or may not ultimately be reclaimed. In order to prevent this page from being removed from the sgx_epc_lru_lists struct in a separate thread by sgx_drop_epc_page(), keep track of whether the EPC page is in the middle of being reclaimed with the addtion of a RECLAIM_IN_PROGRESS flag, and do not delete the page off the LRU in sgx_drop_epc_page() if it has not yet finished being reclaimed. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 15 ++++++++++----- arch/x86/kernel/cpu/sgx/sgx.h | 2 ++ 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index ecd7f8e704cc..bad72498b0a7 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -305,13 +305,15 @@ static void __sgx_reclaim_pages(void) encl_page = epc_page->encl_owner; - if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) + if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) { + epc_page->flags |= SGX_EPC_PAGE_RECLAIM_IN_PROGRESS; chunk[cnt++] = epc_page; - else + } else { /* The owner is freeing the page. No need to add the * page back to the list of reclaimable pages. */ epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; + } } spin_unlock(&sgx_global_lru.lock); @@ -337,6 +339,7 @@ static void __sgx_reclaim_pages(void) skip: spin_lock(&sgx_global_lru.lock); + epc_page->flags &= ~SGX_EPC_PAGE_RECLAIM_IN_PROGRESS; sgx_epc_push_reclaimable(&sgx_global_lru, epc_page); spin_unlock(&sgx_global_lru.lock); @@ -360,7 +363,8 @@ static void __sgx_reclaim_pages(void) sgx_reclaimer_write(epc_page, &backing[i]); kref_put(&encl_page->encl->refcount, sgx_encl_release); - epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; + epc_page->flags &= ~(SGX_EPC_PAGE_RECLAIMER_TRACKED | + SGX_EPC_PAGE_RECLAIM_IN_PROGRESS); sgx_free_epc_page(epc_page); } @@ -508,7 +512,8 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) { spin_lock(&sgx_global_lru.lock); - WARN_ON(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED); + WARN_ON(page->flags & (SGX_EPC_PAGE_RECLAIMER_TRACKED | + SGX_EPC_PAGE_RECLAIM_IN_PROGRESS)); page->flags |= flags; if (flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) sgx_epc_push_reclaimable(&sgx_global_lru, page); @@ -532,7 +537,7 @@ int sgx_drop_epc_page(struct sgx_epc_page *page) spin_lock(&sgx_global_lru.lock); if (page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) { /* The page is being reclaimed. */ - if (list_empty(&page->list)) { + if (page->flags & SGX_EPC_PAGE_RECLAIM_IN_PROGRESS) { spin_unlock(&sgx_global_lru.lock); return -EBUSY; } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index ba4338b7303f..37d66bc6ca27 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -30,6 +30,8 @@ #define SGX_EPC_PAGE_IS_FREE BIT(1) /* Pages allocated for KVM guest */ #define SGX_EPC_PAGE_KVM_GUEST BIT(2) +/* page flag to indicate reclaim is in progress */ +#define SGX_EPC_PAGE_RECLAIM_IN_PROGRESS BIT(3) struct sgx_epc_page { unsigned int section; From patchwork Fri Dec 2 18:36:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29064 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1015894wrr; Fri, 2 Dec 2022 10:43:05 -0800 (PST) X-Google-Smtp-Source: AA0mqf6YegYE8Kmxe1roi+hNIv9DIZDEe6eauwb67T+jMwJmIm6qzfp5p+jr/ZPZIRdEbx85bzhS X-Received: by 2002:a17:902:ce82:b0:186:ed91:5086 with SMTP id f2-20020a170902ce8200b00186ed915086mr53999155plg.59.1670006584955; Fri, 02 Dec 2022 10:43:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006584; cv=none; d=google.com; s=arc-20160816; b=kUYvxrS9j9i8Guo3DFdON+3nsTIlhupit4Q+I7VM/ozPXL82JKJGaS73feDcbg0pIf J/ISo+br1IzJCCX7c437OIrIA/5Z4zJK+znbVlxmtCuAMGrWqUK0dNAbhgICzyuY6cVZ Tx86Zlm9iP0xDFQ5H8qQsINQWIYKV2HT/3f+ubgqABiSfgI2mdJP4PJ4vANzUrFUuBYg CuM9VFRizFfnyHB42girloMFr9+S6Ia8yLZIpdDWMs6Zt8xI8oHVYyQEakpFEG27RCvR p678YclKGEZv1FdcH0KtuhpMnZprMUDQug6F7xscb/YwDg4s0+5d9BTX0WVKinCyMitR sIHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0UXASrXjGRURTxznoPnO2RI4rb0T1lzF7iTlexX3Nj0=; b=jLVCfXOqtWBxYbVe/oSjxxdBq6hWC+u0NPZnt4m9Hwqn1fOaL9a/jjsNxFRfUsdfPc XkuQg0+QYyA2EMVKtIGwhjvYHVoYrmNLROPodA6kifdKIH+w2RGvffj0XCBkFwCES5Ql Gft7GHuHH6QEDLDf95T+EeEVIYGqZnEN63xJejgcteOhDQP1ypkqRHXpd+TJFGKTFbv0 Bxeo/RbpABa5IKqHycqRJUMhnSMDgb65nSsqQCUod2WJvmjimzAt87qdp8Bo2Qhvp5Og Q0L0WS/DUpzYIA3ifK8GiNe7mn0QKcwaDOuw/dC9MyLeqCxNRBQbXF9zN6Kc8osbyOrm XklA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="B/MoItuI"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h7-20020a62b407000000b00574d06d6a4fsi7623924pfn.56.2022.12.02.10.42.52; Fri, 02 Dec 2022 10:43:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="B/MoItuI"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234574AbiLBSht (ORCPT + 99 others); Fri, 2 Dec 2022 13:37:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234506AbiLBShb (ORCPT ); Fri, 2 Dec 2022 13:37:31 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A981CEFD3E; Fri, 2 Dec 2022 10:37:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006239; x=1701542239; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PbbVkK5yr2Kuz5SaGN/Zwcz0c3hpgNwJRYQamx2sNnY=; b=B/MoItuIy/YUBQ+4Jm57rpL/mpakpaCj5PR2vuzRmtiATfJChy7Uuxr6 x96MbIo4SmCPlifSl+kqcflEyGUGKk3990eiUsYU+ZcOGiImHuecYgSzA izeSA+NSn21w5fucKP0qI2gy60Qe6De4Cq6fRA0wKr3/M6vhrGagJGGGC DB3+m/p8FGGRFGe0Qj+85rjc0VSjxcSk3ouqFYQJ8FJK/dt4Fy7JjfmDa ZGf0Q/33MKdJu9lDLdRVM+v8SiS5LWzpjZ2rpwM6tp+w9VcndEAiXJwLe mBXxMC+/vMBacN+BZVfh9pggx5EegK/LV4fbgoNPN8NMecDjUrGzyHtIn A==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724568" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724568" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:19 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717457" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717457" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:17 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 07/18] x86/sgx: Use a list to track to-be-reclaimed pages during reclaim Date: Fri, 2 Dec 2022 10:36:43 -0800 Message-Id: <20221202183655.3767674-8-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128824817378661?= X-GMAIL-MSGID: =?utf-8?q?1751128824817378661?= From: Sean Christopherson Change sgx_reclaim_pages() to use a list rather than an array for storing the epc_pages which will be reclaimed. This change is needed to transition to the LRU implementation for EPC cgroup support. This change requires keeping track of whether newly recorded EPC pages are pages for VA Arrays, or for Enclave data. In addition, helper functions are added to move pages from one list to another and enforce a consistent queue like behavior for the LRU lists. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/encl.c | 7 ++-- arch/x86/kernel/cpu/sgx/ioctl.c | 5 ++- arch/x86/kernel/cpu/sgx/main.c | 69 +++++++++++++++++---------------- arch/x86/kernel/cpu/sgx/sgx.h | 42 ++++++++++++++++++++ 4 files changed, 85 insertions(+), 38 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 4683da9ef4f1..9ee306ac2a8e 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -252,7 +252,7 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, epc_page = sgx_encl_eldu(&encl->secs, NULL); if (IS_ERR(epc_page)) return ERR_CAST(epc_page); - sgx_record_epc_page(epc_page, 0); + sgx_record_epc_page(epc_page, SGX_EPC_PAGE_ENCLAVE); } epc_page = sgx_encl_eldu(entry, encl->secs.epc_page); @@ -260,7 +260,8 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, return ERR_CAST(epc_page); encl->secs_child_cnt++; - sgx_record_epc_page(entry->epc_page, SGX_EPC_PAGE_RECLAIMER_TRACKED); + sgx_record_epc_page(entry->epc_page, + (SGX_EPC_PAGE_ENCLAVE | SGX_EPC_PAGE_RECLAIMER_TRACKED)); return entry; } @@ -1221,7 +1222,7 @@ struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim) sgx_encl_free_epc_page(epc_page); return ERR_PTR(-EFAULT); } - sgx_record_epc_page(epc_page, 0); + sgx_record_epc_page(epc_page, SGX_EPC_PAGE_VERSION_ARRAY); return epc_page; } diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index aca80a3f38a1..c3a9bffbc37e 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -114,7 +114,7 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs) encl->attributes = secs->attributes; encl->attributes_mask = SGX_ATTR_DEBUG | SGX_ATTR_MODE64BIT | SGX_ATTR_KSS; - sgx_record_epc_page(encl->secs.epc_page, 0); + sgx_record_epc_page(encl->secs.epc_page, SGX_EPC_PAGE_ENCLAVE); /* Set only after completion, as encl->lock has not been taken. */ set_bit(SGX_ENCL_CREATED, &encl->flags); @@ -325,7 +325,8 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src, goto err_out; } - sgx_record_epc_page(encl_page->epc_page, SGX_EPC_PAGE_RECLAIMER_TRACKED); + sgx_record_epc_page(encl_page->epc_page, + (SGX_EPC_PAGE_ENCLAVE | SGX_EPC_PAGE_RECLAIMER_TRACKED)); mutex_unlock(&encl->lock); mmap_read_unlock(current->mm); return ret; diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index bad72498b0a7..83aaf5cea7b9 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -288,37 +288,43 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, */ static void __sgx_reclaim_pages(void) { - struct sgx_epc_page *chunk[SGX_NR_TO_SCAN]; struct sgx_backing backing[SGX_NR_TO_SCAN]; + struct sgx_epc_page *epc_page, *tmp; struct sgx_encl_page *encl_page; - struct sgx_epc_page *epc_page; pgoff_t page_index; - int cnt = 0; + LIST_HEAD(iso); int ret; int i; spin_lock(&sgx_global_lru.lock); for (i = 0; i < SGX_NR_TO_SCAN; i++) { - epc_page = sgx_epc_pop_reclaimable(&sgx_global_lru); + epc_page = sgx_epc_peek_reclaimable(&sgx_global_lru); if (!epc_page) break; encl_page = epc_page->encl_owner; + if (WARN_ON_ONCE(!(epc_page->flags & SGX_EPC_PAGE_ENCLAVE))) + continue; + if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) { epc_page->flags |= SGX_EPC_PAGE_RECLAIM_IN_PROGRESS; - chunk[cnt++] = epc_page; + list_move_tail(&epc_page->list, &iso); } else { - /* The owner is freeing the page. No need to add the - * page back to the list of reclaimable pages. + /* The owner is freeing the page, remove it from the + * LRU list */ epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; + list_del_init(&epc_page->list); } } spin_unlock(&sgx_global_lru.lock); - for (i = 0; i < cnt; i++) { - epc_page = chunk[i]; + if (list_empty(&iso)) + return; + + i = 0; + list_for_each_entry_safe(epc_page, tmp, &iso, list) { encl_page = epc_page->encl_owner; if (!sgx_reclaimer_age(epc_page)) @@ -333,6 +339,7 @@ static void __sgx_reclaim_pages(void) goto skip; } + i++; encl_page->desc |= SGX_ENCL_PAGE_BEING_RECLAIMED; mutex_unlock(&encl_page->encl->lock); continue; @@ -340,31 +347,25 @@ static void __sgx_reclaim_pages(void) skip: spin_lock(&sgx_global_lru.lock); epc_page->flags &= ~SGX_EPC_PAGE_RECLAIM_IN_PROGRESS; - sgx_epc_push_reclaimable(&sgx_global_lru, epc_page); + sgx_epc_move_reclaimable(&sgx_global_lru, epc_page); spin_unlock(&sgx_global_lru.lock); kref_put(&encl_page->encl->refcount, sgx_encl_release); - - chunk[i] = NULL; } - for (i = 0; i < cnt; i++) { - epc_page = chunk[i]; - if (epc_page) - sgx_reclaimer_block(epc_page); - } - - for (i = 0; i < cnt; i++) { - epc_page = chunk[i]; - if (!epc_page) - continue; - + list_for_each_entry(epc_page, &iso, list) + sgx_reclaimer_block(epc_page); + + i = 0; + list_for_each_entry_safe(epc_page, tmp, &iso, list) { encl_page = epc_page->encl_owner; - sgx_reclaimer_write(epc_page, &backing[i]); + sgx_reclaimer_write(epc_page, &backing[i++]); kref_put(&encl_page->encl->refcount, sgx_encl_release); epc_page->flags &= ~(SGX_EPC_PAGE_RECLAIMER_TRACKED | - SGX_EPC_PAGE_RECLAIM_IN_PROGRESS); + SGX_EPC_PAGE_RECLAIM_IN_PROGRESS | + SGX_EPC_PAGE_ENCLAVE | + SGX_EPC_PAGE_VERSION_ARRAY); sgx_free_epc_page(epc_page); } @@ -505,6 +506,7 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) /** * sgx_record_epc_page() - Add a page to the LRU tracking * @page: EPC page + * @flags: Reclaim flags for the page. * * Mark a page with the specified flags and add it to the appropriate * (un)reclaimable list. @@ -535,18 +537,19 @@ void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) int sgx_drop_epc_page(struct sgx_epc_page *page) { spin_lock(&sgx_global_lru.lock); - if (page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) { - /* The page is being reclaimed. */ - if (page->flags & SGX_EPC_PAGE_RECLAIM_IN_PROGRESS) { - spin_unlock(&sgx_global_lru.lock); - return -EBUSY; - } - - page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; + if ((page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) && + (page->flags & SGX_EPC_PAGE_RECLAIM_IN_PROGRESS)) { + spin_unlock(&sgx_global_lru.lock); + return -EBUSY; } list_del(&page->list); spin_unlock(&sgx_global_lru.lock); + page->flags &= ~(SGX_EPC_PAGE_RECLAIMER_TRACKED | + SGX_EPC_PAGE_RECLAIM_IN_PROGRESS | + SGX_EPC_PAGE_ENCLAVE | + SGX_EPC_PAGE_VERSION_ARRAY); + return 0; } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 37d66bc6ca27..ec8d567cd975 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -32,6 +32,8 @@ #define SGX_EPC_PAGE_KVM_GUEST BIT(2) /* page flag to indicate reclaim is in progress */ #define SGX_EPC_PAGE_RECLAIM_IN_PROGRESS BIT(3) +#define SGX_EPC_PAGE_ENCLAVE BIT(4) +#define SGX_EPC_PAGE_VERSION_ARRAY BIT(5) struct sgx_epc_page { unsigned int section; @@ -118,6 +120,14 @@ static inline void __sgx_epc_page_list_push(struct list_head *list, struct sgx_e list_add_tail(&page->list, list); } +/* + * Must be called with queue lock acquired + */ +static inline void __sgx_epc_page_list_move(struct list_head *list, struct sgx_epc_page *page) +{ + list_move_tail(&page->list, list); +} + /* * Must be called with queue lock acquired */ @@ -157,6 +167,38 @@ static inline void sgx_epc_push_unreclaimable(struct sgx_epc_lru_lists *lrus, __sgx_epc_page_list_push(&(lrus)->unreclaimable, page); } +/* + * Must be called with queue lock acquired + */ +static inline struct sgx_epc_page * __sgx_epc_page_list_peek(struct list_head *list) +{ + struct sgx_epc_page *epc_page; + + if (list_empty(list)) + return NULL; + + epc_page = list_first_entry(list, struct sgx_epc_page, list); + return epc_page; +} + +static inline struct sgx_epc_page * +sgx_epc_peek_reclaimable(struct sgx_epc_lru_lists *lrus) +{ + return __sgx_epc_page_list_peek(&(lrus)->reclaimable); +} + +static inline void sgx_epc_move_reclaimable(struct sgx_epc_lru_lists *lru, + struct sgx_epc_page *page) +{ + __sgx_epc_page_list_move(&(lru)->reclaimable, page); +} + +static inline struct sgx_epc_page * +sgx_epc_peek_unreclaimable(struct sgx_epc_lru_lists *lrus) +{ + return __sgx_epc_page_list_peek(&(lrus)->unreclaimable); +} + struct sgx_epc_page *__sgx_alloc_epc_page(void); void sgx_free_epc_page(struct sgx_epc_page *page); From patchwork Fri Dec 2 18:36:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29066 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1016198wrr; Fri, 2 Dec 2022 10:43:56 -0800 (PST) X-Google-Smtp-Source: AA0mqf5FmLsnSyfTMMB1WRlpSAhjbfKxHL3ws+ExdZ2VxEUqD3UJw39q+jZnW394IAEb8d36AG0u X-Received: by 2002:a17:902:f092:b0:189:9b43:a082 with SMTP id p18-20020a170902f09200b001899b43a082mr20297142pla.95.1670006636322; Fri, 02 Dec 2022 10:43:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006636; cv=none; d=google.com; s=arc-20160816; b=oZnKEiQyjn+H2YbBcSo7W03ErPgtDv+gHQ20tuHFjv1NCceHuCYKei/MwAJc7qD5H9 0SL/IR65I8SbERo5pUp0C4tRAhW/A9ma69f+rsLRX3L78FcgOX4MusnfpU9shhORDxz9 73x3q6BnbcoK6IsU2z8kYEmXNeD7mvivFhl66vdrintjQlekMDrEgmkQxuW/brGNPfRr zmGfuMw+HkPVU/A+DarBhaYKWX8nVj1oMRPA/gIriM3X/cbscTnKkXVEYk1PFeUEzWOt neDh73cITmy1LgNwVIvlBGXazI+uvB7rc8J4PAbmuRaXgXMR0iRLcimvaqkR1Y5bYG73 XBUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=IbcosBxNnwvvJ3x20aujIuKdnhGuyrTj5m422duX+yU=; b=W03b/eX93psTaAiRO24w/lJ5IHePin+Jn8rcDFC/1FLfI0Lf2YWixOwm/Vyv0Qp68M gZ7nj79SkXk5SJkMltef7r/8pFOvb7JHWxutS2e42Fvh2JXrQgqMfkfCgGhEVXj2sZmj /znMtqZUdsKLoE/fzglF3zwhL3MDoSAAioNwI8Tx8eE7r1aXiVutTaZtyXlwjFTRepgO AKahxYo+zUZ93AxXRzi8PntCItcytDkpSjjVnyZOJ4RBfCzJ73YKs/4Fl5AHEIESiGqm L8bDAOVXMxpEBQfBSZ54FruvOAb/HB/Mr5xUq69M0Hr/RcUqiPKaNQRFqQ0RO5fSdiT2 ngfw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NeMxXnBK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s123-20020a637781000000b0046efd09c4f2si7886361pgc.440.2022.12.02.10.43.39; Fri, 02 Dec 2022 10:43:56 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NeMxXnBK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234603AbiLBSiF (ORCPT + 99 others); Fri, 2 Dec 2022 13:38:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234494AbiLBSht (ORCPT ); Fri, 2 Dec 2022 13:37:49 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33311EBC9B; Fri, 2 Dec 2022 10:37:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006242; x=1701542242; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ld5LaZJWmZionb9cQjqD/LPN3rIVzYDCwEBv2smvVZU=; b=NeMxXnBKBt0bgtyR8H23rYZqgMWwLLroanb5gW/YhNHk5NkFy4VULzl0 sr3RIZzA5rics1vhYkWrKlRfXzwE6m2565VTkJUS9QN2kTtl7BjPgD20k ssAWDbIXegm9gbq/f6IujvWrDntk2b/nUHlq2mz+r6OXvtXsG3FCz6d6C 5F769lk38dBffWoleQX0iybRdakrnlGzM2h5SX827naZf+M1jzLOV79LV kAX7zgtjZCnCLfjp1i32Cj5nNYmLv+u1jL7dDyfhHBWS6W0049RAJ+uTr inY0tMO8GNrpABm0qXYCZmrSTZo6yvTGLrHzfek0JHmFvisC3jbu/dnDL A==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724581" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724581" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:21 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717479" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717479" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:19 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 08/18] x86/sgx: Allow reclaiming up to 32 pages, but scan 16 by default Date: Fri, 2 Dec 2022 10:36:44 -0800 Message-Id: <20221202183655.3767674-9-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128878835437768?= X-GMAIL-MSGID: =?utf-8?q?1751128878835437768?= From: Sean Christopherson Modify sgx_reclaim_pages() to take a parameter that specifies the number of pages to scan for reclaiming. Specify a max value of 32, but scan 16 in the usual case. This allows the number of pages sgx_reclaim_pages() scans to be specified by the caller, and adjusted in future patches. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 83aaf5cea7b9..f201ca85212f 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -18,6 +18,8 @@ #include "encl.h" #include "encls.h" +#define SGX_MAX_NR_TO_RECLAIM 32 + struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static int sgx_nr_epc_sections; static struct task_struct *ksgxd_tsk; @@ -273,7 +275,10 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, mutex_unlock(&encl->lock); } -/* +/** + * sgx_reclaim_pages() - Reclaim EPC pages from the consumers + * @nr_to_scan: Number of EPC pages to scan for reclaim + * * Take a fixed number of pages from the head of the active page pool and * reclaim them to the enclave's private shmem files. Skip the pages, which have * been accessed since the last scan. Move those pages to the tail of active @@ -286,9 +291,9 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, * problematic as it would increase the lock contention too much, which would * halt forward progress. */ -static void __sgx_reclaim_pages(void) +static void __sgx_reclaim_pages(int nr_to_scan) { - struct sgx_backing backing[SGX_NR_TO_SCAN]; + struct sgx_backing backing[SGX_MAX_NR_TO_RECLAIM]; struct sgx_epc_page *epc_page, *tmp; struct sgx_encl_page *encl_page; pgoff_t page_index; @@ -297,7 +302,7 @@ static void __sgx_reclaim_pages(void) int i; spin_lock(&sgx_global_lru.lock); - for (i = 0; i < SGX_NR_TO_SCAN; i++) { + for (i = 0; i < nr_to_scan; i++) { epc_page = sgx_epc_peek_reclaimable(&sgx_global_lru); if (!epc_page) break; @@ -327,7 +332,7 @@ static void __sgx_reclaim_pages(void) list_for_each_entry_safe(epc_page, tmp, &iso, list) { encl_page = epc_page->encl_owner; - if (!sgx_reclaimer_age(epc_page)) + if (i == SGX_MAX_NR_TO_RECLAIM || !sgx_reclaimer_age(epc_page)) goto skip; page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base); @@ -371,9 +376,9 @@ static void __sgx_reclaim_pages(void) } } -static void sgx_reclaim_pages(void) +static void sgx_reclaim_pages(int nr_to_scan) { - __sgx_reclaim_pages(); + __sgx_reclaim_pages(nr_to_scan); cond_resched(); } @@ -393,7 +398,7 @@ static bool sgx_should_reclaim(unsigned long watermark) void sgx_reclaim_direct(void) { if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) - __sgx_reclaim_pages(); + __sgx_reclaim_pages(SGX_NR_TO_SCAN); } static int ksgxd(void *p) @@ -419,7 +424,7 @@ static int ksgxd(void *p) sgx_should_reclaim(SGX_NR_HIGH_PAGES)); if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) - sgx_reclaim_pages(); + sgx_reclaim_pages(SGX_NR_TO_SCAN); } return 0; @@ -598,7 +603,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - sgx_reclaim_pages(); + sgx_reclaim_pages(SGX_NR_TO_SCAN); } if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) From patchwork Fri Dec 2 18:36:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29068 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1016340wrr; Fri, 2 Dec 2022 10:44:17 -0800 (PST) X-Google-Smtp-Source: AA0mqf41j4DF5cjWixJ/DoQAl8lM2Z1dI7tiYx0BlqAOxXhi1Olh0coziEjcl3CtVupWffSN16Mv X-Received: by 2002:a17:906:3ac1:b0:78d:36d8:1814 with SMTP id z1-20020a1709063ac100b0078d36d81814mr59591285ejd.99.1670006657349; Fri, 02 Dec 2022 10:44:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006657; cv=none; d=google.com; s=arc-20160816; b=vkHN2o82s5V1NWX/AVyO8TfNKoHuNIK7XZuZOAp7MaleB475bPttIt+J11GqPyrTxL YnfAhSLfQwxj7/crHXqwj4dozk3qN7CIGleni5H1seAQpyH4onxcBb8V5hhPbs6xyX9n qjHu1MM+WPXJNrgcdj2ugyBEMK/ALG03qB9lkmzQphYKZCyBarttsp4+tGTNsOPEdqcc wSL9hqEdv1nZwyLczgR5YMaLXROko/uHOWLLAqcDafzJv6fX2cB5JzeUGz9vViIhZOIm vRHQHmCDDrSGkB84AgO0TOPx/6MjHRWVCXhZGKIO+NhFeRNWL8k45dzXrdvkfTEWMiai l4Gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=iw0IaPN4ukMOF3yP3ZWIhgA/VHeerJymFIqAp+GFOR0=; b=U1ZIkHSVcrJV+BI0WJ2hKVo9FyK0UwkjfgnNZK5Y8+FpZ6qwYs7Ll3l5WLBwQZ6klN QejyDjwKrD5/xdU1LOFqW+kScgdxp9fmf/0O00R6ehpKs3KGATVoJbRejx0VlMJv4iyI vJQpfcWSGYSEZ7wl3HasgDpGLlDcKuYjEOK77iU2Puhl9I26lU687mWiabHBjHxlN5d7 SncS8ILv2w5kJzR0GuonBfEqmiwn6xjkA0kaSc4yNxyubemv9t4Nn8wmA3kRCavTMaNT o/11uYoe/C146nt2/9vD14Ya+cBfUSb8IRpxS/C1Pn6nFbx88j+PcAaCVzTWj3vMNNgS s1iQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BUBH2Kb9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h12-20020aa7de0c000000b0046c0ff54c7csi2861190edv.189.2022.12.02.10.43.53; Fri, 02 Dec 2022 10:44:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BUBH2Kb9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234573AbiLBSiK (ORCPT + 99 others); Fri, 2 Dec 2022 13:38:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234525AbiLBSht (ORCPT ); Fri, 2 Dec 2022 13:37:49 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2848EBCA1; Fri, 2 Dec 2022 10:37:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006242; x=1701542242; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sBuJld5YfAr7cCiwad4f37JGGvCzkKs+++hBvxzdEeQ=; b=BUBH2Kb9edTv0TDLFqvCIPZfU/2JcMinQ6u2aag+SXBL+F2a1iJpszrE 5ymdgwcwl/zebtSfZuPhncwccRWZxeL2at616gxQy97BjWfuOhJ2oH+tz gRVh35LaroYVPxKW9kzaNsThhYLkEE0Qg283PoAulCveIZxQEInzUFEoY wnswhoqyByx8+6Zotq2PxVQE/67XTCkTPi81D7Md1uua4uaPyG+qxZkPU 59hgpy1ttwlRZnMAVZdzONxffaUXaPFCENXowcdv23E0PYP7pQPjlS3I1 voeKhtbHpYDuoSstFNvyy8FSPx8AICAXbllt59Jjq2KWGnVKKSNZe9QMp A==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724594" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724594" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:22 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717489" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717489" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:21 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 09/18] x86/sgx: Return the number of EPC pages that were successfully reclaimed Date: Fri, 2 Dec 2022 10:36:45 -0800 Message-Id: <20221202183655.3767674-10-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128900623310698?= X-GMAIL-MSGID: =?utf-8?q?1751128900623310698?= From: Sean Christopherson Return the number of reclaimed pages from sgx_reclaim_pages(), the EPC cgroup will use the result to track the success rate of its reclaim calls, e.g. to escalate to a more forceful reclaiming mode if necessary. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index f201ca85212f..a4a65eadfb79 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -291,7 +291,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, * problematic as it would increase the lock contention too much, which would * halt forward progress. */ -static void __sgx_reclaim_pages(int nr_to_scan) +static int __sgx_reclaim_pages(int nr_to_scan) { struct sgx_backing backing[SGX_MAX_NR_TO_RECLAIM]; struct sgx_epc_page *epc_page, *tmp; @@ -326,7 +326,7 @@ static void __sgx_reclaim_pages(int nr_to_scan) spin_unlock(&sgx_global_lru.lock); if (list_empty(&iso)) - return; + return 0; i = 0; list_for_each_entry_safe(epc_page, tmp, &iso, list) { @@ -374,12 +374,16 @@ static void __sgx_reclaim_pages(int nr_to_scan) sgx_free_epc_page(epc_page); } + return i; } -static void sgx_reclaim_pages(int nr_to_scan) +static int sgx_reclaim_pages(int nr_to_scan) { - __sgx_reclaim_pages(nr_to_scan); + int ret; + + ret = __sgx_reclaim_pages(nr_to_scan); cond_resched(); + return ret; } static bool sgx_should_reclaim(unsigned long watermark) From patchwork Fri Dec 2 18:36:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29067 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1016294wrr; Fri, 2 Dec 2022 10:44:10 -0800 (PST) X-Google-Smtp-Source: AA0mqf7J8y8aXMvavvmYsV5C+OshwJqpgfUB09dPAlYYXCgrfy4I5VE2wXS6gNci5r+3TG6vgoX+ X-Received: by 2002:aa7:8e54:0:b0:574:fddf:9946 with SMTP id d20-20020aa78e54000000b00574fddf9946mr30511117pfr.79.1670006650522; Fri, 02 Dec 2022 10:44:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006650; cv=none; d=google.com; s=arc-20160816; b=cWKDoZBmU6x57p62I00RRB0Z4nSFsdnUOhQT9EvOXlY8Lg6noJmlGIeMuxexTnOiOp muU9cXBANUgeolkBLDc5ODl3wfkql6VAAwtm5Wd6a8gtjW6FrVxaV9HqSuxjRwsU5hft CzIbkdGcwhcZInuCEAANHf98fBfZkfNqymJv2AYqC/i+eqThCFRNbBbei/twx18MYL2U 6YBcLt5vNDpALNlj+towNoWRTkmsRygfyoE2uu9Trl9ES/9du4bkl2rjrfi2vQJBgNt9 fQf1VOI/V1C/N0HXYQsoRXdFcOS35LgyyROYPAwyZPAsgO9xyQbjZRd+rzm3GKFI9Vcf +mnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=mIZFKrsrQcPoditvd0MGS/zSv9NQfildL2QqFktbnjc=; b=qYxASy/aE0vxyynLXV1jJUXBcvxSbvZSuWE3mn0g9ugTjo5aUaVGI/toZe/t84WYvR 4chtRjAAEiVmaenBPO7W8TK+LTFzBQVpRBTTubGWuv6043JMDvcgH5MXg3qEaqVXcb46 mijehy956X/XhbNzyHeIe57LNJDoAkT3SfuCIx8SmalSvhOdIQcW94zDWIn6zHxPZksN YrEV4mz4lF40bOIswkeNXJ0CPtMRKjUI/v6J3Py6hR0GHcWRAH19ql13pvK4g5srUFr/ BohCl0RpU1+UGVAf1OGadzDCVuokT8QFhxEBGZzW2iEXeu+8vLoV3EpGUE4QLLqPE3LF rDfg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=f9t+qXj6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 1-20020a630601000000b0045b22d763a3si8732919pgg.483.2022.12.02.10.43.57; Fri, 02 Dec 2022 10:44:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=f9t+qXj6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234490AbiLBSiS (ORCPT + 99 others); Fri, 2 Dec 2022 13:38:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234563AbiLBSht (ORCPT ); Fri, 2 Dec 2022 13:37:49 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC26FEBC9E; Fri, 2 Dec 2022 10:37:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006244; x=1701542244; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=r4FfaMTO7MJPJ0OHQyW69TD4X8I66ysGOSc1/fqmusM=; b=f9t+qXj6fIxfKIXvNV2C2mxUBRCHfJhXKY+oyEQB/gLa37IA7oOFyXHA L+chKoy0S52DrdDRUArO9tVRfK3EpJgCZnbq/YzJk6TaBOqNYuqtChca9 Yyko7JIC5Ehpwz7eIZlnyeZ783EDb9tUUHVa7+olyQsXG6UktC3EdlLp4 l+c+1lZtqjSP3egM9fjibOWL3hBAcOT/ND0Lcwlzg0jPVFDj9y65zRoJu IBmJyqFsIdZfxXwJK4ayP26BfDPiOOzUVPAN1wsvA6Elq1Swal9MMrace 943JvaJKnYo0hFHBUPo1v6o6TGbxGsi2Ni8mYK6m9n21CV75EELz0BfJ/ w==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724609" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724609" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:24 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717500" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717500" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:22 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 10/18] x86/sgx: Add option to ignore age of page during EPC reclaim Date: Fri, 2 Dec 2022 10:36:46 -0800 Message-Id: <20221202183655.3767674-11-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128893709649685?= X-GMAIL-MSGID: =?utf-8?q?1751128893709649685?= From: Sean Christopherson Add a flag to sgx_reclaim_pages() to instruct it to ignore the age of page, i.e. reclaim the page even if it's young. The EPC cgroup will use the flag to enforce its limits by draining the reclaimable lists before resorting to other measures, e.g. forcefully reclaimable "unreclaimable" pages by killing enclaves. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 46 +++++++++++++++++++++------------- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index a4a65eadfb79..db96483e2e74 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -31,6 +31,10 @@ static DEFINE_XARRAY(sgx_epc_address_space); * with sgx_global_lru.lock acquired. */ static struct sgx_epc_lru_lists sgx_global_lru; +static inline struct sgx_epc_lru_lists *sgx_lru_lists(struct sgx_epc_page *epc_page) +{ + return &sgx_global_lru; +} static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0); @@ -278,6 +282,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, /** * sgx_reclaim_pages() - Reclaim EPC pages from the consumers * @nr_to_scan: Number of EPC pages to scan for reclaim + * @ignore_age: Reclaim a page even if it is young * * Take a fixed number of pages from the head of the active page pool and * reclaim them to the enclave's private shmem files. Skip the pages, which have @@ -291,11 +296,12 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, * problematic as it would increase the lock contention too much, which would * halt forward progress. */ -static int __sgx_reclaim_pages(int nr_to_scan) +static int __sgx_reclaim_pages(int nr_to_scan, bool ignore_age) { struct sgx_backing backing[SGX_MAX_NR_TO_RECLAIM]; struct sgx_epc_page *epc_page, *tmp; struct sgx_encl_page *encl_page; + struct sgx_epc_lru_lists *lru; pgoff_t page_index; LIST_HEAD(iso); int ret; @@ -332,7 +338,8 @@ static int __sgx_reclaim_pages(int nr_to_scan) list_for_each_entry_safe(epc_page, tmp, &iso, list) { encl_page = epc_page->encl_owner; - if (i == SGX_MAX_NR_TO_RECLAIM || !sgx_reclaimer_age(epc_page)) + if (i == SGX_MAX_NR_TO_RECLAIM || + (!ignore_age && !sgx_reclaimer_age(epc_page))) goto skip; page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base); @@ -350,10 +357,11 @@ static int __sgx_reclaim_pages(int nr_to_scan) continue; skip: - spin_lock(&sgx_global_lru.lock); + lru = sgx_lru_lists(epc_page); + spin_lock(&lru->lock); epc_page->flags &= ~SGX_EPC_PAGE_RECLAIM_IN_PROGRESS; - sgx_epc_move_reclaimable(&sgx_global_lru, epc_page); - spin_unlock(&sgx_global_lru.lock); + sgx_epc_move_reclaimable(lru, epc_page); + spin_unlock(&lru->lock); kref_put(&encl_page->encl->refcount, sgx_encl_release); } @@ -377,11 +385,11 @@ static int __sgx_reclaim_pages(int nr_to_scan) return i; } -static int sgx_reclaim_pages(int nr_to_scan) +static int sgx_reclaim_pages(int nr_to_scan, bool ignore_age) { int ret; - ret = __sgx_reclaim_pages(nr_to_scan); + ret = __sgx_reclaim_pages(nr_to_scan, ignore_age); cond_resched(); return ret; } @@ -402,7 +410,7 @@ static bool sgx_should_reclaim(unsigned long watermark) void sgx_reclaim_direct(void) { if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) - __sgx_reclaim_pages(SGX_NR_TO_SCAN); + __sgx_reclaim_pages(SGX_NR_TO_SCAN, false); } static int ksgxd(void *p) @@ -428,7 +436,7 @@ static int ksgxd(void *p) sgx_should_reclaim(SGX_NR_HIGH_PAGES)); if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) - sgx_reclaim_pages(SGX_NR_TO_SCAN); + sgx_reclaim_pages(SGX_NR_TO_SCAN, false); } return 0; @@ -522,15 +530,17 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) */ void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) { - spin_lock(&sgx_global_lru.lock); + struct sgx_epc_lru_lists *lru = sgx_lru_lists(page); + + spin_lock(&lru->lock); WARN_ON(page->flags & (SGX_EPC_PAGE_RECLAIMER_TRACKED | SGX_EPC_PAGE_RECLAIM_IN_PROGRESS)); page->flags |= flags; if (flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) - sgx_epc_push_reclaimable(&sgx_global_lru, page); + sgx_epc_push_reclaimable(lru, page); else - sgx_epc_push_unreclaimable(&sgx_global_lru, page); - spin_unlock(&sgx_global_lru.lock); + sgx_epc_push_unreclaimable(lru, page); + spin_unlock(&lru->lock); } /** @@ -545,14 +555,16 @@ void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags) */ int sgx_drop_epc_page(struct sgx_epc_page *page) { - spin_lock(&sgx_global_lru.lock); + struct sgx_epc_lru_lists *lru = sgx_lru_lists(page); + + spin_lock(&lru->lock); if ((page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) && (page->flags & SGX_EPC_PAGE_RECLAIM_IN_PROGRESS)) { - spin_unlock(&sgx_global_lru.lock); + spin_unlock(&lru->lock); return -EBUSY; } list_del(&page->list); - spin_unlock(&sgx_global_lru.lock); + spin_unlock(&lru->lock); page->flags &= ~(SGX_EPC_PAGE_RECLAIMER_TRACKED | SGX_EPC_PAGE_RECLAIM_IN_PROGRESS | @@ -607,7 +619,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - sgx_reclaim_pages(SGX_NR_TO_SCAN); + sgx_reclaim_pages(SGX_NR_TO_SCAN, false); } if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) From patchwork Fri Dec 2 18:36:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29069 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1016495wrr; Fri, 2 Dec 2022 10:44:41 -0800 (PST) X-Google-Smtp-Source: AA0mqf6EXvJxu8dgGdB3904vpnnAMuQY6iW/LOy6NjlOK0kbvSAiniUxd4zYZSC5d5/fwfIQQNFE X-Received: by 2002:a17:906:5dd2:b0:7c0:af0b:b3a2 with SMTP id p18-20020a1709065dd200b007c0af0bb3a2mr2402433ejv.654.1670006681010; Fri, 02 Dec 2022 10:44:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006681; cv=none; d=google.com; s=arc-20160816; b=h+jABEFPNj/w9036zTAno8/90ELQv0u7L/W4fD/rpxbhfZEFCTPTm0teuSl0Kc4cCe F08VNb414O9C1U2waxUYjOYv4JTsg737IDhnACrSkieuItYQspt31Q42gvrNbSu3o4Nt HXpmPeqhqqB5m9nY4blkP4zYi9zjrwxqzSHV/xplevE2kUXv/RWoBmO824dV278SgA8Z oxDkjFjQFS//cUpWy4bqjmBJvWVFUVxVZpzr8BgZdPtV4qOylo/8O9KzO+xD0rGGaa90 zGyAdbTtJLQHmlH+cjgRdGUfnkScR+LCvt9oR+Z/PCVWkQ9AiGf8RZxYniBUj7rnShab Sv+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=mY214QnNK6FoyLzDfPbVjB9mniryc0WmkrrNwJniIbU=; b=FHtnrPEysDR0MA04V1P3B1HrtLd9bgoSU7uqTlxEL0HyN2Ba+zThgEGp3b8XKLexZ7 vIiZYkyhGUNiocgWZzwo4fX4Mnm0GTstdvGhJQzTl41gJthiNZ5oTbKebqQarH15cBWP TTspRrZIFcN6ODcVbKG/jEA0Z4c7DdQDpNRqTufNJ77F3CllZjCO7y8CJT21zmZNV1dn ISBP7Tc8LC+2LWWcp+EwoxMAU+jAU5FmvVKUPe5xO8DUvoD5vHLOjlE7bX7Y82vLxXhQ 97znT2TwwD+rDCmQJLBgU+MzSgwmJcr+U6HaKhs4HLtQQQ5ALzFW+iEYmdLQ9C22O6Qm mprA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XbcsOr+W; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f5-20020aa7d845000000b004674aa77346si6485570eds.72.2022.12.02.10.44.18; Fri, 02 Dec 2022 10:44:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XbcsOr+W; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234584AbiLBSi1 (ORCPT + 99 others); Fri, 2 Dec 2022 13:38:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234174AbiLBShv (ORCPT ); Fri, 2 Dec 2022 13:37:51 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B6466EBCBF; Fri, 2 Dec 2022 10:37:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006246; x=1701542246; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UYpUlq+6H1OK5zNVWPbJu6ZZ/ARyRWJyKUaDzZw64U4=; b=XbcsOr+W7KNHyshrAEZOusqAx5gaBk4z2MtzqvNjf6HLhjwk5n678+Gk 9tnmKoaopQFpGbB6UZg/RAnD2bQmmBSZPBbKkhryB2t3j8soUeYs15B1R z2PZUgXCTaz2V6IBtbh/14jVofCds1X5aQ+zAkR4w/b22Y8j9dwur1rAy hPA5GOlwnA7Lt1Z4d/K6lqy9HMTo5FsuAE4aSWkJyz//igtdb+vZtGxTX pl5J28hGUD7z+cVl1xHvZc2+bc94Xm09k4ymHMvdt+5alrAVSwFCKQJIj oGG0iiA3ROdhWGhfLRZxhgNCwiBvi5EJj/VGuQ7kBiX+4YUllAgQnSdRl w==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724623" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724623" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:25 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717527" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717527" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:24 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 11/18] x86/sgx: Prepare for multiple LRUs Date: Fri, 2 Dec 2022 10:36:47 -0800 Message-Id: <20221202183655.3767674-12-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128925338293482?= X-GMAIL-MSGID: =?utf-8?q?1751128925338293482?= From: Sean Christopherson Add sgx_can_reclaim() wrapper so that in a subsequent patch, multiple LRUs can be used cleanly. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index db96483e2e74..96399e2016a8 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -394,10 +394,15 @@ static int sgx_reclaim_pages(int nr_to_scan, bool ignore_age) return ret; } +static bool sgx_can_reclaim(void) +{ + return !list_empty(&sgx_global_lru.reclaimable); +} + static bool sgx_should_reclaim(unsigned long watermark) { return atomic_long_read(&sgx_nr_free_pages) < watermark && - !list_empty(&sgx_global_lru.reclaimable); + sgx_can_reclaim(); } /* @@ -606,7 +611,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - if (list_empty(&sgx_global_lru.reclaimable)) + if (!sgx_can_reclaim()) return ERR_PTR(-ENOMEM); if (!reclaim) { From patchwork Fri Dec 2 18:36:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29070 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1016640wrr; Fri, 2 Dec 2022 10:45:04 -0800 (PST) X-Google-Smtp-Source: AA0mqf6nsWlZLfo/QK+NaTD0qhL0Miio+0QnlT8ZgVWlIC2wq6L+LtM50znn+ZKZS1+Cedmhpu1L X-Received: by 2002:a17:906:d295:b0:7c0:aff1:f1b6 with SMTP id ay21-20020a170906d29500b007c0aff1f1b6mr7269916ejb.418.1670006704245; Fri, 02 Dec 2022 10:45:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006704; cv=none; d=google.com; s=arc-20160816; b=KZiDOrmp65ohlygJhD/eQc2oLQCNA1ihzG3OciezeNmKrH0DkVINW5xraAk1/fQV+s qu8sqNuHPEPXdw+j3OX2lkjXvx+OHjXn9CNmcHwMT0VSagcg4+jKepzvVWW1Fg5fwyw9 rW7smfsHGTg6ofUt4/uFeqJdrYi4gN84N3434WGSgRk6Gmk8xWeBdTcjBpjCP7jA+QHX U9HtkiSfWZbTaQReYteIrYCNyhcvOfwJG1ZrRhPc36elaHovzKO+fyyjOzQO1yiqUv6F umuuZOxL26IHdj8IzisCBjdXtPPtBWEeE3eU1mpotmrhdXuP/hkaQMN/YKF0esz/MbSU nxKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=W9CUwTO6dI/fCbfi5DWdAGV6RlgESyUIN/SDNZ3yxAE=; b=dzwxLtO1mN53Q7WV7tsInArrjrdUR6k05qoYY9C1NXUdYthe1CySJ6vb2yN/FmDYMt cWR5oIGCPRHxiWD4Q6rte6rvTxBxPNZiZbUKZUIESrJPbNrYZcCnQGbLWZOXeRJx7Qmp zmvPaVWEqCiZpfxGvteBorvu/lsddbnEukM+w47Abo7xgPLC0XAvXoJoZm6h4L4JzuKW CHbg8COQxjOBeKsl58yMu813IqSwm68eM0lHP0Cg3flObnrIaia26Yyhd8m+YQd+U0J6 V09SZEOidU11wHAYieTpIzBaIhbZIowBoecmvngyZwGCHF2HTPO4gM+Q5Gicllo4cSUk 9xpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=l98QrOd6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w24-20020aa7d298000000b0045c98bb5359si6196709edq.590.2022.12.02.10.44.39; Fri, 02 Dec 2022 10:45:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=l98QrOd6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233541AbiLBSij (ORCPT + 99 others); Fri, 2 Dec 2022 13:38:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234585AbiLBShy (ORCPT ); Fri, 2 Dec 2022 13:37:54 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3DDCED687; Fri, 2 Dec 2022 10:37:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006247; x=1701542247; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=u7MsvLTiIZvWWxFFWya637LlOY9eT/xkcvSR43Ixrm0=; b=l98QrOd6Zc3kV9pp1e/Y5UDr88q5aOK6i7f0g22DPKMNnUnZO+Sl7Wxt 6fn40zakseUtk0fAs5xblVWio3zuE/Uvq5k7HJr/KXfuFDFicahmpC9aw ijJeAxjf3T8ZuwfhTnu9T4kplj5kgEFBgyehEmU0AuHzl9AwL1GzomRIt xm1r2+Duohts2bkQaOnWEVBIqpMCme6NxZTJd+MfHXUIL8+gNSG5eB4S1 oLIEy5ChH6S0vkt+Y1m57js+gljza/QtMVOvHKw27ZRuNg5+VVI0Eb8Ns FYpqxspTEBU75SMIeCd1YAku84toeAGpzNCyh6LZuM9BRckcZQFefrXFv Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724635" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724635" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:27 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717549" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717549" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:26 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 12/18] x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup Date: Fri, 2 Dec 2022 10:36:48 -0800 Message-Id: <20221202183655.3767674-13-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128950025748027?= X-GMAIL-MSGID: =?utf-8?q?1751128950025748027?= From: Sean Christopherson Expose the top-level reclaim function as sgx_reclaim_epc_pages() for use by the upcoming EPC cgroup, which will initiate reclaim to enforce changes to high/max limits. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 7 ++++--- arch/x86/kernel/cpu/sgx/sgx.h | 1 + 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 96399e2016a8..c947b4ae06f3 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -281,6 +281,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, /** * sgx_reclaim_pages() - Reclaim EPC pages from the consumers + * sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers * @nr_to_scan: Number of EPC pages to scan for reclaim * @ignore_age: Reclaim a page even if it is young * @@ -385,7 +386,7 @@ static int __sgx_reclaim_pages(int nr_to_scan, bool ignore_age) return i; } -static int sgx_reclaim_pages(int nr_to_scan, bool ignore_age) +int sgx_reclaim_epc_pages(int nr_to_scan, bool ignore_age) { int ret; @@ -441,7 +442,7 @@ static int ksgxd(void *p) sgx_should_reclaim(SGX_NR_HIGH_PAGES)); if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) - sgx_reclaim_pages(SGX_NR_TO_SCAN, false); + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); } return 0; @@ -624,7 +625,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - sgx_reclaim_pages(SGX_NR_TO_SCAN, false); + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); } if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index ec8d567cd975..ce859331ddf5 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -206,6 +206,7 @@ void sgx_reclaim_direct(void); void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags); int sgx_drop_epc_page(struct sgx_epc_page *page); struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); +int sgx_reclaim_epc_pages(int nr_to_scan, bool ignore_age); void sgx_ipi_cb(void *info); From patchwork Fri Dec 2 18:36:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29072 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1016798wrr; Fri, 2 Dec 2022 10:45:31 -0800 (PST) X-Google-Smtp-Source: AA0mqf4AD3KL9Szuuom4jfepgLoGpM0TR3pQIRlF51yRIWbJScRea0s0izMsd2qHCLTA3omELtU0 X-Received: by 2002:a05:6402:e0d:b0:466:4168:6ea7 with SMTP id h13-20020a0564020e0d00b0046641686ea7mr5120226edh.273.1670006731352; Fri, 02 Dec 2022 10:45:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006731; cv=none; d=google.com; s=arc-20160816; b=OfLnGawm2RKlF0cFfp8usnggjnVint+zJtN1KZ0VmKGpZ2rLACAr1uL6ai/Kh4wbHX dnXG+3CHGRGCuThaugUZ1M86+9RoaSpKONDNz3K4EIMGVO5q0ZN62rx+2CrNj+VWkDck xpgPSYoeYJgsfw+CivcUbXPLssXhpEiCZZRlVAxiNklCrckTVszUfoV3CUwr1zAw6Rob q7nTVH2jgAYtZd6lS8dn2tG5f9j+DyENBzJKxaDMl82u/h+pWEiUBX7TM6ouWlaSvqfR 27QdmOAfkFdO3ln7hnZq/JJM/ouBcDZICQmITOuFhriWRyQl7RPmNcwi3ET7rrbJBl8l 5P+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=NClKKlQdYF8DOuM6QGw4Qi86KJUcEcLwm8rPkGqKUqg=; b=aYFmTcWY9yNL1/6qm2SvA2pp/LUtWKYSqn3i+lLOQazF5pFbv5u0oA2YhVA0qixoTb aKRE8cR88xrXbdOGtRAl09mFY1efp7dHdznKGCptjVWY7imjNHoEPEkrKoIlxkDPGBzK +5OykIwd1UIdlUN3/qKY+pRPLaGSiVLwEJ7XVUitK77R0D1IC/hGWVCIeyogxZBpx0ds BQr9RJPyZT4in0sRmgxE9QAO0Q6O2hvCJHALY0dguI3FNNA8Kk/i7IH6Df/Srafsg3PL hGk3aDfTurVBa85E7eAj1wiEBp6WqlSEc2PdlLO7CGHj0wCZiCmEsY4epP1jZ/nhcxSe Bujw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=k0f6EeYG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qf23-20020a1709077f1700b007af0ca23299si6962412ejc.841.2022.12.02.10.45.08; Fri, 02 Dec 2022 10:45:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=k0f6EeYG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234542AbiLBSin (ORCPT + 99 others); Fri, 2 Dec 2022 13:38:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234592AbiLBSiB (ORCPT ); Fri, 2 Dec 2022 13:38:01 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6B93ED698; Fri, 2 Dec 2022 10:37:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006249; x=1701542249; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1TtIGbRSI3H2yjUyeBNwanZjUKzTjkmoWFW60lz7GHQ=; b=k0f6EeYG4JNQd/AV+nA6MwS50bfHEyB+z+Qzi5rdZSbVz2LjN+ErXC8P yTLlXzR4tYPENCYmgo0DuX0gsPUrn3IpHWtIE0L6BZqEVfex85zxiIqmN CP4GqTU4a4DIEEJt/pRVbMtwzu3iFWKJmuQ0EqNiGgKuSNyD3UIVxW3C1 4Rjx4xKVLavqXtQWIJKaPBOHDceC281JQpzKtPaQqnO0wh6Fy9S/PeA4Q IAk+VNYaBPIEcJ0R2H9DzD7b4ax9QltuVShH/IOY6H1OU0fQ16gHLL6Jw qT/vh5sR4550Np+Fpw6Vn/cp43D+/q70k7sB8tuXS3FvhXiwK0RCPgfyH w==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724646" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724646" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:29 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717557" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717557" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:28 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 13/18] x86/sgx: Add helper to grab pages from an arbitrary EPC LRU Date: Fri, 2 Dec 2022 10:36:49 -0800 Message-Id: <20221202183655.3767674-14-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128978348036073?= X-GMAIL-MSGID: =?utf-8?q?1751128978348036073?= From: Sean Christopherson Move the isolation loop into a standalone helper, sgx_isolate_pages(), in preparation for existence of multiple LRUs. Expose the helper to other SGX code so that it can be called from the EPC cgroup code, e.g. to isolate pages from a single cgroup LRU. Exposing the isolation loop allows the cgroup iteration logic to be wholly encapsulated within the cgroup code. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/main.c | 68 +++++++++++++++++++++------------- arch/x86/kernel/cpu/sgx/sgx.h | 2 + 2 files changed, 44 insertions(+), 26 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index c947b4ae06f3..a59550fa150b 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -280,7 +280,46 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, } /** - * sgx_reclaim_pages() - Reclaim EPC pages from the consumers + * sgx_isolate_epc_pages() - Isolate pages from an LRU for reclaim + * @lru: LRU from which to reclaim + * @nr_to_scan: Number of pages to scan for reclaim + * @dst: Destination list to hold the isolated pages + */ +void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lru, int *nr_to_scan, + struct list_head *dst) +{ + struct sgx_encl_page *encl_page; + struct sgx_epc_page *epc_page; + + spin_lock(&lru->lock); + for (; *nr_to_scan > 0; --(*nr_to_scan)) { + if (list_empty(&lru->reclaimable)) + break; + + epc_page = sgx_epc_peek_reclaimable(lru); + if (!epc_page) + break; + + encl_page = epc_page->encl_owner; + + if (WARN_ON_ONCE(!(epc_page->flags & SGX_EPC_PAGE_ENCLAVE))) + continue; + + if (kref_get_unless_zero(&encl_page->encl->refcount)) { + epc_page->flags |= SGX_EPC_PAGE_RECLAIM_IN_PROGRESS; + list_move_tail(&epc_page->list, dst); + } else { + /* The owner is freeing the page, remove it from the + * LRU list + */ + epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; + list_del_init(&epc_page->list); + } + } + spin_unlock(&lru->lock); +} + +/** * sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers * @nr_to_scan: Number of EPC pages to scan for reclaim * @ignore_age: Reclaim a page even if it is young @@ -305,37 +344,14 @@ static int __sgx_reclaim_pages(int nr_to_scan, bool ignore_age) struct sgx_epc_lru_lists *lru; pgoff_t page_index; LIST_HEAD(iso); + int i = 0; int ret; - int i; - - spin_lock(&sgx_global_lru.lock); - for (i = 0; i < nr_to_scan; i++) { - epc_page = sgx_epc_peek_reclaimable(&sgx_global_lru); - if (!epc_page) - break; - - encl_page = epc_page->encl_owner; - if (WARN_ON_ONCE(!(epc_page->flags & SGX_EPC_PAGE_ENCLAVE))) - continue; - - if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) { - epc_page->flags |= SGX_EPC_PAGE_RECLAIM_IN_PROGRESS; - list_move_tail(&epc_page->list, &iso); - } else { - /* The owner is freeing the page, remove it from the - * LRU list - */ - epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; - list_del_init(&epc_page->list); - } - } - spin_unlock(&sgx_global_lru.lock); + sgx_isolate_epc_pages(&sgx_global_lru, &nr_to_scan, &iso); if (list_empty(&iso)) return 0; - i = 0; list_for_each_entry_safe(epc_page, tmp, &iso, list) { encl_page = epc_page->encl_owner; diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index ce859331ddf5..4499a5d5547d 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -207,6 +207,8 @@ void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags); int sgx_drop_epc_page(struct sgx_epc_page *page); struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); int sgx_reclaim_epc_pages(int nr_to_scan, bool ignore_age); +void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lrus, int *nr_to_scan, + struct list_head *dst); void sgx_ipi_cb(void *info); From patchwork Fri Dec 2 18:36:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29071 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1016724wrr; Fri, 2 Dec 2022 10:45:17 -0800 (PST) X-Google-Smtp-Source: AA0mqf6Vx4qD7E82z7MEoUMQ62L1fXUVhM7q3DkDbduzef51SC9meEu2anB6tMdED124Ku53XRtV X-Received: by 2002:a05:6402:104b:b0:461:7d1f:1a7b with SMTP id e11-20020a056402104b00b004617d1f1a7bmr46496396edu.400.1670006717112; Fri, 02 Dec 2022 10:45:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006717; cv=none; d=google.com; s=arc-20160816; b=VOOV6MQYO1lGqPTrCWzjR233ULX0HEXBUrUZ9fcWtZ7iVsoQmIJhveqzbQXjspxtg9 2hUy/quHpxx1LBLiB2eSHsTJb72H9kXSTtj6nQV2PaD1WxN2luA4PM96VWhcsepjUF10 z3cvzOkUNERbyR3jd4VOHdAsE0cWzJHSHx0DAfhKPS6E4P3bS57MyttUKaVRTD+H8sLV rvdO/pzkujF4WFq/D6ATaS4WU7JQN3syqnf+nhKLzHlkf49AD9uUuSUlAf1agjkkzAFO ZqIYK7b1zc5cxwCdsy30aYmGsK5YO6SYZPGZ1ykPowuLAEXDfdE/cJe0mWDUEU/enV0p OY/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Gif80rSU2H+tE7IqWlRlURyFHs/CG43TRNK041sZoiE=; b=gvv2GPQ0J+Pf0Vc24+7XmDMJb/K9N1sCh5wrOxogh1AMUt8ie9QoH/5XvsundAAO17 JdCroSGucYXssRP5kR5QZfJNh1xVwY2afo1pKLoFbTxpFEODf5ckt+u7IY1RwiR4ccyr I5VcTBkflqm9js2anzngEhSwWDb5TUePcYEap1CR0AcBc3jxMvtR0JfiB7KOvdAMlSgb roA6n2AxEl20X2vkE0OBajXVTotLPfGGPep6OP6H451OBjp6jWTTQwGLcYvnSaF+1y5V SGywleYXfZ9teK0UDgfwCvU/empVHgS9yP7Li+HtyNvpdwJcvGZ1vT9GojxRvz4ZfO7m W0vw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=d1utAFnd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w15-20020a056402268f00b0046acab646ecsi7879397edd.81.2022.12.02.10.44.51; Fri, 02 Dec 2022 10:45:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=d1utAFnd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234612AbiLBSi5 (ORCPT + 99 others); Fri, 2 Dec 2022 13:38:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234614AbiLBSiJ (ORCPT ); Fri, 2 Dec 2022 13:38:09 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F2B0EA5C4; Fri, 2 Dec 2022 10:37:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006252; x=1701542252; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nmCZNYgUAIz/aQMP19bygyR4XgEDobKa4qPXtnALYkM=; b=d1utAFndMKm6UB/BYQABqMjqb7QHjf3349H7z0fdmWGvObZYiV6h1Whu al/QYKOOynugeAPRklMnnsmEWbsDaojMmqXCwNucR3X30LOlxJI+45ujS YwdU8BfeBpwqO1727WEDKj2C638W3OT+eWYh4MkvgvDYqteLygRhSHqtf Bet/cGNKg73f5tSmg/8ulJBrs6jmnXwVvYn2NGM+NbD8708kzxLzaBsjD I+NpyzAN8p0bIb6zj7XSuxUg20cZxPj/iVTlkMTKjMwwUxqXAkXe45oWz 3pV0pH585f461nJdkHDATrQEAjwJFLqnHbb6JMhAMmFm8u66jBoh4XPMA g==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724656" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724656" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:31 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717567" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717567" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:30 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 14/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC Date: Fri, 2 Dec 2022 10:36:50 -0800 Message-Id: <20221202183655.3767674-15-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128963278731812?= X-GMAIL-MSGID: =?utf-8?q?1751128963278731812?= From: Sean Christopherson Introduce the OOM path for killing an enclave with the reclaimer is no longer able to reclaim enough EPC pages. Find a victim enclave, which will be an enclave with EPC pages remaining that are not accessible to the reclaimer ("unreclaimable"). Once a victim is identified, mark the enclave as OOM and zap the enclaves entire page range. Release all the enclaves resources except for the struct sgx_encl memory itself. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/kernel/cpu/sgx/encl.c | 74 +++++++++++++++--- arch/x86/kernel/cpu/sgx/encl.h | 2 + arch/x86/kernel/cpu/sgx/main.c | 135 +++++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/sgx/sgx.h | 1 + 4 files changed, 201 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 9ee306ac2a8e..ba350b2961d1 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -623,7 +623,8 @@ static int sgx_vma_access(struct vm_area_struct *vma, unsigned long addr, if (!encl) return -EFAULT; - if (!test_bit(SGX_ENCL_DEBUG, &encl->flags)) + if (!test_bit(SGX_ENCL_DEBUG, &encl->flags) || + test_bit(SGX_ENCL_OOM, &encl->flags)) return -EFAULT; for (i = 0; i < len; i += cnt) { @@ -669,16 +670,8 @@ const struct vm_operations_struct sgx_vm_ops = { .access = sgx_vma_access, }; -/** - * sgx_encl_release - Destroy an enclave instance - * @ref: address of a kref inside &sgx_encl - * - * Used together with kref_put(). Frees all the resources associated with the - * enclave and the instance itself. - */ -void sgx_encl_release(struct kref *ref) +static void __sgx_encl_release(struct sgx_encl *encl) { - struct sgx_encl *encl = container_of(ref, struct sgx_encl, refcount); struct sgx_va_page *va_page; struct sgx_encl_page *entry; unsigned long index; @@ -713,7 +706,7 @@ void sgx_encl_release(struct kref *ref) while (!list_empty(&encl->va_pages)) { va_page = list_first_entry(&encl->va_pages, struct sgx_va_page, list); - list_del(&va_page->list); + list_del_init(&va_page->list); sgx_drop_epc_page(va_page->epc_page); sgx_encl_free_epc_page(va_page->epc_page); kfree(va_page); @@ -729,10 +722,66 @@ void sgx_encl_release(struct kref *ref) /* Detect EPC page leak's. */ WARN_ON_ONCE(encl->secs_child_cnt); WARN_ON_ONCE(encl->secs.epc_page); +} + +/** + * sgx_encl_release - Destroy an enclave instance + * @ref: address of a kref inside &sgx_encl + * + * Used together with kref_put(). Frees all the resources associated with the + * enclave and the instance itself. + */ +void sgx_encl_release(struct kref *ref) +{ + struct sgx_encl *encl = container_of(ref, struct sgx_encl, refcount); + + /* if the enclave was OOM killed previously, it just needs to be freed */ + if (!test_bit(SGX_ENCL_OOM, &encl->flags)) + __sgx_encl_release(encl); kfree(encl); } +/** + * sgx_encl_destroy - prepare the enclave for release + * @encl: address of the sgx_encl to drain + * + * Used during oom kill to empty the mm_list entries after they have + * been zapped. Release the remaining enclave resources without freeing + * struct sgx_encl. + */ +void sgx_encl_destroy(struct sgx_encl *encl) +{ + struct sgx_encl_mm *encl_mm; + + for ( ; ; ) { + spin_lock(&encl->mm_lock); + + if (list_empty(&encl->mm_list)) { + encl_mm = NULL; + } else { + encl_mm = list_first_entry(&encl->mm_list, + struct sgx_encl_mm, list); + list_del_rcu(&encl_mm->list); + } + + spin_unlock(&encl->mm_lock); + + /* The enclave is no longer mapped by any mm. */ + if (!encl_mm) + break; + + synchronize_srcu(&encl->srcu); + mmu_notifier_unregister(&encl_mm->mmu_notifier, encl_mm->mm); + kfree(encl_mm); + + /* 'encl_mm' is gone, put encl_mm->encl reference: */ + kref_put(&encl->refcount, sgx_encl_release); + } + + __sgx_encl_release(encl); +} + /* * 'mm' is exiting and no longer needs mmu notifications. */ @@ -802,6 +851,9 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm) struct sgx_encl_mm *encl_mm; int ret; + if (test_bit(SGX_ENCL_OOM, &encl->flags)) + return -ENOMEM; + /* * Even though a single enclave may be mapped into an mm more than once, * each 'mm' only appears once on encl->mm_list. This is guaranteed by diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index 831d63f80f5a..f4935632e53a 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -39,6 +39,7 @@ enum sgx_encl_flags { SGX_ENCL_DEBUG = BIT(1), SGX_ENCL_CREATED = BIT(2), SGX_ENCL_INITIALIZED = BIT(3), + SGX_ENCL_OOM = BIT(4), }; struct sgx_encl_mm { @@ -125,5 +126,6 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, unsigned long addr); struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, bool reclaim); void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page); +void sgx_encl_destroy(struct sgx_encl *encl); #endif /* _X86_ENCL_H */ diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index a59550fa150b..70046c4e332a 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -677,6 +677,141 @@ void sgx_free_epc_page(struct sgx_epc_page *page) atomic_long_inc(&sgx_nr_free_pages); } +static bool sgx_oom_get_ref(struct sgx_epc_page *epc_page) +{ + struct sgx_encl *encl; + + if (epc_page->flags & SGX_EPC_PAGE_ENCLAVE) + encl = ((struct sgx_encl_page *)epc_page->encl_owner)->encl; + else if (epc_page->flags & SGX_EPC_PAGE_VERSION_ARRAY) + encl = epc_page->encl; + else + return false; + + return kref_get_unless_zero(&encl->refcount); +} + +static struct sgx_epc_page *sgx_oom_get_victim(struct sgx_epc_lru_lists *lru) +{ + struct sgx_epc_page *epc_page, *tmp; + + if (list_empty(&lru->unreclaimable)) + return NULL; + + list_for_each_entry_safe(epc_page, tmp, &lru->unreclaimable, list) { + list_del_init(&epc_page->list); + + if (sgx_oom_get_ref(epc_page)) + return epc_page; + } + return NULL; +} + +static void sgx_epc_oom_zap(void *owner, struct mm_struct *mm, unsigned long start, + unsigned long end, const struct vm_operations_struct *ops) +{ + struct vm_area_struct *vma, *tmp; + unsigned long vm_end; + + vma = find_vma(mm, start); + if (!vma || vma->vm_ops != ops || vma->vm_private_data != owner || + vma->vm_start >= end) + return; + + for (tmp = vma; tmp->vm_start < end; tmp = tmp->vm_next) { + do { + vm_end = tmp->vm_end; + tmp = tmp->vm_next; + } while (tmp && tmp->vm_ops == ops && + vma->vm_private_data == owner && tmp->vm_start < end); + + zap_page_range(vma, vma->vm_start, vm_end - vma->vm_start); + + if (!tmp) + break; + } +} + +static void sgx_oom_encl(struct sgx_encl *encl) +{ + unsigned long mm_list_version; + struct sgx_encl_mm *encl_mm; + int idx; + + set_bit(SGX_ENCL_OOM, &encl->flags); + + if (!test_bit(SGX_ENCL_CREATED, &encl->flags)) + goto out; + + do { + mm_list_version = encl->mm_list_version; + + /* Pairs with smp_rmb() in sgx_encl_mm_add(). */ + smp_rmb(); + + idx = srcu_read_lock(&encl->srcu); + + list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) { + if (!mmget_not_zero(encl_mm->mm)) + continue; + + mmap_read_lock(encl_mm->mm); + + sgx_epc_oom_zap(encl, encl_mm->mm, encl->base, + encl->base + encl->size, &sgx_vm_ops); + + mmap_read_unlock(encl_mm->mm); + + mmput_async(encl_mm->mm); + } + + srcu_read_unlock(&encl->srcu, idx); + } while (WARN_ON_ONCE(encl->mm_list_version != mm_list_version)); + + mutex_lock(&encl->lock); + sgx_encl_destroy(encl); + mutex_unlock(&encl->lock); + +out: + /* + * This puts the refcount we took when we identified this enclave as + * an OOM victim. + */ + kref_put(&encl->refcount, sgx_encl_release); +} + +static inline void sgx_oom_encl_page(struct sgx_encl_page *encl_page) +{ + return sgx_oom_encl(encl_page->encl); +} + +/** + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU + * @lru: LRU that is low + * + * Return: %true if a victim was found and kicked. + */ +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) +{ + struct sgx_epc_page *victim; + + spin_lock(&lru->lock); + victim = sgx_oom_get_victim(lru); + spin_unlock(&lru->lock); + + if (!victim) + return false; + + if (victim->flags & SGX_EPC_PAGE_ENCLAVE) + sgx_oom_encl_page(victim->encl_owner); + else if (victim->flags & SGX_EPC_PAGE_VERSION_ARRAY) + sgx_oom_encl(victim->encl); + else + WARN_ON_ONCE(1); + + return true; +} + static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, unsigned long index, struct sgx_epc_section *section) diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 4499a5d5547d..1c666b25294b 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -209,6 +209,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); int sgx_reclaim_epc_pages(int nr_to_scan, bool ignore_age); void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lrus, int *nr_to_scan, struct list_head *dst); +bool sgx_epc_oom(struct sgx_epc_lru_lists *lrus); void sgx_ipi_cb(void *info); From patchwork Fri Dec 2 18:36:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29073 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1016870wrr; Fri, 2 Dec 2022 10:45:41 -0800 (PST) X-Google-Smtp-Source: AA0mqf51KxQnDEEBTd3TVtTXBqsTNkBpnTuRE6jTiG2Co/SQ9rop19QmHzBpN9g3cZPz3cVfYiaR X-Received: by 2002:a17:906:b012:b0:7c0:9060:1580 with SMTP id v18-20020a170906b01200b007c090601580mr14997488ejy.655.1670006741409; Fri, 02 Dec 2022 10:45:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006741; cv=none; d=google.com; s=arc-20160816; b=HBhM48nfqNDlQHuU108yCYnRDEh0quuZPK26EP5/qKXv7581prp0JF7yYOAOfo+381 7JFYoKWM4b513rhqM+nHBkQOCHIFmOgXrpmBOmkZxixZBvqWi1W+Kxp8UF1MmobzPqlJ B/YFsbHGSwTVaRr5MGbZKenVeybYuCubIFYEGkteoi2npzA/l7K7I/Ayrcwk2S3+7xd4 isFZXhymHAZAXTkfQw50W2IDOhRHtWLsiGXOcPRnpoh6L9mtIGKKNMXE3op8lLdrzo7a r8Sak+Av4BK4tqYK0jOi5MRNb9ovQZiUZWeig81BI76Ap9OYS8NAQAeVrk6QB/d0Tzwq KUfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=HkDzMuFLoMGSiqaBR94oyBMNzl7C3H6Vk+yxrOt1ao4=; b=fb93G8j/ia4cGdJmnrS1UW8zRv2fEkKV3vaWL4kwIFuC0wh1kEDbit9CptiZ3UAQPS CZE8qUa0SDi9F+eKXWiEKfwIrp9FmOm+fIdfVMom4r8olK3Jo7o+JqvoqzDBxWodWqqT GcGGYQHlNxtsf4uvrzPHeIGqDgGdq4oeY7FZ3V8em3sbTftp+MEInC97zNgIf/JJ3jKG Bb6iSkX0UIqgiDwDBWHxSPDpAC7sY5v6GguzOB5/o9aQGY/QnOMPlaivqNUOJ0uG0F3N YYHC+nYgGyKYhls2av0w427pDaUu6QWUT5MJn8s6JizzrcDw4ytdpJOymgwTVd4Fo01S a1Sw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=I67nmsCA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sz15-20020a1709078b0f00b007acbac0871csi5720387ejc.420.2022.12.02.10.45.17; Fri, 02 Dec 2022 10:45:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=I67nmsCA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234647AbiLBSjD (ORCPT + 99 others); Fri, 2 Dec 2022 13:39:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234624AbiLBSiS (ORCPT ); Fri, 2 Dec 2022 13:38:18 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B00DD13E03; Fri, 2 Dec 2022 10:37:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006256; x=1701542256; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+r1/QmCqDwq6NgBzwiM4c1v1mVeOdSGTv4jz3p+3jWY=; b=I67nmsCAVHbo9gnZ/CcXlniBuxun/jVvlhrqXEHVe1/VeXUCNPtW7Q44 gmUKv2LhOBCljvN+v5f2AIJvoFFNErcCOHpkExCO7Xw+MpprMHIJX5xQc RvIemaKUzUj4IG36nWi0eje2DGQciJjPDrZhEeOEZD5RSsESJzp4MzLEQ Vo+tpnO9HzoCalkm0cQGsfjU9bUZMFrZ9gHOnBDWb7BSN8cSPGzqX+gsU G8kATd070glGgV02uoMH/bB81dkcLxAIqcucN0CekPPR1yn4guxxm+nhk NIcdD2dB2yxrwJzS3Lzk/5ik1DyG+Va6D70vg0j+GkZxWDxssY3wvgukZ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724664" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724664" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:32 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717572" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717572" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:31 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Zefan Li , Johannes Weiner Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi Subject: [PATCH v2 15/18] cgroup/misc: Add per resource callbacks for css events Date: Fri, 2 Dec 2022 10:36:51 -0800 Message-Id: <20221202183655.3767674-16-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751128988465631113?= X-GMAIL-MSGID: =?utf-8?q?1751128988465631113?= Consumers of the misc cgroup controller might need to perform separate actions in the event of a cgroup alloc, free or release call. In addition, writes to the max value may also need separate action. Add the ability to allow downstream users to setup callbacks for these operations, and call the per resource type callback when appropriate. This code will be utilized by the SGX driver in a future patch. Signed-off-by: Kristen Carlson Accardi --- include/linux/misc_cgroup.h | 6 +++++ kernel/cgroup/misc.c | 51 ++++++++++++++++++++++++++++++++++--- 2 files changed, 54 insertions(+), 3 deletions(-) diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h index c238207d1615..83620e7c4bb1 100644 --- a/include/linux/misc_cgroup.h +++ b/include/linux/misc_cgroup.h @@ -37,6 +37,12 @@ struct misc_res { unsigned long max; atomic_long_t usage; atomic_long_t events; + + /* per resource callback ops */ + int (*misc_cg_alloc)(struct misc_cg *cg); + void (*misc_cg_free)(struct misc_cg *cg); + void (*misc_cg_released)(struct misc_cg *cg); + void (*misc_cg_max_write)(struct misc_cg *cg); }; /** diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c index fe3e8a0eb7ed..3d17afd5b7a8 100644 --- a/kernel/cgroup/misc.c +++ b/kernel/cgroup/misc.c @@ -278,10 +278,13 @@ static ssize_t misc_cg_max_write(struct kernfs_open_file *of, char *buf, cg = css_misc(of_css(of)); - if (READ_ONCE(misc_res_capacity[type])) + if (READ_ONCE(misc_res_capacity[type])) { WRITE_ONCE(cg->res[type].max, max); - else + if (cg->res[type].misc_cg_max_write) + cg->res[type].misc_cg_max_write(cg); + } else { ret = -EINVAL; + } return ret ? ret : nbytes; } @@ -385,23 +388,39 @@ static struct cftype misc_cg_files[] = { static struct cgroup_subsys_state * misc_cg_alloc(struct cgroup_subsys_state *parent_css) { + struct misc_cg *parent_cg; enum misc_res_type i; struct misc_cg *cg; + int ret; if (!parent_css) { cg = &root_cg; + parent_cg = &root_cg; } else { cg = kzalloc(sizeof(*cg), GFP_KERNEL); if (!cg) return ERR_PTR(-ENOMEM); + parent_cg = css_misc(parent_css); } for (i = 0; i < MISC_CG_RES_TYPES; i++) { WRITE_ONCE(cg->res[i].max, MAX_NUM); atomic_long_set(&cg->res[i].usage, 0); + if (parent_cg->res[i].misc_cg_alloc) { + ret = parent_cg->res[i].misc_cg_alloc(cg); + if (ret) + goto alloc_err; + } } return &cg->css; + +alloc_err: + for (i = 0; i < MISC_CG_RES_TYPES; i++) + if (parent_cg->res[i].misc_cg_free) + cg->res[i].misc_cg_free(cg); + kfree(cg); + return ERR_PTR(ret); } /** @@ -412,13 +431,39 @@ misc_cg_alloc(struct cgroup_subsys_state *parent_css) */ static void misc_cg_free(struct cgroup_subsys_state *css) { - kfree(css_misc(css)); + struct misc_cg *cg = css_misc(css); + enum misc_res_type i; + + for (i = 0; i < MISC_CG_RES_TYPES; i++) + if (cg->res[i].misc_cg_free) + cg->res[i].misc_cg_free(cg); + + kfree(cg); +} + +/** + * misc_cg_released() - Release the misc cgroup + * @css: cgroup subsys object. + * + * Call the misc_cg resource type released callbacks. + * + * Context: Any context. + */ +static void misc_cg_released(struct cgroup_subsys_state *css) +{ + struct misc_cg *cg = css_misc(css); + enum misc_res_type i; + + for (i = 0; i < MISC_CG_RES_TYPES; i++) + if (cg->res[i].misc_cg_released) + cg->res[i].misc_cg_released(cg); } /* Cgroup controller callbacks */ struct cgroup_subsys misc_cgrp_subsys = { .css_alloc = misc_cg_alloc, .css_free = misc_cg_free, + .css_released = misc_cg_released, .legacy_cftypes = misc_cg_files, .dfl_cftypes = misc_cg_files, }; From patchwork Fri Dec 2 18:36:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29074 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1017024wrr; Fri, 2 Dec 2022 10:46:07 -0800 (PST) X-Google-Smtp-Source: AA0mqf77WOXXCnT0vChe8PiDPeKoOqZKO6hB7VDPSnOeyzwxpV5B6PnS/EncoR/6qZaPITIhM6Ei X-Received: by 2002:a17:906:eb41:b0:7b6:ff1c:c4f0 with SMTP id mc1-20020a170906eb4100b007b6ff1cc4f0mr46820365ejb.359.1670006767368; Fri, 02 Dec 2022 10:46:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006767; cv=none; d=google.com; s=arc-20160816; b=omyujMsPpnb6yy3gBOpQdmRyOea+dfyvI/loJeT2AEwuWvDo9PeEwkpcZxWda0jDlC N/ZlseogCbJoRio1iyLnEOWpo/ZQGsbpMIaWJzW8xH7t0mQmkxyjwb44z9pyL77SyPLS oWnwJ+OhhGVu1amv/dIetf/L/PfO3QDYG4xUA29at9PR2MaVWELpma8+Bq0ZN0ug2Arm YK8E4o3def0AENsZq1cmWFqONmWYVkU5LxMkOulYL7dhslZPxsn3mk7HJ3mEjUeRwzuJ y9/VcAKCHLYZqPM6+A+ZRS7oRlrHSNiL/JkskZx9H9WK7jIZRbZ6tsQLy3pHFC9kk1pV L4Kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Tq2UqLkAMA2BAc10lqp9A6xwlobNAqNJlvSHJDDIgaA=; b=Z+I/fvTZni/qhgJcjg58HVVLAAdD7fdfrXyEXPj3GbpaqwUNsuVmZfj4ntJfUHTwW0 tucaYHfUSdhwaPxHmu78jO/OZ7RWjQS/EzuKBs4XW52VcbE+/OQJ8IBsLDCEwilQiuQP qKWzUAzWJW+/R4G8L3tIq1dRM8MUYSjzLhMwI1ytDrc8U1KseRpvHJbmDBCtcKqMeaHH neTwpzvZ2sWAU8b2ir67GB3PcpvHQOU3upGd9C1PkMihcnLEhmoDtR2a03exmbbEAlUQ ZNQAb/DXa8ozGuXSaOrMEsLymeidnTJKlcKotQ0Ep31DFeGl4JvZSfSFThBBBZvV6sgz lUVA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=N+ZkGZgx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id oz44-20020a1709077dac00b007af0d7322a8si6821962ejc.836.2022.12.02.10.45.43; Fri, 02 Dec 2022 10:46:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=N+ZkGZgx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234624AbiLBSjI (ORCPT + 99 others); Fri, 2 Dec 2022 13:39:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234629AbiLBSiX (ORCPT ); Fri, 2 Dec 2022 13:38:23 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B1CEDFFE; Fri, 2 Dec 2022 10:37:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006261; x=1701542261; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FELkx3f6pgyMIZAcnt5cglOY+kr2mG0MHu5R8dRR8ng=; b=N+ZkGZgx/ORltXiH1jfyYkVf8h3zUX1l70A19b80TWocatq0+ZNEhSt4 tNR4y7nCtZ5dcPUnNdMVFWe95xPIzj16HQ1TFiHpoOYylEIHVoip+0ApL veK3BcuauoKSc6gL4WhpUR9bu+rJmoSn3TZHekgQilhQ3UaVMGL1BcA3c vcazw9Cbu8BdOWs2ZW+NF8mR5gV89qM2ZTL+MAaeyhv5kSfogV6VCm4yM xj/CCxDnJND5ofSAIM9Et49c3BOFcjhjAaNjzB5sJvD1HQy563LZxsMRH klFuHjvzqkFiBrywVrNbJqOOb7eNhUiNVagv6PhM6OaMY50ISwOSrs//3 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724686" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724686" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:37 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717579" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717579" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:33 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Zefan Li , Johannes Weiner Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi Subject: [PATCH v2 16/18] cgroup/misc: Prepare for SGX usage Date: Fri, 2 Dec 2022 10:36:52 -0800 Message-Id: <20221202183655.3767674-17-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751129015871014619?= X-GMAIL-MSGID: =?utf-8?q?1751129015871014619?= The SGX driver will need to get access to the root misc_cg object to do iterative walks and also determine if a charge will be towards the root cgroup or not. To manage the SGX EPC memory via the misc controller, the SGX driver will also need to be able to iterate over the misc cgroup hierarchy. Move parent_misc() into misc_cgroup.h and make inline to make this function available to SGX, rename it to misc_cg_parent(), and update misc.c to use the new name. Add per resource type private data so that SGX can store additional per cgroup data with the misc_cg struct. Allow SGX EPC memory to be a valid resource type for the misc controller. Signed-off-by: Kristen Carlson Accardi --- include/linux/misc_cgroup.h | 29 +++++++++++++++++++++++++++++ kernel/cgroup/misc.c | 25 ++++++++++++------------- 2 files changed, 41 insertions(+), 13 deletions(-) diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h index 83620e7c4bb1..53a64d3bb6d7 100644 --- a/include/linux/misc_cgroup.h +++ b/include/linux/misc_cgroup.h @@ -17,6 +17,10 @@ enum misc_res_type { MISC_CG_RES_SEV, /* AMD SEV-ES ASIDs resource */ MISC_CG_RES_SEV_ES, +#endif +#ifdef CONFIG_CGROUP_SGX_EPC + /* SGX EPC memory resource */ + MISC_CG_RES_SGX_EPC, #endif MISC_CG_RES_TYPES }; @@ -37,6 +41,7 @@ struct misc_res { unsigned long max; atomic_long_t usage; atomic_long_t events; + void *priv; /* per resource callback ops */ int (*misc_cg_alloc)(struct misc_cg *cg); @@ -59,6 +64,7 @@ struct misc_cg { struct misc_res res[MISC_CG_RES_TYPES]; }; +struct misc_cg *misc_cg_root(void); unsigned long misc_cg_res_total_usage(enum misc_res_type type); int misc_cg_set_capacity(enum misc_res_type type, unsigned long capacity); int misc_cg_try_charge(enum misc_res_type type, struct misc_cg *cg, @@ -80,6 +86,20 @@ static inline struct misc_cg *css_misc(struct cgroup_subsys_state *css) return css ? container_of(css, struct misc_cg, css) : NULL; } +/** + * misc_cg_parent() - Get the parent of the passed misc cgroup. + * @cgroup: cgroup whose parent needs to be fetched. + * + * Context: Any context. + * Return: + * * struct misc_cg* - Parent of the @cgroup. + * * %NULL - If @cgroup is null or the passed cgroup does not have a parent. + */ +static inline struct misc_cg *misc_cg_parent(struct misc_cg *cgroup) +{ + return cgroup ? css_misc(cgroup->css.parent) : NULL; +} + /* * get_current_misc_cg() - Find and get the misc cgroup of the current task. * @@ -104,6 +124,15 @@ static inline void put_misc_cg(struct misc_cg *cg) } #else /* !CONFIG_CGROUP_MISC */ +static inline struct misc_cg *misc_cg_root(void) +{ + return NULL; +} + +static inline struct misc_cg *misc_cg_parent(struct misc_cg *cg) +{ + return NULL; +} static inline unsigned long misc_cg_res_total_usage(enum misc_res_type type) { diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c index 3d17afd5b7a8..e1e506847dea 100644 --- a/kernel/cgroup/misc.c +++ b/kernel/cgroup/misc.c @@ -24,6 +24,10 @@ static const char *const misc_res_name[] = { /* AMD SEV-ES ASIDs resource */ "sev_es", #endif +#ifdef CONFIG_CGROUP_SGX_EPC + /* Intel SGX EPC memory bytes */ + "sgx_epc", +#endif }; /* Root misc cgroup */ @@ -40,18 +44,13 @@ static struct misc_cg root_cg; static unsigned long misc_res_capacity[MISC_CG_RES_TYPES]; /** - * parent_misc() - Get the parent of the passed misc cgroup. - * @cgroup: cgroup whose parent needs to be fetched. - * - * Context: Any context. - * Return: - * * struct misc_cg* - Parent of the @cgroup. - * * %NULL - If @cgroup is null or the passed cgroup does not have a parent. + * misc_cg_root() - Return the root misc cgroup. */ -static struct misc_cg *parent_misc(struct misc_cg *cgroup) +struct misc_cg *misc_cg_root(void) { - return cgroup ? css_misc(cgroup->css.parent) : NULL; + return &root_cg; } +EXPORT_SYMBOL_GPL(misc_cg_root); /** * valid_type() - Check if @type is valid or not. @@ -151,7 +150,7 @@ int misc_cg_try_charge(enum misc_res_type type, struct misc_cg *cg, if (!amount) return 0; - for (i = cg; i; i = parent_misc(i)) { + for (i = cg; i; i = misc_cg_parent(i)) { res = &i->res[type]; new_usage = atomic_long_add_return(amount, &res->usage); @@ -164,12 +163,12 @@ int misc_cg_try_charge(enum misc_res_type type, struct misc_cg *cg, return 0; err_charge: - for (j = i; j; j = parent_misc(j)) { + for (j = i; j; j = misc_cg_parent(j)) { atomic_long_inc(&j->res[type].events); cgroup_file_notify(&j->events_file); } - for (j = cg; j != i; j = parent_misc(j)) + for (j = cg; j != i; j = misc_cg_parent(j)) misc_cg_cancel_charge(type, j, amount); misc_cg_cancel_charge(type, i, amount); return ret; @@ -192,7 +191,7 @@ void misc_cg_uncharge(enum misc_res_type type, struct misc_cg *cg, if (!(amount && valid_type(type) && cg)) return; - for (i = cg; i; i = parent_misc(i)) + for (i = cg; i; i = misc_cg_parent(i)) misc_cg_cancel_charge(type, i, amount); } EXPORT_SYMBOL_GPL(misc_cg_uncharge); From patchwork Fri Dec 2 18:36:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29075 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1017048wrr; Fri, 2 Dec 2022 10:46:12 -0800 (PST) X-Google-Smtp-Source: AA0mqf5FUqJzcq4VNFJiaYZ/QcwZzEVlx/lugya+w87N6BQstKSP1URiHoNOgKGSBswz8VvxyGJS X-Received: by 2002:a17:907:767a:b0:7ad:e518:13fd with SMTP id kk26-20020a170907767a00b007ade51813fdmr61025506ejc.323.1670006772182; Fri, 02 Dec 2022 10:46:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006772; cv=none; d=google.com; s=arc-20160816; b=g+ctdjH1gikCwnC4IZ0/KC+s680PxY7qaXOAdN05ClvijFzmnyOn5BRPvfIm/XgaUv QSaJAWah5hDqr7alV9FtQUIXStc4rc+MrnksFpzzpfsUwh/LfIvaBvZr1vWOFQb/9m7k m8Q4KgB52J7Zdoi1MAcbTsSuFlKvIBa1yuF5tshn4VJSq7GhLbF3OFzR8cbrFNNMkPaM 7ii2KJPF5W9veMmnG18OW680vFJ/gQMOUHM2Lmp6+4zkkqEAXen0vrI/NGk5fU4lIzl9 yMmTMAgyt498oNvUHVnZUTDFZrxL+qyBC8B4IEEw7z59CrWp36QRRdVtn8QmfDRzqwOR Ha/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=12AN4GU0Q3Lnj2znEaN4130vGxs0u9jhpuCFKoVDgz0=; b=zfmfDkOajjnVpZfom09Z1b4ObDrm5zRRHys5F86My2wRawJTLlkFDCdWKefR4txxxe 12aJ/7dPehK3eUFEXf2TPWf54zuMz9lUDaQc4xXKGs548I1YhQenAVC/pZ77rerwqma7 V4vC5AfGNZ4bPo+e7+TB2pmsQBGTL3Qj4mjLtRrUSbLNIu+hy3NlD+LAjmq6/s/HXN9O ww19eXFsx55sh6CPrMnNK7PwH/XfsSuKq0YlnmDugeqPBaMuoVoAVMjkzlxpdXzDXXt6 o6gu1GDSMvIrnqD06uW3cJzDUciYKRShjP5SZ38UcwsZQtyZvbWIT7TKoLymBxhEL7Dx Dq8A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DEAGRubQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ho6-20020a1709070e8600b007429f0c69ccsi7846111ejc.579.2022.12.02.10.45.49; Fri, 02 Dec 2022 10:46:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DEAGRubQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234593AbiLBSjb (ORCPT + 99 others); Fri, 2 Dec 2022 13:39:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45592 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234570AbiLBSiy (ORCPT ); Fri, 2 Dec 2022 13:38:54 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2453D1BE84; Fri, 2 Dec 2022 10:37:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006270; x=1701542270; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RadumGQcoiq/xKIbEJnULzNUS8PkZvMSEIkwndEE9Xo=; b=DEAGRubQUsFalnLJCkEotKvUp46b/zMGEHazVotMCqc9IOHizdR6+K54 B2NtOCq4pBrHSWG7X0cTedSA0U6ZPv2AH9NXoSXcFYWguYrbMnauViRfE H07Pj6kGOr49Oy0z0Cgxv8Hib8bN2HPJ6nDTUfXV7daB+k3OKKBPw9Nkl Nz6yTFIUjXXYaSwA5r2t+DuFwI3kFkDsciZQuS6+ncv+2Wp2lGn2fFnP8 fOiH4BWaWOk7lbu4KNd4fGRv3xDoinT4yrD/8F5g0hhNv4tPHgIBlGPGZ w78t5SGLBTinF3pkI6apcJY9ZSvcpv2cRjG0nemcS20yEnCeczZ3sBvLm w==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724694" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724694" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:38 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717590" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717590" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:36 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson Subject: [PATCH v2 17/18] x86/sgx: Add support for misc cgroup controller Date: Fri, 2 Dec 2022 10:36:53 -0800 Message-Id: <20221202183655.3767674-18-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751129020934127842?= X-GMAIL-MSGID: =?utf-8?q?1751129020934127842?= Implement support for cgroup control of SGX Enclave Page Cache (EPC) memory using the misc cgroup controller. EPC memory is independent from normal system memory, e.g. must be reserved at boot from RAM and cannot be converted between EPC and normal memory while the system is running. EPC is managed by the SGX subsystem and is not accounted by the memory controller. Much like normal system memory, EPC memory can be overcommitted via virtual memory techniques and pages can be swapped out of the EPC to their backing store (normal system memory, e.g. shmem). The SGX EPC subsystem is analogous to the memory subsytem and the SGX EPC controller is in turn analogous to the memory controller; it implements limit and protection models for EPC memory. The misc controller provides a mechanism to set a hard limit of EPC usage via the "sgx_epc" resource in "misc.max". The total EPC memory available on the system is reported via the "sgx_epc" resource in "misc.capacity". This patch was modified from its original version to use the misc cgroup controller instead of a custom controller. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson --- arch/x86/Kconfig | 13 + arch/x86/kernel/cpu/sgx/Makefile | 1 + arch/x86/kernel/cpu/sgx/epc_cgroup.c | 539 +++++++++++++++++++++++++++ arch/x86/kernel/cpu/sgx/epc_cgroup.h | 59 +++ arch/x86/kernel/cpu/sgx/main.c | 86 ++++- arch/x86/kernel/cpu/sgx/sgx.h | 6 +- 6 files changed, 688 insertions(+), 16 deletions(-) create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.c create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index f9920f1341c8..0eeae4ebe1c3 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1936,6 +1936,19 @@ config X86_SGX If unsure, say N. +config CGROUP_SGX_EPC + bool "Miscellaneous Cgroup Controller for Enclave Page Cache (EPC) for Intel SGX" + depends on X86_SGX && CGROUP_MISC + help + Provides control over the EPC footprint of tasks in a cgroup via + the Miscellaneous cgroup controller. + + EPC is a subset of regular memory that is usable only by SGX + enclaves and is very limited in quantity, e.g. less than 1% + of total DRAM. + + Say N if unsure. + config EFI bool "EFI runtime service support" depends on ACPI diff --git a/arch/x86/kernel/cpu/sgx/Makefile b/arch/x86/kernel/cpu/sgx/Makefile index 9c1656779b2a..12901a488da7 100644 --- a/arch/x86/kernel/cpu/sgx/Makefile +++ b/arch/x86/kernel/cpu/sgx/Makefile @@ -4,3 +4,4 @@ obj-y += \ ioctl.o \ main.o obj-$(CONFIG_X86_SGX_KVM) += virt.o +obj-$(CONFIG_CGROUP_SGX_EPC) += epc_cgroup.o diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.c b/arch/x86/kernel/cpu/sgx/epc_cgroup.c new file mode 100644 index 000000000000..d668a67fde84 --- /dev/null +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c @@ -0,0 +1,539 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright(c) 2022 Intel Corporation. + +#include +#include +#include +#include +#include +#include + +#include "epc_cgroup.h" + +#define SGX_EPC_RECLAIM_MIN_PAGES 16UL +#define SGX_EPC_RECLAIM_MAX_PAGES 64UL +#define SGX_EPC_RECLAIM_IGNORE_AGE_THRESHOLD 5 +#define SGX_EPC_RECLAIM_OOM_THRESHOLD 5 + +static struct workqueue_struct *sgx_epc_cg_wq; + +struct sgx_epc_reclaim_control { + struct sgx_epc_cgroup *epc_cg; + int nr_fails; + bool ignore_age; +}; + +static inline unsigned long sgx_epc_cgroup_page_counter_read(struct sgx_epc_cgroup *epc_cg) +{ + return atomic_long_read(&epc_cg->cg->res[MISC_CG_RES_SGX_EPC].usage) / PAGE_SIZE; +} + +static inline unsigned long sgx_epc_cgroup_max_pages(struct sgx_epc_cgroup *epc_cg) +{ + return READ_ONCE(epc_cg->cg->res[MISC_CG_RES_SGX_EPC].max) / PAGE_SIZE; +} + +static inline struct sgx_epc_cgroup *sgx_epc_cgroup_from_misc_cg(struct misc_cg *cg) +{ + if (cg) + return (struct sgx_epc_cgroup *)(cg->res[MISC_CG_RES_SGX_EPC].priv); + + return NULL; +} + +static inline struct sgx_epc_cgroup *parent_epc_cgroup(struct sgx_epc_cgroup *epc_cg) +{ + return sgx_epc_cgroup_from_misc_cg(misc_cg_parent(epc_cg->cg)); +} + +static inline bool sgx_epc_cgroup_disabled(void) +{ + return !cgroup_subsys_enabled(misc_cgrp_subsys); +} + +/** + * sgx_epc_cgroup_iter - iterate over the EPC cgroup hierarchy + * @root: hierarchy root + * @prev: previously returned epc_cg, NULL on first invocation + * @reclaim_epoch: epoch for shared reclaim walks, NULL for full walks + * + * Return: references to children of the hierarchy below @root, or + * @root itself, or %NULL after a full round-trip. + * + * Caller must pass the return value in @prev on subsequent invocations + * for reference counting, or use sgx_epc_cgroup_iter_break() to cancel + * a hierarchy walk before the round-trip is complete. + */ +static struct sgx_epc_cgroup *sgx_epc_cgroup_iter(struct sgx_epc_cgroup *prev, + struct sgx_epc_cgroup *root, + unsigned long *reclaim_epoch) +{ + struct cgroup_subsys_state *css = NULL; + struct sgx_epc_cgroup *epc_cg = NULL; + struct sgx_epc_cgroup *pos = NULL; + bool inc_epoch = false; + + if (sgx_epc_cgroup_disabled()) + return NULL; + + if (!root) + root = sgx_epc_cgroup_from_misc_cg(misc_cg_root()); + + if (prev && !reclaim_epoch) + pos = prev; + + rcu_read_lock(); + +start: + if (reclaim_epoch) { + /* + * Abort the walk if a reclaimer working from the same root has + * started a new walk after this reclaimer has already scanned + * at least one cgroup. + */ + if (prev && *reclaim_epoch != root->epoch) + goto out; + + while (1) { + pos = READ_ONCE(root->reclaim_iter); + if (!pos || css_tryget(&pos->cg->css)) + break; + + /* + * The css is dying, clear the reclaim_iter immediately + * instead of waiting for ->css_released to be called. + * Busy waiting serves no purpose and attempting to wait + * for ->css_released may actually block it from being + * called. + */ + (void)cmpxchg(&root->reclaim_iter, pos, NULL); + } + } + + if (pos) + css = &pos->cg->css; + + while (!epc_cg) { + struct misc_cg *cg; + + css = css_next_descendant_pre(css, &root->cg->css); + if (!css) { + /* + * Increment the epoch as we've reached the end of the + * tree and the next call to css_next_descendant_pre + * will restart at root. Do not update root->epoch + * directly as we should only do so if we update the + * reclaim_iter, i.e. a different thread may win the + * race and update the epoch for us. + */ + inc_epoch = true; + + /* + * Reclaimers share the hierarchy walk, and a new one + * might jump in at the end of the hierarchy. Restart + * at root so that we don't return NULL on a thread's + * initial call. + */ + if (!prev) + continue; + break; + } + + cg = css_misc(css); + /* + * Verify the css and acquire a reference. Don't take an + * extra reference to root as it's either the global root + * or is provided by the caller and so is guaranteed to be + * alive. Keep walking if this css is dying. + */ + if (cg != root->cg && !css_tryget(&cg->css)) + continue; + + epc_cg = sgx_epc_cgroup_from_misc_cg(cg); + } + + if (reclaim_epoch) { + /* + * reclaim_iter could have already been updated by a competing + * thread; check that the value hasn't changed since we read + * it to avoid reclaiming from the same cgroup twice. If the + * value did change, put all of our references and restart the + * entire process, for all intents and purposes we're making a + * new call. + */ + if (cmpxchg(&root->reclaim_iter, pos, epc_cg) != pos) { + if (epc_cg && epc_cg != root) + put_misc_cg(epc_cg->cg); + if (pos) + put_misc_cg(pos->cg); + css = NULL; + epc_cg = NULL; + inc_epoch = false; + goto start; + } + + if (inc_epoch) + root->epoch++; + if (!prev) + *reclaim_epoch = root->epoch; + + if (pos) + put_misc_cg(pos->cg); + } + +out: + rcu_read_unlock(); + if (prev && prev != root) + put_misc_cg(prev->cg); + + return epc_cg; +} + +/** + * sgx_epc_cgroup_iter_break - abort a hierarchy walk prematurely + * @prev: last visited cgroup as returned by sgx_epc_cgroup_iter() + * @root: hierarchy root + */ +static void sgx_epc_cgroup_iter_break(struct sgx_epc_cgroup *prev, + struct sgx_epc_cgroup *root) +{ + if (!root) + root = sgx_epc_cgroup_from_misc_cg(misc_cg_root()); + if (prev && prev != root) + put_misc_cg(prev->cg); +} + +/** + * sgx_epc_cgroup_lru_empty - check if a cgroup tree has no pages on its lrus + * @root: root of the tree to check + * + * Return: %true if all cgroups under the specified root have empty LRU lists. + * Used to avoid livelocks due to a cgroup having a non-zero charge count but + * no pages on its LRUs, e.g. due to a dead enclave waiting to be released or + * because all pages in the cgroup are unreclaimable. + */ +bool sgx_epc_cgroup_lru_empty(struct sgx_epc_cgroup *root) +{ + struct sgx_epc_cgroup *epc_cg; + + for (epc_cg = sgx_epc_cgroup_iter(NULL, root, NULL); + epc_cg; + epc_cg = sgx_epc_cgroup_iter(epc_cg, root, NULL)) { + if (!list_empty(&epc_cg->lru.reclaimable)) { + sgx_epc_cgroup_iter_break(epc_cg, root); + return false; + } + } + return true; +} + +/** + * sgx_epc_cgroup_isolate_pages - walk a cgroup tree and separate pages + * @root: root of the tree to start walking + * @nr_to_scan: The number of pages that need to be isolated + * @dst: Destination list to hold the isolated pages + * + * Walk the cgroup tree and isolate the pages in the hierarchy + * for reclaiming. + */ +void sgx_epc_cgroup_isolate_pages(struct sgx_epc_cgroup *root, + int *nr_to_scan, struct list_head *dst) +{ + struct sgx_epc_cgroup *epc_cg; + unsigned long epoch; + + if (!*nr_to_scan) + return; + + for (epc_cg = sgx_epc_cgroup_iter(NULL, root, &epoch); + epc_cg; + epc_cg = sgx_epc_cgroup_iter(epc_cg, root, &epoch)) { + sgx_isolate_epc_pages(&epc_cg->lru, nr_to_scan, dst); + if (!*nr_to_scan) { + sgx_epc_cgroup_iter_break(epc_cg, root); + break; + } + } +} + +static int sgx_epc_cgroup_reclaim_pages(unsigned long nr_pages, + struct sgx_epc_reclaim_control *rc) +{ + /* + * Ensure sgx_reclaim_pages is called with a minimum and maximum + * number of pages. Attempting to reclaim only a few pages will + * often fail and is inefficient, while reclaiming a huge number + * of pages can result in soft lockups due to holding various + * locks for an extended duration. This also bounds nr_pages so + */ + nr_pages = max(nr_pages, SGX_EPC_RECLAIM_MIN_PAGES); + nr_pages = min(nr_pages, SGX_EPC_RECLAIM_MAX_PAGES); + + return sgx_reclaim_epc_pages(nr_pages, rc->ignore_age, rc->epc_cg); +} + +static int sgx_epc_cgroup_reclaim_failed(struct sgx_epc_reclaim_control *rc) +{ + if (sgx_epc_cgroup_lru_empty(rc->epc_cg)) + return -ENOMEM; + + ++rc->nr_fails; + if (rc->nr_fails > SGX_EPC_RECLAIM_IGNORE_AGE_THRESHOLD) + rc->ignore_age = true; + + return 0; +} + +static inline +void sgx_epc_reclaim_control_init(struct sgx_epc_reclaim_control *rc, + struct sgx_epc_cgroup *epc_cg) +{ + rc->epc_cg = epc_cg; + rc->nr_fails = 0; + rc->ignore_age = false; +} + +/* + * Scheduled by sgx_epc_cgroup_try_charge() to reclaim pages from the + * cgroup when the cgroup is at/near its maximum capacity + */ +static void sgx_epc_cgroup_reclaim_work_func(struct work_struct *work) +{ + struct sgx_epc_reclaim_control rc; + struct sgx_epc_cgroup *epc_cg; + unsigned long cur, max; + + epc_cg = container_of(work, struct sgx_epc_cgroup, reclaim_work); + + sgx_epc_reclaim_control_init(&rc, epc_cg); + + for (;;) { + max = sgx_epc_cgroup_max_pages(epc_cg); + + /* + * Adjust the limit down by one page, the goal is to free up + * pages for fault allocations, not to simply obey the limit. + * Conditionally decrementing max also means the cur vs. max + * check will correctly handle the case where both are zero. + */ + if (max) + max--; + + /* + * Unless the limit is extremely low, in which case forcing + * reclaim will likely cause thrashing, force the cgroup to + * reclaim at least once if it's operating *near* its maximum + * limit by adjusting @max down by half the min reclaim size. + * This work func is scheduled by sgx_epc_cgroup_try_charge + * when it cannot directly reclaim due to being in an atomic + * context, e.g. EPC allocation in a fault handler. Waiting + * to reclaim until the cgroup is actually at its limit is less + * performant as it means the faulting task is effectively + * blocked until a worker makes its way through the global work + * queue. + */ + if (max > SGX_EPC_RECLAIM_MAX_PAGES) + max -= (SGX_EPC_RECLAIM_MIN_PAGES/2); + + cur = sgx_epc_cgroup_page_counter_read(epc_cg); + if (cur <= max) + break; + + if (!sgx_epc_cgroup_reclaim_pages(cur - max, &rc)) { + if (sgx_epc_cgroup_reclaim_failed(&rc)) + break; + } + } +} + +static int __sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg, + unsigned long nr_pages, bool reclaim) +{ + struct sgx_epc_reclaim_control rc; + unsigned long cur, max, over; + unsigned int nr_empty = 0; + + if (epc_cg == sgx_epc_cgroup_from_misc_cg(misc_cg_root())) { + misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg, + nr_pages * PAGE_SIZE); + return 0; + } + + sgx_epc_reclaim_control_init(&rc, NULL); + + for (;;) { + if (!misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg, + nr_pages * PAGE_SIZE)) + break; + + rc.epc_cg = epc_cg; + max = sgx_epc_cgroup_max_pages(rc.epc_cg); + if (nr_pages > max) + return -ENOMEM; + + if (signal_pending(current)) + return -ERESTARTSYS; + + if (!reclaim) { + queue_work(sgx_epc_cg_wq, &rc.epc_cg->reclaim_work); + return -EBUSY; + } + + cur = sgx_epc_cgroup_page_counter_read(rc.epc_cg); + over = ((cur + nr_pages) > max) ? + (cur + nr_pages) - max : SGX_EPC_RECLAIM_MIN_PAGES; + + if (!sgx_epc_cgroup_reclaim_pages(over, &rc)) { + if (sgx_epc_cgroup_reclaim_failed(&rc)) { + if (++nr_empty > SGX_EPC_RECLAIM_OOM_THRESHOLD) + return -ENOMEM; + schedule(); + } + } + } + + css_get_many(&epc_cg->cg->css, nr_pages); + + return 0; +} + + +/** + * sgx_epc_cgroup_try_charge - hierarchically try to charge a single EPC page + * @mm: the mm_struct of the process to charge + * @reclaim: whether or not synchronous reclaim is allowed + * + * Returns EPC cgroup or NULL on success, -errno on failure. + */ +struct sgx_epc_cgroup *sgx_epc_cgroup_try_charge(struct mm_struct *mm, + bool reclaim) +{ + struct sgx_epc_cgroup *epc_cg; + int ret; + + if (sgx_epc_cgroup_disabled()) + return NULL; + + epc_cg = sgx_epc_cgroup_from_misc_cg(get_current_misc_cg()); + ret = __sgx_epc_cgroup_try_charge(epc_cg, 1, reclaim); + put_misc_cg(epc_cg->cg); + + if (ret) + return ERR_PTR(ret); + + return epc_cg; +} + +/** + * sgx_epc_cgroup_uncharge - hierarchically uncharge EPC pages + * @epc_cg: the charged epc cgroup + */ +void sgx_epc_cgroup_uncharge(struct sgx_epc_cgroup *epc_cg) +{ + if (sgx_epc_cgroup_disabled()) + return; + + misc_cg_uncharge(MISC_CG_RES_SGX_EPC, epc_cg->cg, PAGE_SIZE); + + if (epc_cg->cg != misc_cg_root()) + put_misc_cg(epc_cg->cg); +} + +static void sgx_epc_cgroup_oom(struct sgx_epc_cgroup *root) +{ + struct sgx_epc_cgroup *epc_cg; + + for (epc_cg = sgx_epc_cgroup_iter(NULL, root, NULL); + epc_cg; + epc_cg = sgx_epc_cgroup_iter(epc_cg, root, NULL)) { + if (sgx_epc_oom(&epc_cg->lru)) { + sgx_epc_cgroup_iter_break(epc_cg, root); + return; + } + } +} + +static void sgx_epc_cgroup_released(struct misc_cg *cg) +{ + struct sgx_epc_cgroup *dead_cg; + struct sgx_epc_cgroup *epc_cg; + + epc_cg = sgx_epc_cgroup_from_misc_cg(cg); + dead_cg = epc_cg; + + while ((epc_cg = parent_epc_cgroup(epc_cg))) + cmpxchg(&epc_cg->reclaim_iter, dead_cg, NULL); +} + +static void sgx_epc_cgroup_free(struct misc_cg *cg) +{ + struct sgx_epc_cgroup *epc_cg; + + epc_cg = sgx_epc_cgroup_from_misc_cg(cg); + cancel_work_sync(&epc_cg->reclaim_work); + kfree(epc_cg); +} + +static void sgx_epc_cgroup_max_write(struct misc_cg *cg) +{ + struct sgx_epc_reclaim_control rc; + struct sgx_epc_cgroup *epc_cg; + unsigned int nr_empty = 0; + unsigned long cur, max; + + epc_cg = sgx_epc_cgroup_from_misc_cg(cg); + + sgx_epc_reclaim_control_init(&rc, epc_cg); + + max = sgx_epc_cgroup_max_pages(epc_cg); + + for (;;) { + cur = sgx_epc_cgroup_page_counter_read(epc_cg); + if (cur <= max) + break; + + if (signal_pending(current)) + break; + + if (!sgx_epc_cgroup_reclaim_pages(cur - max, &rc)) { + if (sgx_epc_cgroup_reclaim_failed(&rc)) { + if (++nr_empty > SGX_EPC_RECLAIM_OOM_THRESHOLD) + sgx_epc_cgroup_oom(epc_cg); + schedule(); + } + } + } +} + +static int sgx_epc_cgroup_alloc(struct misc_cg *cg) +{ + struct sgx_epc_cgroup *epc_cg; + + epc_cg = kzalloc(sizeof(struct sgx_epc_cgroup), GFP_KERNEL); + if (!epc_cg) + return -ENOMEM; + + sgx_lru_init(&epc_cg->lru); + INIT_WORK(&epc_cg->reclaim_work, sgx_epc_cgroup_reclaim_work_func); + cg->res[MISC_CG_RES_SGX_EPC].misc_cg_alloc = sgx_epc_cgroup_alloc; + cg->res[MISC_CG_RES_SGX_EPC].misc_cg_free = sgx_epc_cgroup_free; + cg->res[MISC_CG_RES_SGX_EPC].misc_cg_released = sgx_epc_cgroup_released; + cg->res[MISC_CG_RES_SGX_EPC].misc_cg_max_write = sgx_epc_cgroup_max_write; + cg->res[MISC_CG_RES_SGX_EPC].priv = epc_cg; + epc_cg->cg = cg; + return 0; +} + +static int __init sgx_epc_cgroup_init(void) +{ + if (!boot_cpu_has(X86_FEATURE_SGX)) + return 0; + + sgx_epc_cg_wq = alloc_workqueue("sgx_epc_cg_wq", + WQ_UNBOUND | WQ_FREEZABLE, + WQ_UNBOUND_MAX_ACTIVE); + BUG_ON(!sgx_epc_cg_wq); + + return sgx_epc_cgroup_alloc(misc_cg_root()); +} +subsys_initcall(sgx_epc_cgroup_init); diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.h b/arch/x86/kernel/cpu/sgx/epc_cgroup.h new file mode 100644 index 000000000000..bc358934dbe2 --- /dev/null +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.h @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright(c) 2022 Intel Corporation. */ +#ifndef _INTEL_SGX_EPC_CGROUP_H_ +#define _INTEL_SGX_EPC_CGROUP_H_ + +#include +#include +#include +#include +#include +#include + +#include "sgx.h" + +#ifndef CONFIG_CGROUP_SGX_EPC +#define MISC_CG_RES_SGX_EPC MISC_CG_RES_TYPES +struct sgx_epc_cgroup; + +static inline struct sgx_epc_cgroup *sgx_epc_cgroup_try_charge(struct mm_struct *mm, + bool reclaim) +{ + return NULL; +} +static inline void sgx_epc_cgroup_uncharge(struct sgx_epc_cgroup *epc_cg) { } +static inline void sgx_epc_cgroup_isolate_pages(struct sgx_epc_cgroup *root, + int *nr_to_scan, + struct list_head *dst) { } +static inline struct sgx_epc_lru_lists *epc_cg_lru(struct sgx_epc_cgroup *epc_cg) +{ + return NULL; +} +static bool sgx_epc_cgroup_lru_empty(struct sgx_epc_cgroup *root) +{ + return true; +} +#else +struct sgx_epc_cgroup { + struct misc_cg *cg; + struct sgx_epc_lru_lists lru; + struct sgx_epc_cgroup *reclaim_iter; + struct work_struct reclaim_work; + unsigned int epoch; +}; + +struct sgx_epc_cgroup *sgx_epc_cgroup_try_charge(struct mm_struct *mm, + bool reclaim); +void sgx_epc_cgroup_uncharge(struct sgx_epc_cgroup *epc_cg); +bool sgx_epc_cgroup_lru_empty(struct sgx_epc_cgroup *root); +void sgx_epc_cgroup_isolate_pages(struct sgx_epc_cgroup *root, + int *nr_to_scan, struct list_head *dst); +static inline struct sgx_epc_lru_lists *epc_cg_lru(struct sgx_epc_cgroup *epc_cg) +{ + if (epc_cg) + return &epc_cg->lru; + return NULL; +} +#endif + +#endif /* _INTEL_SGX_EPC_CGROUP_H_ */ diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 70046c4e332a..a9d5cfd4e024 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -17,6 +18,7 @@ #include "driver.h" #include "encl.h" #include "encls.h" +#include "epc_cgroup.h" #define SGX_MAX_NR_TO_RECLAIM 32 @@ -33,9 +35,20 @@ static DEFINE_XARRAY(sgx_epc_address_space); static struct sgx_epc_lru_lists sgx_global_lru; static inline struct sgx_epc_lru_lists *sgx_lru_lists(struct sgx_epc_page *epc_page) { + if (IS_ENABLED(CONFIG_CGROUP_SGX_EPC)) + return epc_cg_lru(epc_page->epc_cg); + return &sgx_global_lru; } +static inline bool sgx_can_reclaim(void) +{ + if (!IS_ENABLED(CONFIG_CGROUP_SGX_EPC)) + return !list_empty(&sgx_global_lru.reclaimable); + + return !sgx_epc_cgroup_lru_empty(NULL); +} + static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0); /* Nodes with one or more EPC sections. */ @@ -320,9 +333,10 @@ void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lru, int *nr_to_scan, } /** - * sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers + * __sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers * @nr_to_scan: Number of EPC pages to scan for reclaim * @ignore_age: Reclaim a page even if it is young + * @epc_cg: EPC cgroup from which to reclaim * * Take a fixed number of pages from the head of the active page pool and * reclaim them to the enclave's private shmem files. Skip the pages, which have @@ -336,7 +350,8 @@ void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lru, int *nr_to_scan, * problematic as it would increase the lock contention too much, which would * halt forward progress. */ -static int __sgx_reclaim_pages(int nr_to_scan, bool ignore_age) +static int __sgx_reclaim_epc_pages(int nr_to_scan, bool ignore_age, + struct sgx_epc_cgroup *epc_cg) { struct sgx_backing backing[SGX_MAX_NR_TO_RECLAIM]; struct sgx_epc_page *epc_page, *tmp; @@ -347,7 +362,15 @@ static int __sgx_reclaim_pages(int nr_to_scan, bool ignore_age) int i = 0; int ret; - sgx_isolate_epc_pages(&sgx_global_lru, &nr_to_scan, &iso); + /* + * If a specific cgroup is not being targetted, take from the global + * list first, even when cgroups are enabled. If there are + * pages on the global LRU then they should get reclaimed asap. + */ + if (!IS_ENABLED(CONFIG_CGROUP_SGX_EPC) || !epc_cg) + sgx_isolate_epc_pages(&sgx_global_lru, &nr_to_scan, &iso); + + sgx_epc_cgroup_isolate_pages(epc_cg, &nr_to_scan, &iso); if (list_empty(&iso)) return 0; @@ -397,25 +420,33 @@ static int __sgx_reclaim_pages(int nr_to_scan, bool ignore_age) SGX_EPC_PAGE_ENCLAVE | SGX_EPC_PAGE_VERSION_ARRAY); + if (epc_page->epc_cg) { + sgx_epc_cgroup_uncharge(epc_page->epc_cg); + epc_page->epc_cg = NULL; + } + sgx_free_epc_page(epc_page); } return i; } -int sgx_reclaim_epc_pages(int nr_to_scan, bool ignore_age) +/** + * sgx_reclaim_epc_pages() - wrapper for __sgx_reclaim_epc_pages which + * calls cond_resched() upon completion. + * @nr_to_scan: Number of EPC pages to scan for reclaim + * @ignore_age: Reclaim a page even if it is young + * @epc_cg: EPC cgroup from which to reclaim + */ +int sgx_reclaim_epc_pages(int nr_to_scan, bool ignore_age, + struct sgx_epc_cgroup *epc_cg) { int ret; - ret = __sgx_reclaim_pages(nr_to_scan, ignore_age); + ret = __sgx_reclaim_epc_pages(nr_to_scan, ignore_age, epc_cg); cond_resched(); return ret; } -static bool sgx_can_reclaim(void) -{ - return !list_empty(&sgx_global_lru.reclaimable); -} - static bool sgx_should_reclaim(unsigned long watermark) { return atomic_long_read(&sgx_nr_free_pages) < watermark && @@ -432,7 +463,7 @@ static bool sgx_should_reclaim(unsigned long watermark) void sgx_reclaim_direct(void) { if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) - __sgx_reclaim_pages(SGX_NR_TO_SCAN, false); + __sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false, NULL); } static int ksgxd(void *p) @@ -458,7 +489,7 @@ static int ksgxd(void *p) sgx_should_reclaim(SGX_NR_HIGH_PAGES)); if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) - sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false, NULL); } return 0; @@ -620,6 +651,11 @@ int sgx_drop_epc_page(struct sgx_epc_page *page) struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) { struct sgx_epc_page *page; + struct sgx_epc_cgroup *epc_cg; + + epc_cg = sgx_epc_cgroup_try_charge(current->mm, reclaim); + if (IS_ERR(epc_cg)) + return ERR_CAST(epc_cg); for ( ; ; ) { page = __sgx_alloc_epc_page(); @@ -628,8 +664,10 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - if (!sgx_can_reclaim()) - return ERR_PTR(-ENOMEM); + if (!sgx_can_reclaim()) { + page = ERR_PTR(-ENOMEM); + break; + } if (!reclaim) { page = ERR_PTR(-EBUSY); @@ -641,7 +679,14 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) break; } - sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false, NULL); + } + + if (!IS_ERR(page)) { + WARN_ON(page->epc_cg); + page->epc_cg = epc_cg; + } else { + sgx_epc_cgroup_uncharge(epc_cg); } if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) @@ -674,6 +719,12 @@ void sgx_free_epc_page(struct sgx_epc_page *page) page->flags = SGX_EPC_PAGE_IS_FREE; spin_unlock(&node->lock); + + if (page->epc_cg) { + sgx_epc_cgroup_uncharge(page->epc_cg); + page->epc_cg = NULL; + } + atomic_long_inc(&sgx_nr_free_pages); } @@ -838,6 +889,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, section->pages[i].flags = 0; section->pages[i].encl_owner = NULL; section->pages[i].poison = 0; + section->pages[i].epc_cg = NULL; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } @@ -1002,6 +1054,7 @@ static void __init arch_update_sysfs_visibility(int nid) {} static bool __init sgx_page_cache_init(void) { u32 eax, ebx, ecx, edx, type; + u64 capacity = 0; u64 pa, size; int nid; int i; @@ -1052,6 +1105,7 @@ static bool __init sgx_page_cache_init(void) sgx_epc_sections[i].node = &sgx_numa_nodes[nid]; sgx_numa_nodes[nid].size += size; + capacity += size; sgx_nr_epc_sections++; } @@ -1061,6 +1115,8 @@ static bool __init sgx_page_cache_init(void) return false; } + misc_cg_set_capacity(MISC_CG_RES_SGX_EPC, capacity); + return true; } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 1c666b25294b..defb48f51145 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -35,6 +35,8 @@ #define SGX_EPC_PAGE_ENCLAVE BIT(4) #define SGX_EPC_PAGE_VERSION_ARRAY BIT(5) +struct sgx_epc_cgroup; + struct sgx_epc_page { unsigned int section; u16 flags; @@ -46,6 +48,7 @@ struct sgx_epc_page { struct sgx_encl *encl; }; struct list_head list; + struct sgx_epc_cgroup *epc_cg; }; /* @@ -206,7 +209,8 @@ void sgx_reclaim_direct(void); void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags); int sgx_drop_epc_page(struct sgx_epc_page *page); struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); -int sgx_reclaim_epc_pages(int nr_to_scan, bool ignore_age); +int sgx_reclaim_epc_pages(int nr_to_scan, bool ignore_age, + struct sgx_epc_cgroup *epc_cg); void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lrus, int *nr_to_scan, struct list_head *dst); bool sgx_epc_oom(struct sgx_epc_lru_lists *lrus); From patchwork Fri Dec 2 18:36:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 29077 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1017217wrr; Fri, 2 Dec 2022 10:46:36 -0800 (PST) X-Google-Smtp-Source: AA0mqf4GqVZENDCTvBNpvIOd/cARwXbPg/0szcEGeWevAov4rehoejkks1JI54fJj7T4wGMU5I4v X-Received: by 2002:aa7:c986:0:b0:46b:b010:3f43 with SMTP id c6-20020aa7c986000000b0046bb0103f43mr12333284edt.215.1670006796420; Fri, 02 Dec 2022 10:46:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670006796; cv=none; d=google.com; s=arc-20160816; b=HbVlNvY1xalfwVViju6l43dmTDsPhVPYYMb+0TmDzqDf46D0cbaL4kK+FtNKy7rjUl fUXyPrBUnC2jq1MhIwKGuRae1g0ey9PHmI7mzRsCtn2RCIAkNKPWJziKyANxK/+mRwaV vfSYiKvEJg/sNASux52ZvcpC4+NnNpwfN+NfLwSM0bnpPpgkldP1nGjMhBPBc7n7wMv/ IgUc0x4Obq+M9n4zw8SjzpE1WU0sv7+KrBErFNe40N8ZhEfn9UjaNQKyoOHhNqvY3MzE UkUmxBNBVn4cOs33PLIXX8iKyAW+4Q2KU8CneQp8qBJT39YzidAAM0UvArPKbIz65twa 3KfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=uj9mBIf8hxc0GqaFA7/edc3NdxazH/JnOj3sPeMuYoU=; b=Gkwh2hw4mEX13KMvpUBJq4iW1hMAiubMWNwsD3yRWGP/88PqbCjPTelQSzXvxMzmp4 Wn4w/z1TziAnIVnJ6zvVi4nhsAdWUZbFLFPZ/c9OMYmsDHYX54sG+7LcOvRS2eovI+49 QWQK6ymkdR7gXrxpFb0b54mYE+gKby+y1cigaDEtxuKIyO41Pj4m3z90IJJ2BEHzU1ar mGawh254Ugkqe7F83XKry2evCzisoXX8oC9qWugCZBiVbRilkeYR3qCeMuM7PW1awJ3H xQBYhMj7a0hSAUe0K811fhPcJZaT7D/E9Nw0dRE6zMIDxeP4Csrgis5OMTzbDhcYmEF1 wsZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XJHMzZBb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gn2-20020a1709070d0200b007ae2dfe020bsi7421811ejc.783.2022.12.02.10.46.12; Fri, 02 Dec 2022 10:46:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XJHMzZBb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234576AbiLBSje (ORCPT + 99 others); Fri, 2 Dec 2022 13:39:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234571AbiLBSiz (ORCPT ); Fri, 2 Dec 2022 13:38:55 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 244A81B7BD; Fri, 2 Dec 2022 10:37:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670006270; x=1701542270; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rCrTrxnacTNys7o3ZdRVf0ByGL11XnpENVPAAjiGcmg=; b=XJHMzZBbYbSEjBNhuqCxyjaASKLb1LTnWrYJ7Pp5szrBE+009Gquzvkh 46pLGwsIQe0tv2xAVu+zPP9R5gffCN45eS5yjmnY9sa8pn8uk2xmhuEd0 sLA+swaeuh6NbsHHafqzgB/RNTFTLPLVBY/Qs0eePbiifNU83ppz6TvkK 7yUVuRQTqu1Ur4tpUShj2QGArMtuWdNsGj9JPFq4Rax0S3WQRv+h5JaDk tCDoZyqiaaKr8dSUKC2epqvBzh4uaKsMcSGrIs4tO4XrTp2mKvb4+YWYz j9ntrNM7yUn+e8e/sE8W7CTHqAYnmE2Tb9LWiKqDZkqXd6kX5N2itpSTY g==; X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="314724704" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="314724704" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:41 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10549"; a="713717597" X-IronPort-AV: E=Sophos;i="5.96,213,1665471600"; d="scan'208";a="713717597" Received: from kcaskeyx-mobl1.amr.corp.intel.com (HELO kcaccard-desk.amr.corp.intel.com) ([10.251.1.207]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2022 10:37:38 -0800 From: Kristen Carlson Accardi To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, cgroups@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" , Jonathan Corbet Cc: zhiquan1.li@intel.com, Kristen Carlson Accardi , Sean Christopherson , Bagas Sanjaya , linux-doc@vger.kernel.org Subject: [PATCH v2 18/18] Docs/x86/sgx: Add description for cgroup support Date: Fri, 2 Dec 2022 10:36:54 -0800 Message-Id: <20221202183655.3767674-19-kristen@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221202183655.3767674-1-kristen@linux.intel.com> References: <20221202183655.3767674-1-kristen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751129046827811408?= X-GMAIL-MSGID: =?utf-8?q?1751129046827811408?= Add initial documentation of how to regulate the distribution of SGX Enclave Page Cache (EPC) memory via the Miscellaneous cgroup controller. Signed-off-by: Sean Christopherson Signed-off-by: Kristen Carlson Accardi Cc: Sean Christopherson Reviewed-by: Bagas Sanjaya --- Documentation/x86/sgx.rst | 77 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/Documentation/x86/sgx.rst b/Documentation/x86/sgx.rst index 2bcbffacbed5..f6ca5594dcf2 100644 --- a/Documentation/x86/sgx.rst +++ b/Documentation/x86/sgx.rst @@ -300,3 +300,80 @@ to expected failures and handle them as follows: first call. It indicates a bug in the kernel or the userspace client if any of the second round of ``SGX_IOC_VEPC_REMOVE_ALL`` calls has a return code other than 0. + + +Cgroup Support +============== + +The "sgx_epc" resource within the Miscellaneous cgroup controller regulates +distribution of SGX EPC memory, which is a subset of system RAM that +is used to provide SGX-enabled applications with protected memory, +and is otherwise inaccessible, i.e. shows up as reserved in +/proc/iomem and cannot be read/written outside of an SGX enclave. + +Although current systems implement EPC by stealing memory from RAM, +for all intents and purposes the EPC is independent from normal system +memory, e.g. must be reserved at boot from RAM and cannot be converted +between EPC and normal memory while the system is running. The EPC is +managed by the SGX subsystem and is not accounted by the memory +controller. Note that this is true only for EPC memory itself, i.e. +normal memory allocations related to SGX and EPC memory, e.g. the +backing memory for evicted EPC pages, are accounted, limited and +protected by the memory controller. + +Much like normal system memory, EPC memory can be overcommitted via +virtual memory techniques and pages can be swapped out of the EPC +to their backing store (normal system memory allocated via shmem). +The SGX EPC subsystem is analogous to the memory subsytem, and +it implements limit and protection models for EPC memory. + +SGX EPC Interface Files +----------------------- + +For a generic description of the Miscellaneous controller interface +files, please see Documentation/admin-guide/cgroup-v2.rst + +All SGX EPC memory amounts are in bytes unless explicitly stated +otherwise. If a value which is not PAGE_SIZE aligned is written, +the actual value used by the controller will be rounded down to +the closest PAGE_SIZE multiple. + + misc.capacity + A read-only flat-keyed file shown only in the root cgroup. + The sgx_epc resource will show the total amount of EPC + memory available on the platform. + + misc.current + A read-only flat-keyed file shown in the non-root cgroups. + The sgx_epc resource will show the current active EPC memory + usage of the cgroup and its descendants. EPC pages that are + swapped out to backing RAM are not included in the current count. + + misc.max + A read-write single value file which exists on non-root + cgroups. The sgx_epc resource will show the EPC usage + hard limit. The default is "max". + + If a cgroup's EPC usage reaches this limit, EPC allocations, + e.g. for page fault handling, will be blocked until EPC can + be reclaimed from the cgroup. If EPC cannot be reclaimed in + a timely manner, reclaim will be forced, e.g. by ignoring LRU. + + misc.events + A read-write flat-keyed file which exists on non-root cgroups. + Writes to the file reset the event counters to zero. A value + change in this file generates a file modified event. + + max + The number of times the cgroup has triggered a reclaim + due to its EPC usage approaching (or exceeding) its max + EPC boundary. + +Migration +--------- + +Once an EPC page is charged to a cgroup (during allocation), it +remains charged to the original cgroup until the page is released +or reclaimed. Migrating a process to a different cgroup doesn't +move the EPC charges that it incurred while in the previous cgroup +to its new cgroup.