[v5,11/18] x86/sgx: store unreclaimable pages in LRU lists
Commit Message
From: Sean Christopherson <sean.j.christopherson@intel.com>
When an OOM event occurs, all pages associated with an enclave will need
to be freed, including pages that are not currently tracked by the
cgroup LRU lists.
Add a new "unreclaimable" list to the sgx_epc_lru_lists struct and
update the "sgx_record/drop_epc_pages()" functions for adding/removing
VA and SECS pages to/from this "unreclaimable" list.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Co-developed-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com>
Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
Cc: Sean Christopherson <seanjc@google.com>
---
V4:
- Updates for patch reordering.
- Revised commit messages.
- Revised comments for the list.
V3:
- Removed tracking virtual EPC pages in unreclaimable list as host
kernel does not reclaim them. The EPC cgroups implemented later only
blocks allocating for a guest if the limit is reached by returning
-ENOMEM from sgx_alloc_epc_page() called by virt_epc, and does nothing
else. Therefore, no need to track those in LRU lists.
---
arch/x86/kernel/cpu/sgx/encl.c | 2 ++
arch/x86/kernel/cpu/sgx/ioctl.c | 1 +
arch/x86/kernel/cpu/sgx/main.c | 3 +++
arch/x86/kernel/cpu/sgx/sgx.h | 8 +++++++-
4 files changed, 13 insertions(+), 1 deletion(-)
Comments
On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
>
> When an OOM event occurs, all pages associated with an enclave will need
> to be freed, including pages that are not currently tracked by the
> cgroup LRU lists.
What are "cgroup LRU lists"?
>
> Add a new "unreclaimable" list to the sgx_epc_lru_lists struct and
> update the "sgx_record/drop_epc_pages()" functions for adding/removing
> VA and SECS pages to/from this "unreclaimable" list.
Sorry I don't follow the logic between the two paragraphs.
What is the exact problem? How does the new "unreclaimable" list solve the
problem?
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -746,6 +746,7 @@ void sgx_encl_release(struct kref *ref)
> xa_destroy(&encl->page_array);
>
> if (!encl->secs_child_cnt && encl->secs.epc_page) {
> + sgx_drop_epc_page(encl->secs.epc_page);
> sgx_encl_free_epc_page(encl->secs.epc_page);
> encl->secs.epc_page = NULL;
> }
The "record" of SECS/VA pages should be done together with this. I see no
reason why the "record" and "drop" are separated into different patches.
On Thu, 28 Sep 2023 04:41:33 -0500, Huang, Kai <kai.huang@intel.com> wrote:
>
>> --- a/arch/x86/kernel/cpu/sgx/encl.c
>> +++ b/arch/x86/kernel/cpu/sgx/encl.c
>> @@ -746,6 +746,7 @@ void sgx_encl_release(struct kref *ref)
>> xa_destroy(&encl->page_array);
>>
>> if (!encl->secs_child_cnt && encl->secs.epc_page) {
>> + sgx_drop_epc_page(encl->secs.epc_page);
>> sgx_encl_free_epc_page(encl->secs.epc_page);
>> encl->secs.epc_page = NULL;
>> }
>
> The "record" of SECS/VA pages should be done together with this. I see
> no
> reason why the "record" and "drop" are separated into different patches.
"record" of SECS/VA pages are done in this patch. Before nothing done in
"record" for them because no tracking LRU lists for them. Now they are
tracked.
Thanks
Haitao
On Wed, 27 Sep 2023 06:57:18 -0500, Huang, Kai <kai.huang@intel.com> wrote:
> On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote:
>> From: Sean Christopherson <sean.j.christopherson@intel.com>
>>
>> When an OOM event occurs, all pages associated with an enclave will need
>> to be freed, including pages that are not currently tracked by the
>> cgroup LRU lists.
>
> What are "cgroup LRU lists"?
>
Will reword it. At them moment there is only one global sgx_epc_lru_lists.
>>
>> Add a new "unreclaimable" list to the sgx_epc_lru_lists struct and
>> update the "sgx_record/drop_epc_pages()" functions for adding/removing
>> VA and SECS pages to/from this "unreclaimable" list.
>
> Sorry I don't follow the logic between the two paragraphs.
>
> What is the exact problem? How does the new "unreclaimable" list solve
> the
> problem?
>
>
Currently they are not tracked in a list managed by the ksgxd (future
cgroup worker). So add a list to track them.
Thanks
Haitao
On Tue, 2023-10-03 at 00:15 -0500, Haitao Huang wrote:
> On Thu, 28 Sep 2023 04:41:33 -0500, Huang, Kai <kai.huang@intel.com> wrote:
>
> >
> > > --- a/arch/x86/kernel/cpu/sgx/encl.c
> > > +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > > @@ -746,6 +746,7 @@ void sgx_encl_release(struct kref *ref)
> > > xa_destroy(&encl->page_array);
> > >
> > > if (!encl->secs_child_cnt && encl->secs.epc_page) {
> > > + sgx_drop_epc_page(encl->secs.epc_page);
> > > sgx_encl_free_epc_page(encl->secs.epc_page);
> > > encl->secs.epc_page = NULL;
> > > }
> >
> > The "record" of SECS/VA pages should be done together with this. I see
> > no
> > reason why the "record" and "drop" are separated into different patches.
>
> "record" of SECS/VA pages are done in this patch. Before nothing done in
> "record" for them because no tracking LRU lists for them. Now they are
> tracked.
>
>
I was talking about calling sgx_record_epc_page() for SECS/VA:
@@ -113,6 +113,9 @@ static int sgx_encl_create(struct sgx_encl *encl, struct
sgx_secs *secs)
encl->attributes = secs->attributes;
encl->attributes_mask = SGX_ATTR_UNPRIV_MASK;
+ sgx_record_epc_page(encl->secs.epc_page,
+ SGX_EPC_PAGE_UNRECLAIMABLE);
This piece of code *literally* does the record.
@@ -746,6 +746,7 @@ void sgx_encl_release(struct kref *ref)
xa_destroy(&encl->page_array);
if (!encl->secs_child_cnt && encl->secs.epc_page) {
+ sgx_drop_epc_page(encl->secs.epc_page);
sgx_encl_free_epc_page(encl->secs.epc_page);
encl->secs.epc_page = NULL;
}
@@ -754,6 +755,7 @@ void sgx_encl_release(struct kref *ref)
va_page = list_first_entry(&encl->va_pages, struct sgx_va_page,
list);
list_del(&va_page->list);
+ sgx_drop_epc_page(va_page->epc_page);
sgx_encl_free_epc_page(va_page->epc_page);
kfree(va_page);
}
@@ -48,6 +48,7 @@ void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
encl->page_cnt--;
if (va_page) {
+ sgx_drop_epc_page(va_page->epc_page);
sgx_encl_free_epc_page(va_page->epc_page);
list_del(&va_page->list);
kfree(va_page);
@@ -268,6 +268,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
goto out;
sgx_encl_ewb(encl->secs.epc_page, &secs_backing);
+ sgx_drop_epc_page(encl->secs.epc_page);
sgx_encl_free_epc_page(encl->secs.epc_page);
encl->secs.epc_page = NULL;
@@ -510,6 +511,8 @@ void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags)
page->flags |= flags;
if (sgx_epc_page_reclaimable(flags))
list_add_tail(&page->list, &sgx_global_lru.reclaimable);
+ else
+ list_add_tail(&page->list, &sgx_global_lru.unreclaimable);
spin_unlock(&sgx_global_lru.lock);
}
@@ -152,17 +152,23 @@ static inline void *sgx_get_epc_virt_addr(struct sgx_epc_page *page)
}
/*
- * Tracks EPC pages reclaimable by the reclaimer (ksgxd).
+ * Contains EPC pages tracked by the reclaimer (ksgxd).
*/
struct sgx_epc_lru_lists {
spinlock_t lock;
struct list_head reclaimable;
+ /*
+ * Tracks SECS, VA pages,etc., pages only freeable after all its
+ * dependent reclaimables are freed.
+ */
+ struct list_head unreclaimable;
};
static inline void sgx_lru_init(struct sgx_epc_lru_lists *lrus)
{
spin_lock_init(&lrus->lock);
INIT_LIST_HEAD(&lrus->reclaimable);
+ INIT_LIST_HEAD(&lrus->unreclaimable);
}
struct sgx_epc_page *__sgx_alloc_epc_page(void);