[v2,01/18] x86/sgx: Call cond_resched() at the end of sgx_reclaim_pages()

Message ID 20221202183655.3767674-2-kristen@linux.intel.com
State New
Headers
Series Add Cgroup support for SGX EPC memory |

Commit Message

Kristen Carlson Accardi Dec. 2, 2022, 6:36 p.m. UTC
  From: Sean Christopherson <sean.j.christopherson@intel.com>

In order to avoid repetition of cond_resched() in ksgxd() and
sgx_alloc_epc_page(), move the invocation of post-reclaim cond_resched()
inside sgx_reclaim_pages(). Except in the case of sgx_reclaim_direct(),
sgx_reclaim_pages() is always called in a loop and is always followed
by a call to cond_resched().  This will hold true for the EPC cgroup
as well, which adds even more calls to sgx_reclaim_pages() and thus
cond_resched(). Calls to sgx_reclaim_direct() may be performance
sensitive. Allow sgx_reclaim_direct() to avoid the cond_resched()
call by moving the original sgx_reclaim_pages() call to
__sgx_reclaim_pages() and then have sgx_reclaim_pages() become a
wrapper around that call with a cond_resched().

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Cc: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kernel/cpu/sgx/main.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)
  

Comments

Dave Hansen Dec. 2, 2022, 9:33 p.m. UTC | #1
On 12/2/22 10:36, Kristen Carlson Accardi wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> In order to avoid repetition of cond_resched() in ksgxd() and
> sgx_alloc_epc_page(), move the invocation of post-reclaim cond_resched()
> inside sgx_reclaim_pages(). Except in the case of sgx_reclaim_direct(),
> sgx_reclaim_pages() is always called in a loop and is always followed
> by a call to cond_resched().  This will hold true for the EPC cgroup
> as well, which adds even more calls to sgx_reclaim_pages() and thus
> cond_resched(). Calls to sgx_reclaim_direct() may be performance
> sensitive. Allow sgx_reclaim_direct() to avoid the cond_resched()
> call by moving the original sgx_reclaim_pages() call to
> __sgx_reclaim_pages() and then have sgx_reclaim_pages() become a
> wrapper around that call with a cond_resched().
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> Cc: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 160c8dbee0ab..ffce6fc70a1f 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -287,7 +287,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
>   * problematic as it would increase the lock contention too much, which would
>   * halt forward progress.
>   */
> -static void sgx_reclaim_pages(void)
> +static void __sgx_reclaim_pages(void)
>  {
>  	struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
>  	struct sgx_backing backing[SGX_NR_TO_SCAN];
> @@ -369,6 +369,12 @@ static void sgx_reclaim_pages(void)
>  	}
>  }
>  
> +static void sgx_reclaim_pages(void)
> +{
> +	__sgx_reclaim_pages();
> +	cond_resched();
> +}

Why bother with the wrapper?  Can't we just put cond_resched() in the
existing sgx_reclaim_pages()?
  
Kristen Carlson Accardi Dec. 2, 2022, 9:37 p.m. UTC | #2
On Fri, 2022-12-02 at 13:33 -0800, Dave Hansen wrote:
> On 12/2/22 10:36, Kristen Carlson Accardi wrote:
> > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > 
> > In order to avoid repetition of cond_resched() in ksgxd() and
> > sgx_alloc_epc_page(), move the invocation of post-reclaim
> > cond_resched()
> > inside sgx_reclaim_pages(). Except in the case of
> > sgx_reclaim_direct(),
> > sgx_reclaim_pages() is always called in a loop and is always
> > followed
> > by a call to cond_resched().  This will hold true for the EPC
> > cgroup
> > as well, which adds even more calls to sgx_reclaim_pages() and thus
> > cond_resched(). Calls to sgx_reclaim_direct() may be performance
> > sensitive. Allow sgx_reclaim_direct() to avoid the cond_resched()
> > call by moving the original sgx_reclaim_pages() call to
> > __sgx_reclaim_pages() and then have sgx_reclaim_pages() become a
> > wrapper around that call with a cond_resched().
> > 
> > Signed-off-by: Sean Christopherson
> > <sean.j.christopherson@intel.com>
> > Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> > Cc: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kernel/cpu/sgx/main.c | 17 +++++++++++------
> >  1 file changed, 11 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c
> > b/arch/x86/kernel/cpu/sgx/main.c
> > index 160c8dbee0ab..ffce6fc70a1f 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -287,7 +287,7 @@ static void sgx_reclaimer_write(struct
> > sgx_epc_page *epc_page,
> >   * problematic as it would increase the lock contention too much,
> > which would
> >   * halt forward progress.
> >   */
> > -static void sgx_reclaim_pages(void)
> > +static void __sgx_reclaim_pages(void)
> >  {
> >         struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
> >         struct sgx_backing backing[SGX_NR_TO_SCAN];
> > @@ -369,6 +369,12 @@ static void sgx_reclaim_pages(void)
> >         }
> >  }
> >  
> > +static void sgx_reclaim_pages(void)
> > +{
> > +       __sgx_reclaim_pages();
> > +       cond_resched();
> > +}
> 
> Why bother with the wrapper?  Can't we just put cond_resched() in the
> existing sgx_reclaim_pages()?

Because sgx_reclaim_direct() needs to call sgx_reclaim_pages() but not
do the cond_resched(). It was this or add a boolean or something to let
caller's opt out of the resched.
  
Dave Hansen Dec. 2, 2022, 9:45 p.m. UTC | #3
On 12/2/22 13:37, Kristen Carlson Accardi wrote:
>>> +static void sgx_reclaim_pages(void)
>>> +{
>>> +       __sgx_reclaim_pages();
>>> +       cond_resched();
>>> +}
>> Why bother with the wrapper?  Can't we just put cond_resched() in the
>> existing sgx_reclaim_pages()?
> Because sgx_reclaim_direct() needs to call sgx_reclaim_pages() but not
> do the cond_resched(). It was this or add a boolean or something to let
> caller's opt out of the resched.

Is there a reason sgx_reclaim_direct() *can't* or shouldn't call
cond_resched()?
  
Kristen Carlson Accardi Dec. 2, 2022, 10:17 p.m. UTC | #4
On Fri, 2022-12-02 at 13:45 -0800, Dave Hansen wrote:
> On 12/2/22 13:37, Kristen Carlson Accardi wrote:
> > > > +static void sgx_reclaim_pages(void)
> > > > +{
> > > > +       __sgx_reclaim_pages();
> > > > +       cond_resched();
> > > > +}
> > > Why bother with the wrapper?  Can't we just put cond_resched() in
> > > the
> > > existing sgx_reclaim_pages()?
> > Because sgx_reclaim_direct() needs to call sgx_reclaim_pages() but
> > not
> > do the cond_resched(). It was this or add a boolean or something to
> > let
> > caller's opt out of the resched.
> 
> Is there a reason sgx_reclaim_direct() *can't* or shouldn't call
> cond_resched()?

Yes, it is due to performance concerns. It is explained most succinctly
by Reinette here:

https://lore.kernel.org/linux-sgx/a4eb5ab0-bf83-17a4-8bc0-a90aaf438a8e@intel.com/
  
Dave Hansen Dec. 2, 2022, 10:37 p.m. UTC | #5
On 12/2/22 14:17, Kristen Carlson Accardi wrote:
> On Fri, 2022-12-02 at 13:45 -0800, Dave Hansen wrote:
>> On 12/2/22 13:37, Kristen Carlson Accardi wrote:
>>>>> +static void sgx_reclaim_pages(void)
>>>>> +{
>>>>> +       __sgx_reclaim_pages();
>>>>> +       cond_resched();
>>>>> +}
>>>> Why bother with the wrapper?  Can't we just put cond_resched() in
>>>> the
>>>> existing sgx_reclaim_pages()?
>>> Because sgx_reclaim_direct() needs to call sgx_reclaim_pages()
>>> but not do the cond_resched(). It was this or add a boolean or
>>> something to let caller's opt out of the resched.
>>
>> Is there a reason sgx_reclaim_direct() *can't* or shouldn't call
>> cond_resched()?
> 
> Yes, it is due to performance concerns. It is explained most succinctly
> by Reinette here:
> 
> https://lore.kernel.org/linux-sgx/a4eb5ab0-bf83-17a4-8bc0-a90aaf438a8e@intel.com/

I think I'd much rather have 3 cond_resched()s in the code that
effectively self-document than one __something() in there that's a bit
of a mystery.

Everyone knows what cond_resched() means.
  

Patch

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 160c8dbee0ab..ffce6fc70a1f 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -287,7 +287,7 @@  static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
  * problematic as it would increase the lock contention too much, which would
  * halt forward progress.
  */
-static void sgx_reclaim_pages(void)
+static void __sgx_reclaim_pages(void)
 {
 	struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
 	struct sgx_backing backing[SGX_NR_TO_SCAN];
@@ -369,6 +369,12 @@  static void sgx_reclaim_pages(void)
 	}
 }
 
+static void sgx_reclaim_pages(void)
+{
+	__sgx_reclaim_pages();
+	cond_resched();
+}
+
 static bool sgx_should_reclaim(unsigned long watermark)
 {
 	return atomic_long_read(&sgx_nr_free_pages) < watermark &&
@@ -378,12 +384,14 @@  static bool sgx_should_reclaim(unsigned long watermark)
 /*
  * sgx_reclaim_direct() should be called (without enclave's mutex held)
  * in locations where SGX memory resources might be low and might be
- * needed in order to make forward progress.
+ * needed in order to make forward progress. This call to
+ * __sgx_reclaim_pages() avoids the cond_resched() in sgx_reclaim_pages()
+ * to improve performance.
  */
 void sgx_reclaim_direct(void)
 {
 	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
-		sgx_reclaim_pages();
+		__sgx_reclaim_pages();
 }
 
 static int ksgxd(void *p)
@@ -410,8 +418,6 @@  static int ksgxd(void *p)
 
 		if (sgx_should_reclaim(SGX_NR_HIGH_PAGES))
 			sgx_reclaim_pages();
-
-		cond_resched();
 	}
 
 	return 0;
@@ -582,7 +588,6 @@  struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
 		}
 
 		sgx_reclaim_pages();
-		cond_resched();
 	}
 
 	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))