[01/26] x86/sgx: Call cond_resched() at the end of sgx_reclaim_pages()

Message ID 20221111183532.3676646-2-kristen@linux.intel.com
State New
Headers
Series Add Cgroup support for SGX EPC memory |

Commit Message

Kristen Carlson Accardi Nov. 11, 2022, 6:35 p.m. UTC
  From: Sean Christopherson <sean.j.christopherson@intel.com>

In order to avoid repetition of cond_resched() in ksgxd() and
sgx_alloc_epc_page(), move the invocation of post-reclaim cond_resched()
inside sgx_reclaim_pages(). Except in the case of sgx_reclaim_direct(),
sgx_reclaim_pages() is always called in a loop and is always followed
by a call to cond_resched().  This will hold true for the EPC cgroup
as well, which adds even more calls to sgx_reclaim_pages() and thus
cond_resched(). Calls to sgx_reclaim_direct() may be performance
sensitive. Allow sgx_reclaim_direct() to avoid the cond_resched()
call by moving the original sgx_reclaim_pages() call to
__sgx_reclaim_pages() and then have sgx_reclaim_pages() become a
wrapper around that call with a cond_resched().

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Cc: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kernel/cpu/sgx/main.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)
  

Comments

Jarkko Sakkinen Nov. 15, 2022, 11:27 p.m. UTC | #1
On Fri, Nov 11, 2022 at 10:35:06AM -0800, Kristen Carlson Accardi wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> In order to avoid repetition of cond_resched() in ksgxd() and
> sgx_alloc_epc_page(), move the invocation of post-reclaim cond_resched()
> inside sgx_reclaim_pages(). Except in the case of sgx_reclaim_direct(),
> sgx_reclaim_pages() is always called in a loop and is always followed
> by a call to cond_resched().  This will hold true for the EPC cgroup
> as well, which adds even more calls to sgx_reclaim_pages() and thus
> cond_resched(). Calls to sgx_reclaim_direct() may be performance
> sensitive. Allow sgx_reclaim_direct() to avoid the cond_resched()
> call by moving the original sgx_reclaim_pages() call to
> __sgx_reclaim_pages() and then have sgx_reclaim_pages() become a
> wrapper around that call with a cond_resched().
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> Cc: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 160c8dbee0ab..ffce6fc70a1f 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -287,7 +287,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
>   * problematic as it would increase the lock contention too much, which would
>   * halt forward progress.
>   */
> -static void sgx_reclaim_pages(void)
> +static void __sgx_reclaim_pages(void)
>  {
>  	struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
>  	struct sgx_backing backing[SGX_NR_TO_SCAN];
> @@ -369,6 +369,12 @@ static void sgx_reclaim_pages(void)
>  	}
>  }
>  
> +static void sgx_reclaim_pages(void)
> +{
> +	__sgx_reclaim_pages();
> +	cond_resched();
> +}
> +
>  static bool sgx_should_reclaim(unsigned long watermark)
>  {
>  	return atomic_long_read(&sgx_nr_free_pages) < watermark &&
> @@ -378,12 +384,14 @@ static bool sgx_should_reclaim(unsigned long watermark)
>  /*
>   * sgx_reclaim_direct() should be called (without enclave's mutex held)
>   * in locations where SGX memory resources might be low and might be
> - * needed in order to make forward progress.
> + * needed in order to make forward progress. This call to
> + * __sgx_reclaim_pages() avoids the cond_resched() in sgx_reclaim_pages()
> + * to improve performance.
>   */
>  void sgx_reclaim_direct(void)
>  {
>  	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
> -		sgx_reclaim_pages();
> +		__sgx_reclaim_pages();

Is it a big deal to have "extra" cond_resched?

>  }
>  
>  static int ksgxd(void *p)
> @@ -410,8 +418,6 @@ static int ksgxd(void *p)
>  
>  		if (sgx_should_reclaim(SGX_NR_HIGH_PAGES))
>  			sgx_reclaim_pages();
> -
> -		cond_resched();
>  	}
>  
>  	return 0;
> @@ -582,7 +588,6 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
>  		}
>  
>  		sgx_reclaim_pages();
> -		cond_resched();
>  	}
>  
>  	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
> -- 
> 2.37.3
> 

BR, Jarkko
  
Reinette Chatre Nov. 16, 2022, 1 a.m. UTC | #2
Hi Jarkko,

On 11/15/2022 3:27 PM, Jarkko Sakkinen wrote:
> On Fri, Nov 11, 2022 at 10:35:06AM -0800, Kristen Carlson Accardi wrote:
>> From: Sean Christopherson <sean.j.christopherson@intel.com>
>>
>> In order to avoid repetition of cond_resched() in ksgxd() and
>> sgx_alloc_epc_page(), move the invocation of post-reclaim cond_resched()
>> inside sgx_reclaim_pages(). Except in the case of sgx_reclaim_direct(),
>> sgx_reclaim_pages() is always called in a loop and is always followed
>> by a call to cond_resched().  This will hold true for the EPC cgroup
>> as well, which adds even more calls to sgx_reclaim_pages() and thus
>> cond_resched(). Calls to sgx_reclaim_direct() may be performance
>> sensitive. Allow sgx_reclaim_direct() to avoid the cond_resched()
>> call by moving the original sgx_reclaim_pages() call to
>> __sgx_reclaim_pages() and then have sgx_reclaim_pages() become a
>> wrapper around that call with a cond_resched().
>>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
>> Cc: Sean Christopherson <seanjc@google.com>
>> ---
>>  arch/x86/kernel/cpu/sgx/main.c | 17 +++++++++++------
>>  1 file changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
>> index 160c8dbee0ab..ffce6fc70a1f 100644
>> --- a/arch/x86/kernel/cpu/sgx/main.c
>> +++ b/arch/x86/kernel/cpu/sgx/main.c
>> @@ -287,7 +287,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
>>   * problematic as it would increase the lock contention too much, which would
>>   * halt forward progress.
>>   */
>> -static void sgx_reclaim_pages(void)
>> +static void __sgx_reclaim_pages(void)
>>  {
>>  	struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
>>  	struct sgx_backing backing[SGX_NR_TO_SCAN];
>> @@ -369,6 +369,12 @@ static void sgx_reclaim_pages(void)
>>  	}
>>  }
>>  
>> +static void sgx_reclaim_pages(void)
>> +{
>> +	__sgx_reclaim_pages();
>> +	cond_resched();
>> +}
>> +
>>  static bool sgx_should_reclaim(unsigned long watermark)
>>  {
>>  	return atomic_long_read(&sgx_nr_free_pages) < watermark &&
>> @@ -378,12 +384,14 @@ static bool sgx_should_reclaim(unsigned long watermark)
>>  /*
>>   * sgx_reclaim_direct() should be called (without enclave's mutex held)
>>   * in locations where SGX memory resources might be low and might be
>> - * needed in order to make forward progress.
>> + * needed in order to make forward progress. This call to
>> + * __sgx_reclaim_pages() avoids the cond_resched() in sgx_reclaim_pages()
>> + * to improve performance.
>>   */
>>  void sgx_reclaim_direct(void)
>>  {
>>  	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
>> -		sgx_reclaim_pages();
>> +		__sgx_reclaim_pages();
> 
> Is it a big deal to have "extra" cond_resched?
> 

sgx_reclaim_direct() is used to ensure that there is enough
SGX memory available to make forward progress within a loop that
may span a large range of pages. sgx_reclaim_direct()
ensures that there is enough memory right before it depends on
that available memory. I think that giving other tasks an opportunity
to run in the middle is risky since these other tasks may end
up consuming the SGX memory that was just freed up and thus increase
likelihood of the operation to fail with user getting an EAGAIN error.
Additionally, in a constrained environment where sgx_reclaim_direct()
is needed to reclaim pages an additional cond_resched() could cause
user visible slow down when operating on a large memory range. 

Reinette
  

Patch

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 160c8dbee0ab..ffce6fc70a1f 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -287,7 +287,7 @@  static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
  * problematic as it would increase the lock contention too much, which would
  * halt forward progress.
  */
-static void sgx_reclaim_pages(void)
+static void __sgx_reclaim_pages(void)
 {
 	struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
 	struct sgx_backing backing[SGX_NR_TO_SCAN];
@@ -369,6 +369,12 @@  static void sgx_reclaim_pages(void)
 	}
 }
 
+static void sgx_reclaim_pages(void)
+{
+	__sgx_reclaim_pages();
+	cond_resched();
+}
+
 static bool sgx_should_reclaim(unsigned long watermark)
 {
 	return atomic_long_read(&sgx_nr_free_pages) < watermark &&
@@ -378,12 +384,14 @@  static bool sgx_should_reclaim(unsigned long watermark)
 /*
  * sgx_reclaim_direct() should be called (without enclave's mutex held)
  * in locations where SGX memory resources might be low and might be
- * needed in order to make forward progress.
+ * needed in order to make forward progress. This call to
+ * __sgx_reclaim_pages() avoids the cond_resched() in sgx_reclaim_pages()
+ * to improve performance.
  */
 void sgx_reclaim_direct(void)
 {
 	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
-		sgx_reclaim_pages();
+		__sgx_reclaim_pages();
 }
 
 static int ksgxd(void *p)
@@ -410,8 +418,6 @@  static int ksgxd(void *p)
 
 		if (sgx_should_reclaim(SGX_NR_HIGH_PAGES))
 			sgx_reclaim_pages();
-
-		cond_resched();
 	}
 
 	return 0;
@@ -582,7 +588,6 @@  struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
 		}
 
 		sgx_reclaim_pages();
-		cond_resched();
 	}
 
 	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))