[v3,17/28] x86/sgx: fix a NULL pointer

Message ID 20230712230202.47929-18-haitao.huang@linux.intel.com
State New
Headers
Series Add Cgroup support for SGX EPC memory |

Commit Message

Haitao Huang July 12, 2023, 11:01 p.m. UTC
  Under heavy load, the SGX EPC reclaimers (ksgxd or future EPC cgroup
worker) may reclaim SECS EPC page for an enclave and set
encl->secs.epc_page to NULL. But the SECS EPC page is required for EAUG
in #PF handler and is used without checking for NULL and reloading.

Fix this by checking if SECS is loaded before EAUG and load it if it was
reclaimed.

Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 30 +++++++++++++++++++++++-------
 arch/x86/kernel/cpu/sgx/main.c |  4 ++++
 2 files changed, 27 insertions(+), 7 deletions(-)
  

Comments

Jarkko Sakkinen July 17, 2023, 12:48 p.m. UTC | #1
On Wed Jul 12, 2023 at 11:01 PM UTC, Haitao Huang wrote:
> Under heavy load, the SGX EPC reclaimers (ksgxd or future EPC cgroup
> worker) may reclaim SECS EPC page for an enclave and set
> encl->secs.epc_page to NULL. But the SECS EPC page is required for EAUG
> in #PF handler and is used without checking for NULL and reloading.
>
> Fix this by checking if SECS is loaded before EAUG and load it if it was
> reclaimed.
>
> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>

A bug fix should be 1/*.

BR, Jarkko
  
Jarkko Sakkinen July 17, 2023, 12:49 p.m. UTC | #2
On Mon Jul 17, 2023 at 12:48 PM UTC, Jarkko Sakkinen wrote:
> On Wed Jul 12, 2023 at 11:01 PM UTC, Haitao Huang wrote:
> > Under heavy load, the SGX EPC reclaimers (ksgxd or future EPC cgroup
> > worker) may reclaim SECS EPC page for an enclave and set
> > encl->secs.epc_page to NULL. But the SECS EPC page is required for EAUG
> > in #PF handler and is used without checking for NULL and reloading.
> >
> > Fix this by checking if SECS is loaded before EAUG and load it if it was
> > reclaimed.
> >
> > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
>
> A bug fix should be 1/*.

And a fixes tag.

Or is there a bug that is momentized by the earlier patches? This patch
feels confusing to say the least.

BR, Jarkko
  
Haitao Huang July 17, 2023, 1:14 p.m. UTC | #3
On Mon, 17 Jul 2023 07:49:27 -0500, Jarkko Sakkinen <jarkko@kernel.org>  
wrote:

> On Mon Jul 17, 2023 at 12:48 PM UTC, Jarkko Sakkinen wrote:
>> On Wed Jul 12, 2023 at 11:01 PM UTC, Haitao Huang wrote:
>> > Under heavy load, the SGX EPC reclaimers (ksgxd or future EPC cgroup
>> > worker) may reclaim SECS EPC page for an enclave and set
>> > encl->secs.epc_page to NULL. But the SECS EPC page is required for  
>> EAUG
>> > in #PF handler and is used without checking for NULL and reloading.
>> >
>> > Fix this by checking if SECS is loaded before EAUG and load it if it  
>> was
>> > reclaimed.
>> >
>> > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
>>
>> A bug fix should be 1/*.
>
> And a fixes tag.
>
> Or is there a bug that is momentized by the earlier patches? This patch
> feels confusing to say the least.
>

It happens in heavy reclaiming cases, just extremely rare when EPC  
accounting is not partitioned into cgroups. Will add fix tag with the  
related EDMM patch. And move this as the first patch.

Thanks
Haitao
  
Jarkko Sakkinen July 17, 2023, 2:33 p.m. UTC | #4
On Mon Jul 17, 2023 at 1:14 PM UTC, Haitao Huang wrote:
> On Mon, 17 Jul 2023 07:49:27 -0500, Jarkko Sakkinen <jarkko@kernel.org>  
> wrote:
>
> > On Mon Jul 17, 2023 at 12:48 PM UTC, Jarkko Sakkinen wrote:
> >> On Wed Jul 12, 2023 at 11:01 PM UTC, Haitao Huang wrote:
> >> > Under heavy load, the SGX EPC reclaimers (ksgxd or future EPC cgroup
> >> > worker) may reclaim SECS EPC page for an enclave and set
> >> > encl->secs.epc_page to NULL. But the SECS EPC page is required for  
> >> EAUG
> >> > in #PF handler and is used without checking for NULL and reloading.
> >> >
> >> > Fix this by checking if SECS is loaded before EAUG and load it if it  
> >> was
> >> > reclaimed.
> >> >
> >> > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
> >>
> >> A bug fix should be 1/*.
> >
> > And a fixes tag.
> >
> > Or is there a bug that is momentized by the earlier patches? This patch
> > feels confusing to say the least.
> >
>
> It happens in heavy reclaiming cases, just extremely rare when EPC  
> accounting is not partitioned into cgroups. Will add fix tag with the  
> related EDMM patch. And move this as the first patch.

I understand, it is just a good practice to follow, i.e. have prelude
and then the "real" changes :-)

BR, Jarkko
  
Dave Hansen July 17, 2023, 3:49 p.m. UTC | #5
On 7/17/23 05:48, Jarkko Sakkinen wrote:
> On Wed Jul 12, 2023 at 11:01 PM UTC, Haitao Huang wrote:
>> Under heavy load, the SGX EPC reclaimers (ksgxd or future EPC cgroup
>> worker) may reclaim SECS EPC page for an enclave and set
>> encl->secs.epc_page to NULL. But the SECS EPC page is required for EAUG
>> in #PF handler and is used without checking for NULL and reloading.
>>
>> Fix this by checking if SECS is loaded before EAUG and load it if it was
>> reclaimed.
>>
>> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
> A bug fix should be 1/*.

No, bug fixes should not even be _part_ of another series.  Send bug
fixes separately, please.
  
Haitao Huang July 17, 2023, 6:49 p.m. UTC | #6
On Mon, 17 Jul 2023 10:49:03 -0500, Dave Hansen <dave.hansen@intel.com>  
wrote:

> On 7/17/23 05:48, Jarkko Sakkinen wrote:
>> On Wed Jul 12, 2023 at 11:01 PM UTC, Haitao Huang wrote:
>>> Under heavy load, the SGX EPC reclaimers (ksgxd or future EPC cgroup
>>> worker) may reclaim SECS EPC page for an enclave and set
>>> encl->secs.epc_page to NULL. But the SECS EPC page is required for EAUG
>>> in #PF handler and is used without checking for NULL and reloading.
>>>
>>> Fix this by checking if SECS is loaded before EAUG and load it if it  
>>> was
>>> reclaimed.
>>>
>>> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
>> A bug fix should be 1/*.
>
> No, bug fixes should not even be _part_ of another series.  Send bug
> fixes separately, please.


I sent the two bug fixes separately now. Do you want me resend this series  
without those?
Thanks
Haitao
  
Jarkko Sakkinen July 17, 2023, 6:52 p.m. UTC | #7
On Mon Jul 17, 2023 at 3:49 PM UTC, Dave Hansen wrote:
> On 7/17/23 05:48, Jarkko Sakkinen wrote:
> > On Wed Jul 12, 2023 at 11:01 PM UTC, Haitao Huang wrote:
> >> Under heavy load, the SGX EPC reclaimers (ksgxd or future EPC cgroup
> >> worker) may reclaim SECS EPC page for an enclave and set
> >> encl->secs.epc_page to NULL. But the SECS EPC page is required for EAUG
> >> in #PF handler and is used without checking for NULL and reloading.
> >>
> >> Fix this by checking if SECS is loaded before EAUG and load it if it was
> >> reclaimed.
> >>
> >> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
> > A bug fix should be 1/*.
>
> No, bug fixes should not even be _part_ of another series.  Send bug
> fixes separately, please.

Yes, that would be of course a better option.

BR, Jarkko
  

Patch

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index c321c848baa9..028d1b9d6572 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -235,6 +235,19 @@  static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page,
 	return epc_page;
 }
 
+static struct sgx_epc_page *sgx_encl_load_secs(struct sgx_encl *encl)
+{
+	struct sgx_epc_page *epc_page = encl->secs.epc_page;
+
+	if (!epc_page) {
+		epc_page = sgx_encl_eldu(&encl->secs, NULL);
+		if (!IS_ERR(epc_page))
+			sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE |
+					    SGX_EPC_PAGE_UNRECLAIMABLE);
+	}
+	return epc_page;
+}
+
 static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl,
 						  struct sgx_encl_page *entry)
 {
@@ -248,13 +261,9 @@  static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl,
 		return entry;
 	}
 
-	if (!(encl->secs.epc_page)) {
-		epc_page = sgx_encl_eldu(&encl->secs, NULL);
-		if (IS_ERR(epc_page))
-			return ERR_CAST(epc_page);
-		sgx_record_epc_page(epc_page, SGX_EPC_OWNER_ENCL_PAGE |
-				    SGX_EPC_PAGE_UNRECLAIMABLE);
-	}
+	epc_page = sgx_encl_load_secs(encl);
+	if (IS_ERR(epc_page))
+		return ERR_CAST(epc_page);
 
 	epc_page = sgx_encl_eldu(entry, encl->secs.epc_page);
 	if (IS_ERR(epc_page))
@@ -342,6 +351,13 @@  static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
 
 	mutex_lock(&encl->lock);
 
+	epc_page = sgx_encl_load_secs(encl);
+	if (IS_ERR(epc_page)) {
+		if (PTR_ERR(epc_page) == -EBUSY)
+			vmret =  VM_FAULT_NOPAGE;
+		goto err_out_unlock;
+	}
+
 	epc_page = sgx_alloc_epc_page(encl_page, false);
 	if (IS_ERR(epc_page)) {
 		if (PTR_ERR(epc_page) == -EBUSY)
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 9ea487469e4c..68c89d575abc 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -265,6 +265,10 @@  static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
 
 	mutex_lock(&encl->lock);
 
+	/* Should not be possible */
+	if (WARN_ON(!(encl->secs.epc_page)))
+		goto out;
+
 	sgx_encl_ewb(epc_page, backing);
 	encl_page->epc_page = NULL;
 	encl->secs_child_cnt--;