diff mbox series

dmaengine: idxd: Ensure safe user copy of completion record

Message ID	20240209191412.1050270-1-fenghua.yu@intel.com
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel+bounces-59862-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; From: Fenghua Yu <fenghua.yu@intel.com> To: "Vinod Koul" <vkoul@kernel.org>, "Dave Jiang" <dave.jiang@intel.com> Cc: dmaengine@vger.kernel.org, "linux-kernel" <linux-kernel@vger.kernel.org>, Fenghua Yu <fenghua.yu@intel.com>, Tony Zhu <tony.zhu@intel.com> Subject: [PATCH] dmaengine: idxd: Ensure safe user copy of completion record Date: Fri, 9 Feb 2024 11:14:12 -0800 Message-Id: <20240209191412.1050270-1-fenghua.yu@intel.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX
Series	dmaengine: idxd: Ensure safe user copy of completion record \| dmaengine: idxd: Ensure safe user copy of completion record

Commit Message

Fenghua Yu Feb. 9, 2024, 7:14 p.m. UTC

  If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
event log cache to user triggers a kernel bug.

[ 1987.159822] usercopy: Kernel memory exposure attempt detected from SLUB object 'dsa0' (offset 74, size 31)!
[ 1987.170845] ------------[ cut here ]------------
[ 1987.176086] kernel BUG at mm/usercopy.c:102!
[ 1987.180946] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 1987.186866] CPU: 17 PID: 528 Comm: kworker/17:1 Not tainted 6.8.0-rc2+ #5
[ 1987.194537] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023
[ 1987.206405] Workqueue: wq0.0 idxd_evl_fault_work [idxd]
[ 1987.212338] RIP: 0010:usercopy_abort+0x72/0x90
[ 1987.217381] Code: 58 65 9c 50 48 c7 c2 17 85 61 9c 57 48 c7 c7 98 fd 6b 9c 48 0f 44 d6 48 c7 c6 b3 08 62 9c 4c 89 d1 49 0f 44 f3 e8 1e 2e d5 ff <0f> 0b 49 c7 c1 9e 42 61 9c 4c 89 cf 4d 89 c8 eb a9 66 66 2e 0f 1f
[ 1987.238505] RSP: 0018:ff62f5cf20607d60 EFLAGS: 00010246
[ 1987.244423] RAX: 000000000000005f RBX: 000000000000001f RCX: 0000000000000000
[ 1987.252480] RDX: 0000000000000000 RSI: ffffffff9c61429e RDI: 00000000ffffffff
[ 1987.260538] RBP: ff62f5cf20607d78 R08: ff2a6a89ef3fffe8 R09: 00000000fffeffff
[ 1987.268595] R10: ff2a6a89eed00000 R11: 0000000000000003 R12: ff2a66934849c89a
[ 1987.276652] R13: 0000000000000001 R14: ff2a66934849c8b9 R15: ff2a66934849c899
[ 1987.284710] FS:  0000000000000000(0000) GS:ff2a66b22fe40000(0000) knlGS:0000000000000000
[ 1987.293850] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1987.300355] CR2: 00007fe291a37000 CR3: 000000010fbd4005 CR4: 0000000000f71ef0
[ 1987.308413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1987.316470] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 1987.324527] PKRU: 55555554
[ 1987.327622] Call Trace:
[ 1987.330424]  <TASK>
[ 1987.332826]  ? show_regs+0x6e/0x80
[ 1987.336703]  ? die+0x3c/0xa0
[ 1987.339988]  ? do_trap+0xd4/0xf0
[ 1987.343662]  ? do_error_trap+0x75/0xa0
[ 1987.347922]  ? usercopy_abort+0x72/0x90
[ 1987.352277]  ? exc_invalid_op+0x57/0x80
[ 1987.356634]  ? usercopy_abort+0x72/0x90
[ 1987.360988]  ? asm_exc_invalid_op+0x1f/0x30
[ 1987.365734]  ? usercopy_abort+0x72/0x90
[ 1987.370088]  __check_heap_object+0xb7/0xd0
[ 1987.374739]  __check_object_size+0x175/0x2d0
[ 1987.379588]  idxd_copy_cr+0xa9/0x130 [idxd]
[ 1987.384341]  idxd_evl_fault_work+0x127/0x390 [idxd]
[ 1987.389878]  process_one_work+0x13e/0x300
[ 1987.394435]  ? __pfx_worker_thread+0x10/0x10
[ 1987.399284]  worker_thread+0x2f7/0x420
[ 1987.403544]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
[ 1987.409171]  ? __pfx_worker_thread+0x10/0x10
[ 1987.414019]  kthread+0x107/0x140
[ 1987.417693]  ? __pfx_kthread+0x10/0x10
[ 1987.421954]  ret_from_fork+0x3d/0x60
[ 1987.426019]  ? __pfx_kthread+0x10/0x10
[ 1987.430281]  ret_from_fork_asm+0x1b/0x30
[ 1987.434744]  </TASK>

The issue arises because event log cache is created using
kmem_cache_create() which is not suitable for user copy.

Fix the issue by creating event log cache with
kmem_cache_create_usercopy(), ensuring safe user copy.

Fixes: c2f156bf168f ("dmaengine: idxd: create kmem cache for event log fault items")
Reported-by: Tony Zhu <tony.zhu@intel.com>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
 drivers/dma/idxd/init.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

Comments

Dave Jiang Feb. 9, 2024, 8:15 p.m. UTC | #1

On 2/9/24 12:14 PM, Fenghua Yu wrote:
> If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
> event log cache to user triggers a kernel bug.
> 
> [ 1987.159822] usercopy: Kernel memory exposure attempt detected from SLUB object 'dsa0' (offset 74, size 31)!
> [ 1987.170845] ------------[ cut here ]------------
> [ 1987.176086] kernel BUG at mm/usercopy.c:102!
> [ 1987.180946] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [ 1987.186866] CPU: 17 PID: 528 Comm: kworker/17:1 Not tainted 6.8.0-rc2+ #5
> [ 1987.194537] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023
> [ 1987.206405] Workqueue: wq0.0 idxd_evl_fault_work [idxd]
> [ 1987.212338] RIP: 0010:usercopy_abort+0x72/0x90
> [ 1987.217381] Code: 58 65 9c 50 48 c7 c2 17 85 61 9c 57 48 c7 c7 98 fd 6b 9c 48 0f 44 d6 48 c7 c6 b3 08 62 9c 4c 89 d1 49 0f 44 f3 e8 1e 2e d5 ff <0f> 0b 49 c7 c1 9e 42 61 9c 4c 89 cf 4d 89 c8 eb a9 66 66 2e 0f 1f
> [ 1987.238505] RSP: 0018:ff62f5cf20607d60 EFLAGS: 00010246
> [ 1987.244423] RAX: 000000000000005f RBX: 000000000000001f RCX: 0000000000000000
> [ 1987.252480] RDX: 0000000000000000 RSI: ffffffff9c61429e RDI: 00000000ffffffff
> [ 1987.260538] RBP: ff62f5cf20607d78 R08: ff2a6a89ef3fffe8 R09: 00000000fffeffff
> [ 1987.268595] R10: ff2a6a89eed00000 R11: 0000000000000003 R12: ff2a66934849c89a
> [ 1987.276652] R13: 0000000000000001 R14: ff2a66934849c8b9 R15: ff2a66934849c899
> [ 1987.284710] FS:  0000000000000000(0000) GS:ff2a66b22fe40000(0000) knlGS:0000000000000000
> [ 1987.293850] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1987.300355] CR2: 00007fe291a37000 CR3: 000000010fbd4005 CR4: 0000000000f71ef0
> [ 1987.308413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1987.316470] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
> [ 1987.324527] PKRU: 55555554
> [ 1987.327622] Call Trace:
> [ 1987.330424]  <TASK>
> [ 1987.332826]  ? show_regs+0x6e/0x80
> [ 1987.336703]  ? die+0x3c/0xa0
> [ 1987.339988]  ? do_trap+0xd4/0xf0
> [ 1987.343662]  ? do_error_trap+0x75/0xa0
> [ 1987.347922]  ? usercopy_abort+0x72/0x90
> [ 1987.352277]  ? exc_invalid_op+0x57/0x80
> [ 1987.356634]  ? usercopy_abort+0x72/0x90
> [ 1987.360988]  ? asm_exc_invalid_op+0x1f/0x30
> [ 1987.365734]  ? usercopy_abort+0x72/0x90
> [ 1987.370088]  __check_heap_object+0xb7/0xd0
> [ 1987.374739]  __check_object_size+0x175/0x2d0
> [ 1987.379588]  idxd_copy_cr+0xa9/0x130 [idxd]
> [ 1987.384341]  idxd_evl_fault_work+0x127/0x390 [idxd]
> [ 1987.389878]  process_one_work+0x13e/0x300
> [ 1987.394435]  ? __pfx_worker_thread+0x10/0x10
> [ 1987.399284]  worker_thread+0x2f7/0x420
> [ 1987.403544]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
> [ 1987.409171]  ? __pfx_worker_thread+0x10/0x10
> [ 1987.414019]  kthread+0x107/0x140
> [ 1987.417693]  ? __pfx_kthread+0x10/0x10
> [ 1987.421954]  ret_from_fork+0x3d/0x60
> [ 1987.426019]  ? __pfx_kthread+0x10/0x10
> [ 1987.430281]  ret_from_fork_asm+0x1b/0x30
> [ 1987.434744]  </TASK>
> 
> The issue arises because event log cache is created using
> kmem_cache_create() which is not suitable for user copy.
> 
> Fix the issue by creating event log cache with
> kmem_cache_create_usercopy(), ensuring safe user copy.
> 
> Fixes: c2f156bf168f ("dmaengine: idxd: create kmem cache for event log fault items")
> Reported-by: Tony Zhu <tony.zhu@intel.com>
> Tested-by: Tony Zhu <tony.zhu@intel.com>
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/dma/idxd/init.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index 14df1f1347a8..4954adc6bb60 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -343,7 +343,9 @@ static void idxd_cleanup_internals(struct idxd_device *idxd)
>  static int idxd_init_evl(struct idxd_device *idxd)
>  {
>  	struct device *dev = &idxd->pdev->dev;
> +	unsigned int evl_cache_size;
>  	struct idxd_evl *evl;
> +	const char *idxd_name;
>  
>  	if (idxd->hw.gen_cap.evl_support == 0)
>  		return 0;
> @@ -355,9 +357,16 @@ static int idxd_init_evl(struct idxd_device *idxd)
>  	spin_lock_init(&evl->lock);
>  	evl->size = IDXD_EVL_SIZE_MIN;
>  
> -	idxd->evl_cache = kmem_cache_create(dev_name(idxd_confdev(idxd)),
> -					    sizeof(struct idxd_evl_fault) + evl_ent_size(idxd),
> -					    0, 0, NULL);
> +	idxd_name = dev_name(idxd_confdev(idxd));
> +	evl_cache_size = sizeof(struct idxd_evl_fault) + evl_ent_size(idxd);
> +	/*
> +	 * Since completion record in evl_cache will be copied to user
> +	 * when handling completion record page fault, need to create
> +	 * the cache suitable for user copy.
> +	 */
> +	idxd->evl_cache = kmem_cache_create_usercopy(idxd_name, evl_cache_size,
> +						     0, 0, 0, evl_cache_size,
> +						     NULL);
>  	if (!idxd->evl_cache) {
>  		kfree(evl);
>  		return -ENOMEM;

Lijun Pan Feb. 9, 2024, 9:53 p.m. UTC | #2

On 2/10/2024 3:14 AM, Fenghua Yu wrote:
> If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
> event log cache to user triggers a kernel bug.
> 
> [ 1987.159822] usercopy: Kernel memory exposure attempt detected from SLUB object 'dsa0' (offset 74, size 31)!
> [ 1987.170845] ------------[ cut here ]------------
> [ 1987.176086] kernel BUG at mm/usercopy.c:102!
> [ 1987.180946] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [ 1987.186866] CPU: 17 PID: 528 Comm: kworker/17:1 Not tainted 6.8.0-rc2+ #5
> [ 1987.194537] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023
> [ 1987.206405] Workqueue: wq0.0 idxd_evl_fault_work [idxd]
> [ 1987.212338] RIP: 0010:usercopy_abort+0x72/0x90
> [ 1987.217381] Code: 58 65 9c 50 48 c7 c2 17 85 61 9c 57 48 c7 c7 98 fd 6b 9c 48 0f 44 d6 48 c7 c6 b3 08 62 9c 4c 89 d1 49 0f 44 f3 e8 1e 2e d5 ff <0f> 0b 49 c7 c1 9e 42 61 9c 4c 89 cf 4d 89 c8 eb a9 66 66 2e 0f 1f
> [ 1987.238505] RSP: 0018:ff62f5cf20607d60 EFLAGS: 00010246
> [ 1987.244423] RAX: 000000000000005f RBX: 000000000000001f RCX: 0000000000000000
> [ 1987.252480] RDX: 0000000000000000 RSI: ffffffff9c61429e RDI: 00000000ffffffff
> [ 1987.260538] RBP: ff62f5cf20607d78 R08: ff2a6a89ef3fffe8 R09: 00000000fffeffff
> [ 1987.268595] R10: ff2a6a89eed00000 R11: 0000000000000003 R12: ff2a66934849c89a
> [ 1987.276652] R13: 0000000000000001 R14: ff2a66934849c8b9 R15: ff2a66934849c899
> [ 1987.284710] FS:  0000000000000000(0000) GS:ff2a66b22fe40000(0000) knlGS:0000000000000000
> [ 1987.293850] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1987.300355] CR2: 00007fe291a37000 CR3: 000000010fbd4005 CR4: 0000000000f71ef0
> [ 1987.308413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1987.316470] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
> [ 1987.324527] PKRU: 55555554
> [ 1987.327622] Call Trace:
> [ 1987.330424]  <TASK>
> [ 1987.332826]  ? show_regs+0x6e/0x80
> [ 1987.336703]  ? die+0x3c/0xa0
> [ 1987.339988]  ? do_trap+0xd4/0xf0
> [ 1987.343662]  ? do_error_trap+0x75/0xa0
> [ 1987.347922]  ? usercopy_abort+0x72/0x90
> [ 1987.352277]  ? exc_invalid_op+0x57/0x80
> [ 1987.356634]  ? usercopy_abort+0x72/0x90
> [ 1987.360988]  ? asm_exc_invalid_op+0x1f/0x30
> [ 1987.365734]  ? usercopy_abort+0x72/0x90
> [ 1987.370088]  __check_heap_object+0xb7/0xd0
> [ 1987.374739]  __check_object_size+0x175/0x2d0
> [ 1987.379588]  idxd_copy_cr+0xa9/0x130 [idxd]
> [ 1987.384341]  idxd_evl_fault_work+0x127/0x390 [idxd]
> [ 1987.389878]  process_one_work+0x13e/0x300
> [ 1987.394435]  ? __pfx_worker_thread+0x10/0x10
> [ 1987.399284]  worker_thread+0x2f7/0x420
> [ 1987.403544]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
> [ 1987.409171]  ? __pfx_worker_thread+0x10/0x10
> [ 1987.414019]  kthread+0x107/0x140
> [ 1987.417693]  ? __pfx_kthread+0x10/0x10
> [ 1987.421954]  ret_from_fork+0x3d/0x60
> [ 1987.426019]  ? __pfx_kthread+0x10/0x10
> [ 1987.430281]  ret_from_fork_asm+0x1b/0x30
> [ 1987.434744]  </TASK>
> 
> The issue arises because event log cache is created using
> kmem_cache_create() which is not suitable for user copy.
> 
> Fix the issue by creating event log cache with
> kmem_cache_create_usercopy(), ensuring safe user copy.
s/, ensuring/ to ensure



> 
> Fixes: c2f156bf168f ("dmaengine: idxd: create kmem cache for event log fault items")
> Reported-by: Tony Zhu <tony.zhu@intel.com>
> Tested-by: Tony Zhu <tony.zhu@intel.com>
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>

Reviewed-by: Lijun Pan <lijun.pan@intel.com>

> ---
>   drivers/dma/idxd/init.c | 15 ++++++++++++---
>   1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index 14df1f1347a8..4954adc6bb60 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -343,7 +343,9 @@ static void idxd_cleanup_internals(struct idxd_device *idxd)
>   static int idxd_init_evl(struct idxd_device *idxd)
>   {
>   	struct device *dev = &idxd->pdev->dev;
> +	unsigned int evl_cache_size;
>   	struct idxd_evl *evl;
> +	const char *idxd_name;
>   
>   	if (idxd->hw.gen_cap.evl_support == 0)
>   		return 0;
> @@ -355,9 +357,16 @@ static int idxd_init_evl(struct idxd_device *idxd)
>   	spin_lock_init(&evl->lock);
>   	evl->size = IDXD_EVL_SIZE_MIN;
>   
> -	idxd->evl_cache = kmem_cache_create(dev_name(idxd_confdev(idxd)),
> -					    sizeof(struct idxd_evl_fault) + evl_ent_size(idxd),
> -					    0, 0, NULL);
> +	idxd_name = dev_name(idxd_confdev(idxd));
> +	evl_cache_size = sizeof(struct idxd_evl_fault) + evl_ent_size(idxd);
> +	/*
> +	 * Since completion record in evl_cache will be copied to user
> +	 * when handling completion record page fault, need to create
> +	 * the cache suitable for user copy.
> +	 */

Maybe briefly compare kmem_cache_create() with 
kmem_cache_create_usercopy() and add up to the above comments. If you 
think it too verbose, then forget about it.

> +	idxd->evl_cache = kmem_cache_create_usercopy(idxd_name, evl_cache_size,
> +						     0, 0, 0, evl_cache_size,
> +						     NULL);
>   	if (!idxd->evl_cache) {
>   		kfree(evl);
>   		return -ENOMEM;

Fenghua Yu Feb. 10, 2024, 12:20 a.m. UTC | #3

Hi, Lijun,

On 2/9/24 13:53, Lijun Pan wrote:
> 
> 
> On 2/10/2024 3:14 AM, Fenghua Yu wrote:
>> If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
>> event log cache to user triggers a kernel bug.

..

>> Fixes: c2f156bf168f ("dmaengine: idxd: create kmem cache for event log 
>> fault items")
>> Reported-by: Tony Zhu <tony.zhu@intel.com>
>> Tested-by: Tony Zhu <tony.zhu@intel.com>
>> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> 
> Reviewed-by: Lijun Pan <lijun.pan@intel.com>
> 
>> ---
>>   drivers/dma/idxd/init.c | 15 ++++++++++++---
>>   1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>> index 14df1f1347a8..4954adc6bb60 100644
>> --- a/drivers/dma/idxd/init.c
>> +++ b/drivers/dma/idxd/init.c
>> @@ -343,7 +343,9 @@ static void idxd_cleanup_internals(struct 
>> idxd_device *idxd)
>>   static int idxd_init_evl(struct idxd_device *idxd)
>>   {
>>       struct device *dev = &idxd->pdev->dev;
>> +    unsigned int evl_cache_size;
>>       struct idxd_evl *evl;
>> +    const char *idxd_name;
>>       if (idxd->hw.gen_cap.evl_support == 0)
>>           return 0;
>> @@ -355,9 +357,16 @@ static int idxd_init_evl(struct idxd_device *idxd)
>>       spin_lock_init(&evl->lock);
>>       evl->size = IDXD_EVL_SIZE_MIN;
>> -    idxd->evl_cache = kmem_cache_create(dev_name(idxd_confdev(idxd)),
>> -                        sizeof(struct idxd_evl_fault) + 
>> evl_ent_size(idxd),
>> -                        0, 0, NULL);
>> +    idxd_name = dev_name(idxd_confdev(idxd));
>> +    evl_cache_size = sizeof(struct idxd_evl_fault) + evl_ent_size(idxd);
>> +    /*
>> +     * Since completion record in evl_cache will be copied to user
>> +     * when handling completion record page fault, need to create
>> +     * the cache suitable for user copy.
>> +     */
> 
> Maybe briefly compare kmem_cache_create() with 
> kmem_cache_create_usercopy() and add up to the above comments. If you 
> think it too verbose, then forget about it.

It's improper to add comment to compare the two functions here because:
1. When people look into this code in init.c, they have no idea why 
compare the functions here when only kmem_cache_create_usercopy() is 
used. The comparison is only meaningful in this patch's context and has 
been explained in the patch commit message.

2. Comparison or any details of the function can be found easily in the 
function implementation. No need to add more details on top of the 
current comment which covers enough info (i.e. why call this function) 
already.

Given the above reasons, I will keep the current comment and patch 
without change.

Thanks.

-Fenghua

Fenghua Yu Feb. 19, 2024, 5:20 p.m. UTC | #4

Hi, Vinod,

On 2/9/24 16:20, Fenghua Yu wrote:
> Hi, Lijun,
> 
> On 2/9/24 13:53, Lijun Pan wrote:
>>
>>
>> On 2/10/2024 3:14 AM, Fenghua Yu wrote:
>>> If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
>>> event log cache to user triggers a kernel bug.
> 
> ...
> 
>>> Fixes: c2f156bf168f ("dmaengine: idxd: create kmem cache for event 
>>> log fault items")
>>> Reported-by: Tony Zhu <tony.zhu@intel.com>
>>> Tested-by: Tony Zhu <tony.zhu@intel.com>
>>> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
>>
>> Reviewed-by: Lijun Pan <lijun.pan@intel.com>
>>
>>> ---
>>>   drivers/dma/idxd/init.c | 15 ++++++++++++---
>>>   1 file changed, 12 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>> index 14df1f1347a8..4954adc6bb60 100644
>>> --- a/drivers/dma/idxd/init.c
>>> +++ b/drivers/dma/idxd/init.c
>>> @@ -343,7 +343,9 @@ static void idxd_cleanup_internals(struct 
>>> idxd_device *idxd)
>>>   static int idxd_init_evl(struct idxd_device *idxd)
>>>   {
>>>       struct device *dev = &idxd->pdev->dev;
>>> +    unsigned int evl_cache_size;
>>>       struct idxd_evl *evl;
>>> +    const char *idxd_name;
>>>       if (idxd->hw.gen_cap.evl_support == 0)
>>>           return 0;
>>> @@ -355,9 +357,16 @@ static int idxd_init_evl(struct idxd_device *idxd)
>>>       spin_lock_init(&evl->lock);
>>>       evl->size = IDXD_EVL_SIZE_MIN;
>>> -    idxd->evl_cache = kmem_cache_create(dev_name(idxd_confdev(idxd)),
>>> -                        sizeof(struct idxd_evl_fault) + 
>>> evl_ent_size(idxd),
>>> -                        0, 0, NULL);
>>> +    idxd_name = dev_name(idxd_confdev(idxd));
>>> +    evl_cache_size = sizeof(struct idxd_evl_fault) + 
>>> evl_ent_size(idxd);
>>> +    /*
>>> +     * Since completion record in evl_cache will be copied to user
>>> +     * when handling completion record page fault, need to create
>>> +     * the cache suitable for user copy.
>>> +     */
>>
>> Maybe briefly compare kmem_cache_create() with 
>> kmem_cache_create_usercopy() and add up to the above comments. If you 
>> think it too verbose, then forget about it.
> 
> It's improper to add comment to compare the two functions here because:
> 1. When people look into this code in init.c, they have no idea why 
> compare the functions here when only kmem_cache_create_usercopy() is 
> used. The comparison is only meaningful in this patch's context and has 
> been explained in the patch commit message.
> 
> 2. Comparison or any details of the function can be found easily in the 
> function implementation. No need to add more details on top of the 
> current comment which covers enough info (i.e. why call this function) 
> already.
> 
> Given the above reasons, I will keep the current comment and patch 
> without change.

Since Dave gave Reviewed-by already and the comment from Lijun is 
invalid and won't be addressed, could you please apply this patch to 
dmaengine tree for upstream inclusion?

Thank you very much!

-Fenghua

Lijun Pan Feb. 20, 2024, 1:42 a.m. UTC | #5

On 2/10/2024 8:20 AM, Fenghua Yu wrote:
> Hi, Lijun,
> 
> On 2/9/24 13:53, Lijun Pan wrote:
>>
>>
>> On 2/10/2024 3:14 AM, Fenghua Yu wrote:
>>> If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
>>> event log cache to user triggers a kernel bug.
> 
> ...
> 
>>> Fixes: c2f156bf168f ("dmaengine: idxd: create kmem cache for event 
>>> log fault items")
>>> Reported-by: Tony Zhu <tony.zhu@intel.com>
>>> Tested-by: Tony Zhu <tony.zhu@intel.com>
>>> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
>>
>> Reviewed-by: Lijun Pan <lijun.pan@intel.com>
>>
>>> ---
>>>   drivers/dma/idxd/init.c | 15 ++++++++++++---
>>>   1 file changed, 12 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>> index 14df1f1347a8..4954adc6bb60 100644
>>> --- a/drivers/dma/idxd/init.c
>>> +++ b/drivers/dma/idxd/init.c
>>> @@ -343,7 +343,9 @@ static void idxd_cleanup_internals(struct 
>>> idxd_device *idxd)
>>>   static int idxd_init_evl(struct idxd_device *idxd)
>>>   {
>>>       struct device *dev = &idxd->pdev->dev;
>>> +    unsigned int evl_cache_size;
>>>       struct idxd_evl *evl;
>>> +    const char *idxd_name;
>>>       if (idxd->hw.gen_cap.evl_support == 0)
>>>           return 0;
>>> @@ -355,9 +357,16 @@ static int idxd_init_evl(struct idxd_device *idxd)
>>>       spin_lock_init(&evl->lock);
>>>       evl->size = IDXD_EVL_SIZE_MIN;
>>> -    idxd->evl_cache = kmem_cache_create(dev_name(idxd_confdev(idxd)),
>>> -                        sizeof(struct idxd_evl_fault) + 
>>> evl_ent_size(idxd),
>>> -                        0, 0, NULL);
>>> +    idxd_name = dev_name(idxd_confdev(idxd));
>>> +    evl_cache_size = sizeof(struct idxd_evl_fault) + 
>>> evl_ent_size(idxd);
>>> +    /*
>>> +     * Since completion record in evl_cache will be copied to user
>>> +     * when handling completion record page fault, need to create
>>> +     * the cache suitable for user copy.
>>> +     */
>>
>> Maybe briefly compare kmem_cache_create() with 
>> kmem_cache_create_usercopy() and add up to the above comments. If you 
>> think it too verbose, then forget about it.
> 
> It's improper to add comment to compare the two functions here because:
> 1. When people look into this code in init.c, they have no idea why 
> compare the functions here when only kmem_cache_create_usercopy() is 
> used. The comparison is only meaningful in this patch's context and has 
> been explained in the patch commit message.
> 
> 2. Comparison or any details of the function can be found easily in the 
> function implementation. No need to add more details on top of the 
> current comment which covers enough info (i.e. why call this function) 
> already.
> 
> Given the above reasons, I will keep the current comment and patch 
> without change.
> 


That's fine.

Lijun

Lijun Pan Feb. 20, 2024, 1:43 a.m. UTC | #6

On 2/10/2024 5:53 AM, Lijun Pan wrote:
> 
> 
> On 2/10/2024 3:14 AM, Fenghua Yu wrote:
>> If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
>> event log cache to user triggers a kernel bug.
>>
>> [ 1987.159822] usercopy: Kernel memory exposure attempt detected from 
>> SLUB object 'dsa0' (offset 74, size 31)!
>> [ 1987.170845] ------------[ cut here ]------------
>> [ 1987.176086] kernel BUG at mm/usercopy.c:102!
>> [ 1987.180946] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
>> [ 1987.186866] CPU: 17 PID: 528 Comm: kworker/17:1 Not tainted 
>> 6.8.0-rc2+ #5
>> [ 1987.194537] Hardware name: Intel Corporation AvenueCity/AvenueCity, 
>> BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023
>> [ 1987.206405] Workqueue: wq0.0 idxd_evl_fault_work [idxd]
>> [ 1987.212338] RIP: 0010:usercopy_abort+0x72/0x90
>> [ 1987.217381] Code: 58 65 9c 50 48 c7 c2 17 85 61 9c 57 48 c7 c7 98 
>> fd 6b 9c 48 0f 44 d6 48 c7 c6 b3 08 62 9c 4c 89 d1 49 0f 44 f3 e8 1e 
>> 2e d5 ff <0f> 0b 49 c7 c1 9e 42 61 9c 4c 89 cf 4d 89 c8 eb a9 66 66 2e 
>> 0f 1f
>> [ 1987.238505] RSP: 0018:ff62f5cf20607d60 EFLAGS: 00010246
>> [ 1987.244423] RAX: 000000000000005f RBX: 000000000000001f RCX: 
>> 0000000000000000
>> [ 1987.252480] RDX: 0000000000000000 RSI: ffffffff9c61429e RDI: 
>> 00000000ffffffff
>> [ 1987.260538] RBP: ff62f5cf20607d78 R08: ff2a6a89ef3fffe8 R09: 
>> 00000000fffeffff
>> [ 1987.268595] R10: ff2a6a89eed00000 R11: 0000000000000003 R12: 
>> ff2a66934849c89a
>> [ 1987.276652] R13: 0000000000000001 R14: ff2a66934849c8b9 R15: 
>> ff2a66934849c899
>> [ 1987.284710] FS:  0000000000000000(0000) GS:ff2a66b22fe40000(0000) 
>> knlGS:0000000000000000
>> [ 1987.293850] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 1987.300355] CR2: 00007fe291a37000 CR3: 000000010fbd4005 CR4: 
>> 0000000000f71ef0
>> [ 1987.308413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>> 0000000000000000
>> [ 1987.316470] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 
>> 0000000000000400
>> [ 1987.324527] PKRU: 55555554
>> [ 1987.327622] Call Trace:
>> [ 1987.330424]  <TASK>
>> [ 1987.332826]  ? show_regs+0x6e/0x80
>> [ 1987.336703]  ? die+0x3c/0xa0
>> [ 1987.339988]  ? do_trap+0xd4/0xf0
>> [ 1987.343662]  ? do_error_trap+0x75/0xa0
>> [ 1987.347922]  ? usercopy_abort+0x72/0x90
>> [ 1987.352277]  ? exc_invalid_op+0x57/0x80
>> [ 1987.356634]  ? usercopy_abort+0x72/0x90
>> [ 1987.360988]  ? asm_exc_invalid_op+0x1f/0x30
>> [ 1987.365734]  ? usercopy_abort+0x72/0x90
>> [ 1987.370088]  __check_heap_object+0xb7/0xd0
>> [ 1987.374739]  __check_object_size+0x175/0x2d0
>> [ 1987.379588]  idxd_copy_cr+0xa9/0x130 [idxd]
>> [ 1987.384341]  idxd_evl_fault_work+0x127/0x390 [idxd]
>> [ 1987.389878]  process_one_work+0x13e/0x300
>> [ 1987.394435]  ? __pfx_worker_thread+0x10/0x10
>> [ 1987.399284]  worker_thread+0x2f7/0x420
>> [ 1987.403544]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
>> [ 1987.409171]  ? __pfx_worker_thread+0x10/0x10
>> [ 1987.414019]  kthread+0x107/0x140
>> [ 1987.417693]  ? __pfx_kthread+0x10/0x10
>> [ 1987.421954]  ret_from_fork+0x3d/0x60
>> [ 1987.426019]  ? __pfx_kthread+0x10/0x10
>> [ 1987.430281]  ret_from_fork_asm+0x1b/0x30
>> [ 1987.434744]  </TASK>
>>
>> The issue arises because event log cache is created using
>> kmem_cache_create() which is not suitable for user copy.
>>
>> Fix the issue by creating event log cache with
>> kmem_cache_create_usercopy(), ensuring safe user copy.
> s/, ensuring/ to ensure

It is not a big deal if you really want keep original wording.

Lijun

> 
> 
> 
>>
>> Fixes: c2f156bf168f ("dmaengine: idxd: create kmem cache for event log 
>> fault items")
>> Reported-by: Tony Zhu <tony.zhu@intel.com>
>> Tested-by: Tony Zhu <tony.zhu@intel.com>
>> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> 
> Reviewed-by: Lijun Pan <lijun.pan@intel.com>
>

Vinod Koul Feb. 23, 2024, 12:08 p.m. UTC | #7

On Fri, 09 Feb 2024 11:14:12 -0800, Fenghua Yu wrote:
> If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
> event log cache to user triggers a kernel bug.
> 
> [ 1987.159822] usercopy: Kernel memory exposure attempt detected from SLUB object 'dsa0' (offset 74, size 31)!
> [ 1987.170845] ------------[ cut here ]------------
> [ 1987.176086] kernel BUG at mm/usercopy.c:102!
> [ 1987.180946] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [ 1987.186866] CPU: 17 PID: 528 Comm: kworker/17:1 Not tainted 6.8.0-rc2+ #5
> [ 1987.194537] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023
> [ 1987.206405] Workqueue: wq0.0 idxd_evl_fault_work [idxd]
> [ 1987.212338] RIP: 0010:usercopy_abort+0x72/0x90
> [ 1987.217381] Code: 58 65 9c 50 48 c7 c2 17 85 61 9c 57 48 c7 c7 98 fd 6b 9c 48 0f 44 d6 48 c7 c6 b3 08 62 9c 4c 89 d1 49 0f 44 f3 e8 1e 2e d5 ff <0f> 0b 49 c7 c1 9e 42 61 9c 4c 89 cf 4d 89 c8 eb a9 66 66 2e 0f 1f
> [ 1987.238505] RSP: 0018:ff62f5cf20607d60 EFLAGS: 00010246
> [ 1987.244423] RAX: 000000000000005f RBX: 000000000000001f RCX: 0000000000000000
> [ 1987.252480] RDX: 0000000000000000 RSI: ffffffff9c61429e RDI: 00000000ffffffff
> [ 1987.260538] RBP: ff62f5cf20607d78 R08: ff2a6a89ef3fffe8 R09: 00000000fffeffff
> [ 1987.268595] R10: ff2a6a89eed00000 R11: 0000000000000003 R12: ff2a66934849c89a
> [ 1987.276652] R13: 0000000000000001 R14: ff2a66934849c8b9 R15: ff2a66934849c899
> [ 1987.284710] FS:  0000000000000000(0000) GS:ff2a66b22fe40000(0000) knlGS:0000000000000000
> [ 1987.293850] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1987.300355] CR2: 00007fe291a37000 CR3: 000000010fbd4005 CR4: 0000000000f71ef0
> [ 1987.308413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1987.316470] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
> [ 1987.324527] PKRU: 55555554
> [ 1987.327622] Call Trace:
> [ 1987.330424]  <TASK>
> [ 1987.332826]  ? show_regs+0x6e/0x80
> [ 1987.336703]  ? die+0x3c/0xa0
> [ 1987.339988]  ? do_trap+0xd4/0xf0
> [ 1987.343662]  ? do_error_trap+0x75/0xa0
> [ 1987.347922]  ? usercopy_abort+0x72/0x90
> [ 1987.352277]  ? exc_invalid_op+0x57/0x80
> [ 1987.356634]  ? usercopy_abort+0x72/0x90
> [ 1987.360988]  ? asm_exc_invalid_op+0x1f/0x30
> [ 1987.365734]  ? usercopy_abort+0x72/0x90
> [ 1987.370088]  __check_heap_object+0xb7/0xd0
> [ 1987.374739]  __check_object_size+0x175/0x2d0
> [ 1987.379588]  idxd_copy_cr+0xa9/0x130 [idxd]
> [ 1987.384341]  idxd_evl_fault_work+0x127/0x390 [idxd]
> [ 1987.389878]  process_one_work+0x13e/0x300
> [ 1987.394435]  ? __pfx_worker_thread+0x10/0x10
> [ 1987.399284]  worker_thread+0x2f7/0x420
> [ 1987.403544]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
> [ 1987.409171]  ? __pfx_worker_thread+0x10/0x10
> [ 1987.414019]  kthread+0x107/0x140
> [ 1987.417693]  ? __pfx_kthread+0x10/0x10
> [ 1987.421954]  ret_from_fork+0x3d/0x60
> [ 1987.426019]  ? __pfx_kthread+0x10/0x10
> [ 1987.430281]  ret_from_fork_asm+0x1b/0x30
> [ 1987.434744]  </TASK>
> 
> [...]

Applied, thanks!

[1/1] dmaengine: idxd: Ensure safe user copy of completion record
      commit: d3ea125df37dc37972d581b74a5d3785c3f283ab

Best regards,

diff mbox series

Patch

diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 14df1f1347a8..4954adc6bb60 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -343,7 +343,9 @@  static void idxd_cleanup_internals(struct idxd_device *idxd)
 static int idxd_init_evl(struct idxd_device *idxd)
 {
 	struct device *dev = &idxd->pdev->dev;
+	unsigned int evl_cache_size;
 	struct idxd_evl *evl;
+	const char *idxd_name;
 
 	if (idxd->hw.gen_cap.evl_support == 0)
 		return 0;
@@ -355,9 +357,16 @@  static int idxd_init_evl(struct idxd_device *idxd)
 	spin_lock_init(&evl->lock);
 	evl->size = IDXD_EVL_SIZE_MIN;
 
-	idxd->evl_cache = kmem_cache_create(dev_name(idxd_confdev(idxd)),
-					    sizeof(struct idxd_evl_fault) + evl_ent_size(idxd),
-					    0, 0, NULL);
+	idxd_name = dev_name(idxd_confdev(idxd));
+	evl_cache_size = sizeof(struct idxd_evl_fault) + evl_ent_size(idxd);
+	/*
+	 * Since completion record in evl_cache will be copied to user
+	 * when handling completion record page fault, need to create
+	 * the cache suitable for user copy.
+	 */
+	idxd->evl_cache = kmem_cache_create_usercopy(idxd_name, evl_cache_size,
+						     0, 0, 0, evl_cache_size,
+						     NULL);
 	if (!idxd->evl_cache) {
 		kfree(evl);
 		return -ENOMEM;