[v8,1/4] fs/proc/kcore: avoid bounce buffer for ktext data

Message ID fd39b0bfa7edc76d360def7d034baaee71d90158.1679566220.git.lstoakes@gmail.com
State New
Headers
Series [v8,1/4] fs/proc/kcore: avoid bounce buffer for ktext data |

Commit Message

Lorenzo Stoakes March 23, 2023, 10:15 a.m. UTC
  Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
introduced the use of a bounce buffer to retrieve kernel text data for
/proc/kcore in order to avoid failures arising from hardened user copies
enabled by CONFIG_HARDENED_USERCOPY in check_kernel_text_object().

We can avoid doing this if instead of copy_to_user() we use _copy_to_user()
which bypasses the hardening check. This is more efficient than using a
bounce buffer and simplifies the code.

We do so as part an overall effort to eliminate bounce buffer usage in the
function with an eye to converting it an iterator read.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
 fs/proc/kcore.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)
  

Comments

Jiri Olsa May 31, 2023, 11:58 a.m. UTC | #1
On Thu, Mar 23, 2023 at 10:15:16AM +0000, Lorenzo Stoakes wrote:
> Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
> introduced the use of a bounce buffer to retrieve kernel text data for
> /proc/kcore in order to avoid failures arising from hardened user copies
> enabled by CONFIG_HARDENED_USERCOPY in check_kernel_text_object().
> 
> We can avoid doing this if instead of copy_to_user() we use _copy_to_user()
> which bypasses the hardening check. This is more efficient than using a
> bounce buffer and simplifies the code.
> 
> We do so as part an overall effort to eliminate bounce buffer usage in the
> function with an eye to converting it an iterator read.
> 
> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> Reviewed-by: David Hildenbrand <david@redhat.com>

hi,
sorry for late feedback, but looks like this one breaks reading
/proc/kcore with objdump for me:

  # cat /proc/kallsyms | grep ksys_read
  ffffffff8150ebc0 T ksys_read
  # objdump -d  --start-address=0xffffffff8150ebc0 --stop-address=0xffffffff8150ebd0 /proc/kcore 

  /proc/kcore:     file format elf64-x86-64

  objdump: Reading section load1 failed because: Bad address

reverting this makes it work again

thanks,
jirka

> ---
>  fs/proc/kcore.c | 17 +++++------------
>  1 file changed, 5 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> index 71157ee35c1a..556f310d6aa4 100644
> --- a/fs/proc/kcore.c
> +++ b/fs/proc/kcore.c
> @@ -541,19 +541,12 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
>  		case KCORE_VMEMMAP:
>  		case KCORE_TEXT:
>  			/*
> -			 * Using bounce buffer to bypass the
> -			 * hardened user copy kernel text checks.
> +			 * We use _copy_to_user() to bypass usermode hardening
> +			 * which would otherwise prevent this operation.
>  			 */
> -			if (copy_from_kernel_nofault(buf, (void *)start, tsz)) {
> -				if (clear_user(buffer, tsz)) {
> -					ret = -EFAULT;
> -					goto out;
> -				}
> -			} else {
> -				if (copy_to_user(buffer, buf, tsz)) {
> -					ret = -EFAULT;
> -					goto out;
> -				}
> +			if (_copy_to_user(buffer, (char *)start, tsz)) {
> +				ret = -EFAULT;
> +				goto out;
>  			}
>  			break;
>  		default:
> -- 
> 2.39.2
>
  
Baoquan He July 21, 2023, 1:48 p.m. UTC | #2
Hi Jiri,

On 05/31/23 at 01:58pm, Jiri Olsa wrote:
> On Thu, Mar 23, 2023 at 10:15:16AM +0000, Lorenzo Stoakes wrote:
> > Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
> > introduced the use of a bounce buffer to retrieve kernel text data for
> > /proc/kcore in order to avoid failures arising from hardened user copies
> > enabled by CONFIG_HARDENED_USERCOPY in check_kernel_text_object().
> > 
> > We can avoid doing this if instead of copy_to_user() we use _copy_to_user()
> > which bypasses the hardening check. This is more efficient than using a
> > bounce buffer and simplifies the code.
> > 
> > We do so as part an overall effort to eliminate bounce buffer usage in the
> > function with an eye to converting it an iterator read.
> > 
> > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > Reviewed-by: David Hildenbrand <david@redhat.com>
> 
> hi,
> sorry for late feedback, but looks like this one breaks reading
> /proc/kcore with objdump for me:
> 
>   # cat /proc/kallsyms | grep ksys_read
>   ffffffff8150ebc0 T ksys_read
>   # objdump -d  --start-address=0xffffffff8150ebc0 --stop-address=0xffffffff8150ebd0 /proc/kcore 
> 
>   /proc/kcore:     file format elf64-x86-64
> 
>   objdump: Reading section load1 failed because: Bad address
> 
> reverting this makes it work again

I met this too when I executed below command to trigger a kcore reading.
I wanted to do a simple testing during system running and got this.

  makedumpfile --mem-usage /proc/kcore

Later I tried your above objdump testing, it corrupted system too.

Is there any conclusion about this issue you reported? I could miss
things in the discussion or patch posting to fix this.

Thanks
Baoquan

> 
> 
> > ---
> >  fs/proc/kcore.c | 17 +++++------------
> >  1 file changed, 5 insertions(+), 12 deletions(-)
> > 
> > diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> > index 71157ee35c1a..556f310d6aa4 100644
> > --- a/fs/proc/kcore.c
> > +++ b/fs/proc/kcore.c
> > @@ -541,19 +541,12 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
> >  		case KCORE_VMEMMAP:
> >  		case KCORE_TEXT:
> >  			/*
> > -			 * Using bounce buffer to bypass the
> > -			 * hardened user copy kernel text checks.
> > +			 * We use _copy_to_user() to bypass usermode hardening
> > +			 * which would otherwise prevent this operation.
> >  			 */
> > -			if (copy_from_kernel_nofault(buf, (void *)start, tsz)) {
> > -				if (clear_user(buffer, tsz)) {
> > -					ret = -EFAULT;
> > -					goto out;
> > -				}
> > -			} else {
> > -				if (copy_to_user(buffer, buf, tsz)) {
> > -					ret = -EFAULT;
> > -					goto out;
> > -				}
> > +			if (_copy_to_user(buffer, (char *)start, tsz)) {
> > +				ret = -EFAULT;
> > +				goto out;
> >  			}
> >  			break;
> >  		default:
> > -- 
> > 2.39.2
> > 
>
  
Jiri Olsa July 21, 2023, 2:13 p.m. UTC | #3
On Fri, Jul 21, 2023 at 09:48:37PM +0800, Baoquan He wrote:
> Hi Jiri,
> 
> On 05/31/23 at 01:58pm, Jiri Olsa wrote:
> > On Thu, Mar 23, 2023 at 10:15:16AM +0000, Lorenzo Stoakes wrote:
> > > Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
> > > introduced the use of a bounce buffer to retrieve kernel text data for
> > > /proc/kcore in order to avoid failures arising from hardened user copies
> > > enabled by CONFIG_HARDENED_USERCOPY in check_kernel_text_object().
> > > 
> > > We can avoid doing this if instead of copy_to_user() we use _copy_to_user()
> > > which bypasses the hardening check. This is more efficient than using a
> > > bounce buffer and simplifies the code.
> > > 
> > > We do so as part an overall effort to eliminate bounce buffer usage in the
> > > function with an eye to converting it an iterator read.
> > > 
> > > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > > Reviewed-by: David Hildenbrand <david@redhat.com>
> > 
> > hi,
> > sorry for late feedback, but looks like this one breaks reading
> > /proc/kcore with objdump for me:
> > 
> >   # cat /proc/kallsyms | grep ksys_read
> >   ffffffff8150ebc0 T ksys_read
> >   # objdump -d  --start-address=0xffffffff8150ebc0 --stop-address=0xffffffff8150ebd0 /proc/kcore 
> > 
> >   /proc/kcore:     file format elf64-x86-64
> > 
> >   objdump: Reading section load1 failed because: Bad address
> > 
> > reverting this makes it work again
> 
> I met this too when I executed below command to trigger a kcore reading.
> I wanted to do a simple testing during system running and got this.
> 
>   makedumpfile --mem-usage /proc/kcore
> 
> Later I tried your above objdump testing, it corrupted system too.
> 
> Is there any conclusion about this issue you reported? I could miss
> things in the discussion or patch posting to fix this.

hi,
thanks for your reply, I meant to ping on this again

AFAIK there was no answer yet.. I managed to cleanly revert the patch when
I needed the functionality, then got sidetracked and forgot about this

I just re-tested and it's still failing for me, would be great to get it fixed

Lorenzo, any idea?

thanks,
jirka


> 
> Thanks
> Baoquan
> 
> > 
> > 
> > > ---
> > >  fs/proc/kcore.c | 17 +++++------------
> > >  1 file changed, 5 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> > > index 71157ee35c1a..556f310d6aa4 100644
> > > --- a/fs/proc/kcore.c
> > > +++ b/fs/proc/kcore.c
> > > @@ -541,19 +541,12 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
> > >  		case KCORE_VMEMMAP:
> > >  		case KCORE_TEXT:
> > >  			/*
> > > -			 * Using bounce buffer to bypass the
> > > -			 * hardened user copy kernel text checks.
> > > +			 * We use _copy_to_user() to bypass usermode hardening
> > > +			 * which would otherwise prevent this operation.
> > >  			 */
> > > -			if (copy_from_kernel_nofault(buf, (void *)start, tsz)) {
> > > -				if (clear_user(buffer, tsz)) {
> > > -					ret = -EFAULT;
> > > -					goto out;
> > > -				}
> > > -			} else {
> > > -				if (copy_to_user(buffer, buf, tsz)) {
> > > -					ret = -EFAULT;
> > > -					goto out;
> > > -				}
> > > +			if (_copy_to_user(buffer, (char *)start, tsz)) {
> > > +				ret = -EFAULT;
> > > +				goto out;
> > >  			}
> > >  			break;
> > >  		default:
> > > -- 
> > > 2.39.2
> > > 
> > 
>
  
David Hildenbrand July 24, 2023, 6:23 a.m. UTC | #4
Hi,

> 
> I met this too when I executed below command to trigger a kcore reading.
> I wanted to do a simple testing during system running and got this.
> 
>    makedumpfile --mem-usage /proc/kcore
> 
> Later I tried your above objdump testing, it corrupted system too.
> 

What do you mean with "corrupted system too" --  did it not only fail to 
dump the system, but also actually harmed the system?

@Lorenzo do you plan on reproduce + fix, or should we consider reverting 
that change?
  
Baoquan He July 24, 2023, 8:08 a.m. UTC | #5
On 07/24/23 at 08:23am, David Hildenbrand wrote:
> Hi,
> 
> > 
> > I met this too when I executed below command to trigger a kcore reading.
> > I wanted to do a simple testing during system running and got this.
> > 
> >    makedumpfile --mem-usage /proc/kcore
> > 
> > Later I tried your above objdump testing, it corrupted system too.
> > 
> 
> What do you mean with "corrupted system too" --  did it not only fail to
> dump the system, but also actually harmed the system?

From my testing, reading kcore will cause system panic, then reboot. Not
sure if Jiri saw the same phenomenon.

> 
> @Lorenzo do you plan on reproduce + fix, or should we consider reverting
> that change?

When tested on a arm64 system, the reproducution is stable. I will have
a look too to see if I have some finding this week.
  
Jiri Olsa July 24, 2023, 8:18 a.m. UTC | #6
On Mon, Jul 24, 2023 at 04:08:41PM +0800, Baoquan He wrote:
> On 07/24/23 at 08:23am, David Hildenbrand wrote:
> > Hi,
> > 
> > > 
> > > I met this too when I executed below command to trigger a kcore reading.
> > > I wanted to do a simple testing during system running and got this.
> > > 
> > >    makedumpfile --mem-usage /proc/kcore
> > > 
> > > Later I tried your above objdump testing, it corrupted system too.
> > > 
> > 
> > What do you mean with "corrupted system too" --  did it not only fail to
> > dump the system, but also actually harmed the system?
> 
> From my testing, reading kcore will cause system panic, then reboot. Not
> sure if Jiri saw the same phenomenon.

it did not crash for me, just the read error
could you get console output from that?

jirka

> 
> > 
> > @Lorenzo do you plan on reproduce + fix, or should we consider reverting
> > that change?
> 
> When tested on a arm64 system, the reproducution is stable. I will have
> a look too to see if I have some finding this week.
>
  
Linux regression tracking (Thorsten Leemhuis) July 24, 2023, 9:38 a.m. UTC | #7
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 31.05.23 13:58, Jiri Olsa wrote:
> On Thu, Mar 23, 2023 at 10:15:16AM +0000, Lorenzo Stoakes wrote:
>> Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
>> introduced the use of a bounce buffer to retrieve kernel text data for
>> /proc/kcore in order to avoid failures arising from hardened user copies
>> enabled by CONFIG_HARDENED_USERCOPY in check_kernel_text_object().
>> [...]
>> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
>> Reviewed-by: David Hildenbrand <david@redhat.com>
> 
> sorry for late feedback, but looks like this one breaks reading
> /proc/kcore with objdump for me:
> 
>   # cat /proc/kallsyms | grep ksys_read
>   ffffffff8150ebc0 T ksys_read
>   # objdump -d  --start-address=0xffffffff8150ebc0 --stop-address=0xffffffff8150ebd0 /proc/kcore 
> 
>   /proc/kcore:     file format elf64-x86-64
> 
>   objdump: Reading section load1 failed because: Bad address
> 
> reverting this makes it work again

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 2e1c017077
#regzbot title mm / fs/proc/kcore: reading /proc/kcore with objdump broke
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.
  
Baoquan He July 24, 2023, 2:33 p.m. UTC | #8
On 07/24/23 at 10:18am, Jiri Olsa wrote:
> On Mon, Jul 24, 2023 at 04:08:41PM +0800, Baoquan He wrote:
> > On 07/24/23 at 08:23am, David Hildenbrand wrote:
> > > Hi,
> > > 
> > > > 
> > > > I met this too when I executed below command to trigger a kcore reading.
> > > > I wanted to do a simple testing during system running and got this.
> > > > 
> > > >    makedumpfile --mem-usage /proc/kcore
> > > > 
> > > > Later I tried your above objdump testing, it corrupted system too.
> > > > 
> > > 
> > > What do you mean with "corrupted system too" --  did it not only fail to
> > > dump the system, but also actually harmed the system?
> > 
> > From my testing, reading kcore will cause system panic, then reboot. Not
> > sure if Jiri saw the same phenomenon.
> 
> it did not crash for me, just the read error
> could you get console output from that?

I got a new arm64 machine, then executing "makedumpfile --mem-usage
/proc/kcore" won't trigger panic, your objdump command can trigger
panic. The call trace is pasted at below. It's the same as the panic and
call trace I met on my last arm64 machine.

[13270.314323] Mem abort info:
[13270.317162]   ESR = 0x0000000096000007
[13270.320901]   EC = 0x25: DABT (current EL), IL = 32 bits
[13270.326217]   SET = 0, FnV = 0
[13270.329261]   EA = 0, S1PTW = 0
[13270.332390]   FSC = 0x07: level 3 translation fault
[13270.337270] Data abort info:
[13270.340139]   ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
[13270.345626]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[13270.350666]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[13270.355981] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000400651d64000
[13270.362672] [ffffdc9cf3ea0000] pgd=1000401ffffff003, p4d=1000401ffffff003, pud=1000401fffffe003, pmd=1000401fffffd003, pte=0000000000000000
[13270.375367] Internal error: Oops: 0000000096000007 [#4] SMP
[13270.380934] Modules linked in: mlx5_ib ib_uverbs ib_core rfkill vfat fat joydev cdc_ether usbnet mii mlx5_core acpi_ipmi mlxfw ipmi_ssif psample tls ipmi_devintf pci_hyperv_intf arm_spe_pmu ipmi_msghandler arm_cmn arm_dmc620_pmu arm_dsu_pmu cppc_cpufreq acpi_tad fuse zram xfs crct10dif_ce polyval_ce polyval_generic ghash_ce uas sbsa_gwdt nvme nvme_core ast usb_storage nvme_common i2c_algo_bit xgene_hwmon
[13270.416751] CPU: 15 PID: 8803 Comm: objdump Tainted: G      D            6.5.0-rc3 #1
[13270.424570] Hardware name: WIWYNN Mt.Jade Server System B81.030Z1.0007/Mt.Jade Motherboard, BIOS 2.10.20220531 (SCP: 2.10.20220531) 2022/05/31
[13270.437337] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[13270.444289] pc : __arch_copy_to_user+0x180/0x240
[13270.448910] lr : _copy_to_iter+0x11c/0x5d0
[13270.453002] sp : ffff8000b15a37c0
[13270.456306] x29: ffff8000b15a37c0 x28: ffffdc9cf3ea0000 x27: ffffdc9cf6938158
[13270.463431] x26: ffff8000b15a3ba8 x25: 0000000000000690 x24: ffff8000b15a3b80
[13270.470556] x23: 00000000000038ac x22: ffffdc9cf3ea0000 x21: ffff8000b15a3b80
[13270.477682] x20: ffffdc9cf64fdf00 x19: 0000000000000400 x18: 0000000000000000
[13270.484806] x17: 0000000000000000 x16: 0000000000000000 x15: ffffdc9cf3ea0000
[13270.491931] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[13270.499056] x11: 0001000000000000 x10: ffffdc9cf64fdf00 x9 : 0000000000000690
[13270.506182] x8 : 000000007c000000 x7 : 0000fd007e000000 x6 : 000000000eee0b60
[13270.513306] x5 : 000000000eee0f60 x4 : 0000000000000000 x3 : 0000000000000400
[13270.520431] x2 : 0000000000000380 x1 : ffffdc9cf3ea0000 x0 : 000000000eee0b60
[13270.527556] Call trace:
[13270.529992]  __arch_copy_to_user+0x180/0x240
[13270.534250]  read_kcore_iter+0x718/0x878
[13270.538167]  proc_reg_read_iter+0x8c/0xe8
[13270.542168]  vfs_read+0x214/0x2c0
[13270.545478]  ksys_read+0x78/0x118
[13270.548782]  __arm64_sys_read+0x24/0x38
[13270.552608]  invoke_syscall+0x78/0x108
[13270.556351]  el0_svc_common.constprop.0+0x4c/0xf8
[13270.561044]  do_el0_svc+0x34/0x50
[13270.564347]  el0_svc+0x34/0x108
[13270.567482]  el0t_64_sync_handler+0x100/0x130
[13270.571829]  el0t_64_sync+0x194/0x198
[13270.575483] Code: d503201f d503201f d503201f d503201f (a8c12027) 
[13270.581567] ---[ end trace 0000000000000000 ]---
  
Lorenzo Stoakes July 31, 2023, 7:21 p.m. UTC | #9
On Mon, Jul 24, 2023 at 08:23:55AM +0200, David Hildenbrand wrote:
> Hi,
>
> >
> > I met this too when I executed below command to trigger a kcore reading.
> > I wanted to do a simple testing during system running and got this.
> >
> >    makedumpfile --mem-usage /proc/kcore
> >
> > Later I tried your above objdump testing, it corrupted system too.
> >
>
> What do you mean with "corrupted system too" --  did it not only fail to
> dump the system, but also actually harmed the system?
>
> @Lorenzo do you plan on reproduce + fix, or should we consider reverting
> that change?
>
> --
> Cheers,
>
> David / dhildenb
>

Apologies I mised this, I have been very busy lately not least with book :)

Concerning, I will take a look as I get a chance. I think the whole series
would have to be reverted which would be... depressing... as other patches
in series eliminates the bounce buffer altogether.
  
David Hildenbrand July 31, 2023, 7:24 p.m. UTC | #10
On 31.07.23 21:21, Lorenzo Stoakes wrote:
> On Mon, Jul 24, 2023 at 08:23:55AM +0200, David Hildenbrand wrote:
>> Hi,
>>
>>>
>>> I met this too when I executed below command to trigger a kcore reading.
>>> I wanted to do a simple testing during system running and got this.
>>>
>>>     makedumpfile --mem-usage /proc/kcore
>>>
>>> Later I tried your above objdump testing, it corrupted system too.
>>>
>>
>> What do you mean with "corrupted system too" --  did it not only fail to
>> dump the system, but also actually harmed the system?
>>
>> @Lorenzo do you plan on reproduce + fix, or should we consider reverting
>> that change?
>>
>> --
>> Cheers,
>>
>> David / dhildenb
>>
> 
> Apologies I mised this, I have been very busy lately not least with book :)
> 
> Concerning, I will take a look as I get a chance. I think the whole series
> would have to be reverted which would be... depressing... as other patches
> in series eliminates the bounce buffer altogether.
> 

I spotted

https://lkml.kernel.org/r/069dd40aa71e634b414d07039d72467d051fb486.camel@gmx.de

today, maybe that's related.
  
Lorenzo Stoakes July 31, 2023, 7:40 p.m. UTC | #11
On Mon, Jul 31, 2023 at 09:24:50PM +0200, David Hildenbrand wrote:
> On 31.07.23 21:21, Lorenzo Stoakes wrote:
> > On Mon, Jul 24, 2023 at 08:23:55AM +0200, David Hildenbrand wrote:
> > > Hi,
> > >
> > > >
> > > > I met this too when I executed below command to trigger a kcore reading.
> > > > I wanted to do a simple testing during system running and got this.
> > > >
> > > >     makedumpfile --mem-usage /proc/kcore
> > > >
> > > > Later I tried your above objdump testing, it corrupted system too.
> > > >
> > >
> > > What do you mean with "corrupted system too" --  did it not only fail to
> > > dump the system, but also actually harmed the system?
> > >
> > > @Lorenzo do you plan on reproduce + fix, or should we consider reverting
> > > that change?
> > >
> > > --
> > > Cheers,
> > >
> > > David / dhildenb
> > >
> >
> > Apologies I mised this, I have been very busy lately not least with book :)
> >
> > Concerning, I will take a look as I get a chance. I think the whole series
> > would have to be reverted which would be... depressing... as other patches
> > in series eliminates the bounce buffer altogether.
> >
>
> I spotted
>
> https://lkml.kernel.org/r/069dd40aa71e634b414d07039d72467d051fb486.camel@gmx.de
>

Find that slightly confusing, they talk about just reveritng the patch but then
also add a kern_addr_valid()?

I'm also confused about people talking about just reverting the patch, as
4c91c07c93bb drops the bounce buffer altogether... presumably they mean
reverting both?

Clearly this is an arm64 thing (obviously), I have some arm64 hardware let me
see if I can repro...

Baoquan, Jiri - are you reverting more than just the one commit? And does doing
this go from not working -> working? Or from not working (worst case oops) ->
error?

> today, maybe that's related.
>
> --
> Cheers,
>
> David / dhildenb
>
  
Jiri Olsa July 31, 2023, 8:34 p.m. UTC | #12
On Mon, Jul 31, 2023 at 08:40:24PM +0100, Lorenzo Stoakes wrote:
> On Mon, Jul 31, 2023 at 09:24:50PM +0200, David Hildenbrand wrote:
> > On 31.07.23 21:21, Lorenzo Stoakes wrote:
> > > On Mon, Jul 24, 2023 at 08:23:55AM +0200, David Hildenbrand wrote:
> > > > Hi,
> > > >
> > > > >
> > > > > I met this too when I executed below command to trigger a kcore reading.
> > > > > I wanted to do a simple testing during system running and got this.
> > > > >
> > > > >     makedumpfile --mem-usage /proc/kcore
> > > > >
> > > > > Later I tried your above objdump testing, it corrupted system too.
> > > > >
> > > >
> > > > What do you mean with "corrupted system too" --  did it not only fail to
> > > > dump the system, but also actually harmed the system?
> > > >
> > > > @Lorenzo do you plan on reproduce + fix, or should we consider reverting
> > > > that change?
> > > >
> > > > --
> > > > Cheers,
> > > >
> > > > David / dhildenb
> > > >
> > >
> > > Apologies I mised this, I have been very busy lately not least with book :)
> > >
> > > Concerning, I will take a look as I get a chance. I think the whole series
> > > would have to be reverted which would be... depressing... as other patches
> > > in series eliminates the bounce buffer altogether.
> > >
> >
> > I spotted
> >
> > https://lkml.kernel.org/r/069dd40aa71e634b414d07039d72467d051fb486.camel@gmx.de
> >
> 
> Find that slightly confusing, they talk about just reveritng the patch but then
> also add a kern_addr_valid()?
> 
> I'm also confused about people talking about just reverting the patch, as
> 4c91c07c93bb drops the bounce buffer altogether... presumably they mean
> reverting both?
> 
> Clearly this is an arm64 thing (obviously), I have some arm64 hardware let me
> see if I can repro...

I see the issue on x86

> 
> Baoquan, Jiri - are you reverting more than just the one commit? And does doing
> this go from not working -> working? Or from not working (worst case oops) ->
> error?

yes, I used to revert all 4 patches

I did quick check and had to revert 2 more patches to get clean revert

38b138abc355 Revert "fs/proc/kcore: avoid bounce buffer for ktext data"
e2c3b418d365 Revert "fs/proc/kcore: convert read_kcore() to read_kcore_iter()"
d8bc432cb314 Revert "iov_iter: add copy_page_to_iter_nofault()"
bf2c6799f68c Revert "iov_iter: Kill ITER_PIPE"
ccf4b2c5c5ce Revert "mm: vmalloc: convert vread() to vread_iter()"
de400d383a7e Revert "mm/vmalloc: replace the ternary conditional operator with min()"

jirka
  
Lorenzo Stoakes July 31, 2023, 9:12 p.m. UTC | #13
On Mon, Jul 31, 2023 at 10:34:21PM +0200, Jiri Olsa wrote:
> On Mon, Jul 31, 2023 at 08:40:24PM +0100, Lorenzo Stoakes wrote:
> > On Mon, Jul 31, 2023 at 09:24:50PM +0200, David Hildenbrand wrote:
> > > On 31.07.23 21:21, Lorenzo Stoakes wrote:
> > > > On Mon, Jul 24, 2023 at 08:23:55AM +0200, David Hildenbrand wrote:
> > > > > Hi,
> > > > >
> > > > > >
> > > > > > I met this too when I executed below command to trigger a kcore reading.
> > > > > > I wanted to do a simple testing during system running and got this.
> > > > > >
> > > > > >     makedumpfile --mem-usage /proc/kcore
> > > > > >
> > > > > > Later I tried your above objdump testing, it corrupted system too.
> > > > > >
> > > > >
> > > > > What do you mean with "corrupted system too" --  did it not only fail to
> > > > > dump the system, but also actually harmed the system?
> > > > >
> > > > > @Lorenzo do you plan on reproduce + fix, or should we consider reverting
> > > > > that change?
> > > > >
> > > > > --
> > > > > Cheers,
> > > > >
> > > > > David / dhildenb
> > > > >
> > > >
> > > > Apologies I mised this, I have been very busy lately not least with book :)
> > > >
> > > > Concerning, I will take a look as I get a chance. I think the whole series
> > > > would have to be reverted which would be... depressing... as other patches
> > > > in series eliminates the bounce buffer altogether.
> > > >
> > >
> > > I spotted
> > >
> > > https://lkml.kernel.org/r/069dd40aa71e634b414d07039d72467d051fb486.camel@gmx.de
> > >
> >
> > Find that slightly confusing, they talk about just reveritng the patch but then
> > also add a kern_addr_valid()?
> >
> > I'm also confused about people talking about just reverting the patch, as
> > 4c91c07c93bb drops the bounce buffer altogether... presumably they mean
> > reverting both?
> >
> > Clearly this is an arm64 thing (obviously), I have some arm64 hardware let me
> > see if I can repro...
>
> I see the issue on x86

Ummmm what? I can't! What repro are you seeing on x86, exactly?

>
> >
> > Baoquan, Jiri - are you reverting more than just the one commit? And does doing
> > this go from not working -> working? Or from not working (worst case oops) ->
> > error?
>
> yes, I used to revert all 4 patches
>
> I did quick check and had to revert 2 more patches to get clean revert
>
> 38b138abc355 Revert "fs/proc/kcore: avoid bounce buffer for ktext data"
> e2c3b418d365 Revert "fs/proc/kcore: convert read_kcore() to read_kcore_iter()"
> d8bc432cb314 Revert "iov_iter: add copy_page_to_iter_nofault()"
> bf2c6799f68c Revert "iov_iter: Kill ITER_PIPE"
> ccf4b2c5c5ce Revert "mm: vmalloc: convert vread() to vread_iter()"
> de400d383a7e Revert "mm/vmalloc: replace the ternary conditional operator with min()"
>
> jirka

That's quite a few more reverts and obviously not an acceptable solution here.

Looking at
https://lore.kernel.org/all/CAA5enKaUYehLZGL3abv4rsS7caoUG-pN9wF3R+qek-DGNZufbA@mail.gmail.com
a parallel thread on this, it looks like the issue is that we are no longer
using a no-fault kernel copy in KCORE_TEXT path and arm64 doesn't map everything
in the text range.

Solution would be to reinstate the bounce buffer in this case (ugh). Longer term
solution I think would be to create some iterator helper that does no fault
copies from the kernel.

I will try to come up with a semi-revert that keeps the iterator stuff but
keeps a hideous bounce buffer for the KCORE_TEXT bit with a comment
explaining why...
  
Jiri Olsa July 31, 2023, 9:50 p.m. UTC | #14
On Mon, Jul 31, 2023 at 10:12:08PM +0100, Lorenzo Stoakes wrote:
> On Mon, Jul 31, 2023 at 10:34:21PM +0200, Jiri Olsa wrote:
> > On Mon, Jul 31, 2023 at 08:40:24PM +0100, Lorenzo Stoakes wrote:
> > > On Mon, Jul 31, 2023 at 09:24:50PM +0200, David Hildenbrand wrote:
> > > > On 31.07.23 21:21, Lorenzo Stoakes wrote:
> > > > > On Mon, Jul 24, 2023 at 08:23:55AM +0200, David Hildenbrand wrote:
> > > > > > Hi,
> > > > > >
> > > > > > >
> > > > > > > I met this too when I executed below command to trigger a kcore reading.
> > > > > > > I wanted to do a simple testing during system running and got this.
> > > > > > >
> > > > > > >     makedumpfile --mem-usage /proc/kcore
> > > > > > >
> > > > > > > Later I tried your above objdump testing, it corrupted system too.
> > > > > > >
> > > > > >
> > > > > > What do you mean with "corrupted system too" --  did it not only fail to
> > > > > > dump the system, but also actually harmed the system?
> > > > > >
> > > > > > @Lorenzo do you plan on reproduce + fix, or should we consider reverting
> > > > > > that change?
> > > > > >
> > > > > > --
> > > > > > Cheers,
> > > > > >
> > > > > > David / dhildenb
> > > > > >
> > > > >
> > > > > Apologies I mised this, I have been very busy lately not least with book :)
> > > > >
> > > > > Concerning, I will take a look as I get a chance. I think the whole series
> > > > > would have to be reverted which would be... depressing... as other patches
> > > > > in series eliminates the bounce buffer altogether.
> > > > >
> > > >
> > > > I spotted
> > > >
> > > > https://lkml.kernel.org/r/069dd40aa71e634b414d07039d72467d051fb486.camel@gmx.de
> > > >
> > >
> > > Find that slightly confusing, they talk about just reveritng the patch but then
> > > also add a kern_addr_valid()?
> > >
> > > I'm also confused about people talking about just reverting the patch, as
> > > 4c91c07c93bb drops the bounce buffer altogether... presumably they mean
> > > reverting both?
> > >
> > > Clearly this is an arm64 thing (obviously), I have some arm64 hardware let me
> > > see if I can repro...
> >
> > I see the issue on x86
> 
> Ummmm what? I can't! What repro are you seeing on x86, exactly?

# cat /proc/kallsyms | grep ksys_read
ffffffff8151e450 T ksys_read

# objdump -d  --start-address=0xffffffff8151e450 --stop-address=0xffffffff8151e460 /proc/kcore 

/proc/kcore:     file format elf64-x86-64

objdump: Reading section load1 failed because: Bad address


jirka

> 
> >
> > >
> > > Baoquan, Jiri - are you reverting more than just the one commit? And does doing
> > > this go from not working -> working? Or from not working (worst case oops) ->
> > > error?
> >
> > yes, I used to revert all 4 patches
> >
> > I did quick check and had to revert 2 more patches to get clean revert
> >
> > 38b138abc355 Revert "fs/proc/kcore: avoid bounce buffer for ktext data"
> > e2c3b418d365 Revert "fs/proc/kcore: convert read_kcore() to read_kcore_iter()"
> > d8bc432cb314 Revert "iov_iter: add copy_page_to_iter_nofault()"
> > bf2c6799f68c Revert "iov_iter: Kill ITER_PIPE"
> > ccf4b2c5c5ce Revert "mm: vmalloc: convert vread() to vread_iter()"
> > de400d383a7e Revert "mm/vmalloc: replace the ternary conditional operator with min()"
> >
> > jirka
> 
> That's quite a few more reverts and obviously not an acceptable solution here.
> 
> Looking at
> https://lore.kernel.org/all/CAA5enKaUYehLZGL3abv4rsS7caoUG-pN9wF3R+qek-DGNZufbA@mail.gmail.com
> a parallel thread on this, it looks like the issue is that we are no longer
> using a no-fault kernel copy in KCORE_TEXT path and arm64 doesn't map everything
> in the text range.
> 
> Solution would be to reinstate the bounce buffer in this case (ugh). Longer term
> solution I think would be to create some iterator helper that does no fault
> copies from the kernel.
> 
> I will try to come up with a semi-revert that keeps the iterator stuff but
> keeps a hideous bounce buffer for the KCORE_TEXT bit with a comment
> explaining why...
  
Lorenzo Stoakes July 31, 2023, 9:58 p.m. UTC | #15
On Mon, Jul 31, 2023 at 11:50:22PM +0200, Jiri Olsa wrote:
> > Ummmm what? I can't! What repro are you seeing on x86, exactly?
>
> # cat /proc/kallsyms | grep ksys_read
> ffffffff8151e450 T ksys_read
>
> # objdump -d  --start-address=0xffffffff8151e450 --stop-address=0xffffffff8151e460 /proc/kcore
>
> /proc/kcore:     file format elf64-x86-64
>
> objdump: Reading section load1 failed because: Bad address
>
>
> jirka

Locally I don't see this issue. How odd. The bug doesn't manifest as a 'bad
address' in the arm64 repros either. I wonder if this is something unrelated...

In any case I have a candidate fix for the bug at
https://lore.kernel.org/all/20230731215021.70911-1-lstoakes@gmail.com/ which
should hopefully address the underlying issue with minimum change.
  

Patch

diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index 71157ee35c1a..556f310d6aa4 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -541,19 +541,12 @@  read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
 		case KCORE_VMEMMAP:
 		case KCORE_TEXT:
 			/*
-			 * Using bounce buffer to bypass the
-			 * hardened user copy kernel text checks.
+			 * We use _copy_to_user() to bypass usermode hardening
+			 * which would otherwise prevent this operation.
 			 */
-			if (copy_from_kernel_nofault(buf, (void *)start, tsz)) {
-				if (clear_user(buffer, tsz)) {
-					ret = -EFAULT;
-					goto out;
-				}
-			} else {
-				if (copy_to_user(buffer, buf, tsz)) {
-					ret = -EFAULT;
-					goto out;
-				}
+			if (_copy_to_user(buffer, (char *)start, tsz)) {
+				ret = -EFAULT;
+				goto out;
 			}
 			break;
 		default: