tracing: fix memcpy size when copying stack entries

Message ID 20230612160748.4082850-1-svens@linux.ibm.com
State New
Headers
Series tracing: fix memcpy size when copying stack entries |

Commit Message

Sven Schnelle June 12, 2023, 4:07 p.m. UTC
  Noticed the following warning during boot:

[    2.316341] Testing tracer wakeup:
[    2.383512] ------------[ cut here ]------------
[    2.383517] memcpy: detected field-spanning write (size 104) of single field "&entry->caller" at kernel/trace/trace.c:3167 (size 64)

The reason seems to be that the maximum number of entries is calculated
from the size of the fstack->calls array which is 128. But later the same
size is used to memcpy() the entries to entry->callers, which has only
room for eight elements. Therefore use the minimum of both arrays as limit.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
---
 kernel/trace/trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Steven Rostedt June 12, 2023, 4:34 p.m. UTC | #1
On Mon, 12 Jun 2023 18:07:48 +0200
Sven Schnelle <svens@linux.ibm.com> wrote:

> Noticed the following warning during boot:
> 
> [    2.316341] Testing tracer wakeup:
> [    2.383512] ------------[ cut here ]------------
> [    2.383517] memcpy: detected field-spanning write (size 104) of single field "&entry->caller" at kernel/trace/trace.c:3167 (size 64)
> 
> The reason seems to be that the maximum number of entries is calculated
> from the size of the fstack->calls array which is 128. But later the same
> size is used to memcpy() the entries to entry->callers, which has only
> room for eight elements. Therefore use the minimum of both arrays as limit.
> 
> Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
> ---
>  kernel/trace/trace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 64a4dde073ef..988d664c13ec 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -3146,7 +3146,7 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
>  	barrier();
>  
>  	fstack = this_cpu_ptr(ftrace_stacks.stacks) + stackidx;
> -	size = ARRAY_SIZE(fstack->calls);
> +	size = min(ARRAY_SIZE(entry->caller), ARRAY_SIZE(fstack->calls));

No, this is not how it works, and this breaks the stack tracing code.

>  
>  	if (regs) {
>  		nr_entries = stack_trace_save_regs(regs, fstack->calls,

I guess we need to add some type of annotation to make the memcpy()
checking happy.

Let me explain what is happening. By default the stack trace has a minimum
of 8 entries (defined by struct stack_entry, which is used to show to user
space the default size - for backward compatibility).

Let's take a look at the code in more detail:

/* What is the size of the temp buffer to use to find the stack? */
	size = ARRAY_SIZE(fstack->calls);

	if (regs) {
/* Fills in the stack into the temp buffer */
		nr_entries = stack_trace_save_regs(regs, fstack->calls,
						   size, skip);
	} else {
/* Also fills in the stack into the temp buffer */
		nr_entries = stack_trace_save(fstack->calls, size, skip);
	}

/* Calculate the size from the number of entries stored in the temp buffer */

	size = nr_entries * sizeof(unsigned long);

/* Now reserve space on the ring buffer */
	event = __trace_buffer_lock_reserve(buffer, TRACE_STACK,

/*
 * Notice how it calculates the size! It subtracts the sizeof
 *  entry->caller and then adds size again!
 */
				    (sizeof(*entry) - sizeof(entry->caller)) + size,
				    trace_ctx);
	if (!event)
		goto out;

/* Point entry to the ring buffer data */
	entry = ring_buffer_event_data(event);

/* Now copy the stack to the location for the data on the ftrace ring buffer */
	memcpy(&entry->caller, fstack->calls, size);
	entry->size = nr_entries;

The old way use to just record the 8 entries, but that was not very useful
in real world analysis. Your patch takes that away. Might as well just
record directly into the ring buffer again like it use to.

Yes the above may be special, but your patch breaks it.

NAK on the patch, but I'm willing to update this to make your tooling
handle this special case.

-- Steve
  
Sven Schnelle June 13, 2023, 5:19 a.m. UTC | #2
Steven Rostedt <rostedt@goodmis.org> writes:

> On Mon, 12 Jun 2023 18:07:48 +0200
> Sven Schnelle <svens@linux.ibm.com> wrote:
>
>> Noticed the following warning during boot:
>> 
>> [    2.316341] Testing tracer wakeup:
>> [    2.383512] ------------[ cut here ]------------
>> [    2.383517] memcpy: detected field-spanning write (size 104) of single field "&entry->caller" at kernel/trace/trace.c:3167 (size 64)
>> 
>> The reason seems to be that the maximum number of entries is calculated
>> from the size of the fstack->calls array which is 128. But later the same
>> size is used to memcpy() the entries to entry->callers, which has only
>> room for eight elements. Therefore use the minimum of both arrays as limit.
>> 
>> Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
>> ---
>>  kernel/trace/trace.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
>> index 64a4dde073ef..988d664c13ec 100644
>> --- a/kernel/trace/trace.c
>> +++ b/kernel/trace/trace.c
>> @@ -3146,7 +3146,7 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
>>  	barrier();
>>  
>>  	fstack = this_cpu_ptr(ftrace_stacks.stacks) + stackidx;
>> -	size = ARRAY_SIZE(fstack->calls);
>> +	size = min(ARRAY_SIZE(entry->caller), ARRAY_SIZE(fstack->calls));
>
> No, this is not how it works, and this breaks the stack tracing code.
> [..]
> The old way use to just record the 8 entries, but that was not very useful
> in real world analysis. Your patch takes that away. Might as well just
> record directly into the ring buffer again like it use to.
>
> Yes the above may be special, but your patch breaks it.

Indeed, i'm feeling a bit stupid for sending that patch, should have
used my brain during reading the source. Thanks for the explanation.
  
Steven Rostedt June 13, 2023, 3:37 p.m. UTC | #3
On Tue, 13 Jun 2023 07:19:14 +0200
Sven Schnelle <svens@linux.ibm.com> wrote:

> > Yes the above may be special, but your patch breaks it.  
> 
> Indeed, i'm feeling a bit stupid for sending that patch, should have
> used my brain during reading the source. Thanks for the explanation.

Does this quiet the fortifier?

-- Steve

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 64a4dde073ef..1bac7df1f4b6 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3118,6 +3118,7 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
 	struct ftrace_stack *fstack;
 	struct stack_entry *entry;
 	int stackidx;
+	void *stack;
 
 	/*
 	 * Add one, for this function and the call to save_stack_trace()
@@ -3163,7 +3164,18 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
 		goto out;
 	entry = ring_buffer_event_data(event);
 
-	memcpy(&entry->caller, fstack->calls, size);
+	/*
+	 * For backward compatibility reasons, the entry->caller is an
+	 * array of 8 slots to store the stack. This is also exported
+	 * to user space. The amount allocated on the ring buffer actually
+	 * holds enough for the stack specified by nr_entries. This will
+	 * go into the location of entry->caller. Due to string fortifiers
+	 * checking the size of the destination of memcpy() it triggers
+	 * when it detects that size is greater than 8. To hide this from
+	 * the fortifiers, use a different pointer "stack".
+	 */
+	stack = (void *)&entry->caller;
+	memcpy(stack, fstack->calls, size);
 	entry->size = nr_entries;
 
 	if (!call_filter_check_discard(call, entry, buffer, event))
  
Sven Schnelle June 14, 2023, 10:41 a.m. UTC | #4
Steven Rostedt <rostedt@goodmis.org> writes:

> On Tue, 13 Jun 2023 07:19:14 +0200
> Sven Schnelle <svens@linux.ibm.com> wrote:
>
>> > Yes the above may be special, but your patch breaks it.  
>> 
>> Indeed, i'm feeling a bit stupid for sending that patch, should have
>> used my brain during reading the source. Thanks for the explanation.
>
> Does this quiet the fortifier?
> [..]

No, still getting the same warning:

[    2.302776] memcpy: detected field-spanning write (size 104) of single field "stack" at kernel/trace/trace.c:3178 (size 64)
  
David Laight June 14, 2023, 11:30 a.m. UTC | #5
From: Sven Schnelle
> Sent: 14 June 2023 11:41
> 
> Steven Rostedt <rostedt@goodmis.org> writes:
> 
> > On Tue, 13 Jun 2023 07:19:14 +0200
> > Sven Schnelle <svens@linux.ibm.com> wrote:
> >
> >> > Yes the above may be special, but your patch breaks it.
> >>
> >> Indeed, i'm feeling a bit stupid for sending that patch, should have
> >> used my brain during reading the source. Thanks for the explanation.
> >
> > Does this quiet the fortifier?
> > [..]
> 
> No, still getting the same warning:
> 
> [    2.302776] memcpy: detected field-spanning write (size 104) of single field "stack" at
> kernel/trace/trace.c:3178 (size 64)

What about:
	(memcpy)(......)

Or maybe:
	(__builtin_memcpy)(....)


	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
  
Sven Schnelle July 12, 2023, 2:06 p.m. UTC | #6
Hi Steven,

Sven Schnelle <svens@linux.ibm.com> writes:

> Steven Rostedt <rostedt@goodmis.org> writes:
>
>> On Tue, 13 Jun 2023 07:19:14 +0200
>> Sven Schnelle <svens@linux.ibm.com> wrote:
>>
>>> > Yes the above may be special, but your patch breaks it.  
>>> 
>>> Indeed, i'm feeling a bit stupid for sending that patch, should have
>>> used my brain during reading the source. Thanks for the explanation.
>>
>> Does this quiet the fortifier?
>> [..]
>
> No, still getting the same warning:
>
> [    2.302776] memcpy: detected field-spanning write (size 104) of single field "stack" at kernel/trace/trace.c:3178 (size 64)

BTW, i'm seeing the same error on x86 with current master when
CONFIG_FORTIFY_SOURCE=y and CONFIG_SCHED_TRACER=y:

[    3.089395] Testing tracer wakeup: 
[    3.205602] ------------[ cut here ]------------
[    3.205958] memcpy: detected field-spanning write (size 112) of single field "&entry->caller" at kernel/trace/trace.c:3173 (size 64)
[    3.205958] WARNING: CPU: 1 PID: 0 at kernel/trace/trace.c:3173 __ftrace_trace_stack+0x1d1/0x1e0
[    3.205958] Modules linked in:
[    3.205958] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.5.0-rc1-00012-g77341f6d2110-dirty #50
[    3.205958] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014
[    3.205958] RIP: 0010:__ftrace_trace_stack+0x1d1/0x1e0
[    3.205958] Code: ff ff ff b9 40 00 00 00 4c 89 f6 48 c7 c2 d8 d3 9a 82 48 c7 c7 e8 82 99 82 48 89 44 24 08 c6 05 9d 8c 30 02 01 e8 0f 88 ed ff <0f> 0b 48 8b 44 24 08 e9 f4 fe ff ff 0f 1f 00 90 90 90 90 90 90 90
[    3.205958] RSP: 0000:ffffc90000100ee0 EFLAGS: 00010086
[    3.205958] RAX: 0000000000000000 RBX: ffff8881003db034 RCX: c0000000ffffdfff
[    3.205958] RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: 0000000000000001
[    3.205958] RBP: ffff8881003db03c R08: 0000000000000000 R09: ffffc90000100d88
[    3.205958] R10: 0000000000000003 R11: ffffffff83343008 R12: ffff88810007a100
[    3.205958] R13: 000000000000000e R14: 0000000000000070 R15: 0000000000000070
[    3.205958] FS:  0000000000000000(0000) GS:ffff88817bc40000(0000) knlGS:0000000000000000
[    3.205958] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.205958] CR2: 0000000000000000 CR3: 000000000322e000 CR4: 00000000000006e0
[    3.205958] Call Trace:
[    3.205958]  <IRQ>
[    3.205958]  ? __ftrace_trace_stack+0x1d1/0x1e0
[    3.205958]  ? __warn+0x81/0x130
[    3.205958]  ? __ftrace_trace_stack+0x1d1/0x1e0
[    3.205958]  ? report_bug+0x171/0x1a0
[    3.205958]  ? handle_bug+0x3a/0x70
[    3.205958]  ? exc_invalid_op+0x17/0x70
[    3.205958]  ? asm_exc_invalid_op+0x1a/0x20
[    3.205958]  ? __ftrace_trace_stack+0x1d1/0x1e0
[    3.205958]  probe_wakeup+0x28e/0x340
[    3.205958]  ttwu_do_activate.isra.0+0x132/0x190
[    3.205958]  sched_ttwu_pending+0x97/0x110
[    3.205958]  __flush_smp_call_function_queue+0x131/0x400
[    3.205958]  __sysvec_call_function_single+0x2d/0xd0
[    3.205958]  sysvec_call_function_single+0x65/0x80
[    3.205958]  </IRQ>
[    3.205958]  <TASK>
[    3.205958]  asm_sysvec_call_function_single+0x1a/0x20
[    3.205958] RIP: 0010:default_idle+0xf/0x20
[    3.205958] Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 43 5f 31 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90
  
Steven Rostedt July 12, 2023, 2:14 p.m. UTC | #7
On Wed, 12 Jul 2023 16:06:27 +0200
Sven Schnelle <svens@linux.ibm.com> wrote:

> > No, still getting the same warning:
> >
> > [    2.302776] memcpy: detected field-spanning write (size 104) of single field "stack" at kernel/trace/trace.c:3178 (size 64)  
> 
> BTW, i'm seeing the same error on x86 with current master when
> CONFIG_FORTIFY_SOURCE=y and CONFIG_SCHED_TRACER=y:

As I don't know how the fortifier works, nor what exactly it is checking,
do you have any idea on how to quiet it?

This is a false positive, as I described before.

-- Steve
  
Steven Rostedt July 12, 2023, 2:26 p.m. UTC | #8
On Wed, 12 Jul 2023 10:14:34 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Wed, 12 Jul 2023 16:06:27 +0200
> Sven Schnelle <svens@linux.ibm.com> wrote:
> 
> > > No, still getting the same warning:
> > >
> > > [    2.302776] memcpy: detected field-spanning write (size 104) of single field "stack" at kernel/trace/trace.c:3178 (size 64)    
> > 
> > BTW, i'm seeing the same error on x86 with current master when
> > CONFIG_FORTIFY_SOURCE=y and CONFIG_SCHED_TRACER=y:  
> 
> As I don't know how the fortifier works, nor what exactly it is checking,
> do you have any idea on how to quiet it?
> 
> This is a false positive, as I described before.


Hmm, maybe this would work?

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 4529e264cb86..20122eeccf97 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3118,6 +3118,7 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
 	struct ftrace_stack *fstack;
 	struct stack_entry *entry;
 	int stackidx;
+	void *ptr;
 
 	/*
 	 * Add one, for this function and the call to save_stack_trace()
@@ -3161,9 +3162,25 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
 				    trace_ctx);
 	if (!event)
 		goto out;
-	entry = ring_buffer_event_data(event);
+	ptr = ring_buffer_event_data(event);
+	entry = ptr;
+
+	/*
+	 * For backward compatibility reasons, the entry->caller is an
+	 * array of 8 slots to store the stack. This is also exported
+	 * to user space. The amount allocated on the ring buffer actually
+	 * holds enough for the stack specified by nr_entries. This will
+	 * go into the location of entry->caller. Due to string fortifiers
+	 * checking the size of the destination of memcpy() it triggers
+	 * when it detects that size is greater than 8. To hide this from
+	 * the fortifiers, we use "ptr" and pointer arithmetic to assign caller.
+	 *
+	 * The below is really just:
+	 *   memcpy(&entry->caller, fstack->calls, size);
+	 */
+	ptr += offsetof(typeof(*entry), caller);
+	memcpy(ptr, fstack->calls, size);
 
-	memcpy(&entry->caller, fstack->calls, size);
 	entry->size = nr_entries;
 
 	if (!call_filter_check_discard(call, entry, buffer, event))


-- Steve
  
Sven Schnelle July 12, 2023, 2:31 p.m. UTC | #9
Hi Steven,

Steven Rostedt <rostedt@goodmis.org> writes:

> On Wed, 12 Jul 2023 16:06:27 +0200
> Sven Schnelle <svens@linux.ibm.com> wrote:
>
>> > No, still getting the same warning:
>> >
>> > [    2.302776] memcpy: detected field-spanning write (size 104) of single field "stack" at kernel/trace/trace.c:3178 (size 64)  
>> 
>> BTW, i'm seeing the same error on x86 with current master when
>> CONFIG_FORTIFY_SOURCE=y and CONFIG_SCHED_TRACER=y:
>
> As I don't know how the fortifier works, nor what exactly it is checking,
> do you have any idea on how to quiet it?
>
> This is a false positive, as I described before.

The "problem" is that struct stack_entry is

struct stack_entry {
       int size;
       unsigned long caller[8];
};

So, as you explained, the ringbuffer code allocates some space after the
struct for additional entries:

struct stack_entry 1;
<additional space for 1>
struct stack_entry 2;
<additional space for 2>
...

But the struct member that is passed to memcpy still has the type
information 'caller is an array with 8 members of 8 bytes', so memcpy
fortify complains. I'm not sure whether we can blame the compiler or
the fortify code here.

One (ugly and whitespace damaged) workaround is:

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 35b11f5a9519..31acd8a6b97e 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3170,7 +3170,8 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
                goto out;
        entry = ring_buffer_event_data(event);
 
-       memcpy(&entry->caller, fstack->calls, size);
+       void *p = entry + offsetof(struct stack_entry, caller);
+       memcpy(p, fstack->calls, size);
        entry->size = nr_entries;
 
        if (!call_filter_check_discard(call, entry, buffer, event))


So with that offsetof calculation the compiler doesn't know about the 8
entries * 8 bytes limitation. Adding Kees to the thread, maybe he knows
some way.
  
Sven Schnelle July 12, 2023, 2:32 p.m. UTC | #10
Hi Steven,

Steven Rostedt <rostedt@goodmis.org> writes:

>> As I don't know how the fortifier works, nor what exactly it is checking,
>> do you have any idea on how to quiet it?
>> 
>> This is a false positive, as I described before.
>
>
> Hmm, maybe this would work?
>
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 4529e264cb86..20122eeccf97 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -3118,6 +3118,7 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
>  	struct ftrace_stack *fstack;
>  	struct stack_entry *entry;
>  	int stackidx;
> +	void *ptr;
>  
>  	/*
>  	 * Add one, for this function and the call to save_stack_trace()
> @@ -3161,9 +3162,25 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
>  				    trace_ctx);
>  	if (!event)
>  		goto out;
> -	entry = ring_buffer_event_data(event);
> +	ptr = ring_buffer_event_data(event);
> +	entry = ptr;
> +
> +	/*
> +	 * For backward compatibility reasons, the entry->caller is an
> +	 * array of 8 slots to store the stack. This is also exported
> +	 * to user space. The amount allocated on the ring buffer actually
> +	 * holds enough for the stack specified by nr_entries. This will
> +	 * go into the location of entry->caller. Due to string fortifiers
> +	 * checking the size of the destination of memcpy() it triggers
> +	 * when it detects that size is greater than 8. To hide this from
> +	 * the fortifiers, we use "ptr" and pointer arithmetic to assign caller.
> +	 *
> +	 * The below is really just:
> +	 *   memcpy(&entry->caller, fstack->calls, size);
> +	 */
> +	ptr += offsetof(typeof(*entry), caller);
> +	memcpy(ptr, fstack->calls, size);
>  
> -	memcpy(&entry->caller, fstack->calls, size);
>  	entry->size = nr_entries;
>  
>  	if (!call_filter_check_discard(call, entry, buffer, event))
>
>

I just sent about the same thing without the nice comment. So yes, this
works. :-)
  

Patch

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 64a4dde073ef..988d664c13ec 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3146,7 +3146,7 @@  static void __ftrace_trace_stack(struct trace_buffer *buffer,
 	barrier();
 
 	fstack = this_cpu_ptr(ftrace_stacks.stacks) + stackidx;
-	size = ARRAY_SIZE(fstack->calls);
+	size = min(ARRAY_SIZE(entry->caller), ARRAY_SIZE(fstack->calls));
 
 	if (regs) {
 		nr_entries = stack_trace_save_regs(regs, fstack->calls,