perf tools: Fixup module symbol end address properly

Message ID 20240212233322.1855161-1-namhyung@kernel.org
State New
Headers
Series perf tools: Fixup module symbol end address properly |

Commit Message

Namhyung Kim Feb. 12, 2024, 11:33 p.m. UTC
  I got a strange error on ARM to fail on processing FINISHED_ROUND
record.  It turned out that it was failing in symbol__alloc_hist()
because the symbol size is too big.

When a sample is captured on a specific BPF program, it failed.  I've
added a debug code and found the end address of the symbol is from
the next module which is placed far way.

  ffff800008795778-ffff80000879d6d8: bpf_prog_1bac53b8aac4bc58_netcg_sock    [bpf]
  ffff80000879d6d8-ffff80000ad656b4: bpf_prog_76867454b5944e15_netcg_getsockopt      [bpf]
  ffff80000ad656b4-ffffd69b7af74048: bpf_prog_1d50286d2eb1be85_hn_egress     [bpf]   <---------- here
  ffffd69b7af74048-ffffd69b7af74048: $x.5    [sha3_generic]
  ffffd69b7af74048-ffffd69b7af740b8: crypto_sha3_init        [sha3_generic]
  ffffd69b7af740b8-ffffd69b7af741e0: crypto_sha3_update      [sha3_generic]

The logic in symbols__fixup_end() just uses curr->start to update the
prev->end.  But in this case, it won't work as it's too different.

I think ARM has a different kernel memory layout for modules and BPF
than on x86.  Actually there's a logic to handle kernel and module
boundary.  Let's do the same for symbols between different modules.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/symbol.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)
  

Comments

Leo Yan Feb. 13, 2024, 3:39 a.m. UTC | #1
On Mon, Feb 12, 2024 at 03:33:22PM -0800, Namhyung Kim wrote:
> I got a strange error on ARM to fail on processing FINISHED_ROUND
> record.  It turned out that it was failing in symbol__alloc_hist()
> because the symbol size is too big.
> 
> When a sample is captured on a specific BPF program, it failed.  I've
> added a debug code and found the end address of the symbol is from
> the next module which is placed far way.
> 
>   ffff800008795778-ffff80000879d6d8: bpf_prog_1bac53b8aac4bc58_netcg_sock    [bpf]
>   ffff80000879d6d8-ffff80000ad656b4: bpf_prog_76867454b5944e15_netcg_getsockopt      [bpf]
>   ffff80000ad656b4-ffffd69b7af74048: bpf_prog_1d50286d2eb1be85_hn_egress     [bpf]   <---------- here
>   ffffd69b7af74048-ffffd69b7af74048: $x.5    [sha3_generic]
>   ffffd69b7af74048-ffffd69b7af740b8: crypto_sha3_init        [sha3_generic]
>   ffffd69b7af740b8-ffffd69b7af741e0: crypto_sha3_update      [sha3_generic]
> 
> The logic in symbols__fixup_end() just uses curr->start to update the
> prev->end.  But in this case, it won't work as it's too different.
> 
> I think ARM has a different kernel memory layout for modules and BPF
> than on x86.  Actually there's a logic to handle kernel and module
> boundary.  Let's do the same for symbols between different modules.

Even Arm32 and Arm64 kernel have different memory layout for modules
and kernel image.

eBPF program (JITed) should be allocated from the vmalloc region, for
Arm64, see bpf_jit_alloc_exec() in arch/arm64/net/bpf_jit_comp.c.

> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/symbol.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index 35975189999b..9ebdb8e13c0b 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -248,14 +248,31 @@ void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms)
>  		 * segment is very big.  Therefore do not fill this gap and do
>  		 * not assign it to the kernel dso map (kallsyms).
>  		 *
> +		 * Also BPF code can be allocated separately from text segments
> +		 * and modules.  So the last entry in a module should not fill
> +		 * the gap too.
> +		 *
>  		 * In kallsyms, it determines module symbols using '[' character
>  		 * like in:
>  		 *   ffffffffc1937000 T hdmi_driver_init  [snd_hda_codec_hdmi]
>  		 */
>  		if (prev->end == prev->start) {
> +			const char *prev_mod;
> +			const char *curr_mod;
> +
> +			if (!is_kallsyms) {
> +				prev->end = curr->start;
> +				continue;
> +			}
> +
> +			prev_mod = strchr(prev->name, '[');
> +			curr_mod = strchr(curr->name, '[');
> +
>  			/* Last kernel/module symbol mapped to end of page */
> -			if (is_kallsyms && (!strchr(prev->name, '[') !=
> -					    !strchr(curr->name, '[')))
> +			if (!prev_mod != !curr_mod)
> +				prev->end = roundup(prev->end + 4096, 4096);
> +			/* Last symbol in the previous module */
> +			else if (prev_mod && strcmp(prev_mod, curr_mod))

Should two consecutive moudles fall into this case? I think we need to assign
'prev->end = curr->start' for two two consecutive moudles.

If so, we should use a specific checking for eBPF program, e.g.:

                        else if (prev_mod && strcmp(prev_mod, curr_mod) &&
                                 (!strcmp(prev->name, "bpf") ||
                                  !strcmp(curr->name, "bpf")))

Thanks,
Leo

>  				prev->end = roundup(prev->end + 4096, 4096);
>  			else
>  				prev->end = curr->start;
> -- 
> 2.43.0.687.g38aa6559b0-goog
>
  
Namhyung Kim Feb. 13, 2024, 6:48 p.m. UTC | #2
Hi Leo,

Thanks for your review!

On Mon, Feb 12, 2024 at 7:40 PM Leo Yan <leo.yan@linux.dev> wrote:
>
> On Mon, Feb 12, 2024 at 03:33:22PM -0800, Namhyung Kim wrote:
> > I got a strange error on ARM to fail on processing FINISHED_ROUND
> > record.  It turned out that it was failing in symbol__alloc_hist()
> > because the symbol size is too big.
> >
> > When a sample is captured on a specific BPF program, it failed.  I've
> > added a debug code and found the end address of the symbol is from
> > the next module which is placed far way.
> >
> >   ffff800008795778-ffff80000879d6d8: bpf_prog_1bac53b8aac4bc58_netcg_sock    [bpf]
> >   ffff80000879d6d8-ffff80000ad656b4: bpf_prog_76867454b5944e15_netcg_getsockopt      [bpf]
> >   ffff80000ad656b4-ffffd69b7af74048: bpf_prog_1d50286d2eb1be85_hn_egress     [bpf]   <---------- here
> >   ffffd69b7af74048-ffffd69b7af74048: $x.5    [sha3_generic]
> >   ffffd69b7af74048-ffffd69b7af740b8: crypto_sha3_init        [sha3_generic]
> >   ffffd69b7af740b8-ffffd69b7af741e0: crypto_sha3_update      [sha3_generic]
> >
> > The logic in symbols__fixup_end() just uses curr->start to update the
> > prev->end.  But in this case, it won't work as it's too different.
> >
> > I think ARM has a different kernel memory layout for modules and BPF
> > than on x86.  Actually there's a logic to handle kernel and module
> > boundary.  Let's do the same for symbols between different modules.
>
> Even Arm32 and Arm64 kernel have different memory layout for modules
> and kernel image.
>
> eBPF program (JITed) should be allocated from the vmalloc region, for
> Arm64, see bpf_jit_alloc_exec() in arch/arm64/net/bpf_jit_comp.c.

Ok, so chances are they can fall out far away right?

>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/perf/util/symbol.c | 21 +++++++++++++++++++--
> >  1 file changed, 19 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> > index 35975189999b..9ebdb8e13c0b 100644
> > --- a/tools/perf/util/symbol.c
> > +++ b/tools/perf/util/symbol.c
> > @@ -248,14 +248,31 @@ void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms)
> >                * segment is very big.  Therefore do not fill this gap and do
> >                * not assign it to the kernel dso map (kallsyms).
> >                *
> > +              * Also BPF code can be allocated separately from text segments
> > +              * and modules.  So the last entry in a module should not fill
> > +              * the gap too.
> > +              *
> >                * In kallsyms, it determines module symbols using '[' character
> >                * like in:
> >                *   ffffffffc1937000 T hdmi_driver_init  [snd_hda_codec_hdmi]
> >                */
> >               if (prev->end == prev->start) {
> > +                     const char *prev_mod;
> > +                     const char *curr_mod;
> > +
> > +                     if (!is_kallsyms) {
> > +                             prev->end = curr->start;
> > +                             continue;
> > +                     }
> > +
> > +                     prev_mod = strchr(prev->name, '[');
> > +                     curr_mod = strchr(curr->name, '[');
> > +
> >                       /* Last kernel/module symbol mapped to end of page */
> > -                     if (is_kallsyms && (!strchr(prev->name, '[') !=
> > -                                         !strchr(curr->name, '[')))
> > +                     if (!prev_mod != !curr_mod)
> > +                             prev->end = roundup(prev->end + 4096, 4096);
> > +                     /* Last symbol in the previous module */
> > +                     else if (prev_mod && strcmp(prev_mod, curr_mod))
>
> Should two consecutive moudles fall into this case? I think we need to assign
> 'prev->end = curr->start' for two two consecutive moudles.

Yeah I thought about that case but I believe they would be on
separate pages (hopefully there's a page gap between them).
So I think it should not overlap.  But if you really care we can
check it explicitly like this:

    prev->end = min(roundup(...), curr->start);

>
> If so, we should use a specific checking for eBPF program, e.g.:
>
>                         else if (prev_mod && strcmp(prev_mod, curr_mod) &&
>                                  (!strcmp(prev->name, "bpf") ||
>                                   !strcmp(curr->name, "bpf")))

I suspect it can happen on any module boundary so better
to handle it in a more general way.

Thanks,
Namhyung

>
> >                               prev->end = roundup(prev->end + 4096, 4096);
> >                       else
> >                               prev->end = curr->start;
> > --
> > 2.43.0.687.g38aa6559b0-goog
> >
  
Leo Yan Feb. 14, 2024, 10:14 a.m. UTC | #3
On Tue, Feb 13, 2024 at 10:48:53AM -0800, Namhyung Kim wrote:
> Hi Leo,
> 
> Thanks for your review!
> 
> On Mon, Feb 12, 2024 at 7:40???PM Leo Yan <leo.yan@linux.dev> wrote:
> >
> > On Mon, Feb 12, 2024 at 03:33:22PM -0800, Namhyung Kim wrote:
> > > I got a strange error on ARM to fail on processing FINISHED_ROUND
> > > record.  It turned out that it was failing in symbol__alloc_hist()
> > > because the symbol size is too big.
> > >
> > > When a sample is captured on a specific BPF program, it failed.  I've
> > > added a debug code and found the end address of the symbol is from
> > > the next module which is placed far way.
> > >
> > >   ffff800008795778-ffff80000879d6d8: bpf_prog_1bac53b8aac4bc58_netcg_sock    [bpf]
> > >   ffff80000879d6d8-ffff80000ad656b4: bpf_prog_76867454b5944e15_netcg_getsockopt      [bpf]
> > >   ffff80000ad656b4-ffffd69b7af74048: bpf_prog_1d50286d2eb1be85_hn_egress     [bpf]   <---------- here
> > >   ffffd69b7af74048-ffffd69b7af74048: $x.5    [sha3_generic]
> > >   ffffd69b7af74048-ffffd69b7af740b8: crypto_sha3_init        [sha3_generic]
> > >   ffffd69b7af740b8-ffffd69b7af741e0: crypto_sha3_update      [sha3_generic]
> > >
> > > The logic in symbols__fixup_end() just uses curr->start to update the
> > > prev->end.  But in this case, it won't work as it's too different.
> > >
> > > I think ARM has a different kernel memory layout for modules and BPF
> > > than on x86.  Actually there's a logic to handle kernel and module
> > > boundary.  Let's do the same for symbols between different modules.
> >
> > Even Arm32 and Arm64 kernel have different memory layout for modules
> > and kernel image.
> >
> > eBPF program (JITed) should be allocated from the vmalloc region, for
> > Arm64, see bpf_jit_alloc_exec() in arch/arm64/net/bpf_jit_comp.c.
> 
> Ok, so chances are they can fall out far away right?

Yes, this is my understanding.

> > > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > > ---
> > >  tools/perf/util/symbol.c | 21 +++++++++++++++++++--
> > >  1 file changed, 19 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> > > index 35975189999b..9ebdb8e13c0b 100644
> > > --- a/tools/perf/util/symbol.c
> > > +++ b/tools/perf/util/symbol.c
> > > @@ -248,14 +248,31 @@ void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms)
> > >                * segment is very big.  Therefore do not fill this gap and do
> > >                * not assign it to the kernel dso map (kallsyms).
> > >                *
> > > +              * Also BPF code can be allocated separately from text segments
> > > +              * and modules.  So the last entry in a module should not fill
> > > +              * the gap too.
> > > +              *
> > >                * In kallsyms, it determines module symbols using '[' character
> > >                * like in:
> > >                *   ffffffffc1937000 T hdmi_driver_init  [snd_hda_codec_hdmi]
> > >                */
> > >               if (prev->end == prev->start) {
> > > +                     const char *prev_mod;
> > > +                     const char *curr_mod;
> > > +
> > > +                     if (!is_kallsyms) {
> > > +                             prev->end = curr->start;
> > > +                             continue;
> > > +                     }
> > > +
> > > +                     prev_mod = strchr(prev->name, '[');
> > > +                     curr_mod = strchr(curr->name, '[');
> > > +
> > >                       /* Last kernel/module symbol mapped to end of page */
> > > -                     if (is_kallsyms && (!strchr(prev->name, '[') !=
> > > -                                         !strchr(curr->name, '[')))
> > > +                     if (!prev_mod != !curr_mod)
> > > +                             prev->end = roundup(prev->end + 4096, 4096);
> > > +                     /* Last symbol in the previous module */
> > > +                     else if (prev_mod && strcmp(prev_mod, curr_mod))
> >
> > Should two consecutive moudles fall into this case? I think we need to assign
> > 'prev->end = curr->start' for two two consecutive moudles.
> 
> Yeah I thought about that case but I believe they would be on
> separate pages (hopefully there's a page gap between them).
> So I think it should not overlap.  But if you really care we can
> check it explicitly like this:
> 
>     prev->end = min(roundup(...), curr->start);

I am not concerned that to assign a bigger end value for the 'prev'
symbol. With an exaggerate end region, it will not cause any
difficulty for parsing symbols. On the other hand, I am a bit concern
for a big function (e.g. its code size > 4KiB), we might fail to find
symbols in this case with the change above.

> > If so, we should use a specific checking for eBPF program, e.g.:
> >
> >                         else if (prev_mod && strcmp(prev_mod, curr_mod) &&
> >                                  (!strcmp(prev->name, "bpf") ||
> >                                   !strcmp(curr->name, "bpf")))
> 
> I suspect it can happen on any module boundary so better
> to handle it in a more general way.

I don't want to introduce over complexity at here. We can apply
current patch as it is.

A side topic, when I saw the code is hard coded for 4096 as the page
size, this is not always true on Arm64 (the page size can be 4KiB,
16KiB or 64KiB). We need to consider to extend the environment for
recording the system's page size.

Thanks,
Leo

> Thanks,
> Namhyung
> 
> >
> > >                               prev->end = roundup(prev->end + 4096, 4096);
> > >                       else
> > >                               prev->end = curr->start;
> > > --
> > > 2.43.0.687.g38aa6559b0-goog
> > >
  
Namhyung Kim Feb. 16, 2024, 5:19 a.m. UTC | #4
On Wed, Feb 14, 2024 at 2:14 AM Leo Yan <leo.yan@linux.dev> wrote:
>
> On Tue, Feb 13, 2024 at 10:48:53AM -0800, Namhyung Kim wrote:
> > Hi Leo,
> >
> > Thanks for your review!
> >
> > On Mon, Feb 12, 2024 at 7:40???PM Leo Yan <leo.yan@linux.dev> wrote:
> > >
> > > On Mon, Feb 12, 2024 at 03:33:22PM -0800, Namhyung Kim wrote:
> > > > I got a strange error on ARM to fail on processing FINISHED_ROUND
> > > > record.  It turned out that it was failing in symbol__alloc_hist()
> > > > because the symbol size is too big.
> > > >
> > > > When a sample is captured on a specific BPF program, it failed.  I've
> > > > added a debug code and found the end address of the symbol is from
> > > > the next module which is placed far way.
> > > >
> > > >   ffff800008795778-ffff80000879d6d8: bpf_prog_1bac53b8aac4bc58_netcg_sock    [bpf]
> > > >   ffff80000879d6d8-ffff80000ad656b4: bpf_prog_76867454b5944e15_netcg_getsockopt      [bpf]
> > > >   ffff80000ad656b4-ffffd69b7af74048: bpf_prog_1d50286d2eb1be85_hn_egress     [bpf]   <---------- here
> > > >   ffffd69b7af74048-ffffd69b7af74048: $x.5    [sha3_generic]
> > > >   ffffd69b7af74048-ffffd69b7af740b8: crypto_sha3_init        [sha3_generic]
> > > >   ffffd69b7af740b8-ffffd69b7af741e0: crypto_sha3_update      [sha3_generic]
> > > >
> > > > The logic in symbols__fixup_end() just uses curr->start to update the
> > > > prev->end.  But in this case, it won't work as it's too different.
> > > >
> > > > I think ARM has a different kernel memory layout for modules and BPF
> > > > than on x86.  Actually there's a logic to handle kernel and module
> > > > boundary.  Let's do the same for symbols between different modules.
> > >
> > > Even Arm32 and Arm64 kernel have different memory layout for modules
> > > and kernel image.
> > >
> > > eBPF program (JITed) should be allocated from the vmalloc region, for
> > > Arm64, see bpf_jit_alloc_exec() in arch/arm64/net/bpf_jit_comp.c.
> >
> > Ok, so chances are they can fall out far away right?
>
> Yes, this is my understanding.
>
> > > > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > > > ---
> > > >  tools/perf/util/symbol.c | 21 +++++++++++++++++++--
> > > >  1 file changed, 19 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> > > > index 35975189999b..9ebdb8e13c0b 100644
> > > > --- a/tools/perf/util/symbol.c
> > > > +++ b/tools/perf/util/symbol.c
> > > > @@ -248,14 +248,31 @@ void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms)
> > > >                * segment is very big.  Therefore do not fill this gap and do
> > > >                * not assign it to the kernel dso map (kallsyms).
> > > >                *
> > > > +              * Also BPF code can be allocated separately from text segments
> > > > +              * and modules.  So the last entry in a module should not fill
> > > > +              * the gap too.
> > > > +              *
> > > >                * In kallsyms, it determines module symbols using '[' character
> > > >                * like in:
> > > >                *   ffffffffc1937000 T hdmi_driver_init  [snd_hda_codec_hdmi]
> > > >                */
> > > >               if (prev->end == prev->start) {
> > > > +                     const char *prev_mod;
> > > > +                     const char *curr_mod;
> > > > +
> > > > +                     if (!is_kallsyms) {
> > > > +                             prev->end = curr->start;
> > > > +                             continue;
> > > > +                     }
> > > > +
> > > > +                     prev_mod = strchr(prev->name, '[');
> > > > +                     curr_mod = strchr(curr->name, '[');
> > > > +
> > > >                       /* Last kernel/module symbol mapped to end of page */
> > > > -                     if (is_kallsyms && (!strchr(prev->name, '[') !=
> > > > -                                         !strchr(curr->name, '[')))
> > > > +                     if (!prev_mod != !curr_mod)
> > > > +                             prev->end = roundup(prev->end + 4096, 4096);
> > > > +                     /* Last symbol in the previous module */
> > > > +                     else if (prev_mod && strcmp(prev_mod, curr_mod))
> > >
> > > Should two consecutive moudles fall into this case? I think we need to assign
> > > 'prev->end = curr->start' for two two consecutive moudles.
> >
> > Yeah I thought about that case but I believe they would be on
> > separate pages (hopefully there's a page gap between them).
> > So I think it should not overlap.  But if you really care we can
> > check it explicitly like this:
> >
> >     prev->end = min(roundup(...), curr->start);
>
> I am not concerned that to assign a bigger end value for the 'prev'
> symbol. With an exaggerate end region, it will not cause any
> difficulty for parsing symbols.

Right, but my problem was not in parsing.  It failed to allocate
memory for the symbol because it's too big.

> On the other hand, I am a bit concern
> for a big function (e.g. its code size > 4KiB), we might fail to find
> symbols in this case with the change above.

Yes, it's another problem.  But it cannot know the exact size
so it just assumes it fits in a page.

>
> > > If so, we should use a specific checking for eBPF program, e.g.:
> > >
> > >                         else if (prev_mod && strcmp(prev_mod, curr_mod) &&
> > >                                  (!strcmp(prev->name, "bpf") ||
> > >                                   !strcmp(curr->name, "bpf")))
> >
> > I suspect it can happen on any module boundary so better
> > to handle it in a more general way.
>
> I don't want to introduce over complexity at here. We can apply
> current patch as it is.

Good, can I get your Reviewed-by then? :)

>
> A side topic, when I saw the code is hard coded for 4096 as the page
> size, this is not always true on Arm64 (the page size can be 4KiB,
> 16KiB or 64KiB). We need to consider to extend the environment for
> recording the system's page size.

Sounds good.  But until then, 4K would be the reasonable choice.

Thanks,
Namhyung

> >
> > >
> > > >                               prev->end = roundup(prev->end + 4096, 4096);
> > > >                       else
> > > >                               prev->end = curr->start;
> > > > --
> > > > 2.43.0.687.g38aa6559b0-goog
> > > >
  
Leo Yan Feb. 16, 2024, 12:29 p.m. UTC | #5
On Thu, Feb 15, 2024 at 09:19:51PM -0800, Namhyung Kim wrote:

[...]

> > On the other hand, I am a bit concern
> > for a big function (e.g. its code size > 4KiB), we might fail to find
> > symbols in this case with the change above.
> 
> Yes, it's another problem.  But it cannot know the exact size
> so it just assumes it fits in a page.

Agreed.

> > > > If so, we should use a specific checking for eBPF program, e.g.:
> > > >
> > > >                         else if (prev_mod && strcmp(prev_mod, curr_mod) &&
> > > >                                  (!strcmp(prev->name, "bpf") ||
> > > >                                   !strcmp(curr->name, "bpf")))
> > >
> > > I suspect it can happen on any module boundary so better
> > > to handle it in a more general way.
> >
> > I don't want to introduce over complexity at here. We can apply
> > current patch as it is.
> 
> Good, can I get your Reviewed-by then? :)

Yes.

Reviewed-by: Leo Yan <leo.yan@linux.dev>

> > A side topic, when I saw the code is hard coded for 4096 as the page
> > size, this is not always true on Arm64 (the page size can be 4KiB,
> > 16KiB or 64KiB). We need to consider to extend the environment for
> > recording the system's page size.
> 
> Sounds good.  But until then, 4K would be the reasonable choice.

This is fine for me.

Thanks,
Leo
  
Namhyung Kim Feb. 21, 2024, 2 a.m. UTC | #6
On Mon, 12 Feb 2024 15:33:22 -0800, Namhyung Kim wrote:
> I got a strange error on ARM to fail on processing FINISHED_ROUND
> record.  It turned out that it was failing in symbol__alloc_hist()
> because the symbol size is too big.
> 
> When a sample is captured on a specific BPF program, it failed.  I've
> added a debug code and found the end address of the symbol is from
> the next module which is placed far way.
> 
> [...]

Applied to perf-tools-next, thanks!

[1/1] perf tools: Fixup module symbol end address properly
      commit: bacefe0c7b77b7527a613e053b6d378412a8a779

Best regards,
  

Patch

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 35975189999b..9ebdb8e13c0b 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -248,14 +248,31 @@  void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms)
 		 * segment is very big.  Therefore do not fill this gap and do
 		 * not assign it to the kernel dso map (kallsyms).
 		 *
+		 * Also BPF code can be allocated separately from text segments
+		 * and modules.  So the last entry in a module should not fill
+		 * the gap too.
+		 *
 		 * In kallsyms, it determines module symbols using '[' character
 		 * like in:
 		 *   ffffffffc1937000 T hdmi_driver_init  [snd_hda_codec_hdmi]
 		 */
 		if (prev->end == prev->start) {
+			const char *prev_mod;
+			const char *curr_mod;
+
+			if (!is_kallsyms) {
+				prev->end = curr->start;
+				continue;
+			}
+
+			prev_mod = strchr(prev->name, '[');
+			curr_mod = strchr(curr->name, '[');
+
 			/* Last kernel/module symbol mapped to end of page */
-			if (is_kallsyms && (!strchr(prev->name, '[') !=
-					    !strchr(curr->name, '[')))
+			if (!prev_mod != !curr_mod)
+				prev->end = roundup(prev->end + 4096, 4096);
+			/* Last symbol in the previous module */
+			else if (prev_mod && strcmp(prev_mod, curr_mod))
 				prev->end = roundup(prev->end + 4096, 4096);
 			else
 				prev->end = curr->start;