perf/x86: Fix lockdep warning in for_each_sibling_event() on SPR

Message ID 20230704181516.3293665-1-namhyung@kernel.org
State New
Headers
Series perf/x86: Fix lockdep warning in for_each_sibling_event() on SPR |

Commit Message

Namhyung Kim July 4, 2023, 6:15 p.m. UTC
  On SPR, the load latency event needs an auxiliary event in the same
group to work properly.  There's a check in intel_pmu_hw_config()
for this to iterate sibling events and find a mem-loads-aux event.

The for_each_sibling_event() has a lockdep assert to make sure if it
disabled hardirq or hold leader->ctx->mutex.  This works well if the
given event has a separate leader event since perf_try_init_event()
grabs the leader->ctx->mutex to protect the sibling list.  But it can
cause a problem when the event itself is a leader since the event is
not initialized yet and there's no ctx for the event.

Actually I got a lockdep warning when I run the below command on SPR,
but I guess it could be a NULL pointer dereference.

  $ perf record -d -e cpu/mem-loads/uP true

The code path to the warning is:

  sys_perf_event_open()
    perf_event_alloc()
      perf_init_event()
        perf_try_init_event()
          x86_pmu_event_init()
            hsw_hw_config()
              intel_pmu_hw_config()
                for_each_sibling_event()
                  lockdep_assert_event_ctx()

We don't need for_each_sibling_event() when it's a standalone event.
Let's return the error code directly.

Fixes: f3c0eba28704 ("perf: Add a few assertions")
Reported-by: Greg Thelen <gthelen@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 arch/x86/events/intel/core.c | 8 ++++++++
 1 file changed, 8 insertions(+)
  

Comments

Peter Zijlstra July 5, 2023, 8:38 a.m. UTC | #1
On Tue, Jul 04, 2023 at 11:15:15AM -0700, Namhyung Kim wrote:
> On SPR, the load latency event needs an auxiliary event in the same
> group to work properly.  There's a check in intel_pmu_hw_config()
> for this to iterate sibling events and find a mem-loads-aux event.
> 
> The for_each_sibling_event() has a lockdep assert to make sure if it
> disabled hardirq or hold leader->ctx->mutex.  This works well if the
> given event has a separate leader event since perf_try_init_event()
> grabs the leader->ctx->mutex to protect the sibling list.  But it can
> cause a problem when the event itself is a leader since the event is
> not initialized yet and there's no ctx for the event.
> 
> Actually I got a lockdep warning when I run the below command on SPR,
> but I guess it could be a NULL pointer dereference.
> 
>   $ perf record -d -e cpu/mem-loads/uP true
> 
> The code path to the warning is:
> 
>   sys_perf_event_open()
>     perf_event_alloc()
>       perf_init_event()
>         perf_try_init_event()
>           x86_pmu_event_init()
>             hsw_hw_config()
>               intel_pmu_hw_config()
>                 for_each_sibling_event()
>                   lockdep_assert_event_ctx()
> 
> We don't need for_each_sibling_event() when it's a standalone event.
> Let's return the error code directly.
> 
> Fixes: f3c0eba28704 ("perf: Add a few assertions")
> Reported-by: Greg Thelen <gthelen@google.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  arch/x86/events/intel/core.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 0d09245aa8df..933fe4894c32 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3983,6 +3983,14 @@ static int intel_pmu_hw_config(struct perf_event *event)
>  		struct perf_event *leader = event->group_leader;
>  		struct perf_event *sibling = NULL;
>  
> +		/*
> +		 * The event is not fully initialized yet and no ctx is set
> +		 * for the event.  Avoid for_each_sibling_event() since it
> +		 * has a lockdep assert with leader->ctx->mutex.
> +		 */

If I understand things correctly, your patch is indeed correct, however
I don't much like this comment, does something like:

		/*
		 * When this memload event is also the first event (no
		 * group exists yet), then there is no aux event before
		 * it.
		 */

work for you?

> +		if (leader == event)
> +			return -ENODATA;
> +
>  		if (!is_mem_loads_aux_event(leader)) {
>  			for_each_sibling_event(sibling, leader) {
>  				if (is_mem_loads_aux_event(sibling))
> -- 
> 2.41.0.255.g8b1d071c50-goog
>
  
Namhyung Kim July 5, 2023, 3:11 p.m. UTC | #2
Hi Peter,

On Wed, Jul 5, 2023 at 1:38 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Jul 04, 2023 at 11:15:15AM -0700, Namhyung Kim wrote:
> > On SPR, the load latency event needs an auxiliary event in the same
> > group to work properly.  There's a check in intel_pmu_hw_config()
> > for this to iterate sibling events and find a mem-loads-aux event.
> >
> > The for_each_sibling_event() has a lockdep assert to make sure if it
> > disabled hardirq or hold leader->ctx->mutex.  This works well if the
> > given event has a separate leader event since perf_try_init_event()
> > grabs the leader->ctx->mutex to protect the sibling list.  But it can
> > cause a problem when the event itself is a leader since the event is
> > not initialized yet and there's no ctx for the event.
> >
> > Actually I got a lockdep warning when I run the below command on SPR,
> > but I guess it could be a NULL pointer dereference.
> >
> >   $ perf record -d -e cpu/mem-loads/uP true
> >
> > The code path to the warning is:
> >
> >   sys_perf_event_open()
> >     perf_event_alloc()
> >       perf_init_event()
> >         perf_try_init_event()
> >           x86_pmu_event_init()
> >             hsw_hw_config()
> >               intel_pmu_hw_config()
> >                 for_each_sibling_event()
> >                   lockdep_assert_event_ctx()
> >
> > We don't need for_each_sibling_event() when it's a standalone event.
> > Let's return the error code directly.
> >
> > Fixes: f3c0eba28704 ("perf: Add a few assertions")
> > Reported-by: Greg Thelen <gthelen@google.com>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  arch/x86/events/intel/core.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> > index 0d09245aa8df..933fe4894c32 100644
> > --- a/arch/x86/events/intel/core.c
> > +++ b/arch/x86/events/intel/core.c
> > @@ -3983,6 +3983,14 @@ static int intel_pmu_hw_config(struct perf_event *event)
> >               struct perf_event *leader = event->group_leader;
> >               struct perf_event *sibling = NULL;
> >
> > +             /*
> > +              * The event is not fully initialized yet and no ctx is set
> > +              * for the event.  Avoid for_each_sibling_event() since it
> > +              * has a lockdep assert with leader->ctx->mutex.
> > +              */
>
> If I understand things correctly, your patch is indeed correct, however
> I don't much like this comment, does something like:
>
>                 /*
>                  * When this memload event is also the first event (no
>                  * group exists yet), then there is no aux event before
>                  * it.
>                  */
>
> work for you?

Yep, looks good.  Do you want me to resend?

Thanks,
Namhyung


>
> > +             if (leader == event)
> > +                     return -ENODATA;
> > +
> >               if (!is_mem_loads_aux_event(leader)) {
> >                       for_each_sibling_event(sibling, leader) {
> >                               if (is_mem_loads_aux_event(sibling))
> > --
> > 2.41.0.255.g8b1d071c50-goog
> >
  
Peter Zijlstra July 6, 2023, 7:29 a.m. UTC | #3
On Wed, Jul 05, 2023 at 08:11:53AM -0700, Namhyung Kim wrote:
> Yep, looks good.  Do you want me to resend?

Nah, I've got it. Thanks!
  
Namhyung Kim July 7, 2023, 8:35 p.m. UTC | #4
On Thu, Jul 6, 2023 at 12:29 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Jul 05, 2023 at 08:11:53AM -0700, Namhyung Kim wrote:
> > Yep, looks good.  Do you want me to resend?
>
> Nah, I've got it. Thanks!

Thanks Peter!
  

Patch

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 0d09245aa8df..933fe4894c32 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3983,6 +3983,14 @@  static int intel_pmu_hw_config(struct perf_event *event)
 		struct perf_event *leader = event->group_leader;
 		struct perf_event *sibling = NULL;
 
+		/*
+		 * The event is not fully initialized yet and no ctx is set
+		 * for the event.  Avoid for_each_sibling_event() since it
+		 * has a lockdep assert with leader->ctx->mutex.
+		 */
+		if (leader == event)
+			return -ENODATA;
+
 		if (!is_mem_loads_aux_event(leader)) {
 			for_each_sibling_event(sibling, leader) {
 				if (is_mem_loads_aux_event(sibling))