diff mbox series

[v2] srcu: Improve comments about acceleration leak

Message ID	20231211015717.1067822-1-joel@joelfernandes.org
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; From: "Joel Fernandes (Google)" <joel@joelfernandes.org> To: linux-kernel@vger.kernel.org, Lai Jiangshan <jiangshanlai@gmail.com>, "Paul E. McKenney" <paulmck@kernel.org>, Josh Triplett <josh@joshtriplett.org>, Steven Rostedt <rostedt@goodmis.org>, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>, Frederic Weisbecker <frederic@kernel.org>, Neeraj Upadhyay <neeraj.iitr10@gmail.com>, rcu@vger.kernel.org Subject: [PATCH v2] srcu: Improve comments about acceleration leak Date: Mon, 11 Dec 2023 01:57:16 +0000 Message-ID: <20231211015717.1067822-1-joel@joelfernandes.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[v2] srcu: Improve comments about acceleration leak \| [v2] srcu: Improve comments about acceleration leak

Commit Message

Joel Fernandes Dec. 11, 2023, 1:57 a.m. UTC

  The comments added in commit 1ef990c4b36b ("srcu: No need to
advance/accelerate if no callback enqueued") are a bit confusing to me.
The comments are describing a scenario for code that was moved and is
no longer the way it was (snapshot after advancing). Improve the code
comments to reflect this and also document by acceleration can never
fail.

Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
v1->v2: Fix typo in change log.

 kernel/rcu/srcutree.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

Comments

Joel Fernandes Dec. 12, 2023, 8:11 p.m. UTC | #1

> On Dec 10, 2023, at 8:57 PM, Joel Fernandes (Google) <joel@joelfernandes.org> wrote:
> 
> The comments added in commit 1ef990c4b36b ("srcu: No need to
> advance/accelerate if no callback enqueued") are a bit confusing to me.
> The comments are describing a scenario for code that was moved and is
> no longer the way it was (snapshot after advancing). Improve the code
> comments to reflect this and also document by acceleration can never
> fail.
> 
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

Do we want to quick review and put it in Neeraj PR?

Or next merge window ok with me. Just that then I have to keep track of it ;-)

Thanks,

- Joel 



> ---
> v1->v2: Fix typo in change log.
> 
> kernel/rcu/srcutree.c | 24 ++++++++++++++++++++----
> 1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 0351a4e83529..051e149490d1 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1234,11 +1234,20 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
>    if (rhp)
>        rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
>    /*
> -     * The snapshot for acceleration must be taken _before_ the read of the
> -     * current gp sequence used for advancing, otherwise advancing may fail
> -     * and acceleration may then fail too.
> +     * It's crucial to capture the snapshot 's' for acceleration before
> +     * reading the current gp_seq that is used for advancing. This is
> +     * essential because if the acceleration snapshot is taken after a
> +     * failed advancement attempt, there's a risk that a grace period may
> +     * conclude and a new one may start in the interim. If the snapshot is
> +     * captured after this sequence of events, the acceleration snapshot 's'
> +     * could be excessively advanced, leading to acceleration failure.
> +     * In such a scenario, an 'acceleration leak' can occur, where new
> +     * callbacks become indefinitely stuck in the RCU_NEXT_TAIL segment.
> +     * Also note that encountering advancing failures is a normal
> +     * occurrence when the grace period for RCU_WAIT_TAIL is in progress.
>     *
> -     * This could happen if:
> +     * To see this, consider the following events which occur if
> +     * rcu_seq_snap() were to be called after advance:
>     *
>     *  1) The RCU_WAIT_TAIL segment has callbacks (gp_num = X + 4) and the
>     *     RCU_NEXT_READY_TAIL also has callbacks (gp_num = X + 8).
> @@ -1264,6 +1273,13 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
>    if (rhp) {
>        rcu_segcblist_advance(&sdp->srcu_cblist,
>                      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> +        /*
> +         * Acceleration can never fail because the state of gp_seq used
> +         * for advancing is <= the state of gp_seq used for
> +         * acceleration. This means that RCU_NEXT_TAIL segment will
> +         * always be able to be emptied by the acceleration into the
> +         * RCU_NEXT_READY_TAIL or RCU_WAIT_TAIL segments.
> +         */
>        WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s));
>    }
>    if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
> -- 
> 2.43.0.472.g3155946c3a-goog
>

Paul E. McKenney Dec. 13, 2023, 6:10 p.m. UTC | #2

On Tue, Dec 12, 2023 at 03:11:30PM -0500, Joel Fernandes wrote:
> 
> 
> > On Dec 10, 2023, at 8:57 PM, Joel Fernandes (Google) <joel@joelfernandes.org> wrote:
> > 
> > The comments added in commit 1ef990c4b36b ("srcu: No need to
> > advance/accelerate if no callback enqueued") are a bit confusing to me.
> > The comments are describing a scenario for code that was moved and is
> > no longer the way it was (snapshot after advancing). Improve the code
> > comments to reflect this and also document by acceleration can never
> > fail.
> > 
> > Cc: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> 
> Do we want to quick review and put it in Neeraj PR?
> 
> Or next merge window ok with me. Just that then I have to keep track of it ;-)

Or it could get an ack and I could pull it into -rcu.  ;-)

							Thanx, Paul

> Thanks,
> 
> - Joel 
> 
> 
> 
> > ---
> > v1->v2: Fix typo in change log.
> > 
> > kernel/rcu/srcutree.c | 24 ++++++++++++++++++++----
> > 1 file changed, 20 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index 0351a4e83529..051e149490d1 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -1234,11 +1234,20 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
> >    if (rhp)
> >        rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
> >    /*
> > -     * The snapshot for acceleration must be taken _before_ the read of the
> > -     * current gp sequence used for advancing, otherwise advancing may fail
> > -     * and acceleration may then fail too.
> > +     * It's crucial to capture the snapshot 's' for acceleration before
> > +     * reading the current gp_seq that is used for advancing. This is
> > +     * essential because if the acceleration snapshot is taken after a
> > +     * failed advancement attempt, there's a risk that a grace period may
> > +     * conclude and a new one may start in the interim. If the snapshot is
> > +     * captured after this sequence of events, the acceleration snapshot 's'
> > +     * could be excessively advanced, leading to acceleration failure.
> > +     * In such a scenario, an 'acceleration leak' can occur, where new
> > +     * callbacks become indefinitely stuck in the RCU_NEXT_TAIL segment.
> > +     * Also note that encountering advancing failures is a normal
> > +     * occurrence when the grace period for RCU_WAIT_TAIL is in progress.
> >     *
> > -     * This could happen if:
> > +     * To see this, consider the following events which occur if
> > +     * rcu_seq_snap() were to be called after advance:
> >     *
> >     *  1) The RCU_WAIT_TAIL segment has callbacks (gp_num = X + 4) and the
> >     *     RCU_NEXT_READY_TAIL also has callbacks (gp_num = X + 8).
> > @@ -1264,6 +1273,13 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
> >    if (rhp) {
> >        rcu_segcblist_advance(&sdp->srcu_cblist,
> >                      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> > +        /*
> > +         * Acceleration can never fail because the state of gp_seq used
> > +         * for advancing is <= the state of gp_seq used for
> > +         * acceleration. This means that RCU_NEXT_TAIL segment will
> > +         * always be able to be emptied by the acceleration into the
> > +         * RCU_NEXT_READY_TAIL or RCU_WAIT_TAIL segments.
> > +         */
> >        WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s));
> >    }
> >    if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
> > -- 
> > 2.43.0.472.g3155946c3a-goog
> >

Frederic Weisbecker Dec. 16, 2023, 9:17 p.m. UTC | #3

Le Mon, Dec 11, 2023 at 01:57:16AM +0000, Joel Fernandes (Google) a écrit :
> The comments added in commit 1ef990c4b36b ("srcu: No need to
> advance/accelerate if no callback enqueued") are a bit confusing to me.

I know some maintainers who may argue that in the changelog world, the first
person doesn't exist :-)

> The comments are describing a scenario for code that was moved and is
> no longer the way it was (snapshot after advancing). Improve the code
> comments to reflect this and also document by acceleration can never

s/by/why

> fail.
> 
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
> v1->v2: Fix typo in change log.
> 
>  kernel/rcu/srcutree.c | 24 ++++++++++++++++++++----
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 0351a4e83529..051e149490d1 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1234,11 +1234,20 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
>  	if (rhp)
>  		rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
>  	/*
> -	 * The snapshot for acceleration must be taken _before_ the read of the
> -	 * current gp sequence used for advancing, otherwise advancing may fail
> -	 * and acceleration may then fail too.
> +	 * It's crucial to capture the snapshot 's' for acceleration before
> +	 * reading the current gp_seq that is used for advancing. This is
> +	 * essential because if the acceleration snapshot is taken after a
> +	 * failed advancement attempt, there's a risk that a grace period may
> +	 * conclude and a new one may start in the interim. If the snapshot is
> +	 * captured after this sequence of events, the acceleration snapshot 's'
> +	 * could be excessively advanced, leading to acceleration failure.
> +	 * In such a scenario, an 'acceleration leak' can occur, where new
> +	 * callbacks become indefinitely stuck in the RCU_NEXT_TAIL segment.
> +	 * Also note that encountering advancing failures is a normal
> +	 * occurrence when the grace period for RCU_WAIT_TAIL is in progress.
>  	 *
> -	 * This could happen if:
> +	 * To see this, consider the following events which occur if
> +	 * rcu_seq_snap() were to be called after advance:
>  	 *
>  	 *  1) The RCU_WAIT_TAIL segment has callbacks (gp_num = X + 4) and the
>  	 *     RCU_NEXT_READY_TAIL also has callbacks (gp_num = X + 8).
> @@ -1264,6 +1273,13 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
>  	if (rhp) {
>  		rcu_segcblist_advance(&sdp->srcu_cblist,
>  				      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> +		/*
> +		 * Acceleration can never fail because the state of gp_seq used
> +		 * for advancing is <= the state of gp_seq used for
> +		 * acceleration.

What do you mean by "state" here? If it's the gp_seq number, that doesn't look
right. The situation raising the initial bug also involved a gp_seq used for
advancing <= the gp_seq used for acceleration.

Thanks.

> +                This means that RCU_NEXT_TAIL segment will
> +		 * always be able to be emptied by the acceleration into the
> +		 * RCU_NEXT_READY_TAIL or RCU_WAIT_TAIL segments.
> +		 */
>  		WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s));
>  	}
>  	if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
> -- 
> 2.43.0.472.g3155946c3a-goog
>

Joel Fernandes Dec. 18, 2023, 2 a.m. UTC | #4

On Sat, Dec 16, 2023 at 4:17 PM Frederic Weisbecker <frederic@kernel.org> wrote:
>
> Le Mon, Dec 11, 2023 at 01:57:16AM +0000, Joel Fernandes (Google) a écrit :
> > The comments added in commit 1ef990c4b36b ("srcu: No need to
> > advance/accelerate if no callback enqueued") are a bit confusing to me.
>
> I know some maintainers who may argue that in the changelog world, the first
> person doesn't exist :-)

Heh, that's fair. Ok I can drop the 'to me'. ;-)

>
> > The comments are describing a scenario for code that was moved and is
> > no longer the way it was (snapshot after advancing). Improve the code
> > comments to reflect this and also document by acceleration can never
>
> s/by/why

Ok.

> > fail.
> >
> > Cc: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > ---
> > v1->v2: Fix typo in change log.
> >
> >  kernel/rcu/srcutree.c | 24 ++++++++++++++++++++----
> >  1 file changed, 20 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index 0351a4e83529..051e149490d1 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -1234,11 +1234,20 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
> >       if (rhp)
> >               rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
> >       /*
> > -      * The snapshot for acceleration must be taken _before_ the read of the
> > -      * current gp sequence used for advancing, otherwise advancing may fail
> > -      * and acceleration may then fail too.
> > +      * It's crucial to capture the snapshot 's' for acceleration before
> > +      * reading the current gp_seq that is used for advancing. This is
> > +      * essential because if the acceleration snapshot is taken after a
> > +      * failed advancement attempt, there's a risk that a grace period may
> > +      * conclude and a new one may start in the interim. If the snapshot is
> > +      * captured after this sequence of events, the acceleration snapshot 's'
> > +      * could be excessively advanced, leading to acceleration failure.
> > +      * In such a scenario, an 'acceleration leak' can occur, where new
> > +      * callbacks become indefinitely stuck in the RCU_NEXT_TAIL segment.
> > +      * Also note that encountering advancing failures is a normal
> > +      * occurrence when the grace period for RCU_WAIT_TAIL is in progress.
> >        *
> > -      * This could happen if:
> > +      * To see this, consider the following events which occur if
> > +      * rcu_seq_snap() were to be called after advance:
> >        *
> >        *  1) The RCU_WAIT_TAIL segment has callbacks (gp_num = X + 4) and the
> >        *     RCU_NEXT_READY_TAIL also has callbacks (gp_num = X + 8).
> > @@ -1264,6 +1273,13 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
> >       if (rhp) {
> >               rcu_segcblist_advance(&sdp->srcu_cblist,
> >                                     rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> > +             /*
> > +              * Acceleration can never fail because the state of gp_seq used
> > +              * for advancing is <= the state of gp_seq used for
> > +              * acceleration.
>
> What do you mean by "state" here?

State means "value at a certain point in time" here.

> If it's the gp_seq number, that doesn't look right.

Uff, I screwed up the comment. I swapped "acceleration" and
"advancing". I should say:

"Acceleration can never fail because the state of gp_seq value used
for acceleration is <= the state of gp_seq used for advancing."

Does that sound correct now?

> The situation raising the initial bug also involved a gp_seq used for advancing <= the gp_seq used for acceleration.

Right, which I understand is the bug.

thanks,

 - Joel

Frederic Weisbecker Dec. 18, 2023, 12:13 p.m. UTC | #5

Le Sun, Dec 17, 2023 at 09:00:15PM -0500, Joel Fernandes a écrit :
> "Acceleration can never fail because the state of gp_seq value used
> for acceleration is <= the state of gp_seq used for advancing."
> 
> Does that sound correct now?

That can be confusing since acceleration relies on rcu_seq_snap() while
advance relies on rcu_seq_current(). And rcu_seq_snap() returns a snapshot
that may be above the subsequent rcu_seq_current() return value.

So it should rather be something like:

"The base current gp_seq value used to produce the snapshot has to
be <= the gp_seq used for advancing."

Thanks.

Joel Fernandes Dec. 18, 2023, 3:27 p.m. UTC | #6

On Mon, Dec 18, 2023 at 7:13 AM Frederic Weisbecker <frederic@kernel.org> wrote:
>
> Le Sun, Dec 17, 2023 at 09:00:15PM -0500, Joel Fernandes a écrit :
> > "Acceleration can never fail because the state of gp_seq value used
> > for acceleration is <= the state of gp_seq used for advancing."
> >
> > Does that sound correct now?
>
> That can be confusing since acceleration relies on rcu_seq_snap() while
> advance relies on rcu_seq_current(). And rcu_seq_snap() returns a snapshot
> that may be above the subsequent rcu_seq_current() return value.
>
> So it should rather be something like:
>
> "The base current gp_seq value used to produce the snapshot has to
> be <= the gp_seq used for advancing."

Yeah "base current gp_seq" though probably equally confusing sounds a
bit better, so I'll just use that instead of "state of gp_seq".

With that can I add your Review tag?

 - Joel

Frederic Weisbecker Dec. 18, 2023, 9:28 p.m. UTC | #7

Le Mon, Dec 18, 2023 at 10:27:56AM -0500, Joel Fernandes a écrit :
> On Mon, Dec 18, 2023 at 7:13 AM Frederic Weisbecker <frederic@kernel.org> wrote:
> >
> > Le Sun, Dec 17, 2023 at 09:00:15PM -0500, Joel Fernandes a écrit :
> > > "Acceleration can never fail because the state of gp_seq value used
> > > for acceleration is <= the state of gp_seq used for advancing."
> > >
> > > Does that sound correct now?
> >
> > That can be confusing since acceleration relies on rcu_seq_snap() while
> > advance relies on rcu_seq_current(). And rcu_seq_snap() returns a snapshot
> > that may be above the subsequent rcu_seq_current() return value.
> >
> > So it should rather be something like:
> >
> > "The base current gp_seq value used to produce the snapshot has to
> > be <= the gp_seq used for advancing."
> 
> Yeah "base current gp_seq" though probably equally confusing sounds a
> bit better, so I'll just use that instead of "state of gp_seq".
> 
> With that can I add your Review tag?

Sure, sounds good!

Thanks.

> 
>  - Joel
>

diff mbox series

Patch

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 0351a4e83529..051e149490d1 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1234,11 +1234,20 @@  static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
 	if (rhp)
 		rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
 	/*
-	 * The snapshot for acceleration must be taken _before_ the read of the
-	 * current gp sequence used for advancing, otherwise advancing may fail
-	 * and acceleration may then fail too.
+	 * It's crucial to capture the snapshot 's' for acceleration before
+	 * reading the current gp_seq that is used for advancing. This is
+	 * essential because if the acceleration snapshot is taken after a
+	 * failed advancement attempt, there's a risk that a grace period may
+	 * conclude and a new one may start in the interim. If the snapshot is
+	 * captured after this sequence of events, the acceleration snapshot 's'
+	 * could be excessively advanced, leading to acceleration failure.
+	 * In such a scenario, an 'acceleration leak' can occur, where new
+	 * callbacks become indefinitely stuck in the RCU_NEXT_TAIL segment.
+	 * Also note that encountering advancing failures is a normal
+	 * occurrence when the grace period for RCU_WAIT_TAIL is in progress.
 	 *
-	 * This could happen if:
+	 * To see this, consider the following events which occur if
+	 * rcu_seq_snap() were to be called after advance:
 	 *
 	 *  1) The RCU_WAIT_TAIL segment has callbacks (gp_num = X + 4) and the
 	 *     RCU_NEXT_READY_TAIL also has callbacks (gp_num = X + 8).
@@ -1264,6 +1273,13 @@  static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
 	if (rhp) {
 		rcu_segcblist_advance(&sdp->srcu_cblist,
 				      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
+		/*
+		 * Acceleration can never fail because the state of gp_seq used
+		 * for advancing is <= the state of gp_seq used for
+		 * acceleration. This means that RCU_NEXT_TAIL segment will
+		 * always be able to be emptied by the acceleration into the
+		 * RCU_NEXT_READY_TAIL or RCU_WAIT_TAIL segments.
+		 */
 		WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s));
 	}
 	if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {