[RFC,v2,16/20] rcu: Make RCU dynticks counter size configurable

Message ID 20230720163056.2564824-17-vschneid@redhat.com
State New
Headers
Series context_tracking,x86: Defer some IPIs until a user->kernel transition |

Commit Message

Valentin Schneider July 20, 2023, 4:30 p.m. UTC
  CONTEXT_TRACKING_WORK reduces the size of the dynticks counter to free up
some bits for work deferral. Paul suggested making the actual counter size
configurable for rcutorture to poke at, so do that.

Make it only configurable under RCU_EXPERT. Previous commits have added
build-time checks that ensure a kernel with problematic dynticks counter
width can't be built.

Link: http://lore.kernel.org/r/4c2cb573-168f-4806-b1d9-164e8276e66a@paulmck-laptop
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 include/linux/context_tracking.h       |  3 ++-
 include/linux/context_tracking_state.h |  3 +--
 kernel/rcu/Kconfig                     | 33 ++++++++++++++++++++++++++
 3 files changed, 36 insertions(+), 3 deletions(-)
  

Comments

Valentin Schneider July 21, 2023, 8:17 a.m. UTC | #1
On 20/07/23 17:30, Valentin Schneider wrote:
> index bdd7eadb33d8f..1ff2aab24e964 100644
> --- a/kernel/rcu/Kconfig
> +++ b/kernel/rcu/Kconfig
> @@ -332,4 +332,37 @@ config RCU_DOUBLE_CHECK_CB_TIME
>         Say Y here if you need tighter callback-limit enforcement.
>         Say N here if you are unsure.
>
> +config RCU_DYNTICKS_RANGE_BEGIN
> +	int
> +	depends on !RCU_EXPERT
> +	default 31 if !CONTEXT_TRACKING_WORK

You'll note that this should be 30 really, because the lower *2* bits are
taken by the context state (CONTEXT_GUEST has a value of 3).

This highlights the fragile part of this: the Kconfig values are hardcoded,
but they depend on CT_STATE_SIZE, CONTEXT_MASK and CONTEXT_WORK_MAX. The
static_assert() will at least capture any misconfiguration, but having that
enforced by the actual Kconfig ranges would be less awkward.

Do we currently have a way of e.g. making a Kconfig file depend on and use
values generated by a C header?
  
Paul E. McKenney July 21, 2023, 2:10 p.m. UTC | #2
On Fri, Jul 21, 2023 at 09:17:53AM +0100, Valentin Schneider wrote:
> On 20/07/23 17:30, Valentin Schneider wrote:
> > index bdd7eadb33d8f..1ff2aab24e964 100644
> > --- a/kernel/rcu/Kconfig
> > +++ b/kernel/rcu/Kconfig
> > @@ -332,4 +332,37 @@ config RCU_DOUBLE_CHECK_CB_TIME
> >         Say Y here if you need tighter callback-limit enforcement.
> >         Say N here if you are unsure.
> >
> > +config RCU_DYNTICKS_RANGE_BEGIN
> > +	int
> > +	depends on !RCU_EXPERT
> > +	default 31 if !CONTEXT_TRACKING_WORK
> 
> You'll note that this should be 30 really, because the lower *2* bits are
> taken by the context state (CONTEXT_GUEST has a value of 3).
> 
> This highlights the fragile part of this: the Kconfig values are hardcoded,
> but they depend on CT_STATE_SIZE, CONTEXT_MASK and CONTEXT_WORK_MAX. The
> static_assert() will at least capture any misconfiguration, but having that
> enforced by the actual Kconfig ranges would be less awkward.
> 
> Do we currently have a way of e.g. making a Kconfig file depend on and use
> values generated by a C header?

Why not just have something like a boolean RCU_DYNTICKS_TORTURE Kconfig
option and let the C code work out what the number of bits should be?

I suppose that there might be a failure whose frequency depended on
the number of bits, which might be an argument for keeping something
like RCU_DYNTICKS_RANGE_BEGIN for fault isolation.  But still using
RCU_DYNTICKS_TORTURE for normal testing.

Thoughts?

							Thanx, Paul
  
Valentin Schneider July 21, 2023, 3:08 p.m. UTC | #3
On 21/07/23 07:10, Paul E. McKenney wrote:
> On Fri, Jul 21, 2023 at 09:17:53AM +0100, Valentin Schneider wrote:
>> On 20/07/23 17:30, Valentin Schneider wrote:
>> > index bdd7eadb33d8f..1ff2aab24e964 100644
>> > --- a/kernel/rcu/Kconfig
>> > +++ b/kernel/rcu/Kconfig
>> > @@ -332,4 +332,37 @@ config RCU_DOUBLE_CHECK_CB_TIME
>> >         Say Y here if you need tighter callback-limit enforcement.
>> >         Say N here if you are unsure.
>> >
>> > +config RCU_DYNTICKS_RANGE_BEGIN
>> > +	int
>> > +	depends on !RCU_EXPERT
>> > +	default 31 if !CONTEXT_TRACKING_WORK
>>
>> You'll note that this should be 30 really, because the lower *2* bits are
>> taken by the context state (CONTEXT_GUEST has a value of 3).
>>
>> This highlights the fragile part of this: the Kconfig values are hardcoded,
>> but they depend on CT_STATE_SIZE, CONTEXT_MASK and CONTEXT_WORK_MAX. The
>> static_assert() will at least capture any misconfiguration, but having that
>> enforced by the actual Kconfig ranges would be less awkward.
>>
>> Do we currently have a way of e.g. making a Kconfig file depend on and use
>> values generated by a C header?
>
> Why not just have something like a boolean RCU_DYNTICKS_TORTURE Kconfig
> option and let the C code work out what the number of bits should be?
>
> I suppose that there might be a failure whose frequency depended on
> the number of bits, which might be an argument for keeping something
> like RCU_DYNTICKS_RANGE_BEGIN for fault isolation.  But still using
> RCU_DYNTICKS_TORTURE for normal testing.
>
> Thoughts?
>

AFAICT if we run tests with the minimum possible width, then intermediate
values shouldn't have much value.

Your RCU_DYNTICKS_TORTURE suggestion sounds like a saner option than what I
came up with, as we can let the context tracking code figure out the widths
itself and not expose any of that to Kconfig.

>                                                       Thanx, Paul
  
Paul E. McKenney July 21, 2023, 4:09 p.m. UTC | #4
On Fri, Jul 21, 2023 at 04:08:10PM +0100, Valentin Schneider wrote:
> On 21/07/23 07:10, Paul E. McKenney wrote:
> > On Fri, Jul 21, 2023 at 09:17:53AM +0100, Valentin Schneider wrote:
> >> On 20/07/23 17:30, Valentin Schneider wrote:
> >> > index bdd7eadb33d8f..1ff2aab24e964 100644
> >> > --- a/kernel/rcu/Kconfig
> >> > +++ b/kernel/rcu/Kconfig
> >> > @@ -332,4 +332,37 @@ config RCU_DOUBLE_CHECK_CB_TIME
> >> >         Say Y here if you need tighter callback-limit enforcement.
> >> >         Say N here if you are unsure.
> >> >
> >> > +config RCU_DYNTICKS_RANGE_BEGIN
> >> > +	int
> >> > +	depends on !RCU_EXPERT
> >> > +	default 31 if !CONTEXT_TRACKING_WORK
> >>
> >> You'll note that this should be 30 really, because the lower *2* bits are
> >> taken by the context state (CONTEXT_GUEST has a value of 3).
> >>
> >> This highlights the fragile part of this: the Kconfig values are hardcoded,
> >> but they depend on CT_STATE_SIZE, CONTEXT_MASK and CONTEXT_WORK_MAX. The
> >> static_assert() will at least capture any misconfiguration, but having that
> >> enforced by the actual Kconfig ranges would be less awkward.
> >>
> >> Do we currently have a way of e.g. making a Kconfig file depend on and use
> >> values generated by a C header?
> >
> > Why not just have something like a boolean RCU_DYNTICKS_TORTURE Kconfig
> > option and let the C code work out what the number of bits should be?
> >
> > I suppose that there might be a failure whose frequency depended on
> > the number of bits, which might be an argument for keeping something
> > like RCU_DYNTICKS_RANGE_BEGIN for fault isolation.  But still using
> > RCU_DYNTICKS_TORTURE for normal testing.
> >
> > Thoughts?
> >
> 
> AFAICT if we run tests with the minimum possible width, then intermediate
> values shouldn't have much value.
> 
> Your RCU_DYNTICKS_TORTURE suggestion sounds like a saner option than what I
> came up with, as we can let the context tracking code figure out the widths
> itself and not expose any of that to Kconfig.

Agreed.  If a need for variable numbers of bits ever does arise, we can
worry about it at that time.  And then we would have more information
on what a variable-bit facility should look like.

							Thanx, Paul
  

Patch

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 8aee086d0a25f..9c0c622bc27bb 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -12,7 +12,8 @@ 
 
 #ifdef CONFIG_CONTEXT_TRACKING_WORK
 static_assert(CONTEXT_WORK_MAX_OFFSET <= CONTEXT_WORK_END + 1 - CONTEXT_WORK_START,
-	      "Not enough bits for CONTEXT_WORK");
+	      "Not enough bits for CONTEXT_WORK, "
+	      "CONFIG_RCU_DYNTICKS_BITS might be too high");
 #endif
 
 #ifdef CONFIG_CONTEXT_TRACKING_USER
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 828fcdb801f73..292a0b7c06948 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -58,8 +58,7 @@  enum ctx_state {
 #define CONTEXT_STATE_START 0
 #define CONTEXT_STATE_END   (bits_per(CONTEXT_MAX - 1) - 1)
 
-#define RCU_DYNTICKS_BITS  (IS_ENABLED(CONFIG_CONTEXT_TRACKING_WORK) ? 16 : 31)
-#define RCU_DYNTICKS_START (CT_STATE_SIZE - RCU_DYNTICKS_BITS)
+#define RCU_DYNTICKS_START (CT_STATE_SIZE - CONFIG_RCU_DYNTICKS_BITS)
 #define RCU_DYNTICKS_END   (CT_STATE_SIZE - 1)
 #define RCU_DYNTICKS_IDX   BIT(RCU_DYNTICKS_START)
 
diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index bdd7eadb33d8f..1ff2aab24e964 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -332,4 +332,37 @@  config RCU_DOUBLE_CHECK_CB_TIME
 	  Say Y here if you need tighter callback-limit enforcement.
 	  Say N here if you are unsure.
 
+config RCU_DYNTICKS_RANGE_BEGIN
+	int
+	depends on !RCU_EXPERT
+	default 31 if !CONTEXT_TRACKING_WORK
+	default 16 if CONTEXT_TRACKING_WORK
+
+config RCU_DYNTICKS_RANGE_BEGIN
+	int
+	depends on RCU_EXPERT
+	default 2
+
+config RCU_DYNTICKS_RANGE_END
+	int
+	default 31 if !CONTEXT_TRACKING_WORK
+	default 16 if CONTEXT_TRACKING_WORK
+
+config RCU_DYNTICKS_BITS_DEFAULT
+       int
+       default 31 if !CONTEXT_TRACKING_WORK
+       default 16 if CONTEXT_TRACKING_WORK
+
+config RCU_DYNTICKS_BITS
+	int "Dynticks counter width" if CONTEXT_TRACKING_WORK
+	range RCU_DYNTICKS_RANGE_BEGIN RCU_DYNTICKS_RANGE_END
+	default RCU_DYNTICKS_BITS_DEFAULT
+	help
+	  This option controls the width of the dynticks counter.
+
+	  Lower values will make overflows more frequent, which will increase
+	  the likelihood of extending grace-periods.
+
+	  Don't touch this unless you are running some tests.
+
 endmenu # "RCU Subsystem"