[0/1] Slightly relax the type checking done by min() and max().

Message ID	cfc6c0f0fd4c4724890be8a8397c2cbe@AcuMS.aculab.com
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; From: David Laight <David.Laight@ACULAB.COM> To: LKML <linux-kernel@vger.kernel.org>, Andy Shevchenko <andriy.shevchenko@linux.intel.com>, Andrew Morton <akpm@linux-foundation.org> CC: Steven Rostedt <rostedt@goodmis.org>, 'Joe Perches' <joe@perches.com>, Linus Torvalds <torvalds@linux-foundation.org> Subject: [PATCH 0/1] Slightly relax the type checking done by min() and max(). Thread-Topic: [PATCH 0/1] Slightly relax the type checking done by min() and max(). Thread-Index: AdkA3hAUrkOWi1xYRg2SAA6P778SAw== Date: Fri, 25 Nov 2022 15:00:40 +0000 Message-ID: <cfc6c0f0fd4c4724890be8a8397c2cbe@AcuMS.aculab.com> Accept-Language: en-GB, en-US MIME-Version: 1.0 Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Precedence: bulk
Series	Slightly relax the type checking done by min() and max(). \| [0/1] Slightly relax the type checking done by min() and max(). [1/1] minmax.h: Slightly relax the type checking done by min() and max().

Message ID

cfc6c0f0fd4c4724890be8a8397c2cbe@AcuMS.aculab.com

Headers

Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
From: David Laight <David.Laight@ACULAB.COM>
To: LKML <linux-kernel@vger.kernel.org>,
        Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
        Andrew Morton <akpm@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>,
        'Joe Perches' <joe@perches.com>,
        Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 0/1] Slightly relax the type checking done by min() and max().
Thread-Topic: [PATCH 0/1] Slightly relax the type checking done by min() and
 max().
Thread-Index: AdkA3hAUrkOWi1xYRg2SAA6P778SAw==
Date: Fri, 25 Nov 2022 15:00:40 +0000
Message-ID: <cfc6c0f0fd4c4724890be8a8397c2cbe@AcuMS.aculab.com>
Accept-Language: en-GB, en-US
MIME-Version: 1.0
Content-Language: en-US
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8BIT
Precedence: bulk

Series

Slightly relax the type checking done by min() and max(). |

Message

David Laight Nov. 25, 2022, 3 p.m. UTC

  The min() and max() defines include a type check to avoid the unexpected
  behaviour when a negative value is compared against and unsigned value.
However a lot of code hits this check and uses min_t() to avoid the error.
Many of these are just plain wrong.

Those casting to u8 or u16 are particularly suspect, eg:
drivers/usb/misc/usb251xb.c:528:
		hub->max_current_sp = min_t(u8, property_u32 / 2000, 50);

This patch does two changes:
- Replace typeof(x) with typeof((x) + 0) to promote char/short to int.
- Add an (int) cast to constants between 0 and MAX_INT so the compiler
  doesn't promote the 'other side' of the comparison to an unsinged type.
  If this is done the type test is arranged to always succeed.

The following can also be done (with some lateral thought):
- Allow all comparisons where both types are signed. 
- Allow all comparisons where both types are unsigned. 
- Allow comparisons where the larger type is signed.

In addition most of the min_t() calls are there to compare a signed type
(that holds a non-negative value) with an unsigned value.
The definition:
#define min_unsigned(x,y) min((x) + 0u + 0ull, (y) + 0u + 0ull)
will do an unsigned comparision without 'accidentally' masking off
any non-zero high bits.

With those extra changes there can be a 'duck shoot' on min_t().

David Laight (1):
  Slightly relax the type checking done by min() and max().

 include/linux/minmax.h | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

Comments

Andy Shevchenko Nov. 25, 2022, 3:17 p.m. UTC | #1

On Fri, Nov 25, 2022 at 03:00:40PM +0000, David Laight wrote:
> The min() and max() defines include a type check to avoid the unexpected
>   behaviour when a negative value is compared against and unsigned value.
> However a lot of code hits this check and uses min_t() to avoid the error.
> Many of these are just plain wrong.
> 
> Those casting to u8 or u16 are particularly suspect, eg:
> drivers/usb/misc/usb251xb.c:528:
> 		hub->max_current_sp = min_t(u8, property_u32 / 2000, 50);
> 
> This patch does two changes:
> - Replace typeof(x) with typeof((x) + 0) to promote char/short to int.
> - Add an (int) cast to constants between 0 and MAX_INT so the compiler
>   doesn't promote the 'other side' of the comparison to an unsinged type.
>   If this is done the type test is arranged to always succeed.
> 
> The following can also be done (with some lateral thought):
> - Allow all comparisons where both types are signed. 
> - Allow all comparisons where both types are unsigned. 
> - Allow comparisons where the larger type is signed.
> 
> In addition most of the min_t() calls are there to compare a signed type
> (that holds a non-negative value) with an unsigned value.
> The definition:
> #define min_unsigned(x,y) min((x) + 0u + 0ull, (y) + 0u + 0ull)
> will do an unsigned comparision without 'accidentally' masking off
> any non-zero high bits.
> 
> With those extra changes there can be a 'duck shoot' on min_t().

You have an issue with your email setup, i.e. you send two independent messages
(not a chain). It probably shows that either you don't use `git send-email` for
sending patch, or you missed --thread option to it.

Andy Shevchenko Nov. 25, 2022, 3:20 p.m. UTC | #2

On Fri, Nov 25, 2022 at 03:00:40PM +0000, David Laight wrote:
> The min() and max() defines include a type check to avoid the unexpected
>   behaviour when a negative value is compared against and unsigned value.
> However a lot of code hits this check and uses min_t() to avoid the error.
> Many of these are just plain wrong.
> 
> Those casting to u8 or u16 are particularly suspect, eg:
> drivers/usb/misc/usb251xb.c:528:
> 		hub->max_current_sp = min_t(u8, property_u32 / 2000, 50);

I don't buy this. What's exactly wrong with this code?

David Laight Nov. 25, 2022, 3:22 p.m. UTC | #3

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Sent: 25 November 2022 15:18
...
> You have an issue with your email setup, i.e. you send two independent messages
> (not a chain). It probably shows that either you don't use `git send-email` for
> sending patch, or you missed --thread option to it.

It is technically impossible for me to send (or view) threaded emails.
I have to send them using outlook.
The only way to avoid mangling the text is to cut&paste from wordpad.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

David Laight Nov. 25, 2022, 3:27 p.m. UTC | #4

From: Andy Shevchenko
> Sent: 25 November 2022 15:21
> 
> On Fri, Nov 25, 2022 at 03:00:40PM +0000, David Laight wrote:
> > The min() and max() defines include a type check to avoid the unexpected
> >   behaviour when a negative value is compared against and unsigned value.
> > However a lot of code hits this check and uses min_t() to avoid the error.
> > Many of these are just plain wrong.
> >
> > Those casting to u8 or u16 are particularly suspect, eg:
> > drivers/usb/misc/usb251xb.c:528:
> > 		hub->max_current_sp = min_t(u8, property_u32 / 2000, 50);
> 
> I don't buy this. What's exactly wrong with this code?

Consider what happens if propery_u32 is 512000.
The returned value is 0 not 50.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Andy Shevchenko Nov. 25, 2022, 3:58 p.m. UTC | #5

On Fri, Nov 25, 2022 at 03:27:07PM +0000, David Laight wrote:
> From: Andy Shevchenko
> > Sent: 25 November 2022 15:21
> > On Fri, Nov 25, 2022 at 03:00:40PM +0000, David Laight wrote:
> > > The min() and max() defines include a type check to avoid the unexpected
> > >   behaviour when a negative value is compared against and unsigned value.
> > > However a lot of code hits this check and uses min_t() to avoid the error.
> > > Many of these are just plain wrong.
> > >
> > > Those casting to u8 or u16 are particularly suspect, eg:
> > > drivers/usb/misc/usb251xb.c:528:
> > > 		hub->max_current_sp = min_t(u8, property_u32 / 2000, 50);
> > 
> > I don't buy this. What's exactly wrong with this code?
> 
> Consider what happens if propery_u32 is 512000.
> The returned value is 0 not 50.

I considered that and there are two things to consider on your side:
1) it's coming from device property;
2) device property is validated using YAML schema.

On top of that, the wrong property is on the user. We have a lot of stuff that
user may put wrongly, but it's user's choice.

Any better example, please?

David Laight Nov. 25, 2022, 4:14 p.m. UTC | #6

From: 'Andy Shevchenko'
> Sent: 25 November 2022 15:58
> 
> On Fri, Nov 25, 2022 at 03:27:07PM +0000, David Laight wrote:
> > From: Andy Shevchenko
> > > Sent: 25 November 2022 15:21
> > > On Fri, Nov 25, 2022 at 03:00:40PM +0000, David Laight wrote:
> > > > The min() and max() defines include a type check to avoid the unexpected
> > > >   behaviour when a negative value is compared against and unsigned value.
> > > > However a lot of code hits this check and uses min_t() to avoid the error.
> > > > Many of these are just plain wrong.
> > > >
> > > > Those casting to u8 or u16 are particularly suspect, eg:
> > > > drivers/usb/misc/usb251xb.c:528:
> > > > 		hub->max_current_sp = min_t(u8, property_u32 / 2000, 50);
> > >
> > > I don't buy this. What's exactly wrong with this code?
> >
> > Consider what happens if propery_u32 is 512000.
> > The returned value is 0 not 50.
> 
> I considered that and there are two things to consider on your side:
> 1) it's coming from device property;
> 2) device property is validated using YAML schema.
> 
> On top of that, the wrong property is on the user. We have a lot of stuff that
> user may put wrongly, but it's user's choice.
> 
> Any better example, please?

How about:

data_size = min_t(u16, buf_size, len);

https://elixir.bootlin.com/linux/v6.1-rc6/source/kernel/printk/printk_ringbuffer.c#L1738


Now, maybe, you could claim that buf_size > 64k never happens.
But the correct cast here is u32 to match buf_size.
len (being u16) will be promoted to int before the compare.

Just search the kernel for "min_t(u8," or "min_t(u16," while some might
be ok, I really wouldn't want to verify each case.

If you look hard enough there are also some:
	u32_var = min_t(u32, u32_val, u64_val);
where the intent is to limit values that might be invalid for u32.

I did try compiling a kernel with min_t() defined to be min() (no casts)
but there were too many false positives without allowing all
unsigned v unsigned compares.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Andy Shevchenko Nov. 25, 2022, 5:48 p.m. UTC | #7

On Fri, Nov 25, 2022 at 04:14:58PM +0000, David Laight wrote:
> From: 'Andy Shevchenko'
> > Sent: 25 November 2022 15:58
> > On Fri, Nov 25, 2022 at 03:27:07PM +0000, David Laight wrote:

...

> > Any better example, please?
> 
> How about:

Better, indeed.

> data_size = min_t(u16, buf_size, len);
> 
> https://elixir.bootlin.com/linux/v6.1-rc6/source/kernel/printk/printk_ringbuffer.c#L1738
> 
> Now, maybe, you could claim that buf_size > 64k never happens.
> But the correct cast here is u32 to match buf_size.
> len (being u16) will be promoted to int before the compare.
> 
> Just search the kernel for "min_t(u8," or "min_t(u16," while some might
> be ok, I really wouldn't want to verify each case.
> 
> If you look hard enough there are also some:
> 	u32_var = min_t(u32, u32_val, u64_val);
> where the intent is to limit values that might be invalid for u32.

Wouldn't be better to actually issue a warning if the desired type is shorter
than one of the min_t() arguments?

Then you go through all cases and fix them accordingly.

Blindly relaxing the rules is not an option in my opinion.

David Laight Nov. 25, 2022, 7:47 p.m. UTC | #8

From: 'Andy Shevchenko'
> Sent: 25 November 2022 17:48
> 
> On Fri, Nov 25, 2022 at 04:14:58PM +0000, David Laight wrote:
> > From: 'Andy Shevchenko'
> > > Sent: 25 November 2022 15:58
> > > On Fri, Nov 25, 2022 at 03:27:07PM +0000, David Laight wrote:
> 
> ...
> 
> > > Any better example, please?
> >
> > How about:
> 
> Better, indeed.
> 
> > data_size = min_t(u16, buf_size, len);
> >
> > https://elixir.bootlin.com/linux/v6.1-rc6/source/kernel/printk/printk_ringbuffer.c#L1738
> >
> > Now, maybe, you could claim that buf_size > 64k never happens.
> > But the correct cast here is u32 to match buf_size.
> > len (being u16) will be promoted to int before the compare.
> >
> > Just search the kernel for "min_t(u8," or "min_t(u16," while some might
> > be ok, I really wouldn't want to verify each case.
> >
> > If you look hard enough there are also some:
> > 	u32_var = min_t(u32, u32_val, u64_val);
> > where the intent is to limit values that might be invalid for u32.
> 
> Wouldn't be better to actually issue a warning if the desired type is shorter
> than one of the min_t() arguments?
> 
> Then you go through all cases and fix them accordingly.

That is an option, but that is separate from this change.

> Blindly relaxing the rules is not an option in my opinion.

The point is I'm not really relaxing the rules.
If anything I'm actually tightening them by allowing min() to
be used in more cases - so letting the buggy min_t() casts be
removed at some point in the future.

The purpose of the type check is to error code like:
	int x = -4;
	x = min(x, 100u);
where integer promotion changes the comparison 'x < 100u' to
the unexpected '(unsigned int)x < 100u' so x is set to 100.

However is also errors the opposite:
	unsigned int x = 4;
	x = min(x, 100);
But, in this case, the compiler just converts 100 to 100u and
everything is fine.

It also errors the perfectly valid (and reasonable looking):
	unsigned char x = 4;
	x = min(x, 100u);
because 100u is 'unsigned int'.
Indeed, 'x' gets converted to 'signed int' for the comparison.

When the (usually) RHS is a compile-time constant that is known
to be positive (and less than 2^31) I'm just changing the test to:
	if (variable < (int)constant)
If 'variable' is a signed type this always generates the
required signed compare.
If 'variable is 'unsigned char' or 'unsigned short' then it
gets promoted to signed int and a signed compare of positive
values is done.
If variable is 'unsigned int', 'unsigned long' or 'unsigned long long'
then the RHS is converted to unsigned - but keeps its value since
it is known to be positive.
In all cases the outcome is exactly what is required.
No change of putting in the wrong cast.

There are a few other common cases that are valid but currently
generate an error:
	min(s8_var, int_expr)
	min(u8_var, int_expr)
	min(u8_var, unsigned_int_expr)
These generate an error regardless of whether the sizes match:
	min(int_expr, long_expr)
	min(unsigned_int_expr, unsigned_long_expr)
	min(unsigned_long_expr, unsigned_long_long_expr)
This is less common but should also be allowed:
	min(long_long_int_expr, unsigned_int_expr)
(ie when the signed type is larger than the unsigned one)

I have a plan for how to allow all those as well.

Checking typeof((x) + 0) against typeof((y) + 0) allows for the
promotion of char/short to signed int - but not any of the
other comparisons that never implicitly convert a signed value
to unsigned (and hence negative values to large positive ones).

	David

	

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Andrew Morton Nov. 26, 2022, 12:02 a.m. UTC | #9

On Fri, 25 Nov 2022 19:47:14 +0000 David Laight <David.Laight@ACULAB.COM> wrote:

> > Blindly relaxing the rules is not an option in my opinion.
> 
> The point is I'm not really relaxing the rules.
> If anything I'm actually tightening them by allowing min() to
> be used in more cases - so letting the buggy min_t() casts be
> removed at some point in the future.

That sounds very virtuous.

It would be helpful to see a few "convert min_t to min" patches
to see this proposal in action.

David Laight Nov. 26, 2022, 10:21 a.m. UTC | #10

From: Andrew Morton
> Sent: 26 November 2022 00:02
> 
> On Fri, 25 Nov 2022 19:47:14 +0000 David Laight <David.Laight@ACULAB.COM> wrote:
> 
> > > Blindly relaxing the rules is not an option in my opinion.
> >
> > The point is I'm not really relaxing the rules.
> > If anything I'm actually tightening them by allowing min() to
> > be used in more cases - so letting the buggy min_t() casts be
> > removed at some point in the future.
> 
> That sounds very virtuous.

Indeed.

> It would be helpful to see a few "convert min_t to min" patches
> to see this proposal in action.

I did try building a kernel with '#define min_t min' and then
picking up the pieces.
Doable but there are still too many false positives.
That is when I found some of the 'broken' uses of min_t().

I'd rather get this patch accepted and then build on top of it.
Then it should be possible to work out which min_t() actually
need to stay - I suspect very few.

It might even be worth converting min_t(type,x,y) to
min_t_type(x,y) and then converting the min_t_u[0-9]*(x,y) to
#define min_unsigned(x,y) min((x) + 0u + 0ull, (y) + 0u + 0ull)
I think that efficiently converts both values to an unsigned
type without masking the value.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)