[v3,0/4] kstrtox: introduce memparse_safe()

Message ID cover.1704324320.git.wqu@suse.com
Headers
Series kstrtox: introduce memparse_safe() |

Message

Qu Wenruo Jan. 3, 2024, 11:27 p.m. UTC
  [CHANGELOG]
v3:
- Fix the 32bit pointer pattern in the test case
  The old pointer pattern for 32 bit systems is in fact 40 bits,
  which would still lead to sparse warning.
  The newer pattern is using UINTPTR_MAX to trim the pattern, then
  converted to a pointer, which should not cause any trimmed bits and
  make sparse happy.

v2:
- Make _parse_integer_fixup_radix() to always treat "0x" as hex
  This is to make sure invalid strings like "0x" or "0xG" to fail
  as expected for memparse_safe().
  Or they would only parse the first 0, then leaving "x" for caller
  to handle.

- Update the test case to include above failure cases
  This including:
  * "0x", just hex prefix without any suffix/follow up chars
  * "0xK", just hex prefix and a stray suffix
  * "0xY", hex prefix with an invalid char

- Fix a bug in btrfs' conversion to memparse_safe()
  Where I forgot to delete the old memparse() line.

- Fix a compiler warning on m68K
  On that platform, a pointer (32 bits) is smaller than unsigned long long
  (64 bits), which can cause static checker to warn.

Function memparse() lacks error handling:

- If no valid number string at all
  In that case @retptr would just be updated and return value would be
  zero.

- No overflown detection
  This applies to both the number string part, and the suffixes part.
  And since we have no way to indicate errors, we can get weird results
  like:

  	"25E" -> 10376293541461622784 (9E)

  This is due to the fact that for "E" suffix, there is only 4 bits
  left, and 25 with 60 bits left shift would lead to overflow.
  (And decision to support for that "E" suffix is already cursed)

So here we introduce a safer version of it: memparse_safe(), and mark
the original one deprecated.
Unfortunately I didn't find a good way to mark it deprecated, as with
recent -Werror changes, '__deprecated' marco does not seem to warn
anymore.

The new helper has the following advantages:

- Better overflow and invalid string detection
  The overflow detection is for both the numberic part, and with the
  suffix. Thus above "25E" would be rejected correctly.
  The invalid string part means if there is no valid number starts at
  the buffer, we return -EINVAL.

- Allow caller to select the suffixes, and saner default ones
  The new default one would be "KMGTP", without the cursed and overflow
  prone "E".
  Some older code like setup_elfcorehdr() would benefit from it, if the
  code really wants to only allow "KMG" suffixes.

- Keep the old @retptr behavior
  So the existing callers can easily migrate to the new one, without the
  need to do extra strsep() work.

- Finally test cases
  The test case would cover more things than the existing kstrtox
  tests:

  * The @retptr behavior
    Either for bad cases, which @retptr should not be touched,
    or for good cases, the @retptr is properly advanced,

  * Return value verification
    Make sure we distinguish -EINVAL and -ERANGE correctly.

  * Valid string with suffix, but disable the corresponding suffix
    To make sure we still got the numeric string parsed, and can
    still pass the disabled suffix to the caller.

With the new helper, migrate btrfs to the interface, and since the
@retptr behavior is the same, we won't cause any behavior change.


Qu Wenruo (4):
  kstrtox: always skip the leading "0x" even if no more valid chars
  kstrtox: introduce a safer version of memparse()
  kstrtox: add unit tests for memparse_safe()
  btrfs: migrate to the newer memparse_safe() helper

 arch/x86/boot/string.c  |   2 +-
 fs/btrfs/ioctl.c        |   6 +-
 fs/btrfs/super.c        |   9 +-
 fs/btrfs/sysfs.c        |  14 ++-
 include/linux/kernel.h  |   8 +-
 include/linux/kstrtox.h |  15 +++
 lib/cmdline.c           |   5 +-
 lib/kstrtox.c           |  98 +++++++++++++++-
 lib/test-kstrtox.c      | 244 ++++++++++++++++++++++++++++++++++++++++
 9 files changed, 392 insertions(+), 9 deletions(-)
  

Comments

Andy Shevchenko Jan. 6, 2024, 2:34 p.m. UTC | #1
On Thu, Jan 04, 2024 at 09:57:47AM +1030, Qu Wenruo wrote:
> [CHANGELOG]
> v3:
> - Fix the 32bit pointer pattern in the test case
>   The old pointer pattern for 32 bit systems is in fact 40 bits,
>   which would still lead to sparse warning.
>   The newer pattern is using UINTPTR_MAX to trim the pattern, then
>   converted to a pointer, which should not cause any trimmed bits and
>   make sparse happy.

Having test cases is quite good, thanks!
But as I understood what Alexey wanted, is not using the kstrtox files for this.
You can introduce it in the cmdline.c, correct? Just include local "kstrtox.h".

I'm on leave till end of the month, I'll look at this later.
  
Qu Wenruo Jan. 6, 2024, 8:58 p.m. UTC | #2
On 2024/1/7 01:04, Andy Shevchenko wrote:
> On Thu, Jan 04, 2024 at 09:57:47AM +1030, Qu Wenruo wrote:
>> [CHANGELOG]
>> v3:
>> - Fix the 32bit pointer pattern in the test case
>>    The old pointer pattern for 32 bit systems is in fact 40 bits,
>>    which would still lead to sparse warning.
>>    The newer pattern is using UINTPTR_MAX to trim the pattern, then
>>    converted to a pointer, which should not cause any trimmed bits and
>>    make sparse happy.
>
> Having test cases is quite good, thanks!
> But as I understood what Alexey wanted, is not using the kstrtox files for this.
> You can introduce it in the cmdline.c, correct? Just include local "kstrtox.h".

Not really possible, all the needed parsing helpers are internal inside
kstrtox.c.

Furthermore, this also means memparse() can not be enhanced due to:

- Lack of ways to return errors

- Unable to call the parsing helpers inside cmdline.c

Thanks,
Qu

>
> I'm on leave till end of the month, I'll look at this later.
>
  
Andy Shevchenko Jan. 14, 2024, 1:42 p.m. UTC | #3
On Sun, Jan 07, 2024 at 07:28:27AM +1030, Qu Wenruo wrote:
> On 2024/1/7 01:04, Andy Shevchenko wrote:
> > On Thu, Jan 04, 2024 at 09:57:47AM +1030, Qu Wenruo wrote:

..

> > Having test cases is quite good, thanks!
> > But as I understood what Alexey wanted, is not using the kstrtox files for this.
> > You can introduce it in the cmdline.c, correct? Just include local "kstrtox.h".
> 
> Not really possible, all the needed parsing helpers are internal inside
> kstrtox.c.

I'm not sure I follow. The functions are available to other library (built-in)
modules.

> Furthermore, this also means memparse() can not be enhanced due to:
> 
> - Lack of ways to return errors

What does this mean?

> - Unable to call the parsing helpers inside cmdline.c

??? (see above)
  
Qu Wenruo Jan. 14, 2024, 8:01 p.m. UTC | #4
On 2024/1/15 00:12, Andy Shevchenko wrote:
> On Sun, Jan 07, 2024 at 07:28:27AM +1030, Qu Wenruo wrote:
>> On 2024/1/7 01:04, Andy Shevchenko wrote:
>>> On Thu, Jan 04, 2024 at 09:57:47AM +1030, Qu Wenruo wrote:
> 
> ...
> 
>>> Having test cases is quite good, thanks!
>>> But as I understood what Alexey wanted, is not using the kstrtox files for this.
>>> You can introduce it in the cmdline.c, correct? Just include local "kstrtox.h".
>>
>> Not really possible, all the needed parsing helpers are internal inside
>> kstrtox.c.
> 
> I'm not sure I follow. The functions are available to other library (built-in)
> modules.

Did I miss something?

Firstly neither _parse_integer_fixup_radix() nor _parse_integer_limit() 
is exported to modules. (No EXPORT_SYMBOL() call on them).

Secondly _parse_integer_fixup_radix() and _parse_integer_limit have "_" 
prefix, and is only declared in "lib/kstrtox.h", which means they are 
designed only for internal usage.
If putting memparse_safe() into cmdline.c, at least we would need to 
include local header "kstrtox.h", and I'm not sure if this is any better 
than putting memparse_safe() into kstrtox.[ch].

Finally, I just tried putting memparse_safe() into cmdline.c, and it 
failed at linkage stage, even if that offending file has no call to 
memparse_safe():

   ld: arch/x86/boot/compressed/kaslr.o: in function `memparse_safe':
kaslr.c:(.text+0xbb1): undefined reference to `_parse_integer_fixup_radix'
   ld: kaslr.c:(.text+0xbc5): undefined reference to `_parse_integer'
   ld: arch/x86/boot/compressed/vmlinux: hidden symbol `_parse_integer' 
isn't defined

I can try again but I'm not sure if it's possible to move 
memparse_safe() to cmdline.[ch].

> 
>> Furthermore, this also means memparse() can not be enhanced due to:
>>
>> - Lack of ways to return errors
> 
> What does this mean?

If you want to keep the prototype of memparse() (aka, a drop-in 
enhancement), then there is no good way to indicate the errors like 
overflow at all.

> 
>> - Unable to call the parsing helpers inside cmdline.c
> 
> ??? (see above)
> 
See above.

Thanks,
Qu