[v2,2/4] string: Allow 2-argument strscpy()

Message ID 20240205123525.1379299-2-keescook@chromium.org
State New
Headers
Series string: Allow 2-argument strscpy() |

Commit Message

Kees Cook Feb. 5, 2024, 12:35 p.m. UTC
  Using sizeof(dst) for the "size" argument in strscpy() is the
overwhelmingly common case. Instead of requiring this everywhere, allow a
2-argument version to be used that will use the sizeof() internally. There
are other functions in the kernel with optional arguments[1], so this
isn't unprecedented, and improves readability. Update and relocate the
kern-doc for strscpy() too.

Adjust ARCH=um build to notice the changed export name, as it doesn't
do full header includes for the string helpers.

This could additionally let us save a few hundred lines of code:
 1177 files changed, 2455 insertions(+), 3026 deletions(-)
with a treewide cleanup using Coccinelle:

@needless_arg@
expression DST, SRC;
@@

        strscpy(DST, SRC
-, sizeof(DST)
        )

Link: https://elixir.bootlin.com/linux/v6.7/source/include/linux/pci.h#L1517 [1]
Reviewed-by: Justin Stitt <justinstitt@google.com>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/um/include/shared/user.h  |  3 ++-
 include/linux/fortify-string.h | 22 ++-------------------
 include/linux/string.h         | 35 +++++++++++++++++++++++++++++++++-
 lib/string.c                   |  4 ++--
 4 files changed, 40 insertions(+), 24 deletions(-)
  

Comments

Geert Uytterhoeven Feb. 5, 2024, 12:47 p.m. UTC | #1
Hi Kees,

On Mon, Feb 5, 2024 at 1:37 PM Kees Cook <keescook@chromium.org> wrote:
> Using sizeof(dst) for the "size" argument in strscpy() is the
> overwhelmingly common case. Instead of requiring this everywhere, allow a
> 2-argument version to be used that will use the sizeof() internally. There
> are other functions in the kernel with optional arguments[1], so this
> isn't unprecedented, and improves readability. Update and relocate the
> kern-doc for strscpy() too.
>
> Adjust ARCH=um build to notice the changed export name, as it doesn't
> do full header includes for the string helpers.
>
> This could additionally let us save a few hundred lines of code:
>  1177 files changed, 2455 insertions(+), 3026 deletions(-)
> with a treewide cleanup using Coccinelle:
>
> @needless_arg@
> expression DST, SRC;
> @@
>
>         strscpy(DST, SRC
> -, sizeof(DST)
>         )
>
> Link: https://elixir.bootlin.com/linux/v6.7/source/include/linux/pci.h#L1517 [1]
> Reviewed-by: Justin Stitt <justinstitt@google.com>
> Cc: Andy Shevchenko <andy@kernel.org>
> Cc: linux-hardening@vger.kernel.org
> Signed-off-by: Kees Cook <keescook@chromium.org>

Thanks for your patch!

> --- a/include/linux/string.h
> +++ b/include/linux/string.h

> +/*
> + * The 2 argument style can only be used when dst is an array with a
> + * known size.
> + */
> +#define __strscpy0(dst, src, ...)      \
> +       sized_strscpy(dst, src, sizeof(dst) + __must_be_array(dst))
> +#define __strscpy1(dst, src, size)     sized_strscpy(dst, src, size)

(dst), (src), (size) etc.


> +
> +/**
> + * strscpy - Copy a C-string into a sized buffer
> + * @dst: Where to copy the string to
> + * @src: Where to copy the string from
> + * @...: Size of destination buffer (optional)
> + *
> + * Copy the source string @src, or as much of it as fits, into the
> + * destination @dst buffer. The behavior is undefined if the string
> + * buffers overlap. The destination @dst buffer is always NUL terminated,
> + * unless it's zero-sized.
> + *
> + * The size argument @... is only required when @dst is not an array, or
> + * when the copy needs to be smaller than sizeof(@dst).
> + *
> + * Preferred to strncpy() since it always returns a valid string, and
> + * doesn't unnecessarily force the tail of the destination buffer to be
> + * zero padded. If padding is desired please use strscpy_pad().
> + *
> + * Returns the number of characters copied in @dst (not including the
> + * trailing %NUL) or -E2BIG if @size is 0 or the copy from @src was
> + * truncated.
> + */
> +#define strscpy(dst, src, ...) \
> +       CONCATENATE(__strscpy, COUNT_ARGS(__VA_ARGS__))(dst, src, __VA_ARGS__)

Likewise

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68korg

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
  
andy@kernel.org Feb. 5, 2024, 12:58 p.m. UTC | #2
On Mon, Feb 05, 2024 at 01:47:08PM +0100, Geert Uytterhoeven wrote:
> On Mon, Feb 5, 2024 at 1:37 PM Kees Cook <keescook@chromium.org> wrote:

..

> > +#define __strscpy1(dst, src, size)     sized_strscpy(dst, src, size)
> 
> (dst), (src), (size) etc.

No need.

..

> > +#define strscpy(dst, src, ...) \
> > +       CONCATENATE(__strscpy, COUNT_ARGS(__VA_ARGS__))(dst, src, __VA_ARGS__)
> 
> Likewise

Likewise
  
Kees Cook Feb. 5, 2024, 1:01 p.m. UTC | #3
On Mon, Feb 05, 2024 at 01:47:08PM +0100, Geert Uytterhoeven wrote:
> > +/*
> > + * The 2 argument style can only be used when dst is an array with a
> > + * known size.
> > + */
> > +#define __strscpy0(dst, src, ...)      \
> > +       sized_strscpy(dst, src, sizeof(dst) + __must_be_array(dst))
> > +#define __strscpy1(dst, src, size)     sized_strscpy(dst, src, size)
> 
> (dst), (src), (size) etc.

I normally don't do this when macro args are being expanded into
function arguments. I've only done it for when macro args are used in
expressions. Am I missing a side-effect here, or is this more about
stylistic consistency?
  
Geert Uytterhoeven Feb. 5, 2024, 1:07 p.m. UTC | #4
Hi Kees,

On Mon, Feb 5, 2024 at 2:01 PM Kees Cook <keescook@chromium.org> wrote:
> On Mon, Feb 05, 2024 at 01:47:08PM +0100, Geert Uytterhoeven wrote:
> > > +/*
> > > + * The 2 argument style can only be used when dst is an array with a
> > > + * known size.
> > > + */
> > > +#define __strscpy0(dst, src, ...)      \
> > > +       sized_strscpy(dst, src, sizeof(dst) + __must_be_array(dst))
> > > +#define __strscpy1(dst, src, size)     sized_strscpy(dst, src, size)
> >
> > (dst), (src), (size) etc.
>
> I normally don't do this when macro args are being expanded into
> function arguments. I've only done it for when macro args are used in
> expressions. Am I missing a side-effect here, or is this more about
> stylistic consistency?

I'm not 100% sure it is needed, but I'm always wary when using macro
parameters without parentheses, except in the most simple use-cases.

Gr{oetje,eeting}s,

                        Geert
  

Patch

diff --git a/arch/um/include/shared/user.h b/arch/um/include/shared/user.h
index 981e11d8e025..9568cc04cbb7 100644
--- a/arch/um/include/shared/user.h
+++ b/arch/um/include/shared/user.h
@@ -51,7 +51,8 @@  static inline int printk(const char *fmt, ...)
 
 extern int in_aton(char *str);
 extern size_t strlcat(char *, const char *, size_t);
-extern size_t strscpy(char *, const char *, size_t);
+extern size_t sized_strscpy(char *, const char *, size_t);
+#define strscpy(dst, src, size)	sized_strscpy(dst, src, size)
 
 /* Copied from linux/compiler-gcc.h since we can't include it directly */
 #define barrier() __asm__ __volatile__("": : :"memory")
diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
index 89a6888f2f9e..06b3aaa63724 100644
--- a/include/linux/fortify-string.h
+++ b/include/linux/fortify-string.h
@@ -215,26 +215,8 @@  __kernel_size_t __fortify_strlen(const char * const POS p)
 }
 
 /* Defined after fortified strnlen() to reuse it. */
-extern ssize_t __real_strscpy(char *, const char *, size_t) __RENAME(strscpy);
-/**
- * strscpy - Copy a C-string into a sized buffer
- *
- * @p: Where to copy the string to
- * @q: Where to copy the string from
- * @size: Size of destination buffer
- *
- * Copy the source string @q, or as much of it as fits, into the destination
- * @p buffer. The behavior is undefined if the string buffers overlap. The
- * destination @p buffer is always NUL terminated, unless it's zero-sized.
- *
- * Preferred to strncpy() since it always returns a valid string, and
- * doesn't unnecessarily force the tail of the destination buffer to be
- * zero padded. If padding is desired please use strscpy_pad().
- *
- * Returns the number of characters copied in @p (not including the
- * trailing %NUL) or -E2BIG if @size is 0 or the copy of @q was truncated.
- */
-__FORTIFY_INLINE ssize_t strscpy(char * const POS p, const char * const POS q, size_t size)
+extern ssize_t __real_strscpy(char *, const char *, size_t) __RENAME(sized_strscpy);
+__FORTIFY_INLINE ssize_t sized_strscpy(char * const POS p, const char * const POS q, size_t size)
 {
 	/* Use string size rather than possible enclosing struct size. */
 	const size_t p_size = __member_size(p);
diff --git a/include/linux/string.h b/include/linux/string.h
index 03f59cf7fe72..a21371aa2fd6 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -67,9 +67,42 @@  extern char * strcpy(char *,const char *);
 extern char * strncpy(char *,const char *, __kernel_size_t);
 #endif
 #ifndef __HAVE_ARCH_STRSCPY
-ssize_t strscpy(char *, const char *, size_t);
+ssize_t sized_strscpy(char *, const char *, size_t);
 #endif
 
+/*
+ * The 2 argument style can only be used when dst is an array with a
+ * known size.
+ */
+#define __strscpy0(dst, src, ...)	\
+	sized_strscpy(dst, src, sizeof(dst) + __must_be_array(dst))
+#define __strscpy1(dst, src, size)	sized_strscpy(dst, src, size)
+
+/**
+ * strscpy - Copy a C-string into a sized buffer
+ * @dst: Where to copy the string to
+ * @src: Where to copy the string from
+ * @...: Size of destination buffer (optional)
+ *
+ * Copy the source string @src, or as much of it as fits, into the
+ * destination @dst buffer. The behavior is undefined if the string
+ * buffers overlap. The destination @dst buffer is always NUL terminated,
+ * unless it's zero-sized.
+ *
+ * The size argument @... is only required when @dst is not an array, or
+ * when the copy needs to be smaller than sizeof(@dst).
+ *
+ * Preferred to strncpy() since it always returns a valid string, and
+ * doesn't unnecessarily force the tail of the destination buffer to be
+ * zero padded. If padding is desired please use strscpy_pad().
+ *
+ * Returns the number of characters copied in @dst (not including the
+ * trailing %NUL) or -E2BIG if @size is 0 or the copy from @src was
+ * truncated.
+ */
+#define strscpy(dst, src, ...)	\
+	CONCATENATE(__strscpy, COUNT_ARGS(__VA_ARGS__))(dst, src, __VA_ARGS__)
+
 /**
  * strscpy_pad() - Copy a C-string into a sized buffer
  * @dest: Where to copy the string to
diff --git a/lib/string.c b/lib/string.c
index 6891d15ce991..2869895a1180 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -104,7 +104,7 @@  EXPORT_SYMBOL(strncpy);
 #endif
 
 #ifndef __HAVE_ARCH_STRSCPY
-ssize_t strscpy(char *dest, const char *src, size_t count)
+ssize_t sized_strscpy(char *dest, const char *src, size_t count)
 {
 	const struct word_at_a_time constants = WORD_AT_A_TIME_CONSTANTS;
 	size_t max = count;
@@ -170,7 +170,7 @@  ssize_t strscpy(char *dest, const char *src, size_t count)
 
 	return -E2BIG;
 }
-EXPORT_SYMBOL(strscpy);
+EXPORT_SYMBOL(sized_strscpy);
 #endif
 
 /**