[v3] kunit: run test suites only after module initialization completes

Message ID 20231206150729.54604-1-marpagan@redhat.com
State New
Headers
Series [v3] kunit: run test suites only after module initialization completes |

Commit Message

Marco Pagani Dec. 6, 2023, 3:07 p.m. UTC
  Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
kunit_free_suite_set()") fixed a wild-memory-access bug that could have
happened during the loading phase of test suites built and executed as
loadable modules. However, it also introduced a problematic side effect
that causes test suites modules to crash when they attempt to register
fake devices.

When a module is loaded, it traverses the MODULE_STATE_UNFORMED and
MODULE_STATE_COMING states before reaching the normal operating state
MODULE_STATE_LIVE. Finally, when the module is removed, it moves to
MODULE_STATE_GOING before being released. However, if the loading
function load_module() fails between complete_formation() and
do_init_module(), the module goes directly from MODULE_STATE_COMING to
MODULE_STATE_GOING without passing through MODULE_STATE_LIVE.

This behavior was causing kunit_module_exit() to be called without
having first executed kunit_module_init(). Since kunit_module_exit() is
responsible for freeing the memory allocated by kunit_module_init()
through kunit_filter_suites(), this behavior was resulting in a
wild-memory-access bug.

Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
kunit_free_suite_set()") fixed this issue by running the tests when the
module is still in MODULE_STATE_COMING. However, modules in that state
are not fully initialized, lacking sysfs kobjects. Therefore, if a test
module attempts to register a fake device, it will inevitably crash.

This patch proposes a different approach to fix the original
wild-memory-access bug while restoring the normal module execution flow
by making kunit_module_exit() able to detect if kunit_module_init() has
previously initialized the tests suite set. In this way, test modules
can once again register fake devices without crashing.

This behavior is achieved by checking whether mod->kunit_suites is a
virtual or direct mapping address. If it is a virtual address, then
kunit_module_init() has allocated the suite_set in kunit_filter_suites()
using kmalloc_array(). On the contrary, if mod->kunit_suites is still
pointing to the original address that was set when looking up the
.kunit_test_suites section of the module, then the loading phase has
failed and there's no memory to be freed.

v3:
- add a comment to clarify why the start address is checked
v2:
- add include <linux/mm.h>

Fixes: 2810c1e99867 ("kunit: Fix wild-memory-access bug in kunit_free_suite_set()")
Tested-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Marco Pagani <marpagan@redhat.com>
---
 lib/kunit/test.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)


base-commit: 33cc938e65a98f1d29d0a18403dbbee050dcad9a
  

Comments

Richard Fitzgerald Dec. 20, 2023, 11:16 a.m. UTC | #1
On 06/12/2023 15:07, Marco Pagani wrote:
> Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
> kunit_free_suite_set()") fixed a wild-memory-access bug that could have
> happened during the loading phase of test suites built and executed as
> loadable modules. However, it also introduced a problematic side effect
> that causes test suites modules to crash when they attempt to register
> fake devices.
> 
> When a module is loaded, it traverses the MODULE_STATE_UNFORMED and
> MODULE_STATE_COMING states before reaching the normal operating state
> MODULE_STATE_LIVE. Finally, when the module is removed, it moves to
> MODULE_STATE_GOING before being released. However, if the loading
> function load_module() fails between complete_formation() and
> do_init_module(), the module goes directly from MODULE_STATE_COMING to
> MODULE_STATE_GOING without passing through MODULE_STATE_LIVE.
> 
> This behavior was causing kunit_module_exit() to be called without
> having first executed kunit_module_init(). Since kunit_module_exit() is
> responsible for freeing the memory allocated by kunit_module_init()
> through kunit_filter_suites(), this behavior was resulting in a
> wild-memory-access bug.
> 
> Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
> kunit_free_suite_set()") fixed this issue by running the tests when the
> module is still in MODULE_STATE_COMING. However, modules in that state
> are not fully initialized, lacking sysfs kobjects. Therefore, if a test
> module attempts to register a fake device, it will inevitably crash.
> 
> This patch proposes a different approach to fix the original
> wild-memory-access bug while restoring the normal module execution flow
> by making kunit_module_exit() able to detect if kunit_module_init() has
> previously initialized the tests suite set. In this way, test modules
> can once again register fake devices without crashing.
> 
> This behavior is achieved by checking whether mod->kunit_suites is a
> virtual or direct mapping address. If it is a virtual address, then
> kunit_module_init() has allocated the suite_set in kunit_filter_suites()
> using kmalloc_array(). On the contrary, if mod->kunit_suites is still
> pointing to the original address that was set when looking up the
> .kunit_test_suites section of the module, then the loading phase has
> failed and there's no memory to be freed.
> 
> v3:
> - add a comment to clarify why the start address is checked
> v2:
> - add include <linux/mm.h>
> 
> Fixes: 2810c1e99867 ("kunit: Fix wild-memory-access bug in kunit_free_suite_set()")
> Tested-by: Richard Fitzgerald <rf@opensource.cirrus.com>
> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
> Signed-off-by: Marco Pagani <marpagan@redhat.com>
> ---

For V3:

Tested-by: Richard Fitzgerald <rf@opensource.cirrus.com>

Fixes this crash:
https://lore.kernel.org/all/e239b94b-462a-41e5-9a4c-cd1ffd530d75@opensource.cirrus.com/

Also tested with sound/pci/hda/cirrus_scodec_test.c
  
Rae Moar Jan. 5, 2024, 10:21 p.m. UTC | #2
On Wed, Dec 6, 2023 at 10:07 AM Marco Pagani <marpagan@redhat.com> wrote:
>
> Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
> kunit_free_suite_set()") fixed a wild-memory-access bug that could have
> happened during the loading phase of test suites built and executed as
> loadable modules. However, it also introduced a problematic side effect
> that causes test suites modules to crash when they attempt to register
> fake devices.
>
> When a module is loaded, it traverses the MODULE_STATE_UNFORMED and
> MODULE_STATE_COMING states before reaching the normal operating state
> MODULE_STATE_LIVE. Finally, when the module is removed, it moves to
> MODULE_STATE_GOING before being released. However, if the loading
> function load_module() fails between complete_formation() and
> do_init_module(), the module goes directly from MODULE_STATE_COMING to
> MODULE_STATE_GOING without passing through MODULE_STATE_LIVE.
>
> This behavior was causing kunit_module_exit() to be called without
> having first executed kunit_module_init(). Since kunit_module_exit() is
> responsible for freeing the memory allocated by kunit_module_init()
> through kunit_filter_suites(), this behavior was resulting in a
> wild-memory-access bug.
>
> Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
> kunit_free_suite_set()") fixed this issue by running the tests when the
> module is still in MODULE_STATE_COMING. However, modules in that state
> are not fully initialized, lacking sysfs kobjects. Therefore, if a test
> module attempts to register a fake device, it will inevitably crash.
>
> This patch proposes a different approach to fix the original
> wild-memory-access bug while restoring the normal module execution flow
> by making kunit_module_exit() able to detect if kunit_module_init() has
> previously initialized the tests suite set. In this way, test modules
> can once again register fake devices without crashing.
>
> This behavior is achieved by checking whether mod->kunit_suites is a
> virtual or direct mapping address. If it is a virtual address, then
> kunit_module_init() has allocated the suite_set in kunit_filter_suites()
> using kmalloc_array(). On the contrary, if mod->kunit_suites is still
> pointing to the original address that was set when looking up the
> .kunit_test_suites section of the module, then the loading phase has
> failed and there's no memory to be freed.
>

Hello,

I have tested this change and it looks good to me!

Although, it no longer applies cleanly on the kselftest/kunit branch
so it will need to be rebased.

So besides the need for a rebase,
Tested-by: Rae Moar <rmoar@google.com>

Thanks for the fix!
Rae

> v3:
> - add a comment to clarify why the start address is checked
> v2:
> - add include <linux/mm.h>
>
> Fixes: 2810c1e99867 ("kunit: Fix wild-memory-access bug in kunit_free_suite_set()")
> Tested-by: Richard Fitzgerald <rf@opensource.cirrus.com>
> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
> Signed-off-by: Marco Pagani <marpagan@redhat.com>
> ---
>  lib/kunit/test.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/lib/kunit/test.c b/lib/kunit/test.c
> index 7aceb07a1af9..3263e0d5e0f6 100644
> --- a/lib/kunit/test.c
> +++ b/lib/kunit/test.c
> @@ -16,6 +16,7 @@
>  #include <linux/panic.h>
>  #include <linux/sched/debug.h>
>  #include <linux/sched.h>
> +#include <linux/mm.h>
>
>  #include "debugfs.h"
>  #include "hooks-impl.h"
> @@ -775,12 +776,19 @@ static void kunit_module_exit(struct module *mod)
>         };
>         const char *action = kunit_action();
>
> +       /*
> +        * Check if the start address is a valid virtual address to detect
> +        * if the module load sequence has failed and the suite set has not
> +        * been initialized and filtered.
> +        */
> +       if (!suite_set.start || !virt_addr_valid(suite_set.start))
> +               return;
> +
>         if (!action)
>                 __kunit_test_suites_exit(mod->kunit_suites,
>                                          mod->num_kunit_suites);
>
> -       if (suite_set.start)
> -               kunit_free_suite_set(suite_set);
> +       kunit_free_suite_set(suite_set);
>  }
>
>  static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
> @@ -790,12 +798,12 @@ static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
>
>         switch (val) {
>         case MODULE_STATE_LIVE:
> +               kunit_module_init(mod);
>                 break;
>         case MODULE_STATE_GOING:
>                 kunit_module_exit(mod);
>                 break;
>         case MODULE_STATE_COMING:
> -               kunit_module_init(mod);
>                 break;
>         case MODULE_STATE_UNFORMED:
>                 break;
>
> base-commit: 33cc938e65a98f1d29d0a18403dbbee050dcad9a
> --
> 2.43.0
>
  
David Gow Jan. 8, 2024, 7:27 a.m. UTC | #3
On Wed, 6 Dec 2023 at 23:07, Marco Pagani <marpagan@redhat.com> wrote:
>
> Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
> kunit_free_suite_set()") fixed a wild-memory-access bug that could have
> happened during the loading phase of test suites built and executed as
> loadable modules. However, it also introduced a problematic side effect
> that causes test suites modules to crash when they attempt to register
> fake devices.
>
> When a module is loaded, it traverses the MODULE_STATE_UNFORMED and
> MODULE_STATE_COMING states before reaching the normal operating state
> MODULE_STATE_LIVE. Finally, when the module is removed, it moves to
> MODULE_STATE_GOING before being released. However, if the loading
> function load_module() fails between complete_formation() and
> do_init_module(), the module goes directly from MODULE_STATE_COMING to
> MODULE_STATE_GOING without passing through MODULE_STATE_LIVE.
>
> This behavior was causing kunit_module_exit() to be called without
> having first executed kunit_module_init(). Since kunit_module_exit() is
> responsible for freeing the memory allocated by kunit_module_init()
> through kunit_filter_suites(), this behavior was resulting in a
> wild-memory-access bug.
>
> Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
> kunit_free_suite_set()") fixed this issue by running the tests when the
> module is still in MODULE_STATE_COMING. However, modules in that state
> are not fully initialized, lacking sysfs kobjects. Therefore, if a test
> module attempts to register a fake device, it will inevitably crash.
>
> This patch proposes a different approach to fix the original
> wild-memory-access bug while restoring the normal module execution flow
> by making kunit_module_exit() able to detect if kunit_module_init() has
> previously initialized the tests suite set. In this way, test modules
> can once again register fake devices without crashing.
>
> This behavior is achieved by checking whether mod->kunit_suites is a
> virtual or direct mapping address. If it is a virtual address, then
> kunit_module_init() has allocated the suite_set in kunit_filter_suites()
> using kmalloc_array(). On the contrary, if mod->kunit_suites is still
> pointing to the original address that was set when looking up the
> .kunit_test_suites section of the module, then the loading phase has
> failed and there's no memory to be freed.
>
> v3:
> - add a comment to clarify why the start address is checked
> v2:
> - add include <linux/mm.h>
>
> Fixes: 2810c1e99867 ("kunit: Fix wild-memory-access bug in kunit_free_suite_set()")
> Tested-by: Richard Fitzgerald <rf@opensource.cirrus.com>
> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
> Signed-off-by: Marco Pagani <marpagan@redhat.com>
> ---

Sorry for the delay here: there are enough subtleties here that I
wanted to double check some things.

I keep feeling that there has to be a nicer way of doing this, but I
can't think of one, so let's go with this, since it's fixing a real
issue.

I'm a little hesitant about our use of the suite_set.start address as
an 'is initialised' flag, and depending on it being reallocated via
kunit_filter_suites(), but since we already depend on that (by always
using kunit_free_suite_set()), I'm okay with it.

My only request (other than this needing a rebase, probably on top of
6.8) would be to add a comment in kunit_filter_suites() noting that it
must return a virtual address. That's probably something we should've
done a while ago, but I can just see this requirement getting
forgotten.

Reviewed-by: David Gow <davidgow@google.com>

Cheers,
-- David

>  lib/kunit/test.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/lib/kunit/test.c b/lib/kunit/test.c
> index 7aceb07a1af9..3263e0d5e0f6 100644
> --- a/lib/kunit/test.c
> +++ b/lib/kunit/test.c
> @@ -16,6 +16,7 @@
>  #include <linux/panic.h>
>  #include <linux/sched/debug.h>
>  #include <linux/sched.h>
> +#include <linux/mm.h>
>
>  #include "debugfs.h"
>  #include "hooks-impl.h"
> @@ -775,12 +776,19 @@ static void kunit_module_exit(struct module *mod)
>         };
>         const char *action = kunit_action();
>
> +       /*
> +        * Check if the start address is a valid virtual address to detect
> +        * if the module load sequence has failed and the suite set has not
> +        * been initialized and filtered.
> +        */
> +       if (!suite_set.start || !virt_addr_valid(suite_set.start))
> +               return;
> +
>         if (!action)
>                 __kunit_test_suites_exit(mod->kunit_suites,
>                                          mod->num_kunit_suites);
>
> -       if (suite_set.start)
> -               kunit_free_suite_set(suite_set);
> +       kunit_free_suite_set(suite_set);
>  }
>
>  static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
> @@ -790,12 +798,12 @@ static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
>
>         switch (val) {
>         case MODULE_STATE_LIVE:
> +               kunit_module_init(mod);
>                 break;
>         case MODULE_STATE_GOING:
>                 kunit_module_exit(mod);
>                 break;
>         case MODULE_STATE_COMING:
> -               kunit_module_init(mod);
>                 break;
>         case MODULE_STATE_UNFORMED:
>                 break;
>
> base-commit: 33cc938e65a98f1d29d0a18403dbbee050dcad9a
> --
> 2.43.0
>
  
Marco Pagani Jan. 9, 2024, 3:36 p.m. UTC | #4
On 2024-01-08 08:27, David Gow wrote:
> On Wed, 6 Dec 2023 at 23:07, Marco Pagani <marpagan@redhat.com> wrote:
>>
>> Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
>> kunit_free_suite_set()") fixed a wild-memory-access bug that could have
>> happened during the loading phase of test suites built and executed as
>> loadable modules. However, it also introduced a problematic side effect
>> that causes test suites modules to crash when they attempt to register
>> fake devices.
>>
>> When a module is loaded, it traverses the MODULE_STATE_UNFORMED and
>> MODULE_STATE_COMING states before reaching the normal operating state
>> MODULE_STATE_LIVE. Finally, when the module is removed, it moves to
>> MODULE_STATE_GOING before being released. However, if the loading
>> function load_module() fails between complete_formation() and
>> do_init_module(), the module goes directly from MODULE_STATE_COMING to
>> MODULE_STATE_GOING without passing through MODULE_STATE_LIVE.
>>
>> This behavior was causing kunit_module_exit() to be called without
>> having first executed kunit_module_init(). Since kunit_module_exit() is
>> responsible for freeing the memory allocated by kunit_module_init()
>> through kunit_filter_suites(), this behavior was resulting in a
>> wild-memory-access bug.
>>
>> Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
>> kunit_free_suite_set()") fixed this issue by running the tests when the
>> module is still in MODULE_STATE_COMING. However, modules in that state
>> are not fully initialized, lacking sysfs kobjects. Therefore, if a test
>> module attempts to register a fake device, it will inevitably crash.
>>
>> This patch proposes a different approach to fix the original
>> wild-memory-access bug while restoring the normal module execution flow
>> by making kunit_module_exit() able to detect if kunit_module_init() has
>> previously initialized the tests suite set. In this way, test modules
>> can once again register fake devices without crashing.
>>
>> This behavior is achieved by checking whether mod->kunit_suites is a
>> virtual or direct mapping address. If it is a virtual address, then
>> kunit_module_init() has allocated the suite_set in kunit_filter_suites()
>> using kmalloc_array(). On the contrary, if mod->kunit_suites is still
>> pointing to the original address that was set when looking up the
>> .kunit_test_suites section of the module, then the loading phase has
>> failed and there's no memory to be freed.
>>
>> v3:
>> - add a comment to clarify why the start address is checked
>> v2:
>> - add include <linux/mm.h>
>>
>> Fixes: 2810c1e99867 ("kunit: Fix wild-memory-access bug in kunit_free_suite_set()")
>> Tested-by: Richard Fitzgerald <rf@opensource.cirrus.com>
>> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
>> Signed-off-by: Marco Pagani <marpagan@redhat.com>
>> ---
> 
> Sorry for the delay here: there are enough subtleties here that I
> wanted to double check some things.
> 
> I keep feeling that there has to be a nicer way of doing this, but I
> can't think of one, so let's go with this, since it's fixing a real
> issue.
> 
> I'm a little hesitant about our use of the suite_set.start address as
> an 'is initialised' flag, and depending on it being reallocated via
> kunit_filter_suites(), but since we already depend on that (by always
> using kunit_free_suite_set()), I'm okay with it.
>

I have the same feeling. I spent some thinking about alternative
solutions that did not require adding a flag in the module struct or
restructuring significant portions of the code, but I could not think of
anything better for the moment.

> My only request (other than this needing a rebase, probably on top of
> 6.8) would be to add a comment in kunit_filter_suites() noting that it
> must return a virtual address. That's probably something we should've
> done a while ago, but I can just see this requirement getting
> forgotten.
> 

Sure, I'll do it.

Thanks,
Marco

> Reviewed-by: David Gow <davidgow@google.com>
> 
> 
>>  lib/kunit/test.c | 14 +++++++++++---
>>  1 file changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/kunit/test.c b/lib/kunit/test.c
>> index 7aceb07a1af9..3263e0d5e0f6 100644
>> --- a/lib/kunit/test.c
>> +++ b/lib/kunit/test.c
>> @@ -16,6 +16,7 @@
>>  #include <linux/panic.h>
>>  #include <linux/sched/debug.h>
>>  #include <linux/sched.h>
>> +#include <linux/mm.h>
>>
>>  #include "debugfs.h"
>>  #include "hooks-impl.h"
>> @@ -775,12 +776,19 @@ static void kunit_module_exit(struct module *mod)
>>         };
>>         const char *action = kunit_action();
>>
>> +       /*
>> +        * Check if the start address is a valid virtual address to detect
>> +        * if the module load sequence has failed and the suite set has not
>> +        * been initialized and filtered.
>> +        */
>> +       if (!suite_set.start || !virt_addr_valid(suite_set.start))
>> +               return;
>> +
>>         if (!action)
>>                 __kunit_test_suites_exit(mod->kunit_suites,
>>                                          mod->num_kunit_suites);
>>
>> -       if (suite_set.start)
>> -               kunit_free_suite_set(suite_set);
>> +       kunit_free_suite_set(suite_set);
>>  }
>>
>>  static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
>> @@ -790,12 +798,12 @@ static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
>>
>>         switch (val) {
>>         case MODULE_STATE_LIVE:
>> +               kunit_module_init(mod);
>>                 break;
>>         case MODULE_STATE_GOING:
>>                 kunit_module_exit(mod);
>>                 break;
>>         case MODULE_STATE_COMING:
>> -               kunit_module_init(mod);
>>                 break;
>>         case MODULE_STATE_UNFORMED:
>>                 break;
>>
>> base-commit: 33cc938e65a98f1d29d0a18403dbbee050dcad9a
>> --
>> 2.43.0
>>
  

Patch

diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 7aceb07a1af9..3263e0d5e0f6 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -16,6 +16,7 @@ 
 #include <linux/panic.h>
 #include <linux/sched/debug.h>
 #include <linux/sched.h>
+#include <linux/mm.h>
 
 #include "debugfs.h"
 #include "hooks-impl.h"
@@ -775,12 +776,19 @@  static void kunit_module_exit(struct module *mod)
 	};
 	const char *action = kunit_action();
 
+	/*
+	 * Check if the start address is a valid virtual address to detect
+	 * if the module load sequence has failed and the suite set has not
+	 * been initialized and filtered.
+	 */
+	if (!suite_set.start || !virt_addr_valid(suite_set.start))
+		return;
+
 	if (!action)
 		__kunit_test_suites_exit(mod->kunit_suites,
 					 mod->num_kunit_suites);
 
-	if (suite_set.start)
-		kunit_free_suite_set(suite_set);
+	kunit_free_suite_set(suite_set);
 }
 
 static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
@@ -790,12 +798,12 @@  static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
 
 	switch (val) {
 	case MODULE_STATE_LIVE:
+		kunit_module_init(mod);
 		break;
 	case MODULE_STATE_GOING:
 		kunit_module_exit(mod);
 		break;
 	case MODULE_STATE_COMING:
-		kunit_module_init(mod);
 		break;
 	case MODULE_STATE_UNFORMED:
 		break;