[v4,4/5] selftests/resctrl: Cleanup properly when an error occurs in CAT test

Message ID 20221117010541.1014481-5-tan.shaopeng@jp.fujitsu.com
State New
Headers
Series Some improvements of resctrl selftest |

Commit Message

Shaopeng Tan Nov. 17, 2022, 1:05 a.m. UTC
  After creating a child process with fork() in CAT test, if there is
an error occurs or such as a SIGINT signal is received, the parent
process will be terminated immediately, but the child process will not
be killed and also umount_resctrlfs() will not be called.

Add a signal handler like other tests to kill child process, umount
resctrlfs, cleanup result files, etc. if an error occurs in parent
process. To use ctrlc_handler() of other tests to kill child
process(bm_pid), using global bm_pid instead of local bm_pid.

Wait for child process to be killed if an error occurs in child process.

Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 tools/testing/selftests/resctrl/cat_test.c | 30 ++++++++++++++--------
 1 file changed, 20 insertions(+), 10 deletions(-)
  

Comments

Reinette Chatre Nov. 23, 2022, 5:48 p.m. UTC | #1
Hi Shaopeng,

On 11/16/2022 5:05 PM, Shaopeng Tan wrote:
> After creating a child process with fork() in CAT test, if there is
> an error occurs or such as a SIGINT signal is received, the parent
> process will be terminated immediately, but the child process will not
> be killed and also umount_resctrlfs() will not be called.
> 
> Add a signal handler like other tests to kill child process, umount
> resctrlfs, cleanup result files, etc. if an error occurs in parent
> process. To use ctrlc_handler() of other tests to kill child
> process(bm_pid), using global bm_pid instead of local bm_pid.
> 
> Wait for child process to be killed if an error occurs in child process.
> 
> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
>  tools/testing/selftests/resctrl/cat_test.c | 30 ++++++++++++++--------
>  1 file changed, 20 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index 6a8306b0a109..1f8f5cf94e95 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -100,10 +100,10 @@ void cat_test_cleanup(void)
>  
>  int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
>  {
> +	struct sigaction sigact;
>  	unsigned long l_mask, l_mask_1;
>  	int ret, pipefd[2], sibling_cpu_no;
>  	char pipe_message;
> -	pid_t bm_pid;
>  
>  	cache_size = 0;
>  
> @@ -181,17 +181,25 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
>  		strcpy(param.filename, RESULT_FILE_NAME1);
>  		param.num_of_runs = 0;
>  		param.cpu_no = sibling_cpu_no;
> +	} else {
> +		/*
> +		 * Register CTRL-C handler for parent, as it has to kill
> +		 * child process before exiting
> +		 */
> +		sigact.sa_sigaction = ctrlc_handler;
> +		sigemptyset(&sigact.sa_mask);
> +		sigact.sa_flags = SA_SIGINFO;
> +		if (sigaction(SIGINT, &sigact, NULL) ||
> +		    sigaction(SIGTERM, &sigact, NULL) ||
> +		    sigaction(SIGHUP, &sigact, NULL))
> +			perror("# sigaction");

Why keep going at this point?

I tried this change but I was not able to trigger ctrlc_handler(). It
seems that the handler is changed soon after (see cat_val()->run_fill_buf())
and ctrl_handler() (note the subtle name difference) is run instead when
a SIGINT is received. The value of ctrl_handler() is not clear to me though,
and it could even be argued that it is broken because it starts with
free(startptr) and startptr would likely already be free'd at this point
without resetting its value to NULL.

From what I understand ctrl_handler() and its installation from
run_fill_buf() could be removed so that it does not override what is being
done with this change. Otherwise, from what I can tell, this new handler
will never run.

Reinette
  
Shaopeng Tan (Fujitsu) Nov. 24, 2022, 8:17 a.m. UTC | #2
Hi Reinette,

> On 11/16/2022 5:05 PM, Shaopeng Tan wrote:
> > After creating a child process with fork() in CAT test, if there is
> > an error occurs or such as a SIGINT signal is received, the parent
> > process will be terminated immediately, but the child process will not
> > be killed and also umount_resctrlfs() will not be called.
> >
> > Add a signal handler like other tests to kill child process, umount
> > resctrlfs, cleanup result files, etc. if an error occurs in parent
> > process. To use ctrlc_handler() of other tests to kill child
> > process(bm_pid), using global bm_pid instead of local bm_pid.
> >
> > Wait for child process to be killed if an error occurs in child process.
> >
> > Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
> > Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> > ---
> >  tools/testing/selftests/resctrl/cat_test.c | 30
> ++++++++++++++--------
> >  1 file changed, 20 insertions(+), 10 deletions(-)
> >
> > diff --git a/tools/testing/selftests/resctrl/cat_test.c
> b/tools/testing/selftests/resctrl/cat_test.c
> > index 6a8306b0a109..1f8f5cf94e95 100644
> > --- a/tools/testing/selftests/resctrl/cat_test.c
> > +++ b/tools/testing/selftests/resctrl/cat_test.c
> > @@ -100,10 +100,10 @@ void cat_test_cleanup(void)
> >
> >  int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> >  {
> > +	struct sigaction sigact;
> >  	unsigned long l_mask, l_mask_1;
> >  	int ret, pipefd[2], sibling_cpu_no;
> >  	char pipe_message;
> > -	pid_t bm_pid;
> >
> >  	cache_size = 0;
> >
> > @@ -181,17 +181,25 @@ int cat_perf_miss_val(int cpu_no, int n, char
> *cache_type)
> >  		strcpy(param.filename, RESULT_FILE_NAME1);
> >  		param.num_of_runs = 0;
> >  		param.cpu_no = sibling_cpu_no;
> > +	} else {
> > +		/*
> > +		 * Register CTRL-C handler for parent, as it has to kill
> > +		 * child process before exiting
> > +		 */
> > +		sigact.sa_sigaction = ctrlc_handler;
> > +		sigemptyset(&sigact.sa_mask);
> > +		sigact.sa_flags = SA_SIGINFO;
> > +		if (sigaction(SIGINT, &sigact, NULL) ||
> > +		    sigaction(SIGTERM, &sigact, NULL) ||
> > +		    sigaction(SIGHUP, &sigact, NULL))
> > +			perror("# sigaction");
> 
> Why keep going at this point?
> 
> I tried this change but I was not able to trigger ctrlc_handler(). It

I tested this change using kselftest framework,
In this case, the timeout command sent a SIGTERM signal,
and then ctrlc_handler() was triggered.
Since the handling of SIGINT and SIGHUP signals is overridden in run_fill_buf(), 
ctrl_handler() may be called if ctrl-c is received.

> seems that the handler is changed soon after (see cat_val()->run_fill_buf())
> and ctrl_handler() (note the subtle name difference) is run instead when
> a SIGINT is received. The value of ctrl_handler() is not clear to me though,
> and it could even be argued that it is broken because it starts with
> free(startptr) and startptr would likely already be free'd at this point
> without resetting its value to NULL.
> 
> From what I understand ctrl_handler() and its installation from
> run_fill_buf() could be removed so that it does not override what is being
> done with this change. Otherwise, from what I can tell, this new handler
> will never run.

I think removing ctrl_handler() is fine. 
In CAT test, it overrides ctrlc_handler().
In other tests(CMT,MBA,MBM), the child process used ctrl_handler() to cleanup itself,
but the parent process cleanup the child process with ctrlc_handler() properly.
Also, avoid using signal().
 fill_buf.c:run_fill_buf()
 201         /* set up ctrl-c handler */
 202         if (signal(SIGINT, ctrl_handler) == SIG_ERR)
 203                 printf("Failed to catch SIGINT!\n");

I removed ctrl_handler() and ran resctrl_tests with and without the kselftest framework.
There is no problem.

Best regards,
Shaopeng Tan
  
Reinette Chatre Nov. 28, 2022, 5:11 p.m. UTC | #3
Hi Shaopeng,

On 11/24/2022 12:17 AM, Shaopeng Tan (Fujitsu) wrote:
> Hi Reinette,
> 
>> On 11/16/2022 5:05 PM, Shaopeng Tan wrote:
>>> After creating a child process with fork() in CAT test, if there is
>>> an error occurs or such as a SIGINT signal is received, the parent
>>> process will be terminated immediately, but the child process will not
>>> be killed and also umount_resctrlfs() will not be called.
>>>
>>> Add a signal handler like other tests to kill child process, umount
>>> resctrlfs, cleanup result files, etc. if an error occurs in parent
>>> process. To use ctrlc_handler() of other tests to kill child
>>> process(bm_pid), using global bm_pid instead of local bm_pid.
>>>
>>> Wait for child process to be killed if an error occurs in child process.
>>>
>>> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
>>> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
>>> ---
>>>  tools/testing/selftests/resctrl/cat_test.c | 30
>> ++++++++++++++--------
>>>  1 file changed, 20 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c
>> b/tools/testing/selftests/resctrl/cat_test.c
>>> index 6a8306b0a109..1f8f5cf94e95 100644
>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>> @@ -100,10 +100,10 @@ void cat_test_cleanup(void)
>>>
>>>  int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
>>>  {
>>> +	struct sigaction sigact;
>>>  	unsigned long l_mask, l_mask_1;
>>>  	int ret, pipefd[2], sibling_cpu_no;
>>>  	char pipe_message;
>>> -	pid_t bm_pid;
>>>
>>>  	cache_size = 0;
>>>
>>> @@ -181,17 +181,25 @@ int cat_perf_miss_val(int cpu_no, int n, char
>> *cache_type)
>>>  		strcpy(param.filename, RESULT_FILE_NAME1);
>>>  		param.num_of_runs = 0;
>>>  		param.cpu_no = sibling_cpu_no;
>>> +	} else {
>>> +		/*
>>> +		 * Register CTRL-C handler for parent, as it has to kill
>>> +		 * child process before exiting
>>> +		 */
>>> +		sigact.sa_sigaction = ctrlc_handler;
>>> +		sigemptyset(&sigact.sa_mask);
>>> +		sigact.sa_flags = SA_SIGINFO;
>>> +		if (sigaction(SIGINT, &sigact, NULL) ||
>>> +		    sigaction(SIGTERM, &sigact, NULL) ||
>>> +		    sigaction(SIGHUP, &sigact, NULL))
>>> +			perror("# sigaction");
>>
>> Why keep going at this point?
>>
>> I tried this change but I was not able to trigger ctrlc_handler(). It
> 
> I tested this change using kselftest framework,
> In this case, the timeout command sent a SIGTERM signal,
> and then ctrlc_handler() was triggered.
> Since the handling of SIGINT and SIGHUP signals is overridden in run_fill_buf(), 
> ctrl_handler() may be called if ctrl-c is received.

I tested this by running "resctrl_tests -t cat" and when doing so
this change does not behave as described.


>> seems that the handler is changed soon after (see cat_val()->run_fill_buf())
>> and ctrl_handler() (note the subtle name difference) is run instead when
>> a SIGINT is received. The value of ctrl_handler() is not clear to me though,
>> and it could even be argued that it is broken because it starts with
>> free(startptr) and startptr would likely already be free'd at this point
>> without resetting its value to NULL.
>>
>> From what I understand ctrl_handler() and its installation from
>> run_fill_buf() could be removed so that it does not override what is being
>> done with this change. Otherwise, from what I can tell, this new handler
>> will never run.
> 
> I think removing ctrl_handler() is fine. 
> In CAT test, it overrides ctrlc_handler().
> In other tests(CMT,MBA,MBM), the child process used ctrl_handler() to cleanup itself,

Is that explicit cleanup required? All I can see is free(startptr) and that pointer
would usually be invalid as I mentioned earlier. If the process crashes while
running fill_cache() and thus not get a chance to run free(startptr) then
the OS would release the memory, no?

> but the parent process cleanup the child process with ctrlc_handler() properly.
> Also, avoid using signal().
>  fill_buf.c:run_fill_buf()
>  201         /* set up ctrl-c handler */
>  202         if (signal(SIGINT, ctrl_handler) == SIG_ERR)
>  203                 printf("Failed to catch SIGINT!\n");
> 
> I removed ctrl_handler() and ran resctrl_tests with and without the kselftest framework.
> There is no problem.

Thank you. I only used the CAT test when I found the issue.

Reinette
  
Shaopeng Tan (Fujitsu) Nov. 30, 2022, 8:32 a.m. UTC | #4
Hi Reinette,

> On 11/24/2022 12:17 AM, Shaopeng Tan (Fujitsu) wrote:
> > Hi Reinette,
> >
> >> On 11/16/2022 5:05 PM, Shaopeng Tan wrote:
> >>> After creating a child process with fork() in CAT test, if there is
> >>> an error occurs or such as a SIGINT signal is received, the parent
> >>> process will be terminated immediately, but the child process will
> >>> not be killed and also umount_resctrlfs() will not be called.
> >>>
> >>> Add a signal handler like other tests to kill child process, umount
> >>> resctrlfs, cleanup result files, etc. if an error occurs in parent
> >>> process. To use ctrlc_handler() of other tests to kill child
> >>> process(bm_pid), using global bm_pid instead of local bm_pid.
> >>>
> >>> Wait for child process to be killed if an error occurs in child process.
> >>>
> >>> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
> >>> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> >>> ---
> >>>  tools/testing/selftests/resctrl/cat_test.c | 30
> >> ++++++++++++++--------
> >>>  1 file changed, 20 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/tools/testing/selftests/resctrl/cat_test.c
> >> b/tools/testing/selftests/resctrl/cat_test.c
> >>> index 6a8306b0a109..1f8f5cf94e95 100644
> >>> --- a/tools/testing/selftests/resctrl/cat_test.c
> >>> +++ b/tools/testing/selftests/resctrl/cat_test.c
> >>> @@ -100,10 +100,10 @@ void cat_test_cleanup(void)
> >>>
> >>>  int cat_perf_miss_val(int cpu_no, int n, char *cache_type)  {
> >>> +	struct sigaction sigact;
> >>>  	unsigned long l_mask, l_mask_1;
> >>>  	int ret, pipefd[2], sibling_cpu_no;
> >>>  	char pipe_message;
> >>> -	pid_t bm_pid;
> >>>
> >>>  	cache_size = 0;
> >>>
> >>> @@ -181,17 +181,25 @@ int cat_perf_miss_val(int cpu_no, int n, char
> >> *cache_type)
> >>>  		strcpy(param.filename, RESULT_FILE_NAME1);
> >>>  		param.num_of_runs = 0;
> >>>  		param.cpu_no = sibling_cpu_no;
> >>> +	} else {
> >>> +		/*
> >>> +		 * Register CTRL-C handler for parent, as it has to kill
> >>> +		 * child process before exiting
> >>> +		 */
> >>> +		sigact.sa_sigaction = ctrlc_handler;
> >>> +		sigemptyset(&sigact.sa_mask);
> >>> +		sigact.sa_flags = SA_SIGINFO;
> >>> +		if (sigaction(SIGINT, &sigact, NULL) ||
> >>> +		    sigaction(SIGTERM, &sigact, NULL) ||
> >>> +		    sigaction(SIGHUP, &sigact, NULL))
> >>> +			perror("# sigaction");
> >>
> >> Why keep going at this point?
> >>
> >> I tried this change but I was not able to trigger ctrlc_handler(). It
> >
> > I tested this change using kselftest framework, In this case, the
> > timeout command sent a SIGTERM signal, and then ctrlc_handler() was
> > triggered.
> > Since the handling of SIGINT and SIGHUP signals is overridden in
> > run_fill_buf(),
> > ctrl_handler() may be called if ctrl-c is received.
> 
> I tested this by running "resctrl_tests -t cat" and when doing so this change
> does not behave as described.

Yes, the fix of v4 cannot satisfy "resctrl_tests -t cat"".
I will add new fix in next version.

> >> seems that the handler is changed soon after (see
> >> cat_val()->run_fill_buf()) and ctrl_handler() (note the subtle name
> >> difference) is run instead when a SIGINT is received. The value of
> >> ctrl_handler() is not clear to me though, and it could even be argued
> >> that it is broken because it starts with
> >> free(startptr) and startptr would likely already be free'd at this
> >> point without resetting its value to NULL.
> >>
> >> From what I understand ctrl_handler() and its installation from
> >> run_fill_buf() could be removed so that it does not override what is
> >> being done with this change. Otherwise, from what I can tell, this
> >> new handler will never run.
> >
> > I think removing ctrl_handler() is fine.
> > In CAT test, it overrides ctrlc_handler().
> > In other tests(CMT,MBA,MBM), the child process used ctrl_handler() to
> > cleanup itself,
> 
> Is that explicit cleanup required? All I can see is free(startptr) and that pointer
> would usually be invalid as I mentioned earlier. If the process crashes while
> running fill_cache() and thus not get a chance to run free(startptr) then the OS
> would release the memory, no?

Sorry, my explanation was not clear.
I agree with you, I think removing ctrl_handler() is OK.

> > but the parent process cleanup the child process with ctrlc_handler()
> properly.
> > Also, avoid using signal().
> >  fill_buf.c:run_fill_buf()
> >  201         /* set up ctrl-c handler */
> >  202         if (signal(SIGINT, ctrl_handler) == SIG_ERR)
> >  203                 printf("Failed to catch SIGINT!\n");
> >
> > I removed ctrl_handler() and ran resctrl_tests with and without the kselftest
> framework.
> > There is no problem.
> 
> Thank you. I only used the CAT test when I found the issue.

Removing ctrl_handler() is only part of the fix in the next version(v5).
All fixes as follows.

--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -98,12 +98,17 @@ void cat_test_cleanup(void)
        remove(RESULT_FILE_NAME2);
 }

+static void ctrlc_handler_child(int signum, siginfo_t *info, void *ptr)
+{
+       exit(EXIT_SUCCESS);
+}
+
 int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
 {
+       struct sigaction sigact;
        unsigned long l_mask, l_mask_1;
        int ret, pipefd[2], sibling_cpu_no;
        char pipe_message;
-       pid_t bm_pid;

        cache_size = 0;

@@ -181,17 +186,33 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
                strcpy(param.filename, RESULT_FILE_NAME1);
                param.num_of_runs = 0;
                param.cpu_no = sibling_cpu_no;
+
+               sigfillset(&sigact.sa_mask);
+               sigact.sa_sigaction = ctrlc_handler_child;
+               sigact.sa_flags = SA_SIGINFO;
+               if (sigaction(SIGINT, &sigact, NULL) ||
+                   sigaction(SIGTERM, &sigact, NULL) ||
+                   sigaction(SIGHUP, &sigact, NULL))
+                       perror("# sigaction");
+       } else {
+               /*
+                * Register CTRL-C handler for parent, as it has to kill
+                * child process before exiting
+                */
+               sigact.sa_sigaction = ctrlc_handler;
+               sigemptyset(&sigact.sa_mask);
+               sigact.sa_flags = SA_SIGINFO;
+               if (sigaction(SIGINT, &sigact, NULL) ||
+                   sigaction(SIGTERM, &sigact, NULL) ||
+                   sigaction(SIGHUP, &sigact, NULL))
+                       perror("# sigaction");
        }

        remove(param.filename);

        ret = cat_val(&param);
-       if (ret)
-               return ret;
-
-       ret = check_results(&param);
-       if (ret)
-               return ret;
+       if (ret == 0)
+               ret = check_results(&param);

        if (bm_pid == 0) {
                /* Tell parent that child is ready */
@@ -199,9 +220,11 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
                pipe_message = 1;
                if (write(pipefd[1], &pipe_message, sizeof(pipe_message)) <
                    sizeof(pipe_message)) {
-                       close(pipefd[1]);
+                       /*
+                        * Just print the error message.
+                        * Let while(1) run and wait for itself to be killed.
+                        */
                        perror("# failed signaling parent process");
-                       return errno;
                }

                close(pipefd[1]);
@@ -226,5 +249,5 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
        if (bm_pid)
                umount_resctrlfs();

-       return 0;
+       return ret;
 }
diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
index 56ccbeae0638..322c6812e15c 100644
--- a/tools/testing/selftests/resctrl/fill_buf.c
+++ b/tools/testing/selftests/resctrl/fill_buf.c
@@ -33,14 +33,6 @@ static void sb(void)
 #endif
 }

-static void ctrl_handler(int signo)
-{
-       free(startptr);
-       printf("\nEnding\n");
-       sb();
-       exit(EXIT_SUCCESS);
-}
-
 static void cl_flush(void *p)
 {
 #if defined(__i386) || defined(__x86_64)
@@ -198,12 +190,6 @@ int run_fill_buf(unsigned long span, int malloc_and_init_memory,
        unsigned long long cache_size = span;
        int ret;

-       /* set up ctrl-c handler */
-       if (signal(SIGINT, ctrl_handler) == SIG_ERR)
-               printf("Failed to catch SIGINT!\n");
-       if (signal(SIGHUP, ctrl_handler) == SIG_ERR)
-               printf("Failed to catch SIGHUP!\n");
-
        ret = fill_cache(cache_size, malloc_and_init_memory, memflush, op,
                         resctrl_val);
        if (ret) {


Best regards,
Shaopeng Tan
  
Reinette Chatre Nov. 30, 2022, 4:33 p.m. UTC | #5
Hi Shaopeng,

On 11/30/2022 12:32 AM, Shaopeng Tan (Fujitsu) wrote:
 
> Removing ctrl_handler() is only part of the fix in the next version(v5).
> All fixes as follows.
> 
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -98,12 +98,17 @@ void cat_test_cleanup(void)
>         remove(RESULT_FILE_NAME2);
>  }
> 
> +static void ctrlc_handler_child(int signum, siginfo_t *info, void *ptr)
> +{
> +       exit(EXIT_SUCCESS);
> +}
> +

Could you please elaborate why this is necessary?

Reinette
  
Shaopeng Tan (Fujitsu) Dec. 1, 2022, 8:20 a.m. UTC | #6
Hi Reinette,

> On 11/30/2022 12:32 AM, Shaopeng Tan (Fujitsu) wrote:
> 
> > Removing ctrl_handler() is only part of the fix in the next version(v5).
> > All fixes as follows.
> >
> > --- a/tools/testing/selftests/resctrl/cat_test.c
> > +++ b/tools/testing/selftests/resctrl/cat_test.c
> > @@ -98,12 +98,17 @@ void cat_test_cleanup(void)
> >         remove(RESULT_FILE_NAME2);
> >  }
> >
> > +static void ctrlc_handler_child(int signum, siginfo_t *info, void
> > +*ptr) {
> > +       exit(EXIT_SUCCESS);
> > +}
> > +
> 
> Could you please elaborate why this is necessary?

If enter "ctrl-c" when running "resctrl_tests -t cat",
SIGINT will be sent to all processes (parent&child).

At this time, the child process receives a SIGINT signal, but does not take any action.
In this case the parent process may not call ctrlc_handler() as expected.
Therefore, ctrlc_handler_child() is necessary.

It may be better to ignore the signal, then code can be simple as follows.
----
        if (bm_pid == 0) {
                param.mask = l_mask_1;
                strcpy(param.ctrlgrp, "c1");
                strcpy(param.mongrp, "m1");
                param.span = cache_size * n / count_of_bits;
                strcpy(param.filename, RESULT_FILE_NAME1);
                param.num_of_runs = 0;
                param.cpu_no = sibling_cpu_no;
                /* Ignore the signal,and wait to be cleaned up by the parent process */
                sigfillset(&sigact.sa_mask);
                sigact.sa_handler = SIG_IGN;
                //sigact.sa_sigaction = ctrlc_handler_child;  //delete
                if (sigaction(SIGINT, &sigact, NULL) ||
                    sigaction(SIGTERM, &sigact, NULL) ||
                    sigaction(SIGHUP, &sigact, NULL))
                        perror("# sigaction");
        } else {
----

Best regards,
Shaopeng Tan
  
Reinette Chatre Dec. 1, 2022, 6:12 p.m. UTC | #7
Hi Shaopeng,

On 12/1/2022 12:20 AM, Shaopeng Tan (Fujitsu) wrote:
> Hi Reinette,
> 
>> On 11/30/2022 12:32 AM, Shaopeng Tan (Fujitsu) wrote:
>>
>>> Removing ctrl_handler() is only part of the fix in the next version(v5).
>>> All fixes as follows.
>>>
>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>> @@ -98,12 +98,17 @@ void cat_test_cleanup(void)
>>>         remove(RESULT_FILE_NAME2);
>>>  }
>>>
>>> +static void ctrlc_handler_child(int signum, siginfo_t *info, void
>>> +*ptr) {
>>> +       exit(EXIT_SUCCESS);
>>> +}
>>> +
>>
>> Could you please elaborate why this is necessary?
> 
> If enter "ctrl-c" when running "resctrl_tests -t cat",
> SIGINT will be sent to all processes (parent&child).
> 
> At this time, the child process receives a SIGINT signal, but does not take any action.
> In this case the parent process may not call ctrlc_handler() as expected.

Apologies, but I am not able to follow. My understanding is that the
ideal in working an failing case is for the parent to kill the child.
Could you please elaborate why the ctrlc_handler() may not be called?

> Therefore, ctrlc_handler_child() is necessary.
> 
> It may be better to ignore the signal, then code can be simple as follows.
> ----
>         if (bm_pid == 0) {
>                 param.mask = l_mask_1;
>                 strcpy(param.ctrlgrp, "c1");
>                 strcpy(param.mongrp, "m1");
>                 param.span = cache_size * n / count_of_bits;
>                 strcpy(param.filename, RESULT_FILE_NAME1);
>                 param.num_of_runs = 0;
>                 param.cpu_no = sibling_cpu_no;
>                 /* Ignore the signal,and wait to be cleaned up by the parent process */
>                 sigfillset(&sigact.sa_mask);
>                 sigact.sa_handler = SIG_IGN;
>                 //sigact.sa_sigaction = ctrlc_handler_child;  //delete
>                 if (sigaction(SIGINT, &sigact, NULL) ||
>                     sigaction(SIGTERM, &sigact, NULL) ||
>                     sigaction(SIGHUP, &sigact, NULL))
>                         perror("# sigaction");
>         } else {

Reinette
  
Shaopeng Tan (Fujitsu) Dec. 16, 2022, 8:20 a.m. UTC | #8
Hi Reinette

> On 12/1/2022 12:20 AM, Shaopeng Tan (Fujitsu) wrote:
> > Hi Reinette,
> >
> >> On 11/30/2022 12:32 AM, Shaopeng Tan (Fujitsu) wrote:
> >>
> >>> Removing ctrl_handler() is only part of the fix in the next version(v5).
> >>> All fixes as follows.
> >>>
> >>> --- a/tools/testing/selftests/resctrl/cat_test.c
> >>> +++ b/tools/testing/selftests/resctrl/cat_test.c
> >>> @@ -98,12 +98,17 @@ void cat_test_cleanup(void)
> >>>         remove(RESULT_FILE_NAME2);
> >>>  }
> >>>
> >>> +static void ctrlc_handler_child(int signum, siginfo_t *info, void
> >>> +*ptr) {
> >>> +       exit(EXIT_SUCCESS);
> >>> +}
> >>> +
> >>
> >> Could you please elaborate why this is necessary?
> >
> > If enter "ctrl-c" when running "resctrl_tests -t cat", SIGINT will be
> > sent to all processes (parent&child).
> >
> > At this time, the child process receives a SIGINT signal, but does not take any
> action.
> > In this case the parent process may not call ctrlc_handler() as expected.
> 
> Apologies, but I am not able to follow. My understanding is that the ideal in
> working an failing case is for the parent to kill the child.
> Could you please elaborate why the ctrlc_handler() may not be called?

Apologies for the late replay.

The problem is that at the time of running CAT test, 
previous ctrlc_handler from MBM/MBA/CMT test will be inherited to child process.

Let me explain in detail:
In resctrl_tests,the default run order of the tests is MBM->MBA->CMT->CAT.
When running MBM, MBA, CMT, signal handler(ctrlc_handler) was set to the parent process.
After these tests, when fork() is executed in CAT, 
the signal handler set by MBM/MBA/CMT is inherited by the parent&child process of CAT.
At this time, if "ctrl+c" SIGINT is sent to parent&child process,
according to the inherited signal handler,
the child process may kill parent process before parent process kills child process.
Therefore, when running all tests(MBM->MBA->CMT->CAT),
signal handler of child process need to be overridden in CAT.

Also, when running CAT test only,
since there are no signal handler that can be inherited from other tests,
signal handler of parent process need to be set.

> > Therefore, ctrlc_handler_child() is necessary.
> >
> > It may be better to ignore the signal, then code can be simple as follows.
> > ----
> >         if (bm_pid == 0) {
> >                 param.mask = l_mask_1;
> >                 strcpy(param.ctrlgrp, "c1");
> >                 strcpy(param.mongrp, "m1");
> >                 param.span = cache_size * n / count_of_bits;
> >                 strcpy(param.filename, RESULT_FILE_NAME1);
> >                 param.num_of_runs = 0;
> >                 param.cpu_no = sibling_cpu_no;
> >                 /* Ignore the signal,and wait to be cleaned up by the parent
> process */
> >                 sigfillset(&sigact.sa_mask);
> >                 sigact.sa_handler = SIG_IGN;
> >                 //sigact.sa_sigaction = ctrlc_handler_child;  //delete
> >                 if (sigaction(SIGINT, &sigact, NULL) ||
> >                     sigaction(SIGTERM, &sigact, NULL) ||
> >                     sigaction(SIGHUP, &sigact, NULL))
> >                         perror("# sigaction");
> >         } else {

Best regards,
Shaopeng
  
Reinette Chatre Dec. 16, 2022, 7:08 p.m. UTC | #9
Hi Shaopeng,

On 12/16/2022 12:20 AM, Shaopeng Tan (Fujitsu) wrote:
> Hi Reinette
> 
>> On 12/1/2022 12:20 AM, Shaopeng Tan (Fujitsu) wrote:
>>> Hi Reinette,
>>>
>>>> On 11/30/2022 12:32 AM, Shaopeng Tan (Fujitsu) wrote:
>>>>
>>>>> Removing ctrl_handler() is only part of the fix in the next version(v5).
>>>>> All fixes as follows.
>>>>>
>>>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>>>> @@ -98,12 +98,17 @@ void cat_test_cleanup(void)
>>>>>         remove(RESULT_FILE_NAME2);
>>>>>  }
>>>>>
>>>>> +static void ctrlc_handler_child(int signum, siginfo_t *info, void
>>>>> +*ptr) {
>>>>> +       exit(EXIT_SUCCESS);
>>>>> +}
>>>>> +
>>>>
>>>> Could you please elaborate why this is necessary?
>>>
>>> If enter "ctrl-c" when running "resctrl_tests -t cat", SIGINT will be
>>> sent to all processes (parent&child).
>>>
>>> At this time, the child process receives a SIGINT signal, but does not take any
>> action.
>>> In this case the parent process may not call ctrlc_handler() as expected.
>>
>> Apologies, but I am not able to follow. My understanding is that the ideal in
>> working an failing case is for the parent to kill the child.
>> Could you please elaborate why the ctrlc_handler() may not be called?
> 
> Apologies for the late replay.
> 
> The problem is that at the time of running CAT test, 
> previous ctrlc_handler from MBM/MBA/CMT test will be inherited to child process.
> 
> Let me explain in detail:
> In resctrl_tests,the default run order of the tests is MBM->MBA->CMT->CAT.
> When running MBM, MBA, CMT, signal handler(ctrlc_handler) was set to the parent process.
> After these tests, when fork() is executed in CAT, 
> the signal handler set by MBM/MBA/CMT is inherited by the parent&child process of CAT.
> At this time, if "ctrl+c" SIGINT is sent to parent&child process,
> according to the inherited signal handler,
> the child process may kill parent process before parent process kills child process.
> Therefore, when running all tests(MBM->MBA->CMT->CAT),
> signal handler of child process need to be overridden in CAT.

Thank you for the additional details. I do not think that this should be handled
in the CAT test. The CAT test should not need to work around leftovers from the other
tests, that will just leave us with harder code to maintain. Instead, I think this should
be addressed in resctrl_val() where the signal handler is set for the
MBM/MBA/CMT tests. That is, when the test is complete and the test case
specific signal handler no longer needed (after test itself kills the child and
handler thus no longer needed), the signal handler should be reset to SIG_DFL.

This should give the CAT test a clean slate to work with.

> 
> Also, when running CAT test only,
> since there are no signal handler that can be inherited from other tests,
> signal handler of parent process need to be set.

Yes, this is clear. It is the child signal handler that was confusing to me.

Reinette
  

Patch

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 6a8306b0a109..1f8f5cf94e95 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -100,10 +100,10 @@  void cat_test_cleanup(void)
 
 int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
 {
+	struct sigaction sigact;
 	unsigned long l_mask, l_mask_1;
 	int ret, pipefd[2], sibling_cpu_no;
 	char pipe_message;
-	pid_t bm_pid;
 
 	cache_size = 0;
 
@@ -181,17 +181,25 @@  int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
 		strcpy(param.filename, RESULT_FILE_NAME1);
 		param.num_of_runs = 0;
 		param.cpu_no = sibling_cpu_no;
+	} else {
+		/*
+		 * Register CTRL-C handler for parent, as it has to kill
+		 * child process before exiting
+		 */
+		sigact.sa_sigaction = ctrlc_handler;
+		sigemptyset(&sigact.sa_mask);
+		sigact.sa_flags = SA_SIGINFO;
+		if (sigaction(SIGINT, &sigact, NULL) ||
+		    sigaction(SIGTERM, &sigact, NULL) ||
+		    sigaction(SIGHUP, &sigact, NULL))
+			perror("# sigaction");
 	}
 
 	remove(param.filename);
 
 	ret = cat_val(&param);
-	if (ret)
-		return ret;
-
-	ret = check_results(&param);
-	if (ret)
-		return ret;
+	if (ret == 0)
+		ret = check_results(&param);
 
 	if (bm_pid == 0) {
 		/* Tell parent that child is ready */
@@ -199,9 +207,11 @@  int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
 		pipe_message = 1;
 		if (write(pipefd[1], &pipe_message, sizeof(pipe_message)) <
 		    sizeof(pipe_message)) {
-			close(pipefd[1]);
+			/*
+			 * Just print the error message.
+			 * Let while(1) run and wait for itself to be killed.
+			 */
 			perror("# failed signaling parent process");
-			return errno;
 		}
 
 		close(pipefd[1]);
@@ -226,5 +236,5 @@  int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
 	if (bm_pid)
 		umount_resctrlfs();
 
-	return 0;
+	return ret;
 }