libgcc: fix Win32 CV abnormal spurious wakeups in timed wait [PR113850]

Message ID 20240209141858.75189-1-matteo@mitalia.net
State Accepted
Headers
Series libgcc: fix Win32 CV abnormal spurious wakeups in timed wait [PR113850] |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Matteo Italia Feb. 9, 2024, 2:18 p.m. UTC
  The Win32 threading model uses __gthr_win32_abs_to_rel_time to convert
the timespec used in gthreads to specify the absolute time for end of
the condition variables timed wait to a milliseconds value relative to
"now" to pass to the Win32 SleepConditionVariableCS function.

Unfortunately, the conversion is incorrect, as, due to a typo, it
returns the relative time _in seconds_, so SleepConditionVariableCS
receives a timeout value 1000 times shorter than it should be, resulting
in a huge amount of spurious wakeups in calls such as
std::condition_variable::wait_for or wait_until.

This can be demonstrated easily by this sample program:

```

int main() {
    std::condition_variable cv;
    std::mutex mx;
    bool pass = false;

    auto thread_fn = [&](bool timed) {
        int wakeups = 0;
        using sc = std::chrono::system_clock;
        auto before = sc::now();
        std::unique_lock<std::mutex> ml(mx);
        if (timed) {
            cv.wait_for(ml, std::chrono::seconds(2), [&]{
                ++wakeups;
                return pass;
            });
        } else {
            cv.wait(ml, [&]{
                ++wakeups;
                return pass;
            });
        }
        printf("pass: %d; wakeups: %d; elapsed: %d ms\n", pass, wakeups,
                int((sc::now() - before) / std::chrono::milliseconds(1)));
        pass = false;
    };

    {
        // timed wait, let expire
        std::thread t(thread_fn, true);
        t.join();
    }

    {
        // timed wait, wake up explicitly after 1 second
        std::thread t(thread_fn, true);
        std::this_thread::sleep_for(std::chrono::seconds(1));
        {
            std::unique_lock<std::mutex> ml(mx);
            pass = true;
        }
        cv.notify_all();
        t.join();
    }

    {
        // non-timed wait, wake up explicitly after 1 second
        std::thread t(thread_fn, false);
        std::this_thread::sleep_for(std::chrono::seconds(1));
        {
            std::unique_lock<std::mutex> ml(mx);
            pass = true;
        }
        cv.notify_all();
        t.join();
    }
    return 0;
}
```

On builds that other threading models (e.g. POSIX on Linux, or
winpthreads or MCF on Win32) the output is something like
```
pass: 0; wakeups: 2; elapsed: 2000 ms
pass: 1; wakeups: 2; elapsed: 991 ms
pass: 1; wakeups: 2; elapsed: 996 ms
```

while on the affected builds it's more like
```
pass: 0; wakeups: 1418; elapsed: 2000 ms
pass: 1; wakeups: 479; elapsed: 988 ms
pass: 1; wakeups: 2; elapsed: 992 ms
```
(notice the huge number of wakeups).

This commit fixes the conversion, adjusting the final division by
NSEC100_PER_SEC to use NSEC100_PER_MSEC instead (already defined in the
file and not used in any other place, so the problem here was probably a
typo or some rebase/refactoring mishap).

libgcc/ChangeLog:

	* config/i386/gthr-win32-cond.c (__gthr_win32_abs_to_rel_time):
          fix absolute timespec to relative milliseconds count
          conversion (it incorrectly returned seconds instead of
          milliseconds); this avoids spurious wakeups in
          __gthr_win32_cond_timedwait
---
 libgcc/config/i386/gthr-win32-cond.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Matteo Italia Feb. 10, 2024, 10:10 a.m. UTC | #1
Il 09/02/24 15:18, Matteo Italia ha scritto:
> The Win32 threading model uses __gthr_win32_abs_to_rel_time to convert
> the timespec used in gthreads to specify the absolute time for end of
> the condition variables timed wait to a milliseconds value relative to
> "now" to pass to the Win32 SleepConditionVariableCS function.
>
> Unfortunately, the conversion is incorrect, as, due to a typo, it
> returns the relative time _in seconds_, so SleepConditionVariableCS
> receives a timeout value 1000 times shorter than it should be, resulting
> in a huge amount of spurious wakeups in calls such as
> std::condition_variable::wait_for or wait_until.
>
> This can be demonstrated easily by this sample program:
>
> ```
>
> int main() {
>      std::condition_variable cv;
>      std::mutex mx;
>      bool pass = false;
>
>      auto thread_fn = [&](bool timed) {
>          int wakeups = 0;
>          using sc = std::chrono::system_clock;
>          auto before = sc::now();
>          std::unique_lock<std::mutex> ml(mx);
>          if (timed) {
>              cv.wait_for(ml, std::chrono::seconds(2), [&]{
>                  ++wakeups;
>                  return pass;
>              });
>          } else {
>              cv.wait(ml, [&]{
>                  ++wakeups;
>                  return pass;
>              });
>          }
>          printf("pass: %d; wakeups: %d; elapsed: %d ms\n", pass, wakeups,
>                  int((sc::now() - before) / std::chrono::milliseconds(1)));
>          pass = false;
>      };
>
>      {
>          // timed wait, let expire
>          std::thread t(thread_fn, true);
>          t.join();
>      }
>
>      {
>          // timed wait, wake up explicitly after 1 second
>          std::thread t(thread_fn, true);
>          std::this_thread::sleep_for(std::chrono::seconds(1));
>          {
>              std::unique_lock<std::mutex> ml(mx);
>              pass = true;
>          }
>          cv.notify_all();
>          t.join();
>      }
>
>      {
>          // non-timed wait, wake up explicitly after 1 second
>          std::thread t(thread_fn, false);
>          std::this_thread::sleep_for(std::chrono::seconds(1));
>          {
>              std::unique_lock<std::mutex> ml(mx);
>              pass = true;
>          }
>          cv.notify_all();
>          t.join();
>      }
>      return 0;
> }
> ```
>
> On builds that other threading models (e.g. POSIX on Linux, or
> winpthreads or MCF on Win32) the output is something like
> ```
> pass: 0; wakeups: 2; elapsed: 2000 ms
> pass: 1; wakeups: 2; elapsed: 991 ms
> pass: 1; wakeups: 2; elapsed: 996 ms
> ```
>
> while on the affected builds it's more like
> ```
> pass: 0; wakeups: 1418; elapsed: 2000 ms
> pass: 1; wakeups: 479; elapsed: 988 ms
> pass: 1; wakeups: 2; elapsed: 992 ms
> ```
> (notice the huge number of wakeups).
>
> This commit fixes the conversion, adjusting the final division by
> NSEC100_PER_SEC to use NSEC100_PER_MSEC instead (already defined in the
> file and not used in any other place, so the problem here was probably a
> typo or some rebase/refactoring mishap).
>
> libgcc/ChangeLog:
>
> 	* config/i386/gthr-win32-cond.c (__gthr_win32_abs_to_rel_time):
>            fix absolute timespec to relative milliseconds count
>            conversion (it incorrectly returned seconds instead of
>            milliseconds); this avoids spurious wakeups in
>            __gthr_win32_cond_timedwait
> ---
>   libgcc/config/i386/gthr-win32-cond.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libgcc/config/i386/gthr-win32-cond.c b/libgcc/config/i386/gthr-win32-cond.c
> index ecb007a54b2..650c448f286 100644
> --- a/libgcc/config/i386/gthr-win32-cond.c
> +++ b/libgcc/config/i386/gthr-win32-cond.c
> @@ -78,7 +78,7 @@ __gthr_win32_abs_to_rel_time (const __gthread_time_t *abs_time)
>     if (abs_time_nsec100 < now.nsec100)
>       return 0;
>   
> -  return (DWORD) CEIL_DIV (abs_time_nsec100 - now.nsec100, NSEC100_PER_SEC);
> +  return (DWORD) CEIL_DIV (abs_time_nsec100 - now.nsec100, NSEC100_PER_MSEC);
>   }
>   
>   /* Check the sizes of the local version of the Win32 types.  */
Re-reading the commit message I found a few typos, and it was generally 
a bit more obscure than I like; reworded it now, hope it's better.
  
Jonathan Yong Feb. 17, 2024, 12:24 a.m. UTC | #2
On 2/10/24 10:10, Matteo Italia wrote:
> Il 09/02/24 15:18, Matteo Italia ha scritto:
>> The Win32 threading model uses __gthr_win32_abs_to_rel_time to convert
>> the timespec used in gthreads to specify the absolute time for end of
>> the condition variables timed wait to a milliseconds value relative to
>> "now" to pass to the Win32 SleepConditionVariableCS function.
>>
>> Unfortunately, the conversion is incorrect, as, due to a typo, it
>> returns the relative time _in seconds_, so SleepConditionVariableCS
>> receives a timeout value 1000 times shorter than it should be, resulting
>> in a huge amount of spurious wakeups in calls such as
>> std::condition_variable::wait_for or wait_until.
>>
> Re-reading the commit message I found a few typos, and it was generally 
> a bit more obscure than I like; reworded it now, hope it's better.

Thanks, pushed to master and 13.x branches.
  
Matteo Italia Feb. 19, 2024, 1:48 p.m. UTC | #3
Il 17/02/24 01:24, Jonathan Yong ha scritto:
> On 2/10/24 10:10, Matteo Italia wrote:
>> Il 09/02/24 15:18, Matteo Italia ha scritto:
>>> The Win32 threading model uses __gthr_win32_abs_to_rel_time to convert
>>> the timespec used in gthreads to specify the absolute time for end of
>>> the condition variables timed wait to a milliseconds value relative to
>>> "now" to pass to the Win32 SleepConditionVariableCS function.
>>>
>>> Unfortunately, the conversion is incorrect, as, due to a typo, it
>>> returns the relative time _in seconds_, so SleepConditionVariableCS
>>> receives a timeout value 1000 times shorter than it should be, 
>>> resulting
>>> in a huge amount of spurious wakeups in calls such as
>>> std::condition_variable::wait_for or wait_until.
>>>
>> Re-reading the commit message I found a few typos, and it was 
>> generally a bit more obscure than I like; reworded it now, hope it's 
>> better.
>
> Thanks, pushed to master and 13.x branches.
Great, thank you! Do I need to change the status of the Bugzilla entry 
to RESOLVED, or it's going to be closed automatically at the next 
releases, or something else?
  
Jonathan Yong Feb. 19, 2024, 1:55 p.m. UTC | #4
On 2/19/24 13:48, Matteo Italia wrote:
> Il 17/02/24 01:24, Jonathan Yong ha scritto:
>> On 2/10/24 10:10, Matteo Italia wrote:
>>> Il 09/02/24 15:18, Matteo Italia ha scritto:
>>>> The Win32 threading model uses __gthr_win32_abs_to_rel_time to convert
>>>> the timespec used in gthreads to specify the absolute time for end of
>>>> the condition variables timed wait to a milliseconds value relative to
>>>> "now" to pass to the Win32 SleepConditionVariableCS function.
>>>>
>>>> Unfortunately, the conversion is incorrect, as, due to a typo, it
>>>> returns the relative time _in seconds_, so SleepConditionVariableCS
>>>> receives a timeout value 1000 times shorter than it should be, 
>>>> resulting
>>>> in a huge amount of spurious wakeups in calls such as
>>>> std::condition_variable::wait_for or wait_until.
>>>>
>>> Re-reading the commit message I found a few typos, and it was 
>>> generally a bit more obscure than I like; reworded it now, hope it's 
>>> better.
>>
>> Thanks, pushed to master and 13.x branches.
> Great, thank you! Do I need to change the status of the Bugzilla entry 
> to RESOLVED, or it's going to be closed automatically at the next 
> releases, or something else?

Closed as resolved, thanks.
  

Patch

diff --git a/libgcc/config/i386/gthr-win32-cond.c b/libgcc/config/i386/gthr-win32-cond.c
index ecb007a54b2..650c448f286 100644
--- a/libgcc/config/i386/gthr-win32-cond.c
+++ b/libgcc/config/i386/gthr-win32-cond.c
@@ -78,7 +78,7 @@  __gthr_win32_abs_to_rel_time (const __gthread_time_t *abs_time)
   if (abs_time_nsec100 < now.nsec100)
     return 0;
 
-  return (DWORD) CEIL_DIV (abs_time_nsec100 - now.nsec100, NSEC100_PER_SEC);
+  return (DWORD) CEIL_DIV (abs_time_nsec100 - now.nsec100, NSEC100_PER_MSEC);
 }
 
 /* Check the sizes of the local version of the Win32 types.  */