libgcc: fix Win32 CV abnormal spurious wakeups in timed wait [PR113850]
Checks
Commit Message
The Win32 threading model uses __gthr_win32_abs_to_rel_time to convert
the timespec used in gthreads to specify the absolute time for end of
the condition variables timed wait to a milliseconds value relative to
"now" to pass to the Win32 SleepConditionVariableCS function.
Unfortunately, the conversion is incorrect, as, due to a typo, it
returns the relative time _in seconds_, so SleepConditionVariableCS
receives a timeout value 1000 times shorter than it should be, resulting
in a huge amount of spurious wakeups in calls such as
std::condition_variable::wait_for or wait_until.
This can be demonstrated easily by this sample program:
```
int main() {
std::condition_variable cv;
std::mutex mx;
bool pass = false;
auto thread_fn = [&](bool timed) {
int wakeups = 0;
using sc = std::chrono::system_clock;
auto before = sc::now();
std::unique_lock<std::mutex> ml(mx);
if (timed) {
cv.wait_for(ml, std::chrono::seconds(2), [&]{
++wakeups;
return pass;
});
} else {
cv.wait(ml, [&]{
++wakeups;
return pass;
});
}
printf("pass: %d; wakeups: %d; elapsed: %d ms\n", pass, wakeups,
int((sc::now() - before) / std::chrono::milliseconds(1)));
pass = false;
};
{
// timed wait, let expire
std::thread t(thread_fn, true);
t.join();
}
{
// timed wait, wake up explicitly after 1 second
std::thread t(thread_fn, true);
std::this_thread::sleep_for(std::chrono::seconds(1));
{
std::unique_lock<std::mutex> ml(mx);
pass = true;
}
cv.notify_all();
t.join();
}
{
// non-timed wait, wake up explicitly after 1 second
std::thread t(thread_fn, false);
std::this_thread::sleep_for(std::chrono::seconds(1));
{
std::unique_lock<std::mutex> ml(mx);
pass = true;
}
cv.notify_all();
t.join();
}
return 0;
}
```
On builds that other threading models (e.g. POSIX on Linux, or
winpthreads or MCF on Win32) the output is something like
```
pass: 0; wakeups: 2; elapsed: 2000 ms
pass: 1; wakeups: 2; elapsed: 991 ms
pass: 1; wakeups: 2; elapsed: 996 ms
```
while on the affected builds it's more like
```
pass: 0; wakeups: 1418; elapsed: 2000 ms
pass: 1; wakeups: 479; elapsed: 988 ms
pass: 1; wakeups: 2; elapsed: 992 ms
```
(notice the huge number of wakeups).
This commit fixes the conversion, adjusting the final division by
NSEC100_PER_SEC to use NSEC100_PER_MSEC instead (already defined in the
file and not used in any other place, so the problem here was probably a
typo or some rebase/refactoring mishap).
libgcc/ChangeLog:
* config/i386/gthr-win32-cond.c (__gthr_win32_abs_to_rel_time):
fix absolute timespec to relative milliseconds count
conversion (it incorrectly returned seconds instead of
milliseconds); this avoids spurious wakeups in
__gthr_win32_cond_timedwait
---
libgcc/config/i386/gthr-win32-cond.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
Il 09/02/24 15:18, Matteo Italia ha scritto:
> The Win32 threading model uses __gthr_win32_abs_to_rel_time to convert
> the timespec used in gthreads to specify the absolute time for end of
> the condition variables timed wait to a milliseconds value relative to
> "now" to pass to the Win32 SleepConditionVariableCS function.
>
> Unfortunately, the conversion is incorrect, as, due to a typo, it
> returns the relative time _in seconds_, so SleepConditionVariableCS
> receives a timeout value 1000 times shorter than it should be, resulting
> in a huge amount of spurious wakeups in calls such as
> std::condition_variable::wait_for or wait_until.
>
> This can be demonstrated easily by this sample program:
>
> ```
>
> int main() {
> std::condition_variable cv;
> std::mutex mx;
> bool pass = false;
>
> auto thread_fn = [&](bool timed) {
> int wakeups = 0;
> using sc = std::chrono::system_clock;
> auto before = sc::now();
> std::unique_lock<std::mutex> ml(mx);
> if (timed) {
> cv.wait_for(ml, std::chrono::seconds(2), [&]{
> ++wakeups;
> return pass;
> });
> } else {
> cv.wait(ml, [&]{
> ++wakeups;
> return pass;
> });
> }
> printf("pass: %d; wakeups: %d; elapsed: %d ms\n", pass, wakeups,
> int((sc::now() - before) / std::chrono::milliseconds(1)));
> pass = false;
> };
>
> {
> // timed wait, let expire
> std::thread t(thread_fn, true);
> t.join();
> }
>
> {
> // timed wait, wake up explicitly after 1 second
> std::thread t(thread_fn, true);
> std::this_thread::sleep_for(std::chrono::seconds(1));
> {
> std::unique_lock<std::mutex> ml(mx);
> pass = true;
> }
> cv.notify_all();
> t.join();
> }
>
> {
> // non-timed wait, wake up explicitly after 1 second
> std::thread t(thread_fn, false);
> std::this_thread::sleep_for(std::chrono::seconds(1));
> {
> std::unique_lock<std::mutex> ml(mx);
> pass = true;
> }
> cv.notify_all();
> t.join();
> }
> return 0;
> }
> ```
>
> On builds that other threading models (e.g. POSIX on Linux, or
> winpthreads or MCF on Win32) the output is something like
> ```
> pass: 0; wakeups: 2; elapsed: 2000 ms
> pass: 1; wakeups: 2; elapsed: 991 ms
> pass: 1; wakeups: 2; elapsed: 996 ms
> ```
>
> while on the affected builds it's more like
> ```
> pass: 0; wakeups: 1418; elapsed: 2000 ms
> pass: 1; wakeups: 479; elapsed: 988 ms
> pass: 1; wakeups: 2; elapsed: 992 ms
> ```
> (notice the huge number of wakeups).
>
> This commit fixes the conversion, adjusting the final division by
> NSEC100_PER_SEC to use NSEC100_PER_MSEC instead (already defined in the
> file and not used in any other place, so the problem here was probably a
> typo or some rebase/refactoring mishap).
>
> libgcc/ChangeLog:
>
> * config/i386/gthr-win32-cond.c (__gthr_win32_abs_to_rel_time):
> fix absolute timespec to relative milliseconds count
> conversion (it incorrectly returned seconds instead of
> milliseconds); this avoids spurious wakeups in
> __gthr_win32_cond_timedwait
> ---
> libgcc/config/i386/gthr-win32-cond.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libgcc/config/i386/gthr-win32-cond.c b/libgcc/config/i386/gthr-win32-cond.c
> index ecb007a54b2..650c448f286 100644
> --- a/libgcc/config/i386/gthr-win32-cond.c
> +++ b/libgcc/config/i386/gthr-win32-cond.c
> @@ -78,7 +78,7 @@ __gthr_win32_abs_to_rel_time (const __gthread_time_t *abs_time)
> if (abs_time_nsec100 < now.nsec100)
> return 0;
>
> - return (DWORD) CEIL_DIV (abs_time_nsec100 - now.nsec100, NSEC100_PER_SEC);
> + return (DWORD) CEIL_DIV (abs_time_nsec100 - now.nsec100, NSEC100_PER_MSEC);
> }
>
> /* Check the sizes of the local version of the Win32 types. */
Re-reading the commit message I found a few typos, and it was generally
a bit more obscure than I like; reworded it now, hope it's better.
On 2/10/24 10:10, Matteo Italia wrote:
> Il 09/02/24 15:18, Matteo Italia ha scritto:
>> The Win32 threading model uses __gthr_win32_abs_to_rel_time to convert
>> the timespec used in gthreads to specify the absolute time for end of
>> the condition variables timed wait to a milliseconds value relative to
>> "now" to pass to the Win32 SleepConditionVariableCS function.
>>
>> Unfortunately, the conversion is incorrect, as, due to a typo, it
>> returns the relative time _in seconds_, so SleepConditionVariableCS
>> receives a timeout value 1000 times shorter than it should be, resulting
>> in a huge amount of spurious wakeups in calls such as
>> std::condition_variable::wait_for or wait_until.
>>
> Re-reading the commit message I found a few typos, and it was generally
> a bit more obscure than I like; reworded it now, hope it's better.
Thanks, pushed to master and 13.x branches.
Il 17/02/24 01:24, Jonathan Yong ha scritto:
> On 2/10/24 10:10, Matteo Italia wrote:
>> Il 09/02/24 15:18, Matteo Italia ha scritto:
>>> The Win32 threading model uses __gthr_win32_abs_to_rel_time to convert
>>> the timespec used in gthreads to specify the absolute time for end of
>>> the condition variables timed wait to a milliseconds value relative to
>>> "now" to pass to the Win32 SleepConditionVariableCS function.
>>>
>>> Unfortunately, the conversion is incorrect, as, due to a typo, it
>>> returns the relative time _in seconds_, so SleepConditionVariableCS
>>> receives a timeout value 1000 times shorter than it should be,
>>> resulting
>>> in a huge amount of spurious wakeups in calls such as
>>> std::condition_variable::wait_for or wait_until.
>>>
>> Re-reading the commit message I found a few typos, and it was
>> generally a bit more obscure than I like; reworded it now, hope it's
>> better.
>
> Thanks, pushed to master and 13.x branches.
Great, thank you! Do I need to change the status of the Bugzilla entry
to RESOLVED, or it's going to be closed automatically at the next
releases, or something else?
On 2/19/24 13:48, Matteo Italia wrote:
> Il 17/02/24 01:24, Jonathan Yong ha scritto:
>> On 2/10/24 10:10, Matteo Italia wrote:
>>> Il 09/02/24 15:18, Matteo Italia ha scritto:
>>>> The Win32 threading model uses __gthr_win32_abs_to_rel_time to convert
>>>> the timespec used in gthreads to specify the absolute time for end of
>>>> the condition variables timed wait to a milliseconds value relative to
>>>> "now" to pass to the Win32 SleepConditionVariableCS function.
>>>>
>>>> Unfortunately, the conversion is incorrect, as, due to a typo, it
>>>> returns the relative time _in seconds_, so SleepConditionVariableCS
>>>> receives a timeout value 1000 times shorter than it should be,
>>>> resulting
>>>> in a huge amount of spurious wakeups in calls such as
>>>> std::condition_variable::wait_for or wait_until.
>>>>
>>> Re-reading the commit message I found a few typos, and it was
>>> generally a bit more obscure than I like; reworded it now, hope it's
>>> better.
>>
>> Thanks, pushed to master and 13.x branches.
> Great, thank you! Do I need to change the status of the Bugzilla entry
> to RESOLVED, or it's going to be closed automatically at the next
> releases, or something else?
Closed as resolved, thanks.
@@ -78,7 +78,7 @@ __gthr_win32_abs_to_rel_time (const __gthread_time_t *abs_time)
if (abs_time_nsec100 < now.nsec100)
return 0;
- return (DWORD) CEIL_DIV (abs_time_nsec100 - now.nsec100, NSEC100_PER_SEC);
+ return (DWORD) CEIL_DIV (abs_time_nsec100 - now.nsec100, NSEC100_PER_MSEC);
}
/* Check the sizes of the local version of the Win32 types. */