[PATCHSET,v2,0/2] Split iowait into two states

Message ID 20240227211152.1099534-1-axboe@kernel.dk
Headers
Series Split iowait into two states |

Message

Jens Axboe Feb. 27, 2024, 9:06 p.m. UTC
  Hi,

This is v2 of the patch posted yesterday, where the current in_iowait
state is split into two parts:

1) The "task is sleeping waiting on IO", and would like cpufreq goodness
   in terms of sleep and wakeup latencies.
2) The above, and also accounted as such in the iowait stats.

The current ->in_iowait covers both, with this series we have ->in_iowait
for step 1 above, and ->in_iowait_acct for step 2. You cannot be
->in_iowait_acct without also having ->in_iowait set.

Patch 1 is a prep patch, that turns rq->nr_iowait into an unsigned int
rather than an atomic_t. Reasons given in that patch.

Patch 2 adds the ->in_iowait_acct stage inside the current ->in_iowait
setting.

I haven't been able to properly benchmark patch 1, as the atomics are
noise in any workloads that approximate normality. I can certainly
concoct a synthetic test case if folks are interested. My gut says that
we're trading 3 fast path atomics for none, and with the 4th case
_probably_ being way less likely. There we grab the rq lock.

Comments welcome! Peter, CC'ing you since I did in the previous, feel
free to ignore.

Since v1:
- Add prep patch 1, switching nr_iowait to an unsigned int
- Modify patch 2 to not use atomic_t as well, no changes otherwise

 arch/s390/appldata/appldata_base.c |  2 +-
 arch/s390/appldata/appldata_os.c   |  2 +-
 fs/proc/stat.c                     |  2 +-
 include/linux/sched.h              |  6 ++++
 include/linux/sched/stat.h         | 10 ++++--
 kernel/sched/core.c                | 55 +++++++++++++++++++++++-------
 kernel/sched/cputime.c             |  2 +-
 kernel/sched/sched.h               |  3 +-
 kernel/time/tick-sched.c           |  6 ++--
 9 files changed, 66 insertions(+), 22 deletions(-)
  

Comments

Jens Axboe Feb. 28, 2024, 2:21 a.m. UTC | #1
On 2/27/24 2:06 PM, Jens Axboe wrote:
> I haven't been able to properly benchmark patch 1, as the atomics are
> noise in any workloads that approximate normality. I can certainly
> concoct a synthetic test case if folks are interested. My gut says that
> we're trading 3 fast path atomics for none, and with the 4th case
> _probably_ being way less likely. There we grab the rq lock.

OK, so on Chris's suggestion, I tried his schbench to exercise the
scheduling side. It's very futex intensive, so I hacked up futex to set
iowait state when sleeping. I also added simple accounting to that path
so I knew how many times it ran. A run of:

/schbench -m 60 -t 10 -p 8

on a 2 socket Intel(R) Xeon(R) Platinum 8458P with 176 threads, there's
no regression in performance and try_to_wake_up() locking the rq of the
task being scheduled in from another CPU doesn't seem to register much.
On the previous run, I saw 2.21% there and now it's 2.36%. But it was
also a better performing run, which may have lead to the increase.

Each run takes 30 seconds, and during that time I see around 290-310M
hits of that path, or about ~10M/sec. Without modifying futex to use
iowait, we obviously rarely hit it. About 200 times for a run, which
makes sense as we're not really doing IO.

Anyway, just some data on this. If I leave the futex/pipe iowait in and
run the same test, I see no discernable difference in profiles. In fact,
the highest cost across the tests is bringing in the task->in_iowait
cacheline.