[v3,0/2] x86/resctrl: fix task CLOSID update race

Message ID 20221115141953.816851-1-peternewman@google.com
Headers
Series x86/resctrl: fix task CLOSID update race |

Message

Peter Newman Nov. 15, 2022, 2:19 p.m. UTC
  Hi Reinette, Fenghua,

I've reorganized the patches for clarity, following James's guidance.

The patch series addresses the IPI race we discussed in the container
move RFD thread[1].

The first patch changes group-wide CLOSID/RMID updates to IPI all CPUs.
Now that the synchronization cost of correctly updating a single task is
more than originally thought, we believe that it's cheaper to IPI all
CPUs than forming a more precise CPU mask by synchronizing with all
tasks in an rdtgroup, especially when there is a large number of tasks
in the group. It's possible that this update could upset users who
frequently delete groups with few tasks. If anyone is aware of a use
case that frequently deletes groups, we can consider mitigations.

The second one uses the new task_call_func() interface to serialize
updating closid and rmid with any context switch of the task. AFAICT,
the implementation of this function acts like a mutex with context
switch, but I'm not certain whether it is intended to be one. If this is
not how task_call_func() is meant to be used, I will instead move the
code performing the update under sched/ where it can be done holding the
task_rq_lock() explicitly, as Reinette has suggested before[2].

Updates in v3:
 - Split the handling of multi-task and single-task operations into
   separate patches, now that they're handled differently.
 - Clarify justification in the commit message, including moving some of
   it out of inline code comment.
Updates in v2:
 - Following Reinette's suggestion: use task_call_func() for single
   task, IPI broadcast for group movements.
 - Rebased to v6.1-rc4

v1: https://lore.kernel.org/lkml/20221103141641.3055981-1-peternewman@google.com/
v2: https://lore.kernel.org/lkml/20221110135346.2209839-1-peternewman@google.com/

Thanks!
-Peter

[1] https://lore.kernel.org/all/CALPaoCg2-9ARbK+MEgdvdcjJtSy_2H6YeRkLrT97zgy8Aro3Vg@mail.gmail.com/
[2] https://lore.kernel.org/lkml/d3c06fa3-83a4-7ade-6b08-3a7259aa6c4b@intel.com/

Peter Newman (2):
  x86/resctrl: IPI all CPUs for group updates
  x86/resctrl: update task closid/rmid with task_call_func()

 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 128 +++++++++++--------------
 1 file changed, 58 insertions(+), 70 deletions(-)

--
2.38.1.493.g58b659f92b-goog