summaryrefslogtreecommitdiff
path: root/tools/lib/python
diff options
context:
space:
mode:
authorTejun Heo <tj@kernel.org>2025-10-28 20:19:17 -1000
committerTejun Heo <tj@kernel.org>2025-11-03 11:46:18 -1000
commitd245698d727ab8f5420b3e28d1243f96a5234851 (patch)
treedf5b6a4cdf01bd423c9c42b871b323a2b0105884 /tools/lib/python
parent260fbcb92bbeacfcd050410fdc2d24ab15044400 (diff)
cgroup: Defer task cgroup unlink until after the task is done switching out
When a task exits, css_set_move_task(tsk, cset, NULL, false) unlinks the task from its cgroup. From the cgroup's perspective, the task is now gone. If this makes the cgroup empty, it can be removed, triggering ->css_offline() callbacks that notify controllers the cgroup is going offline resource-wise. However, the exiting task can still run, perform memory operations, and schedule until the final context switch in finish_task_switch(). This creates a confusing situation where controllers are told a cgroup is offline while resource activities are still happening in it. While this hasn't broken existing controllers, it has caused direct confusion for sched_ext schedulers. Split cgroup_task_exit() into two functions. cgroup_task_exit() now only calls the subsystem exit callbacks and continues to be called from do_exit(). The css_set cleanup is moved to the new cgroup_task_dead() which is called from finish_task_switch() after the final context switch, so that the cgroup only appears empty after the task is truly done running. This also reorders operations so that subsys->exit() is now called before unlinking from the cgroup, which shouldn't break anything. Cc: Dan Schatzberg <dschatzberg@meta.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Tejun Heo <tj@kernel.org>
Diffstat (limited to 'tools/lib/python')
0 files changed, 0 insertions, 0 deletions