diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-01-08 19:49:17 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-01-08 19:49:17 -0800 |
commit | bfe8eb3b85c571f7e94e1039f59b462505b8e0fc (patch) | |
tree | 2084624e1d6e2c7f570239aad1bbdd9741cfe5e5 /kernel/sched/idle.c | |
parent | aac4de465af08ccec90ef47bdcc13435e48a7223 (diff) | |
parent | cdb3033e191fd03da2d7da23b9cd448dfa180a8e (diff) |
Merge tag 'sched-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Energy scheduling:
- Consolidate how the max compute capacity is used in the scheduler
and how we calculate the frequency for a level of utilization.
- Rework interface between the scheduler and the schedutil governor
- Simplify the util_est logic
Deadline scheduler:
- Work more towards reducing SCHED_DEADLINE starvation of low
priority tasks (e.g., SCHED_OTHER) tasks when higher priority tasks
monopolize CPU cycles, via the introduction of 'deadline servers'
(nested/2-level scheduling).
"Fair servers" to make use of this facility are not introduced yet.
EEVDF:
- Introduce O(1) fastpath for EEVDF task selection
NUMA balancing:
- Tune the NUMA-balancing vma scanning logic some more, to better
distribute the probability of a particular vma getting scanned.
Plus misc fixes, cleanups and updates"
* tag 'sched-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
sched/fair: Fix tg->load when offlining a CPU
sched/fair: Remove unused 'next_buddy_marked' local variable in check_preempt_wakeup_fair()
sched/fair: Use all little CPUs for CPU-bound workloads
sched/fair: Simplify util_est
sched/fair: Remove SCHED_FEAT(UTIL_EST_FASTUP, true)
arm64/amu: Use capacity_ref_freq() to set AMU ratio
cpufreq/cppc: Set the frequency used for computing the capacity
cpufreq/cppc: Move and rename cppc_cpufreq_{perf_to_khz|khz_to_perf}()
energy_model: Use a fixed reference frequency
cpufreq/schedutil: Use a fixed reference frequency
cpufreq: Use the fixed and coherent frequency for scaling capacity
sched/topology: Add a new arch_scale_freq_ref() method
freezer,sched: Clean saved_state when restoring it during thaw
sched/fair: Update min_vruntime for reweight_entity() correctly
sched/doc: Update documentation after renames and synchronize Chinese version
sched/cpufreq: Rework iowait boost
sched/cpufreq: Rework schedutil governor performance estimation
sched/pelt: Avoid underestimation of task utilization
sched/timers: Explain why idle task schedules out on remote timer enqueue
sched/cpuidle: Comment about timers requirements VS idle handler
...
Diffstat (limited to 'kernel/sched/idle.c')
-rw-r--r-- | kernel/sched/idle.c | 30 |
1 files changed, 30 insertions, 0 deletions
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 565f8374ddbb..31231925f1ec 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -258,6 +258,36 @@ static void do_idle(void) while (!need_resched()) { rmb(); + /* + * Interrupts shouldn't be re-enabled from that point on until + * the CPU sleeping instruction is reached. Otherwise an interrupt + * may fire and queue a timer that would be ignored until the CPU + * wakes from the sleeping instruction. And testing need_resched() + * doesn't tell about pending needed timer reprogram. + * + * Several cases to consider: + * + * - SLEEP-UNTIL-PENDING-INTERRUPT based instructions such as + * "wfi" or "mwait" are fine because they can be entered with + * interrupt disabled. + * + * - sti;mwait() couple is fine because the interrupts are + * re-enabled only upon the execution of mwait, leaving no gap + * in-between. + * + * - ROLLBACK based idle handlers with the sleeping instruction + * called with interrupts enabled are NOT fine. In this scheme + * when the interrupt detects it has interrupted an idle handler, + * it rolls back to its beginning which performs the + * need_resched() check before re-executing the sleeping + * instruction. This can leak a pending needed timer reprogram. + * If such a scheme is really mandatory due to the lack of an + * appropriate CPU sleeping instruction, then a FAST-FORWARD + * must instead be applied: when the interrupt detects it has + * interrupted an idle handler, it must resume to the end of + * this idle handler so that the generic idle loop is iterated + * again to reprogram the tick. + */ local_irq_disable(); if (cpu_is_offline(cpu)) { |