diff options
author | Raghavendra Rao Ananta <rananta@google.com> | 2025-08-20 16:22:42 +0000 |
---|---|---|
committer | Oliver Upton <oliver.upton@linux.dev> | 2025-08-21 16:27:03 -0700 |
commit | e9abe311f35631a999fe38c86f26f0e48ffe46d5 (patch) | |
tree | e3e6a10ff770ffb949979a329b685f9f2ccf96b9 /rust/helpers/mutex.c | |
parent | 0e89ca13ee5ff41b437bb2a003c0eaf34ea43555 (diff) |
KVM: arm64: Reschedule as needed when destroying the stage-2 page-tables
When a large VM, specifically one that holds a significant number of PTEs,
gets abruptly destroyed, the following warning is seen during the
page-table walk:
sched: CPU 0 need_resched set for > 100018840 ns (100 ticks) without schedule
CPU: 0 UID: 0 PID: 9617 Comm: kvm_page_table_ Tainted: G O 6.16.0-smp-DEV #3 NONE
Tainted: [O]=OOT_MODULE
Call trace:
show_stack+0x20/0x38 (C)
dump_stack_lvl+0x3c/0xb8
dump_stack+0x18/0x30
resched_latency_warn+0x7c/0x88
sched_tick+0x1c4/0x268
update_process_times+0xa8/0xd8
tick_nohz_handler+0xc8/0x168
__hrtimer_run_queues+0x11c/0x338
hrtimer_interrupt+0x104/0x308
arch_timer_handler_phys+0x40/0x58
handle_percpu_devid_irq+0x8c/0x1b0
generic_handle_domain_irq+0x48/0x78
gic_handle_irq+0x1b8/0x408
call_on_irq_stack+0x24/0x30
do_interrupt_handler+0x54/0x78
el1_interrupt+0x44/0x88
el1h_64_irq_handler+0x18/0x28
el1h_64_irq+0x84/0x88
stage2_free_walker+0x30/0xa0 (P)
__kvm_pgtable_walk+0x11c/0x258
__kvm_pgtable_walk+0x180/0x258
__kvm_pgtable_walk+0x180/0x258
__kvm_pgtable_walk+0x180/0x258
kvm_pgtable_walk+0xc4/0x140
kvm_pgtable_stage2_destroy+0x5c/0xf0
kvm_free_stage2_pgd+0x6c/0xe8
kvm_uninit_stage2_mmu+0x24/0x48
kvm_arch_flush_shadow_all+0x80/0xa0
kvm_mmu_notifier_release+0x38/0x78
__mmu_notifier_release+0x15c/0x250
exit_mmap+0x68/0x400
__mmput+0x38/0x1c8
mmput+0x30/0x68
exit_mm+0xd4/0x198
do_exit+0x1a4/0xb00
do_group_exit+0x8c/0x120
get_signal+0x6d4/0x778
do_signal+0x90/0x718
do_notify_resume+0x70/0x170
el0_svc+0x74/0xd8
el0t_64_sync_handler+0x60/0xc8
el0t_64_sync+0x1b0/0x1b8
The warning is seen majorly on the host kernels that are configured
not to force-preempt, such as CONFIG_PREEMPT_NONE=y. To avoid this,
instead of walking the entire page-table in one go, split it into
smaller ranges, by checking for cond_resched() between each range.
Since the path is executed during VM destruction, after the
page-table structure is unlinked from the KVM MMU, relying on
cond_resched_rwlock_write() isn't necessary.
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250820162242.2624752-3-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Diffstat (limited to 'rust/helpers/mutex.c')
0 files changed, 0 insertions, 0 deletions