sched_ext: Pass locked CPU parameter to scx_hardlockup() and add docs

With the buddy lockup detector, smp_processor_id() returns the detecting CPU, not the locked CPU, making scx_hardlockup()'s printouts confusing. Pass the locked CPU number from watchdog_hardlockup_check() as a parameter instead. Also add kerneldoc comments to handle_lockup(), scx_hardlockup(), and scx_rcu_cpu_stall() documenting their return value semantics. Suggested-by: Doug Anderson <dianders@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Acked-by: Andrea Righi <arighi@nvidia.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Tejun Heo <tj@kernel.org>
author: Tejun Heo <tj@kernel.org> 2025-11-13 15:33:41 -1000
committer: Tejun Heo <tj@kernel.org> 2025-11-14 11:11:08 -1000
commit: 1dcb98bbb7538d4b9015d47c934acdf5ea86045c (patch)
tree: cd4da847a8bd58897113dcfcd6335030e0c338a1 /kernel/sched/ext.c
parent: 67932f691895294a95861571b0ca69a38e0a4894 (diff)
1 files changed, 22 insertions, 3 deletions
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 8a3b8f64a06b..918573f3f088 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -3687,6 +3687,17 @@ bool scx_allow_ttwu_queue(const struct task_struct *p)
 	return false;
 }
 
+/**
+ * handle_lockup - sched_ext common lockup handler
+ * @fmt: format string
+ *
+ * Called on system stall or lockup condition and initiates abort of sched_ext
+ * if enabled, which may resolve the reported lockup.
+ *
+ * Returns %true if sched_ext is enabled and abort was initiated, which may
+ * resolve the lockup. %false if sched_ext is not enabled or abort was already
+ * initiated by someone else.
+ */
 static __printf(1, 2) bool handle_lockup(const char *fmt, ...)
 {
 	struct scx_sched *sch;
@@ -3718,6 +3729,10 @@ static __printf(1, 2) bool handle_lockup(const char *fmt, ...)
  * that may not be caused by the current BPF scheduler, try kicking out the
  * current scheduler in an attempt to recover the system to a good state before
  * issuing panics.
+ *
+ * Returns %true if sched_ext is enabled and abort was initiated, which may
+ * resolve the reported RCU stall. %false if sched_ext is not enabled or someone
+ * else already initiated abort.
  */
 bool scx_rcu_cpu_stall(void)
 {
@@ -3750,14 +3765,18 @@ void scx_softlockup(u32 dur_s)
  * numerous affinitized tasks in a single queue and directing all CPUs at it.
  * Try kicking out the current scheduler in an attempt to recover the system to
  * a good state before taking more drastic actions.
+ *
+ * Returns %true if sched_ext is enabled and abort was initiated, which may
+ * resolve the reported hardlockdup. %false if sched_ext is not enabled or
+ * someone else already initiated abort.
  */
-bool scx_hardlockup(void)
+bool scx_hardlockup(int cpu)
 {
-	if (!handle_lockup("hard lockup - CPU %d", smp_processor_id()))
+	if (!handle_lockup("hard lockup - CPU %d", cpu))
 		return false;
 
 	printk_deferred(KERN_ERR "sched_ext: Hard lockup - CPU %d, disabling BPF scheduler\n",
-			smp_processor_id());
+			cpu);
 	return true;
 }
author	Tejun Heo <tj@kernel.org>	2025-11-13 15:33:41 -1000
committer	Tejun Heo <tj@kernel.org>	2025-11-14 11:11:08 -1000
commit	1dcb98bbb7538d4b9015d47c934acdf5ea86045c (patch)
tree	cd4da847a8bd58897113dcfcd6335030e0c338a1 /kernel/sched/ext.c
parent	67932f691895294a95861571b0ca69a38e0a4894 (diff)