Age | Commit message (Collapse) | Author |
|
With commit 08699d20467b6 ("sched_ext: idle: Consolidate default idle
CPU selection kfuncs") allowing scx_bpf_select_cpu_dfl() to be invoked
from multiple contexts, update the test to validate that the kfunc
behaves correctly when used from ops.enqueue() and via BPF test_run.
Additionally, rename the test to enq_select_cpu, dropping "fails" from
the name, as the logic has now been inverted.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Update the allowed_cpus selftest to include a check to validate the
behavior of scx_bpf_select_cpu_and() when invoked via a BPF test_run
call.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Add a selftest to validate the behavior of the built-in idle CPU
selection policy applied to a subset of allowed CPUs, using
scx_bpf_select_cpu_and().
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Core & fair scheduler changes:
- Cancel the slice protection of the idle entity (Zihan Zhou)
- Reduce the default slice to avoid tasks getting an extra tick
(Zihan Zhou)
- Force propagating min_slice of cfs_rq when {en,de}queue tasks
(Tianchen Ding)
- Refactor can_migrate_task() to elimate looping (I Hsin Cheng)
- Add unlikey branch hints to several system calls (Colin Ian King)
- Optimize current_clr_polling() on certain architectures (Yujun
Dong)
Deadline scheduler: (Juri Lelli)
- Remove redundant dl_clear_root_domain call
- Move dl_rebuild_rd_accounting to cpuset.h
Uclamp:
- Use the uclamp_is_used() helper instead of open-coding it (Xuewen
Yan)
- Optimize sched_uclamp_used static key enabling (Xuewen Yan)
Scheduler topology support: (Juri Lelli)
- Ignore special tasks when rebuilding domains
- Add wrappers for sched_domains_mutex
- Generalize unique visiting of root domains
- Rebuild root domain accounting after every update
- Remove partition_and_rebuild_sched_domains
- Stop exposing partition_sched_domains_locked
RSEQ: (Michael Jeanson)
- Update kernel fields in lockstep with CONFIG_DEBUG_RSEQ=y
- Fix segfault on registration when rseq_cs is non-zero
- selftests: Add rseq syscall errors test
- selftests: Ensure the rseq ABI TLS is actually 1024 bytes
Membarriers:
- Fix redundant load of membarrier_state (Nysal Jan K.A.)
Scheduler debugging:
- Introduce and use preempt_model_str() (Sebastian Andrzej Siewior)
- Make CONFIG_SCHED_DEBUG unconditional (Ingo Molnar)
Fixes and cleanups:
- Always save/restore x86 TSC sched_clock() on suspend/resume
(Guilherme G. Piccoli)
- Misc fixes and cleanups (Thorsten Blum, Juri Lelli, Sebastian
Andrzej Siewior)"
* tag 'sched-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
cpuidle, sched: Use smp_mb__after_atomic() in current_clr_polling()
sched/debug: Remove CONFIG_SCHED_DEBUG
sched/debug: Remove CONFIG_SCHED_DEBUG from self-test config files
sched/debug, Documentation: Remove (most) CONFIG_SCHED_DEBUG references from documentation
sched/debug: Make CONFIG_SCHED_DEBUG functionality unconditional
sched/debug: Make 'const_debug' tunables unconditional __read_mostly
sched/debug: Change SCHED_WARN_ON() to WARN_ON_ONCE()
rseq/selftests: Fix namespace collision with rseq UAPI header
include/{topology,cpuset}: Move dl_rebuild_rd_accounting to cpuset.h
sched/topology: Stop exposing partition_sched_domains_locked
cgroup/cpuset: Remove partition_and_rebuild_sched_domains
sched/topology: Remove redundant dl_clear_root_domain call
sched/deadline: Rebuild root domain accounting after every update
sched/deadline: Generalize unique visiting of root domains
sched/topology: Wrappers for sched_domains_mutex
sched/deadline: Ignore special tasks when rebuilding domains
tracing: Use preempt_model_str()
xtensa: Rely on generic printing of preemption model
x86: Rely on generic printing of preemption model
s390: Rely on generic printing of preemption model
...
|
|
We leave most of the defconfigs alone (there's over 70 of them),
but let's remove CONFIG_SCHED_DEBUG from the scheduler self-test
Kconfig files.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/Z9szt3MpQmQ56TRd@gmail.com
|
|
Pull for-6.14-fixes to receive:
9360dfe4cbd6 ("sched_ext: Validate prev_cpu in scx_bpf_select_cpu_dfl()")
which conflicts with:
337d1b354a29 ("sched_ext: Move built-in idle CPU selection policy to a separate file")
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Add a selftest to validate the behavior of the NUMA-aware scheduler
functionalities, including idle CPU selection within nodes, per-node
DSQs and CPU to node mapping.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Fixed grammar for a few tests of sched_ext.
Signed-off-by: Devaansh Kumar <devaanshk840@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
In UP systems p->migration_disabled is not available. Fix this by using
the portable helper is_migration_disabled(p).
Fixes: e9fe182772dc ("sched_ext: selftests/dsp_local_on: Fix sporadic failures")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
dsp_local_on has several incorrect assumptions, one of which is that
p->nr_cpus_allowed always tracks p->cpus_ptr. This is not true when a task
is scheduled out while migration is disabled - p->cpus_ptr is temporarily
overridden to the previous CPU while p->nr_cpus_allowed remains unchanged.
This led to sporadic test faliures when dsp_local_on_dispatch() tries to put
a migration disabled task to a different CPU. Fix it by keeping the previous
CPU when migration is disabled.
There are SCX schedulers that make use of p->nr_cpus_allowed. They should
also implement explicit handling for p->migration_disabled.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ihor Solodrai <ihor.solodrai@pm.me>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Changwoo Min <changwoo@igalia.com>
|
|
All scx enums are now automatically generated from vmlinux.h and they
must be initialized using the SCX_ENUM_INIT() macro.
Fix the scx selftests to use this macro to properly initialize these
values.
Fixes: 8da7bf2cee27 ("tools/sched_ext: Receive updates from SCX repo")
Reported-by: Ihor Solodrai <ihor.solodrai@pm.me>
Closes: https://lore.kernel.org/all/Z2tNK2oFDX1OPp8C@slm.duckdns.org/
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext updates from Tejun Heo:
- scx_bpf_now() added so that BPF scheduler can access the cached
timestamp in struct rq to avoid reading TSC multiple times within a
locked scheduling operation.
- Minor updates to the built-in idle CPU selection logic.
- tool/sched_ext updates and other misc changes.
* tag 'sched_ext-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: fix kernel-doc warnings
sched_ext: Use time helpers in BPF schedulers
sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now()
sched_ext: Add time helpers for BPF schedulers
sched_ext: Add scx_bpf_now() for BPF scheduler
sched_ext: Implement scx_bpf_now()
sched_ext: Relocate scx_enabled() related code
sched_ext: Add option -l in selftest runner to list all available tests
sched_ext: Include remaining task time slice in error state dump
sched_ext: update scx_bpf_dsq_insert() doc for SCX_DSQ_LOCAL_ON
sched_ext: idle: small CPU iteration refactoring
sched_ext: idle: introduce check_builtin_idle_enabled() helper
sched_ext: idle: clarify comments
sched_ext: idle: use assign_cpu() to update the idle cpumask
sched_ext: Use str_enabled_disabled() helper in update_selcpu_topology()
sched_ext: Use sizeof_field for key_len in dsq_hash_params
tools/sched_ext: Receive updates from SCX repo
sched_ext: Use the NUMA scheduling domain for NUMA optimizations
|
|
The selftest runner currently allows selecting tests via the -t
option. This patch adds a new -l option that lists all available tests,
providing users with an overview of the tests they can choose from. This
enhancement is especially useful for scripting and automation purposes,
making it easier to discover and run tests.
Signed-off-by: Shizhao Chen <shichen@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The dsp_local_on selftest expects the scheduler to fail by trying to
schedule an e.g. CPU-affine task to the wrong CPU. However, this isn't
guaranteed to happen in the 1 second window that the test is running.
Besides, it's odd to have this particular exception path tested when there
are no other tests that verify that the interface is working at all - e.g.
the test would pass if dsp_local_on interface is completely broken and fails
on any attempt.
Flip the test so that it verifies that the feature works. While at it, fix a
typo in the info message.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ihor Solodrai <ihor.solodrai@pm.me>
Link: http://lkml.kernel.org/r/Z1n9v7Z6iNJ-wKmq@slm.duckdns.org
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
maximal.bpf.c is still dispatching to and consuming from SCX_DSQ_GLOBAL.
Let's have it use its own DSQ to avoid any runtime errors.
Signed-off-by: David Vernet <void@manifault.com>
Tested-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The selftests are falining to build on current tip of bpf-next and
sched_ext [1]. This has broken BPF CI [2] after merge from upstream.
Use appropriate function names in the selftests according to the
recent changes in the sched_ext API [3].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=fc39fb56917bb3cb53e99560ca3612a84456ada2
[2] https://github.com/kernel-patches/bpf/actions/runs/11959327258/job/33340923745
[3] https://lore.kernel.org/all/20241109194853.580310-1-tj@kernel.org/
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
Acked-by: Andrea Righi <arighi@nvidia.com>
Acked-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
cc9877fb7677 ("sched_ext: Improve error reporting during loading") changed
how load failures are reported so that more error context can be
communicated. This breaks the enq_last_no_enq_fails test as attach no longer
fails. The scheduler is guaranteed to be ejected on attach completion with
full error information. Update enq_last_no_enq_fails so that it checks that
the scheduler is ejected using ops.exit().
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Vishal Chourasia <vishalc@linux.ibm.com>
Link: http://lkml.kernel.org/r/Zxknp7RAVNjmdJSc@linux.ibm.com
Fixes: cc9877fb7677 ("sched_ext: Improve error reporting during loading")
|
|
In commit 63fb3ec80516 ("sched_ext: Allow only user DSQs for
scx_bpf_consume(), scx_bpf_dsq_nr_queued() and bpf_iter_scx_dsq_new()"), we
updated the consume path to only accept user DSQs, thus making it invalid
to consume SCX_DSQ_GLOBAL. This selftest was doing that, so let's create a
custom DSQ and use that instead. The test now passes:
[root@virtme-ng sched_ext]# ./runner -t exit
===== START =====
TEST: exit
DESCRIPTION: Verify we can cleanly exit a scheduler in multiple places
OUTPUT:
[ 12.387229] sched_ext: BPF scheduler "exit" enabled
[ 12.406064] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF)
[ 12.453325] sched_ext: BPF scheduler "exit" enabled
[ 12.474064] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF)
[ 12.515241] sched_ext: BPF scheduler "exit" enabled
[ 12.532064] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF)
[ 12.592063] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF)
[ 12.654063] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF)
[ 12.715062] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF)
ok 1 exit #
===== END =====
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Fix incompatible function pointer type warnings in sched_ext BPF selftests by
explicitly casting the function pointers when initializing struct_ops.
This addresses multiple -Wincompatible-function-pointer-types warnings from the
clang compiler where function signatures didn't match exactly.
The void * cast ensures the compiler accepts the function pointer
assignment despite minor type differences in the parameters.
Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The runner.o may start building before libbpf headers are installed,
and as a result build fails. This happened a couple of times on
libbpf/ci test jobs:
* https://github.com/libbpf/ci/actions/runs/11447667257/job/31849533100
* https://github.com/theihor/libbpf-ci/actions/runs/11445162764/job/31841649552
Headers are installed in a recipe for $(BPFOBJ) target, and adding an
order-only dependency should ensure this doesn't happen.
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The sched_ext selftests is missing proper cross-compilation support, a
proper target entry, and out-of-tree build support.
When building the kselftest suite, e.g.:
make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- \
TARGETS=sched_ext SKIP_TARGETS="" O=/output/foo \
-C tools/testing/selftests install
or:
make ARCH=arm64 LLVM=1 TARGETS=sched_ext SKIP_TARGETS="" \
O=/output/foo -C tools/testing/selftests install
The expectation is that the sched_ext is included, cross-built, the
correct toolchain is picked up, and placed into /output/foo.
In contrast to the BPF selftests, the sched_ext suite does not use
bpftool at test run-time, so it is sufficient to build bpftool for the
build host only.
Add ARCH, CROSS_COMPILE, OUTPUT, and TARGETS support to the sched_ext
selftest. Also, remove some variables that were unused by the
Makefile.
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: David Vernet <void@manifault.com>
Tested-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Add sched_ext_ops operations to init/exit cgroups, and track task migrations
and config changes. A BPF scheduler may not implement or implement only
subset of cgroup features. The implemented features can be indicated using
%SCX_OPS_HAS_CGOUP_* flags. If cgroup configuration makes use of features
that are not implemented, a warning is triggered.
While a BPF scheduler is being enabled and disabled, relevant cgroup
operations are locked out using scx_cgroup_rwsem. This avoids situations
like task prep taking place while the task is being moved across cgroups,
making things easier for BPF schedulers.
v7: - cgroup interface file visibility toggling is dropped in favor just
warning messages. Dynamically changing interface visiblity caused more
confusion than helping.
v6: - Updated to reflect the removal of SCX_KF_SLEEPABLE.
- Updated to use CONFIG_GROUP_SCHED_WEIGHT and fixes for
!CONFIG_FAIR_GROUP_SCHED && CONFIG_EXT_GROUP_SCHED.
v5: - Flipped the locking order between scx_cgroup_rwsem and
cpus_read_lock() to avoid locking order conflict w/ cpuset. Better
documentation around locking.
- sched_move_task() takes an early exit if the source and destination
are identical. This triggered the warning in scx_cgroup_can_attach()
as it left p->scx.cgrp_moving_from uncleared. Updated the cgroup
migration path so that ops.cgroup_prep_move() is skipped for identity
migrations so that its invocations always match ops.cgroup_move()
one-to-one.
v4: - Example schedulers moved into their own patches.
- Fix build failure when !CONFIG_CGROUP_SCHED, reported by Andrea Righi.
v3: - Make scx_example_pair switch all tasks by default.
- Convert to BPF inline iterators.
- scx_bpf_task_cgroup() is added to determine the current cgroup from
CPU controller's POV. This allows BPF schedulers to accurately track
CPU cgroup membership.
- scx_example_flatcg added. This demonstrates flattened hierarchy
implementation of CPU cgroup control and shows significant performance
improvement when cgroups which are nested multiple levels are under
competition.
v2: - Build fixes for different CONFIG combinations.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: David Vernet <dvernet@meta.com>
Acked-by: Josh Don <joshdon@google.com>
Acked-by: Hao Luo <haoluo@google.com>
Acked-by: Barret Rhoden <brho@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Andrea Righi <andrea.righi@canonical.com>
|
|
We already have some testcases verifying that we can call
BPF_PROG_TYPE_SYSCALL progs and invoke scx_bpf_exit(). Let's extend that to
also call scx_bpf_create_dsq() so we get coverage for that as well.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Add basic selftests.
Signed-off-by: David Vernet <dvernet@meta.com>
Acked-by: Tejun Heo <tj@kernel.org>
|