summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2025-04-10bpf: Convert queue_stack map to rqspinlockKumar Kartikeya Dwivedi
Replace all usage of raw_spinlock_t in queue_stack_maps.c with rqspinlock. This is a map type with a set of open syzbot reports reproducing possible deadlocks. Prior attempt to fix the issues was at [0], but was dropped in favor of this approach. Make sure we return the -EBUSY error in case of possible deadlocks or timeouts, just to make sure user space or BPF programs relying on the error code to detect problems do not break. With these changes, the map should be safe to access in any context, including NMIs. [0]: https://lore.kernel.org/all/20240429165658.1305969-1-sidchintamaneni@gmail.com Reported-by: syzbot+8bdfc2c53fb2b63e1871@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/0000000000004c3fc90615f37756@google.com Reported-by: syzbot+252bc5c744d0bba917e1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000c80abd0616517df9@google.com Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250410153142.2064340-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-10bpf: Use architecture provided res_smp_cond_load_acquireKumar Kartikeya Dwivedi
In v2 of rqspinlock [0], we fixed potential problems with WFE usage in arm64 to fallback to a version copied from Ankur's series [1]. This logic was moved into arch-specific headers in v3 [2]. However, we missed using the arch-provided res_smp_cond_load_acquire in commit ebababcd0372 ("rqspinlock: Hardcode cond_acquire loops for arm64") due to a rebasing mistake between v2 and v3 of the rqspinlock series. Fix the typo to fallback to the arm64 definition as we did in v2. [0]: https://lore.kernel.org/bpf/20250206105435.2159977-18-memxor@gmail.com [1]: https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com [2]: https://lore.kernel.org/bpf/20250303152305.3195648-9-memxor@gmail.com Fixes: ebababcd0372 ("rqspinlock: Hardcode cond_acquire loops for arm64") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250410145512.1876745-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-09timekeeping: Add a lockdep override in tick_freeze()Sebastian Andrzej Siewior
tick_freeze() acquires a raw spinlock (tick_freeze_lock). Later in the callchain (timekeeping_suspend() -> mc146818_avoid_UIP()) the RTC driver acquires a spinlock which becomes a sleeping lock on PREEMPT_RT. Lockdep complains about this lock nesting. Add a lockdep override for this special case and a comment explaining why it is okay. Reported-by: Borislav Petkov <bp@alien8.de> Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250404133429.pnAzf-eF@linutronix.de Closes: https://lore.kernel.org/all/20250330113202.GAZ-krsjAnurOlTcp-@fat_crate.local/ Closes: https://lore.kernel.org/all/CAP-bSRZ0CWyZZsMtx046YV8L28LhY0fson2g4EqcwRAVN1Jk+Q@mail.gmail.com/
2025-04-09hrtimer: Add missing ACCESS_PRIVATE() for hrtimer::functionNam Cao
The "function" field of struct hrtimer has been changed to private, but two instances have not been converted to use ACCESS_PRIVATE(). Convert them to use ACCESS_PRIVATE(). Fixes: 04257da0c99c ("hrtimers: Make callback function pointer private") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250408103854.1851093-1-namcao@linutronix.de Closes: https://lore.kernel.org/oe-kbuild-all/202504071931.vOVl13tt-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202504072155.5UAZjYGU-lkp@intel.com/
2025-04-09tracing: Do not add length to print format in synthetic eventsSteven Rostedt
The following causes a vsnprintf fault: # echo 's:wake_lat char[] wakee; u64 delta;' >> /sys/kernel/tracing/dynamic_events # echo 'hist:keys=pid:ts=common_timestamp.usecs if !(common_flags & 0x18)' > /sys/kernel/tracing/events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:delta=common_timestamp.usecs-$ts:onmatch(sched.sched_waking).trace(wake_lat,next_comm,$delta)' > /sys/kernel/tracing/events/sched/sched_switch/trigger Because the synthetic event's "wakee" field is created as a dynamic string (even though the string copied is not). The print format to print the dynamic string changed from "%*s" to "%s" because another location (__set_synth_event_print_fmt()) exported this to user space, and user space did not need that. But it is still used in print_synth_event(), and the output looks like: <idle>-0 [001] d..5. 193.428167: wake_lat: wakee=(efault)sshd-sessiondelta=155 sshd-session-879 [001] d..5. 193.811080: wake_lat: wakee=(efault)kworker/u34:5delta=58 <idle>-0 [002] d..5. 193.811198: wake_lat: wakee=(efault)bashdelta=91 bash-880 [002] d..5. 193.811371: wake_lat: wakee=(efault)kworker/u35:2delta=21 <idle>-0 [001] d..5. 193.811516: wake_lat: wakee=(efault)sshd-sessiondelta=129 sshd-session-879 [001] d..5. 193.967576: wake_lat: wakee=(efault)kworker/u34:5delta=50 The length isn't needed as the string is always nul terminated. Just print the string and not add the length (which was hard coded to the max string length anyway). Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <zanussi@kernel.org> Cc: Douglas Raillard <douglas.raillard@arm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/20250407154139.69955768@gandalf.local.home Fixes: 4d38328eb442d ("tracing: Fix synth event printk format for str fields"); Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-09dma/contiguous: avoid warning about unused size_bytesArnd Bergmann
When building with W=1, this variable is unused for configs with CONFIG_CMA_SIZE_SEL_PERCENTAGE=y: kernel/dma/contiguous.c:67:26: error: 'size_bytes' defined but not used [-Werror=unused-const-variable=] Change this to a macro to avoid the warning. Fixes: c64be2bb1c6e ("drivers: add Contiguous Memory Allocator") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20250409151557.3890443-1-arnd@kernel.org
2025-04-08Merge tag 'probes-fixes-v6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - fprobe: remove fprobe_hlist_node when module unloading When a fprobe target module is removed, the fprobe_hlist_node should be removed from the fprobe's hash table to prevent reusing accidentally if another module is loaded at the same address. - fprobe: lock module while registering fprobe The module containing the function to be probeed is locked using a reference counter until the fprobe registration is complete, which prevents use after free. - fprobe-events: fix possible UAF on modules Basically as same as above, but in the fprobe-events layer we also need to get module reference counter when we find the tracepoint in the module. * tag 'probes-fixes-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: fprobe: Cleanup fprobe hash when module unloading tracing: fprobe events: Fix possible UAF on modules tracing: fprobe: Fix to lock module while registering fprobe
2025-04-08Merge tag 'cgroup-for-6.15-rc1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - A number of cpuset remote partition related fixes and cleanups along with selftest updates. - A change from this merge window made cgroup_rstat_updated_list() called outside cgroup_rstat_lock leading to list corruptions. Fix it by relocating the call inside the lock. * tag 'cgroup-for-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup/cpuset: Fix race between newly created partition and dying one cgroup: rstat: call cgroup_rstat_updated_list with cgroup_rstat_lock selftest/cgroup: Add a remote partition transition test to test_cpuset_prs.sh selftest/cgroup: Clean up and restructure test_cpuset_prs.sh selftest/cgroup: Update test_cpuset_prs.sh to use | as effective CPUs and state separator cgroup/cpuset: Remove unneeded goto in sched_partition_write() and rename it cgroup/cpuset: Code cleanup and comment update cgroup/cpuset: Don't allow creation of local partition over a remote one cgroup/cpuset: Remove remote_partition_check() & make update_cpumasks_hier() handle remote partition cgroup/cpuset: Fix error handling in remote_partition_disable() cgroup/cpuset: Fix incorrect isolated_cpus update in update_parent_effective_cpumask()
2025-04-08perf: Fix hang while freeing sigtrap eventFrederic Weisbecker
Perf can hang while freeing a sigtrap event if a related deferred signal hadn't managed to be sent before the file got closed: perf_event_overflow() task_work_add(perf_pending_task) fput() task_work_add(____fput()) task_work_run() ____fput() perf_release() perf_event_release_kernel() _free_event() perf_pending_task_sync() task_work_cancel() -> FAILED rcuwait_wait_event() Once task_work_run() is running, the list of pending callbacks is removed from the task_struct and from this point on task_work_cancel() can't remove any pending and not yet started work items, hence the task_work_cancel() failure and the hang on rcuwait_wait_event(). Task work could be changed to remove one work at a time, so a work running on the current task can always cancel a pending one, however the wait / wake design is still subject to inverted dependencies when remote targets are involved, as pictured by Oleg: T1 T2 fd = perf_event_open(pid => T2->pid); fd = perf_event_open(pid => T1->pid); close(fd) close(fd) <IRQ> <IRQ> perf_event_overflow() perf_event_overflow() task_work_add(perf_pending_task) task_work_add(perf_pending_task) </IRQ> </IRQ> fput() fput() task_work_add(____fput()) task_work_add(____fput()) task_work_run() task_work_run() ____fput() ____fput() perf_release() perf_release() perf_event_release_kernel() perf_event_release_kernel() _free_event() _free_event() perf_pending_task_sync() perf_pending_task_sync() rcuwait_wait_event() rcuwait_wait_event() Therefore the only option left is to acquire the event reference count upon queueing the perf task work and release it from the task work, just like it was done before 3a5465418f5f ("perf: Fix event leak upon exec and file release") but without the leaks it fixed. Some adjustments are necessary to make it work: * A child event might dereference its parent upon freeing. Care must be taken to release the parent last. * Some places assuming the event doesn't have any reference held and therefore can be freed right away must instead put the reference and let the reference counting to its job. Reported-by: "Yi Lai" <yi1.lai@linux.intel.com> Closes: https://lore.kernel.org/all/Zx9Losv4YcJowaP%2F@ly-workstation/ Reported-by: syzbot+3c4321e10eea460eb606@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/673adf75.050a0220.87769.0024.GAE@google.com/ Fixes: 3a5465418f5f ("perf: Fix event leak upon exec and file release") Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250304135446.18905-1-frederic@kernel.org
2025-04-08sched_ext: Mark SCX_OPS_HAS_CGROUP_WEIGHT for deprecationTejun Heo
SCX_OPS_HAS_CGROUP_WEIGHT was only used to suppress the missing cgroup weight support warnings. Now that the warnings are removed, the flag doesn't do anything. Mark it for deprecation and remove its usage from scx_flatcg. v2: Actually include the scx_flatcg update. Signed-off-by: Tejun Heo <tj@kernel.org> Suggested-and-reviewed-by: Andrea Righi <arighi@nvidia.com>
2025-04-08sched_ext: Remove cpu.weight / cpu.idle unimplemented warningsTejun Heo
sched_ext generates warnings when cpu.weight / cpu.idle are set to non-default values if the BPF scheduler doesn't implement weight support. These warnings don't provide much value while adding constant annoyance. A BPF scheduler may not implement any particular behavior and there's nothing particularly special about missing cgroup weight support. Drop the warnings. Signed-off-by: Tejun Heo <tj@kernel.org>
2025-04-08sched_ext: Use kvzalloc for large exit_dump allocationBreno Leitao
Replace kzalloc with kvzalloc for the exit_dump buffer allocation, which can require large contiguous memory depending on the implementation. This change prevents allocation failures by allowing the system to fall back to vmalloc when contiguous memory allocation fails. Since this buffer is only used for debugging purposes, physical memory contiguity is not required, making vmalloc a suitable alternative. Cc: stable@vger.kernel.org Fixes: 07814a9439a3b0 ("sched_ext: Print debug dump after an error exit") Suggested-by: Rik van Riel <riel@surriel.com> Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-04-08tracing: fprobe: Cleanup fprobe hash when module unloadingMasami Hiramatsu (Google)
Cleanup fprobe address hash table on module unloading because the target symbols will be disappeared when unloading module and not sure the same symbol is mapped on the same address. Note that this is at least disables the fprobes if a part of target symbols on the unloaded modules. Unlike kprobes, fprobe does not re-enable the probe point by itself. To do that, the caller should take care register/unregister fprobe when loading/unloading modules. This simplifies the fprobe state managememt related to the module loading/unloading. Link: https://lore.kernel.org/all/174343534473.843280.13988101014957210732.stgit@devnote2/ Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2025-04-07uprobes: Avoid false-positive lockdep splat on CONFIG_PREEMPT_RT=y in the ↵Andrii Nakryiko
ri_timer() uprobe timer callback, use raw_write_seqcount_*() Avoid a false-positive lockdep warning in the CONFIG_PREEMPT_RT=y configuration when using write_seqcount_begin() in the uprobe timer callback by using raw_write_* APIs. Uprobe's use of timer callback is guaranteed to not race with itself for a given uprobe_task, and as such seqcount's insistence on having preemption disabled on the writer side is irrelevant. So switch to raw_ variants of seqcount API instead of disabling preemption unnecessarily. Also, point out in the comments more explicitly why we use seqcount despite our reader side being rather simple and never retrying. We favor well-maintained kernel primitive in favor of open-coding our own memory barriers. Fixes: 8622e45b5da1 ("uprobes: Reuse return_instances between multiple uretprobes within task") Reported-by: Alexei Starovoitov <ast@kernel.org> Suggested-by: Sebastian Siewior <bigeasy@linutronix.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20250404194848.2109539-1-andrii@kernel.org
2025-04-07tracing: Hide get_vm_area() from MMUless buildsSteven Rostedt
The function get_vm_area() is not defined for non-MMU builds and causes a build error if it is used. Hide the map_pages() function around a: #ifdef CONFIG_MMU to keep it from being compiled when CONFIG_MMU is not set. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250407120111.2ccc9319@gandalf.local.home Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/4f8ece8b-8862-4f7c-8ede-febd28f8a9fe@roeck-us.net/ Fixes: 394f3f02de531 ("tracing: Use vmap_page_range() to map memmap ring buffer") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-06perf/core: Fix WARN_ON(!ctx) in __free_event() for partial initGabriel Shahrouzi
Move the get_ctx(child_ctx) call and the child_event->ctx assignment to occur immediately after the child event is allocated. Ensure that child_event->ctx is non-NULL before any subsequent error path within inherit_event calls free_event(), satisfying the assumptions of the cleanup code. Details: There's no clear Fixes tag, because this bug is a side-effect of multiple interacting commits over time (up to 15 years old), not a single regression. The code initially incremented refcount then assigned context immediately after the child_event was created. Later, an early validity check for child_event was added before the refcount/assignment. Even later, a WARN_ON_ONCE() cleanup check was added, assuming event->ctx is valid if the pmu_ctx is valid. The problem is that the WARN_ON_ONCE() could trigger after the initial check passed but before child_event->ctx was assigned, violating its precondition. The solution is to assign child_event->ctx right after its initial validation. This ensures the context exists for any subsequent checks or cleanup routines, resolving the WARN_ON_ONCE(). To resolve it, defer the refcount update and child_event->ctx assignment directly after child_event->pmu_ctx is set but before checking if the parent event is orphaned. The cleanup routine depends on event->pmu_ctx being non-NULL before it verifies event->ctx is non-NULL. This also maintains the author's original intent of passing in child_ctx to find_get_pmu_context before its refcount/assignment. [ mingo: Expanded the changelog from another email by Gabriel Shahrouzi. ] Reported-by: syzbot+ff3aa851d46ab82953a3@syzkaller.appspotmail.com Signed-off-by: Gabriel Shahrouzi <gshahrouzi@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://lore.kernel.org/r/20250405203036.582721-1-gshahrouzi@gmail.com Closes: https://syzkaller.appspot.com/bug?extid=ff3aa851d46ab82953a3
2025-04-06Merge tag 'perf-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fix from Ingo Molnar: "Fix a perf events time accounting bug" * tag 'perf-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix child_total_time_enabled accounting bug at task exit
2025-04-06Merge tag 'sched-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: - Fix a nonsensical Kconfig combination - Remove an unnecessary rseq-notification * tag 'sched-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rseq: Eliminate useless task_work on execve sched/isolation: Make CONFIG_CPU_ISOLATION depend on CONFIG_SMP
2025-04-06Merge tag 'timers-cleanups-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanups from Thomas Gleixner: "A set of final cleanups for the timer subsystem: - Convert all del_timer[_sync]() instances over to the new timer_delete[_sync]() API and remove the legacy wrappers. Conversion was done with coccinelle plus some manual fixups as coccinelle chokes on scoped_guard(). - The final cleanup of the hrtimer_init() to hrtimer_setup() conversion. This has been delayed to the end of the merge window, so that all patches which have been merged through other trees are in mainline and all new users are catched. Doing this right before rc1 ensures that new code which is merged post rc1 is not introducing new instances of the original functionality" * tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tracing/timers: Rename the hrtimer_init event to hrtimer_setup hrtimers: Rename debug_init_on_stack() to debug_setup_on_stack() hrtimers: Rename debug_init() to debug_setup() hrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper() hrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns() hrtimers: Make callback function pointer private hrtimers: Merge __hrtimer_init() into __hrtimer_setup() hrtimers: Switch to use __htimer_setup() hrtimers: Delete hrtimer_init() treewide: Convert new and leftover hrtimer_init() users treewide: Switch/rename to timer_delete[_sync]()
2025-04-06Merge tag 'irq-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more irq updates from Thomas Gleixner: "A set of updates for the interrupt subsystem: - A treewide cleanup for the irq_domain code, which makes the naming consistent and gets rid of the original oddity of naming domains 'host'. This is a trivial mechanical change and is done late to ensure that all instances have been catched and new code merged post rc1 wont reintroduce new instances. - A trivial consistency fix in the migration code The recent introduction of irq_force_complete_move() in the core code, causes a problem for the nostalgia crowd who maintains ia64 out of tree. The code assumes that hierarchical interrupt domains are enabled and dereferences irq_data::parent_data unconditionally. That works in mainline because both architectures which enable that code have hierarchical domains enabled. Though it breaks the ia64 build, which enables the functionality, but does not have hierarchical domains. While it's not really a problem for mainline today, this unconditional dereference is inconsistent and trivially fixable by using the existing helper function irqd_get_parent_data(), which has the appropriate #ifdeffery in place" * tag 'irq-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/migration: Use irqd_get_parent_data() in irq_force_complete_move() irqdomain: Stop using 'host' for domain irqdomain: Rename irq_get_default_host() to irq_get_default_domain() irqdomain: Rename irq_set_default_host() to irq_set_default_domain()
2025-04-06Merge tag 'timers-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A revert to fix a adjtimex() regression: The recent change to prevent that time goes backwards for the coarse time getters due to immediate multiplier adjustments via adjtimex(), changed the way how the timekeeping core treats that. That change result in a regression on the adjtimex() side, which is user space visible: 1) The forwarding of the base time moves the update out of the original period and establishes a new one. That's changing the behaviour of the [PF]LL control, which user space expects to be applied periodically. 2) The clearing of the accumulated NTP error due to #1, changes the behaviour as well. An attempt to delay the multiplier/frequency update to the next tick did not solve the problem as userspace expects that the multiplier or frequency updates are in effect, when the syscall returns. There is a different solution for the coarse time problem available, so revert the offending commit to restore the existing adjtimex() behaviour" * tag 'timers-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "timekeeping: Fix possible inconsistencies in _COARSE clockids"
2025-04-05Merge tag 'kbuild-v6.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Improve performance in gendwarfksyms - Remove deprecated EXTRA_*FLAGS and KBUILD_ENABLE_EXTRA_GCC_CHECKS - Support CONFIG_HEADERS_INSTALL for ARCH=um - Use more relative paths to sources files for better reproducibility - Support the loong64 Debian architecture - Add Kbuild bash completion - Introduce intermediate vmlinux.unstripped for architectures that need static relocations to be stripped from the final vmlinux - Fix versioning in Debian packages for -rc releases - Treat missing MODULE_DESCRIPTION() as an error - Convert Nios2 Makefiles to use the generic rule for built-in DTB - Add debuginfo support to the RPM package * tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (40 commits) kbuild: rpm-pkg: build a debuginfo RPM kconfig: merge_config: use an empty file as initfile nios2: migrate to the generic rule for built-in DTB rust: kbuild: skip `--remap-path-prefix` for `rustdoc` kbuild: pacman-pkg: hardcode module installation path kbuild: deb-pkg: don't set KBUILD_BUILD_VERSION unconditionally modpost: require a MODULE_DESCRIPTION() kbuild: make all file references relative to source root x86: drop unnecessary prefix map configuration kbuild: deb-pkg: add comment about future removal of KDEB_COMPRESS kbuild: Add a help message for "headers" kbuild: deb-pkg: remove "version" variable in mkdebian kbuild: deb-pkg: fix versioning for -rc releases Documentation/kbuild: Fix indentation in modules.rst example x86: Get rid of Makefile.postlink kbuild: Create intermediate vmlinux build with relocations preserved kbuild: Introduce Kconfig symbol for linking vmlinux with relocations kbuild: link-vmlinux.sh: Make output file name configurable kbuild: do not generate .tmp_vmlinux*.map when CONFIG_VMLINUX_MAP=y Revert "kheaders: Ignore silly-rename files" ...
2025-04-05tracing/timers: Rename the hrtimer_init event to hrtimer_setupNam Cao
The function hrtimer_init() doesn't exist anymore. It was replaced by hrtimer_setup(). Thus, rename the hrtimer_init trace event to hrtimer_setup to keep it consistent. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/cba84c3d853c5258aa3a262363a6eac08e2c7afc.1738746927.git.namcao@linutronix.de
2025-04-05hrtimers: Rename debug_init_on_stack() to debug_setup_on_stack()Nam Cao
All the hrtimer_init*() functions have been renamed to hrtimer_setup*(). Rename debug_init_on_stack() to debug_setup_on_stack() as well, to keep the names consistent. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/073cf6162779a2f5b12624677d4c49ee7eccc1ed.1738746927.git.namcao@linutronix.de
2025-04-05hrtimers: Rename debug_init() to debug_setup()Nam Cao
All the hrtimer_init*() functions have been renamed to hrtimer_setup*(). Rename debug_init() to debug_setup() as well, to keep the names consistent. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/4b730c1f79648b16a1c5413f928fdc2e138dfc43.1738746927.git.namcao@linutronix.de
2025-04-05hrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper()Nam Cao
All the hrtimer_init*() functions have been renamed to hrtimer_setup*(). Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper() as well, to keep the names consistent. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/807694aedad9353421c4a7347629a30c5c31026f.1738746927.git.namcao@linutronix.de
2025-04-05hrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns()Nam Cao
The struct hrtimer::function field can only be changed using hrtimer_setup*() or hrtimer_update_function(), and both already null-check 'function'. Therefore, null-checking 'function' in hrtimer_start_range_ns() is not necessary. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/4661c571ee87980c340ccc318fc1a473c0c8f6bc.1738746927.git.namcao@linutronix.de
2025-04-05hrtimers: Make callback function pointer privateNam Cao
Make the struct hrtimer::function field private, to prevent users from changing this field in an unsafe way. hrtimer_update_function() should be used if the callback function needs to be changed. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/7d0e6e0c5c59a64a9bea940051aac05d750bc0c2.1738746927.git.namcao@linutronix.de
2025-04-05hrtimers: Merge __hrtimer_init() into __hrtimer_setup()Nam Cao
__hrtimer_init() is only called by __hrtimer_setup(). Simplify by merging __hrtimer_init() into __hrtimer_setup(). Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/8a0a847a35f711f66b2d05b57255aa44e7e61279.1738746927.git.namcao@linutronix.de
2025-04-05hrtimers: Switch to use __htimer_setup()Nam Cao
__hrtimer_init_sleeper() calls __hrtimer_init() and also sets up the callback function. But there is already __hrtimer_setup() which does both actions. Switch to use __hrtimer_setup() to simplify the code. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/d9a45a51b6a8aa0045310d63f73753bf6b33f385.1738746927.git.namcao@linutronix.de
2025-04-05hrtimers: Delete hrtimer_init()Nam Cao
hrtimer_init() is now unused. Delete it. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/003722f60c7a2a4f8d4ed24fb741aa313b7e5136.1738746927.git.namcao@linutronix.de
2025-04-05treewide: Switch/rename to timer_delete[_sync]()Thomas Gleixner
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree over and remove the historical wrapper inlines. Conversion was done with coccinelle plus manual fixups where necessary. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-04-04Revert "timekeeping: Fix possible inconsistencies in _COARSE clockids"Thomas Gleixner
This reverts commit 757b000f7b936edf79311ab0971fe465bbda75ea. Miroslav reported that the changes for handling the inconsistencies in the coarse time getters result in a regression on the adjtimex() side. There are two issues: 1) The forwarding of the base time moves the update out of the original period and establishes a new one. 2) The clearing of the accumulated NTP error is changing the behaviour as well. Userspace expects that multiplier/frequency updates are in effect, when the syscall returns, so delaying the update to the next tick is not solving the problem either. Revert the change, so that the established expectations of user space implementations (ntpd, chronyd) are restored. The re-introduced inconsistency of the coarse time getters will be addressed in a subsequent fix. Fixes: 757b000f7b93 ("timekeeping: Fix possible inconsistencies in _COARSE clockids") Reported-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/Z-qsg6iDGlcIJulJ@localhost
2025-04-04genirq/migration: Use irqd_get_parent_data() in irq_force_complete_move()Thomas Gleixner
Frank reported, that the common irq_force_complete_move() breaks the out of tree build of ia64. The reason is that ia64 uses the migration code, but does not have hierarchical interrupt domains enabled. This went unnoticed in mainline as both x86 and RISC-V have hierarchical domains enabled. Not that it matters for mainline, but it's still inconsistent. Use irqd_get_parent_data() instead of accessing the parent_data field directly. The helper returns NULL when hierarchical domains are disabled otherwise it accesses the parent_data field of the domain. No functional change. Fixes: 751dc837dabd ("genirq: Introduce common irq_force_complete_move() implementation") Reported-by: Frank Scheiner <frank.scheiner@web.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Frank Scheiner <frank.scheiner@web.de> Link: https://lore.kernel.org/all/87h634ugig.ffs@tglx
2025-04-04irqdomain: Rename irq_get_default_host() to irq_get_default_domain()Jiri Slaby (SUSE)
Naming interrupt domains host is confusing at best and the irqdomain code uses both domain and host inconsistently. Therefore rename irq_get_default_host() to irq_get_default_domain(). Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250319092951.37667-4-jirislaby@kernel.org
2025-04-04irqdomain: Rename irq_set_default_host() to irq_set_default_domain()Jiri Slaby (SUSE)
Naming interrupt domains host is confusing at best and the irqdomain code uses both domain and host inconsistently. Therefore rename irq_set_default_host() to irq_set_default_domain(). Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250319092951.37667-3-jirislaby@kernel.org
2025-04-03Merge tag 'trace-ringbuffer-v6.15-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring-buffer updates from Steven Rostedt: "Persistent buffer cleanups and simplifications. It was mistaken that the physical memory returned from "reserve_mem" had to be vmap()'d to get to it from a virtual address. But reserve_mem already maps the memory to the virtual address of the kernel so a simple phys_to_virt() can be used to get to the virtual address from the physical memory returned by "reserve_mem". With this new found knowledge, the code can be cleaned up and simplified. - Enforce that the persistent memory is page aligned As the buffers using the persistent memory are all going to be mapped via pages, make sure that the memory given to the tracing infrastructure is page aligned. If it is not, it will print a warning and fail to map the buffer. - Use phys_to_virt() to get the virtual address from reserve_mem Instead of calling vmap() on the physical memory returned from "reserve_mem", use phys_to_virt() instead. As the memory returned by "memmap" or any other means where a physical address is given to the tracing infrastructure, it still needs to be vmap(). Since this memory can never be returned back to the buddy allocator nor should it ever be memmory mapped to user space, flag this buffer and up the ref count. The ref count will keep it from ever being freed, and the flag will prevent it from ever being memory mapped to user space. - Use vmap_page_range() for memmap virtual address mapping For the memmap buffer, instead of allocating an array of struct pages, assigning them to the contiguous phsycial memory and then passing that to vmap(), use vmap_page_range() instead - Replace flush_dcache_folio() with flush_kernel_vmap_range() Instead of calling virt_to_folio() and passing that to flush_dcache_folio(), just call flush_kernel_vmap_range() directly. This also fixes a bug where if a subbuffer was bigger than PAGE_SIZE only the PAGE_SIZE portion would be flushed" * tag 'trace-ringbuffer-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Use flush_kernel_vmap_range() over flush_dcache_folio() tracing: Use vmap_page_range() to map memmap ring buffer tracing: Have reserve_mem use phys_to_virt() and separate from memmap buffer tracing: Enforce the persistent ring buffer to be page aligned
2025-04-03Merge tag 'mm-stable-2025-04-02-22-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - The series "mm: fixes for fallouts from mem_init() cleanup" from Mike Rapoport fixes a couple of issues with the just-merged "arch, mm: reduce code duplication in mem_init()" series - The series "MAINTAINERS: add my isub-entries to MM part." from Mike Rapoport does some maintenance on MAINTAINERS - The series "remove tlb_remove_page_ptdesc()" from Qi Zheng does some cleanup work to the page mapping code - The series "mseal system mappings" from Jeff Xu permits sealing of "system mappings", such as vdso, vvar, vvar_vclock, vectors (arm compat-mode), sigpage (arm compat-mode) - Plus the usual shower of singleton patches * tag 'mm-stable-2025-04-02-22-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (31 commits) mseal sysmap: add arch-support txt mseal sysmap: enable s390 selftest: test system mappings are sealed mseal sysmap: update mseal.rst mseal sysmap: uprobe mapping mseal sysmap: enable arm64 mseal sysmap: enable x86-64 mseal sysmap: generic vdso vvar mapping selftests: x86: test_mremap_vdso: skip if vdso is msealed mseal sysmap: kernel config and header change mm: pgtable: remove tlb_remove_page_ptdesc() x86: pgtable: convert to use tlb_remove_ptdesc() riscv: pgtable: unconditionally use tlb_remove_ptdesc() mm: pgtable: convert some architectures to use tlb_remove_ptdesc() mm: pgtable: change pt parameter of tlb_remove_ptdesc() to struct ptdesc* mm: pgtable: make generic tlb_remove_table() use struct ptdesc microblaze/mm: put mm_cmdline_setup() in .init.text section mm/memory_hotplug: fix call folio_test_large with tail page in do_migrate_range MAINTAINERS: mm: add entry for secretmem MAINTAINERS: mm: add entry for numa memblocks and numa emulation ...
2025-04-03Merge tag 'sched_ext-for-6.15-rc0-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Calling scx_bpf_create_dsq() with the same ID would succeed creating duplicate DSQs. Fix it to return -EEXIST. - scx_select_cpu_dfl() fixes and cleanups. - Synchronize tool/sched_ext with external scheduler repo. While this isn't a fix. There's no risk to the kernel and it's better if they stay synced closer. * tag 'sched_ext-for-6.15-rc0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: tools/sched_ext: Sync with scx repo sched_ext: initialize built-in idle state before ops.init() sched_ext: create_dsq: Return -EEXIST on duplicate request sched_ext: Remove a meaningless conditional goto in scx_select_cpu_dfl() sched_ext: idle: Fix return code of scx_select_cpu_dfl()
2025-04-03Merge tag 'trace-v6.15-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix build error when CONFIG_PROBE_EVENTS_BTF_ARGS is not enabled The tracing of arguments in the function tracer depends on some functions that are only defined when PROBE_EVENTS_BTF_ARGS is enabled. In fact, PROBE_EVENTS_BTF_ARGS also depends on all the same configs as the function argument tracing requires. Just have the function argument tracing depend on PROBE_EVENTS_BTF_ARGS. - Free module_delta for persistent ring buffer instance When an instance holds the persistent ring buffer, it allocates a helper array to hold the deltas between where modules are loaded on the last boot and the current boot. This array needs to be freed when the instance is freed. - Add cond_resched() to loop in ftrace_graph_set_hash() The hash functions in ftrace loop over every function that can be enabled by ftrace. This can be 50,000 functions or more. This loop is known to trigger soft lockup warnings and requires a cond_resched(). The loop in ftrace_graph_set_hash() was missing it. - Fix the event format verifier to include "%*p.." arguments To prevent events from dereferencing stale pointers that can happen if a trace event uses a dereferece pointer to something that was not copied into the ring buffer and can be freed by the time the trace is read, a verifier is called. At boot or module load, the verifier scans the print format string for pointers that can be dereferenced and it checks the arguments to make sure they do not contain something that can be freed. The "%*p" was not handled, which would add another argument and cause the verifier to not only not verify this pointer, but it will look at the wrong argument for every pointer after that. - Fix mcount sorttable building for different endian type target When modifying the ELF file to sort the mcount_loc table in the sorttable.c code, the endianess of the file and the host is used to determine if the bytes need to be swapped when calculations are done. A change was made to the sorting of the mcount_loc that read the values from the ELF file into an array and the swap happened on the filling of the array. But one of the calculations of the array still did the swap when it did not need to. This caused building on a little endian machine for a big endian target to not find the mcount function in the 'nm' table and it zeroed it out, causing there to be no functions available to trace. - Add goto out_unlock jump to rv_register_monitor() on error path One of the error paths in rv_register_monitor() just returned the error when it should have jumped to the out_unlock label to release the mutex. * tag 'trace-v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rv: Fix missing unlock on double nested monitors return path scripts/sorttable: Fix endianness handling in build-time mcount sort tracing: Verify event formats that have "%*p.." ftrace: Add cond_resched() to ftrace_graph_set_hash() tracing: Free module_delta on freeing of persistent ring buffer ftrace: Have tracing function args depend on PROBE_EVENTS_BTF_ARGS
2025-04-03rseq: Eliminate useless task_work on execveMathieu Desnoyers
Eliminate a useless task_work on execve by moving the call to rseq_set_notify_resume() from sched_mm_cid_after_execve() to the error path of bprm_execve(). The call to rseq_set_notify_resume() from sched_mm_cid_after_execve() is pointless in the success case, because rseq_execve() will clear the rseq pointer before returning to userspace. sched_mm_cid_after_execve() is called from both the success and error paths of bprm_execve(). The call to rseq_set_notify_resume() is needed on error because the mm_cid may have changed. Also move the rseq_execve() to right after sched_mm_cid_after_execve() in bprm_execve(). [ mingo: Merged to a recent upstream kernel, extended the changelog. ] Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250327132945.1558783-1-mathieu.desnoyers@efficios.com
2025-04-02Merge tag 'tty-6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver updates from Greg KH: "Here is the big set of serial and tty driver updates for 6.15-rc1. Include in here are the following: - more great tty layer cleanups from Jiri. Someday this will be done, but that's not going to be any year soon... - kdb debug driver reverts to fix a reported issue - lots of .dts binding updates for different devices with serial devices - lots of tiny updates and tweaks and a few bugfixes for different serial drivers. All of these have been in linux-next for a while with no reported issues" * tag 'tty-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (79 commits) tty: serial: fsl_lpuart: Fix unused variable 'sport' build warning serial: stm32: do not deassert RS485 RTS GPIO prematurely serial: 8250: add driver for NI UARTs dt-bindings: serial: snps-dw-apb-uart: document RZ/N1 binding without DMA serial: icom: fix code format problems serial: sh-sci: Save and restore more registers tty: serial: pl011: remove incorrect of_match_ptr annotation dt-bindings: serial: snps-dw-apb-uart: Add support for rk3562 tty: serial: lpuart: only disable CTS instead of overwriting the whole UARTMODIR register tty: caif: removed unused function debugfs_tx() serial: 8250_dma: terminate correct DMA in tx_dma_flush() tty: serial: fsl_lpuart: rename register variables more specifically tty: serial: fsl_lpuart: use port struct directly to simply code tty: serial: fsl_lpuart: Use u32 and u8 for register variables tty: serial: fsl_lpuart: disable transmitter before changing RS485 related registers tty: serial: 8250: Add Brainboxes XC devices dt-bindings: serial: fsl-lpuart: support i.MX94 tty: serial: 8250: Add some more device IDs dt-bindings: serial: samsung: add exynos7870-uart compatible serial: 8250_dw: Comment possible corner cases in serial_out() implementation ...
2025-04-02Merge tag 'vfs-6.15-rc1.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Add a new maintainer for configfs - Fix exportfs module description - Place flexible array memeber at the end of an internal struct in the mount code - Add new maintainer for netfslib as Jeff Layton is stepping down as current co-maintainer - Fix error handling in cachefiles_get_directory() - Cleanup do_notify_pidfd() - Fix syscall number definitions in pidfd selftests - Fix racy usage of fs_struct->in exec during multi-threaded exec - Ensure correct exit code is reported when pidfs_exit() is called from release_task() for a delayed thread-group leader exit - Fix conflicting iomap flag definitions * tag 'vfs-6.15-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iomap: Fix conflicting values of iomap flags fs: namespace: Avoid -Wflex-array-member-not-at-end warning MAINTAINERS: configfs: add Andreas Hindborg as maintainer exportfs: add module description exit: fix the usage of delay_group_leader->exit_code in do_notify_parent() and pidfs_exit() netfs: add Paulo as maintainer and remove myself as Reviewer cachefiles: Fix oops in vfs_mkdir from cachefiles_get_directory exec: fix the racy usage of fs_struct->in_exec selftests/pidfd: fixes syscall number defines pidfs: cleanup the usage of do_notify_pidfd()
2025-04-02Merge tag 'objtool-urgent-2025-04-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Ingo Molnar: "These are objtool fixes and updates by Josh Poimboeuf, centered around the fallout from the new CONFIG_OBJTOOL_WERROR=y feature, which, despite its default-off nature, increased the profile/impact of objtool warnings: - Improve error handling and the presentation of warnings/errors - Revert the new summary warning line that some test-bot tools interpreted as new regressions - Fix a number of objtool warnings in various drivers, core kernel code and architecture code. About half of them are potential problems related to out-of-bounds accesses or potential undefined behavior, the other half are additional objtool annotations - Update objtool to latest (known) compiler quirks and objtool bugs triggered by compiler code generation - Misc fixes" * tag 'objtool-urgent-2025-04-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) objtool/loongarch: Add unwind hints in prepare_frametrace() rcu-tasks: Always inline rcu_irq_work_resched() context_tracking: Always inline ct_{nmi,irq}_{enter,exit}() sched/smt: Always inline sched_smt_active() objtool: Fix verbose disassembly if CROSS_COMPILE isn't set objtool: Change "warning:" to "error: " for fatal errors objtool: Always fail on fatal errors Revert "objtool: Increase per-function WARN_FUNC() rate limit" objtool: Append "()" to function name in "unexpected end of section" warning objtool: Ignore end-of-section jumps for KCOV/GCOV objtool: Silence more KCOV warnings, part 2 objtool, drm/vmwgfx: Don't ignore vmw_send_msg() for ORC objtool: Fix STACK_FRAME_NON_STANDARD for cold subfunctions objtool: Fix segfault in ignore_unreachable_insn() objtool: Fix NULL printf() '%s' argument in builtin-check.c:save_argv() objtool, lkdtm: Obfuscate the do_nothing() pointer objtool, regulator: rk808: Remove potential undefined behavior in rk806_set_mode_dcdc() objtool, ASoC: codecs: wcd934x: Remove potential undefined behavior in wcd934x_slim_irq_handler() objtool, Input: cyapa - Remove undefined behavior in cyapa_update_fw_store() objtool, panic: Disable SMAP in __stack_chk_fail() ...
2025-04-02Merge tag 'printk-for-6.15-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull more printk updates from Petr Mladek: - Silence warnings about candidates for ‘gnu_print’ format attribute * tag 'printk-for-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: vsnprintf: Silence false positive GCC warning for va_format() vsnprintf: Drop unused const char fmt * in va_format() vsnprintf: Mark binary printing functions with __printf() attribute tracing: Mark binary printing functions with __printf() attribute seq_file: Mark binary printing functions with __printf() attribute seq_buf: Mark binary printing functions with __printf() attribute
2025-04-02Merge tag 'rcu-fixes-v6.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU fix from Boqun Feng: - srcu: Make FORCE_NEED_SRCU_NMI_SAFE depend on RCU_EXPERT * tag 'rcu-fixes-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: srcu: Make FORCE_NEED_SRCU_NMI_SAFE depend on RCU_EXPERT
2025-04-02Merge tag 'kgdb-6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux Pull kgdb updates from Daniel Thompson: "Two cleanups this cycle. The larger of which is the removal of a private allocator within kdb and replacing it with regular memory allocation. The other adopts the simplified version of strscpy() in a couple of places in kdb" * tag 'kgdb-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux: kdb: Remove optional size arguments from strscpy() calls kdb: remove usage of static environment buffer
2025-04-02ring-buffer: Use flush_kernel_vmap_range() over flush_dcache_folio()Steven Rostedt
Some architectures do not have data cache coherency between user and kernel space. For these architectures, the cache needs to be flushed on both the kernel and user addresses so that user space can see the updates the kernel has made. Instead of using flush_dcache_folio() and playing with virt_to_folio() within the call to that function, use flush_kernel_vmap_range() which takes the virtual address and does the work for those architectures that need it. This also fixes a bug where the flush of the reader page only flushed one page. If the sub-buffer order is 1 or more, where the sub-buffer size would be greater than a page, it would miss the rest of the sub-buffer content, as the "reader page" is not just a page, but the size of a sub-buffer. Link: https://lore.kernel.org/all/CAG48ez3w0my4Rwttbc5tEbNsme6tc0mrSN95thjXUFaJ3aQ6SA@mail.gmail.com/ Cc: stable@vger.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Link: https://lore.kernel.org/20250402144953.920792197@goodmis.org Fixes: 117c39200d9d7 ("ring-buffer: Introducing ring-buffer mapping functions"); Suggested-by: Jann Horn <jannh@google.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-02tracing: Use vmap_page_range() to map memmap ring bufferSteven Rostedt
The code to map the physical memory retrieved by memmap currently allocates an array of pages to cover the physical memory and then calls vmap() to map it to a virtual address. Instead of using this temporary array of struct page descriptors, simply use vmap_page_range() that can directly map the contiguous physical memory to a virtual address. Link: https://lore.kernel.org/all/CAHk-=whUOfVucfJRt7E0AH+GV41ELmS4wJqxHDnui6Giddfkzw@mail.gmail.com/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/20250402144953.754618481@goodmis.org Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-02tracing: Have reserve_mem use phys_to_virt() and separate from memmap bufferSteven Rostedt
The reserve_mem kernel command line option may pass back a physical address, but the memory is still part of the normal memory just like using memblock_alloc() would be. This means that the physical memory returned by the reserve_mem command line option can be converted directly to virtual memory by simply using phys_to_virt(). When freeing the buffer there's no need to call vunmap() anymore as the memory allocated by reserve_mem is freed by the call to reserve_mem_release_by_name(). Because the persistent ring buffer can also be allocated via the memmap option, which *is* different than normal memory as it cannot be added back to the buddy system, it must be treated differently. It still needs to be virtually mapped to have access to it. It also can not be freed nor can it ever be memory mapped to user space. Create a new trace_array flag called TRACE_ARRAY_FL_MEMMAP which gets set if the buffer is created by the memmap option, and this will prevent the buffer from being memory mapped by user space. Also increment the ref count for memmap'ed buffers so that they can never be freed. Link: https://lore.kernel.org/all/Z-wFszhJ_9o4dc8O@kernel.org/ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/20250402144953.583750106@goodmis.org Suggested-by: Mike Rapoport <rppt@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>