summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
12 daysMerge tag 'mm-nonmm-stable-2025-10-02-15-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ida: Remove the ida_simple_xxx() API" from Christophe Jaillet completes the removal of this legacy IDR API - "panic: introduce panic status function family" from Jinchao Wang provides a number of cleanups to the panic code and its various helpers, which were rather ad-hoc and scattered all over the place - "tools/delaytop: implement real-time keyboard interaction support" from Fan Yu adds a few nice user-facing usability changes to the delaytop monitoring tool - "efi: Fix EFI boot with kexec handover (KHO)" from Evangelos Petrongonas fixes a panic which was happening with the combination of EFI and KHO - "Squashfs: performance improvement and a sanity check" from Phillip Lougher teaches squashfs's lseek() about SEEK_DATA/SEEK_HOLE. A mere 150x speedup was measured for a well-chosen microbenchmark - plus another 50-odd singleton patches all over the place * tag 'mm-nonmm-stable-2025-10-02-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (75 commits) Squashfs: reject negative file sizes in squashfs_read_inode() kallsyms: use kmalloc_array() instead of kmalloc() MAINTAINERS: update Sibi Sankar's email address Squashfs: add SEEK_DATA/SEEK_HOLE support Squashfs: add additional inode sanity checking lib/genalloc: fix device leak in of_gen_pool_get() panic: remove CONFIG_PANIC_ON_OOPS_VALUE ocfs2: fix double free in user_cluster_connect() checkpatch: suppress strscpy warnings for userspace tools cramfs: fix incorrect physical page address calculation kernel: prevent prctl(PR_SET_PDEATHSIG) from racing with parent process exit Squashfs: fix uninit-value in squashfs_get_parent kho: only fill kimage if KHO is finalized ocfs2: avoid extra calls to strlen() after ocfs2_sprintf_system_inode_name() kernel/sys.c: fix the racy usage of task_lock(tsk->group_leader) in sys_prlimit64() paths sched/task.h: fix the wrong comment on task_lock() nesting with tasklist_lock coccinelle: platform_no_drv_owner: handle also built-in drivers coccinelle: of_table: handle SPI device ID tables lib/decompress: use designated initializers for struct compress_format efi: support booting with kexec handover (KHO) ...
12 daysMerge tag 'mm-stable-2025-10-01-19-00' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "mm, swap: improve cluster scan strategy" from Kairui Song improves performance and reduces the failure rate of swap cluster allocation - "support large align and nid in Rust allocators" from Vitaly Wool permits Rust allocators to set NUMA node and large alignment when perforning slub and vmalloc reallocs - "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets for virtual address spaces for ops-level DAMOS filters - "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren Baghdasaryan reduces mmap_lock contention during reads of /proc/pid/maps - "mm/mincore: minor clean up for swap cache checking" from Kairui Song performs some cleanup in the swap code - "mm: vm_normal_page*() improvements" from David Hildenbrand provides code cleanup in the pagemap code - "add persistent huge zero folio support" from Pankaj Raghav provides a block layer speedup by optionalls making the huge_zero_pagepersistent, instead of releasing it when its refcount falls to zero - "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to the recently added Kexec Handover feature - "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap. To end the constant struggle with space shortage on 32-bit conflicting with 64-bit's needs - "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap code - "selftests/mm: Fix false positives and skip unsupported tests" from Donet Tom fixes a few things in our selftests code - "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised" from David Hildenbrand "allows individual processes to opt-out of THP=always into THP=madvise, without affecting other workloads on the system". It's a long story - the [1/N] changelog spells out the considerations - "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on the memdesc project. Please see https://kernelnewbies.org/MatthewWilcox/Memdescs and https://blogs.oracle.com/linux/post/introducing-memdesc - "Tiny optimization for large read operations" from Chi Zhiling improves the efficiency of the pagecache read path - "Better split_huge_page_test result check" from Zi Yan improves our folio splitting selftest code - "test that rmap behaves as expected" from Wei Yang adds some rmap selftests - "remove write_cache_pages()" from Christoph Hellwig removes that function and converts its two remaining callers - "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD selftests issues - "introduce kernel file mapped folios" from Boris Burkov introduces the concept of "kernel file pages". Using these permits btrfs to account its metadata pages to the root cgroup, rather than to the cgroups of random inappropriate tasks - "mm/pageblock: improve readability of some pageblock handling" from Wei Yang provides some readability improvements to the page allocator code - "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON to understand arm32 highmem - "tools: testing: Use existing atomic.h for vma/maple tests" from Brendan Jackman performs some code cleanups and deduplication under tools/testing/ - "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes a couple of 32-bit issues in tools/testing/radix-tree.c - "kasan: unify kasan_enabled() and remove arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific initialization code into a common arch-neutral implementation - "mm: remove zpool" from Johannes Weiner removes zspool - an indirection layer which now only redirects to a single thing (zsmalloc) - "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a couple of cleanups in the fork code - "mm: remove nth_page()" from David Hildenbrand makes rather a lot of adjustments at various nth_page() callsites, eventually permitting the removal of that undesirable helper function - "introduce kasan.write_only option in hw-tags" from Yeoreum Yun creates a KASAN read-only mode for ARM, using that architecture's memory tagging feature. It is felt that a read-only mode KASAN is suitable for use in production systems rather than debug-only - "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does some tidying in the hugetlb folio allocation code - "mm: establish const-correctness for pointer parameters" from Max Kellermann makes quite a number of the MM API functions more accurate about the constness of their arguments. This was getting in the way of subsystems (in this case CEPH) when they attempt to improving their own const/non-const accuracy - "Cleanup free_pages() misuse" from Vishal Moola fixes a number of code sites which were confused over when to use free_pages() vs __free_pages() - "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the mapletree code accessible to Rust. Required by nouveau and by its forthcoming successor: the new Rust Nova driver - "selftests/mm: split_huge_page_test: split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and some cleanups to the thp selftesting code - "mm, swap: introduce swap table as swap cache (phase I)" from Chris Li and Kairui Song is the first step along the path to implementing "swap tables" - a new approach to swap allocation and state tracking which is expected to yield speed and space improvements. This patchset itself yields a 5-20% performance benefit in some situations - "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc layer to clean up the ptdesc code a little - "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some issues in our 5-level pagetable selftesting code - "Minor fixes for memory allocation profiling" from Suren Baghdasaryan addresses a couple of minor issues in relatively new memory allocation profiling feature - "Small cleanups" from Matthew Wilcox has a few cleanups in preparation for more memdesc work - "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in furtherance of supporting arm highmem - "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad Anjum adds that compiler check to selftests code and fixes the fallout, by removing dead code - "Improvements to Victim Process Thawing and OOM Reaper Traversal Order" from zhongjinji makes a number of improvements in the OOM killer: mainly thawing a more appropriate group of victim threads so they can release resources - "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park is a bunch of small and unrelated fixups for DAMON - "mm/damon: define and use DAMON initialization check function" from SeongJae Park implement reliability and maintainability improvements to a recently-added bug fix - "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from SeongJae Park provides additional transparency to userspace clients of the DAMON_STAT information - "Expand scope of khugepaged anonymous collapse" from Dev Jain removes some constraints on khubepaged's collapsing of anon VMAs. It also increases the success rate of MADV_COLLAPSE against an anon vma - "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards removal of file_operations.mmap(). This patchset concentrates upon clearing up the treatment of stacked filesystems - "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau provides some fixes and improvements to mlock's tracking of large folios. /proc/meminfo's "Mlocked" field became more accurate - "mm/ksm: Fix incorrect accounting of KSM counters during fork" from Donet Tom fixes several user-visible KSM stats inaccuracies across forks and adds selftest code to verify these counters - "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses some potential but presently benign issues in KSM's mm_slot handling * tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits) mm: swap: check for stable address space before operating on the VMA mm: convert folio_page() back to a macro mm/khugepaged: use start_addr/addr for improved readability hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list alloc_tag: fix boot failure due to NULL pointer dereference mm: silence data-race in update_hiwater_rss mm/memory-failure: don't select MEMORY_ISOLATION mm/khugepaged: remove definition of struct khugepaged_mm_slot mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL hugetlb: increase number of reserving hugepages via cmdline selftests/mm: add fork inheritance test for ksm_merging_pages counter mm/ksm: fix incorrect KSM counter handling in mm_struct during fork drivers/base/node: fix double free in register_one_node() mm: remove PMD alignment constraint in execmem_vmalloc() mm/memory_hotplug: fix typo 'esecially' -> 'especially' mm/rmap: improve mlock tracking for large folios mm/filemap: map entire large folio faultaround mm/fault: try to map the entire file folio in finish_fault() mm/rmap: mlock large folios in try_to_unmap_one() mm/rmap: fix a mlock race condition in folio_referenced_one() ...
12 daysMerge tag 'slab-for-6.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: - A new layer for caching objects for allocation and free via percpu arrays called sheaves. The aim is to combine the good parts of SLAB (lower-overhead and simpler percpu caching, compared to SLUB) without the past issues with arrays for freeing remote NUMA node objects and their flushing. It also allows more efficient kfree_rcu(), and cheaper object preallocations for cases where the exact number of objects is unknown, but an upper bound is. Currently VMAs and maple nodes are using this new caching, with a plan to enable it for all caches and remove the complex SLUB fastpath based on cpu (partial) slabs and this_cpu_cmpxchg_double(). (Vlastimil Babka, with Liam Howlett and Pedro Falcato for the maple tree changes) - Re-entrant kmalloc_nolock(), which allows opportunistic allocations from NMI and tracing/kprobe contexts. Building on prior page allocator and memcg changes, it will result in removing BPF-specific caches on top of slab (Alexei Starovoitov) - Various fixes and cleanups. (Kuan-Wei Chiu, Matthew Wilcox, Suren Baghdasaryan, Ye Liu) * tag 'slab-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (40 commits) slab: Introduce kmalloc_nolock() and kfree_nolock(). slab: Reuse first bit for OBJEXTS_ALLOC_FAIL slab: Make slub local_(try)lock more precise for LOCKDEP mm: Introduce alloc_frozen_pages_nolock() mm: Allow GFP_ACCOUNT to be used in alloc_pages_nolock(). locking/local_lock: Introduce local_lock_is_locked(). maple_tree: Convert forking to use the sheaf interface maple_tree: Add single node allocation support to maple state maple_tree: Prefilled sheaf conversion and testing tools/testing: Add support for prefilled slab sheafs maple_tree: Replace mt_free_one() with kfree() maple_tree: Use kfree_rcu in ma_free_rcu testing/radix-tree/maple: Hack around kfree_rcu not existing tools/testing: include maple-shim.c in maple.c maple_tree: use percpu sheaves for maple_node_cache mm, vma: use percpu sheaves for vm_area_struct cache tools/testing: Add support for changes to slab for sheaves slab: allow NUMA restricted allocations to use percpu sheaves tools/testing/vma: Implement vm_refcnt reset slab: skip percpu sheaves for remote object freeing ...
12 daysMerge tag 'net-next-6.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core & protocols: - Improve drop account scalability on NUMA hosts for RAW and UDP sockets and the backlog, almost doubling the Pps capacity under DoS - Optimize the UDP RX performance under stress, reducing contention, revisiting the binary layout of the involved data structs and implementing NUMA-aware locking. This improves UDP RX performance by an additional 50%, even more under extreme conditions - Add support for PSP encryption of TCP connections; this mechanism has some similarities with IPsec and TLS, but offers superior HW offloads capabilities - Ongoing work to support Accurate ECN for TCP. AccECN allows more than one congestion notification signal per RTT and is a building block for Low Latency, Low Loss, and Scalable Throughput (L4S) - Reorganize the TCP socket binary layout for data locality, reducing the number of touched cachelines in the fastpath - Refactor skb deferral free to better scale on large multi-NUMA hosts, this improves TCP and UDP RX performances significantly on such HW - Increase the default socket memory buffer limits from 256K to 4M to better fit modern link speeds - Improve handling of setups with a large number of nexthop, making dump operating scaling linearly and avoiding unneeded synchronize_rcu() on delete - Improve bridge handling of VLAN FDB, storing a single entry per bridge instead of one entry per port; this makes the dump order of magnitude faster on large switches - Restore IP ID correctly for encapsulated packets at GSO segmentation time, allowing GRO to merge packets in more scenarios - Improve netfilter matching performance on large sets - Improve MPTCP receive path performance by leveraging recently introduced core infrastructure (skb deferral free) and adopting recent TCP autotuning changes - Allow bridges to redirect to a backup port when the bridge port is administratively down - Introduce MPTCP 'laminar' endpoint that con be used only once per connection and simplify common MPTCP setups - Add RCU safety to dst->dev, closing a lot of possible races - A significant crypto library API for SCTP, MPTCP and IPv6 SR, reducing code duplication - Supports pulling data from an skb frag into the linear area of an XDP buffer Things we sprinkled into general kernel code: - Generate netlink documentation from YAML using an integrated YAML parser Driver API: - Support using IPv6 Flow Label in Rx hash computation and RSS queue selection - Introduce API for fetching the DMA device for a given queue, allowing TCP zerocopy RX on more H/W setups - Make XDP helpers compatible with unreadable memory, allowing more easily building DevMem-enabled drivers with a unified XDP/skbs datapath - Add a new dedicated ethtool callback enabling drivers to provide the number of RX rings directly, improving efficiency and clarity in RX ring queries and RSS configuration - Introduce a burst period for the health reporter, allowing better handling of multiple errors due to the same root cause - Support for DPLL phase offset exponential moving average, controlling the average smoothing factor Device drivers: - Add a new Huawei driver for 3rd gen NIC (hinic3) - Add a new SpacemiT driver for K1 ethernet MAC - Add a generic abstraction for shared memory communication devices (dibps) - Ethernet high-speed NICs: - nVidia/Mellanox: - Use multiple per-queue doorbell, to avoid MMIO contention issues - support adjacent functions, allowing them to delegate their SR-IOV VFs to sibling PFs - support RSS for IPSec offload - support exposing raw cycle counters in PTP and mlx5 - support for disabling host PFs. - Intel (100G, ice, idpf): - ice: support for SRIOV VFs over an Active-Active link aggregate - ice: support for firmware logging via debugfs - ice: support for Earliest TxTime First (ETF) hardware offload - idpf: support basic XDP functionalities and XSk - Broadcom (bnxt): - support Hyper-V VF ID - dynamic SRIOV resource allocations for RoCE - Meta (fbnic): - support queue API, zero-copy Rx and Tx - support basic XDP functionalities - devlink health support for FW crashes and OTP mem corruptions - expand hardware stats coverage to FEC, PHY, and Pause - Wangxun: - support ethtool coalesce options - support for multiple RSS contexts - Ethernet virtual: - Macsec: - replace custom netlink attribute checks with policy-level checks - Bonding: - support aggregator selection based on port priority - Microsoft vNIC: - use page pool fragments for RX buffers instead of full pages to improve memory efficiency - Ethernet NICs consumer, and embedded: - Qualcomm: support Ethernet function for IPQ9574 SoC - Airoha: implement wlan offloading via NPU - Freescale - enetc: add NETC timer PTP driver and add PTP support - fec: enable the Jumbo frame support for i.MX8QM - Renesas (R-Car S4): - support HW offloading for layer 2 switching - support for RZ/{T2H, N2H} SoCs - Cadence (macb): support TAPRIO traffic scheduling - TI: - support for Gigabit ICSS ethernet SoC (icssm-prueth) - Synopsys (stmmac): a lot of cleanups - Ethernet PHYs: - Support 10g-qxgmi phy-mode for AQR412C, Felix DSA and Lynx PCS driver - Support bcm63268 GPHY power control - Support for Micrel lan8842 PHY and PTP - Support for Aquantia AQR412 and AQR115 - CAN: - a large CAN-XL preparation work - reorganize raw_sock and uniqframe struct to minimize memory usage - rcar_canfd: update the CAN-FD handling - WiFi: - extended Neighbor Awareness Networking (NAN) support - S1G channel representation cleanup - improve S1G support - WiFi drivers: - Intel (iwlwifi): - major refactor and cleanup - Broadcom (brcm80211): - support for AP isolation - RealTek (rtw88/89) rtw88/89: - preparation work for RTL8922DE support - MediaTek (mt76): - HW restart improvements - MLO support - Qualcomm/Atheros (ath10k): - GTK rekey fixes - Bluetooth drivers: - btusb: support for several new IDs for MT7925 - btintel: support for BlazarIW core - btintel_pcie: support for _suspend() / _resume() - btintel_pcie: support for Scorpious, Panther Lake-H484 IDs" * tag 'net-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1536 commits) net: stmmac: Add support for Allwinner A523 GMAC200 dt-bindings: net: sun8i-emac: Add A523 GMAC200 compatible Revert "Documentation: net: add flow control guide and document ethtool API" octeontx2-pf: fix bitmap leak octeontx2-vf: fix bitmap leak net/mlx5e: Use extack in set rxfh callback net/mlx5e: Introduce mlx5e_rss_params for RSS configuration net/mlx5e: Introduce mlx5e_rss_init_params net/mlx5e: Remove unused mdev param from RSS indir init net/mlx5: Improve QoS error messages with actual depth values net/mlx5e: Prevent entering switchdev mode with inconsistent netns net/mlx5: HWS, Generalize complex matchers net/mlx5: Improve write-combining test reliability for ARM64 Grace CPUs selftests/net: add tcp_port_share to .gitignore Revert "net/mlx5e: Update and set Xon/Xoff upon MTU set" net: add NUMA awareness to skb_attempt_defer_free() net: use llist for sd->defer_list net: make softnet_data.defer_count an atomic selftests: drv-net: psp: add tests for destroying devices selftests: drv-net: psp: add test for auto-adjusting TCP MSS ...
12 daysMerge tag 'kcsan-20250929-v6.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/melver/linux Pull Kernel Concurrency Sanitizer (KCSAN) update from Marco Elver: - Replace deprecated strcpy() with strscpy() * tag 'kcsan-20250929-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/melver/linux: kcsan: test: Replace deprecated strcpy() with strscpy()
13 daysMerge tag 'linux_kselftest-kunit-6.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit updates from Shuah Khan: - New parameterized test features KUnit parameterized tests supported two primary methods for getting parameters: - Defining custom logic within a generate_params() function. - Using the KUNIT_ARRAY_PARAM() and KUNIT_ARRAY_PARAM_DESC() macros with a pre-defined static array and passing the created *_gen_params() to KUNIT_CASE_PARAM(). These methods present limitations when dealing with dynamically generated parameter arrays, or in scenarios where populating parameters sequentially via generate_params() is inefficient or overly complex. These limitations are fixed with a parameterized test method - Fix issues in kunit build artifacts cleanup - Fix parsing skipped test problem in kselftest framework - Enable PCI on UML without triggering WARN() - a few other fixes and adds support for new configs such as MIPS * tag 'linux_kselftest-kunit-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: Extend kconfig help text for KUNIT_UML_PCI rust: kunit: allow `cfg` on `test`s kunit: qemu_configs: Add MIPS configurations kunit: Enable PCI on UML without triggering WARN() Documentation: kunit: Document new parameterized test features kunit: Add example parameterized test with direct dynamic parameter array setup kunit: Add example parameterized test with shared resource management using the Resource API kunit: Enable direct registration of parameter arrays to a KUnit test kunit: Pass parameterized test context to generate_params() kunit: Introduce param_init/exit for parameterized test context management kunit: Add parent kunit for parameterized test context kunit: tool: Accept --raw_output=full as an alias of 'all' kunit: tool: Parse skipped tests from kselftest.h kunit: Always descend into kunit directory during build
13 daysMerge tag 'pm-6.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "The majority of these are cpufreq changes, which has been a recurring pattern for a few recent cycles. Those changes include new hardware support (AN7583 SoC support in the airoha cpufreq driver, ipq5424 support in the qcom-nvmem cpufreq driver, MT8196 support in the mediatek cpufreq driver, AM62D2 support in the ti cpufreq driver), DT bindings and Rust code updates, cleanups of the core and governors, and multiple driver fixes and cleanups. Beyond that, there are hibernation fixes (some remaining 6.16 cycle fallout and an issue related to hybrid suspend in the amdgpu driver), cleanups of the PM core code, runtime PM documentation update, cpuidle and power capping cleanups, and tooling updates. Specifics: - Rearrange variable declarations involving __free() in the cpufreq core and intel_pstate driver to follow common coding style (Rafael Wysocki) - Fix object lifecycle issue in update_qos_request(), rearrange freq QoS updates using __free(), and adjust frequency percentage computations in the intel_pstate driver (Rafael Wysocki) - Update intel_pstate to allow it to enable HWP without EPP if the new DEC (Dynamic Efficiency Control) HW feature is enabled (Rafael Wysocki) - Use on_each_cpu_mask() in drv_write() in the ACPI cpufreq driver to simplify the code (Rafael Wysocki) - Use likely() optimization in intel_pstate_sample() (Yaxiong Tian) - Remove dead EPB-related code from intel_pstate (Srinivas Pandruvada) - Use scope-based cleanup for cpufreq policy references in multiple cpufreq drivers (Zihuan Zhang) - Avoid calling get_governor() for the first policy in the cpufreq core to simplify the initial policy path (Zihuan Zhang) - Clean up the cpufreq core in multiple places (Zihuan Zhang) - Use int type to store negative error codes in the cpufreq core and update the speedstep-lib to use int for error codes (Qianfeng Rong) - Update the efficient idle check for Intel extended Families in the ondemand cpufreq governor (Sohil Mehta) - Replace sscanf() with kstrtouint() in the conservative cpufreq governor (Kaushlendra Kumar) - Rename CpumaskVar::as[_mut]_ref to from_raw[_mut] in the cpumask Rust code and mark CpumaskVar as transparent (Alice Ryhl, Baptiste Lepers) - Update ARef and AlwaysRefCounted imports from sync::aref in the OPP Rust code (Shankari Anand) - Add support for AN7583 SoC to the airoha cpufreq driver (Christian Marangi) - Enable cpufreq for ipq5424 in the qcom-nvmem cpufreq driver (Md Sadre Alam) - Add support for MT8196 to the mediatek-hw cpufreq driver, refactor that driver and add mediatek,mt8196-cpufreq-hw DT binding (Nicolas Frattaroli) - Avoid redundant conditions in the mediatek cpufreq driver (Liao Yuanhong) - Add support for AM62D2 to the ti cpufreq driver and blocklist ti,am62d2 SoC in dt-platdev (Paresh Bhagat) - Support more speed grades on AM62Px SoC in the ti cpufreq driver, allow all silicon revisions to support OPPs in it, and fix supported hardware for 1GHz OPP (Judith Mendez) - Add QCS615 compatible to DT bindings for cpufreq-qcom-hw (Taniya Das) - Minor assorted updates of the scmi, longhaul, CPPC, and armada-37xx cpufreq drivers (Akhilesh Patil, BowenYu, Dennis Beier, and Florian Fainelli) - Remove outdated cpufreq-dt.txt (Frank Li) - Fix python gnuplot package names in the amd_pstate_tracer utility (Kuan-Wei Chiu) - Saravana Kannan will maintain the virtual-cpufreq driver (Saravana Kannan) - Prevent CPU capacity updates after registering a perf domain from failing on a first CPU that is not present (Christian Loehle) - Add support for the cases in which frequency alone is not sufficient to uniquely identify an OPP (Krishna Chaitanya Chundru) - Use to_result() for OPP error handling in Rust (Onur Özkan) - Add support for LPDDR5 on Rockhip RK3588 SoC to rockchip-dfi devfreq driver (Nicolas Frattaroli) - Fix an issue where DDR cycle counts on RK3588/RK3528 with LPDDR4(X) are reported as half by adding a cycle multiplier to the DFI driver in rockchip-dfi devfreq-event driver (Nicolas Frattaroli) - Fix missing error pointer dereference check of regulator instance in the mtk-cci devfreq driver probe and remove a redundant condition from an if () statement in that driver (Dan Carpenter, Liao Yuanhong) - Fail cpuidle device registration if there is one already to avoid sysfs-related issues (Rafael Wysocki) - Use sysfs_emit()/sysfs_emit_at() instead of sprintf()/scnprintf() in cpuidle (Vivek Yadav) - Fix device and OF node leaks at probe in the qcom-spm cpuidle driver and drop unnecessary initialisations from it (Johan Hovold) - Remove unnecessary address-of operators from the intel_idle cpuidle driver (Kaushlendra Kumar) - Rearrange main loop in menu_select() to make the code in that funtion easier to follow (Rafael Wysocki) - Convert values in microseconds to ktime using us_to_ktime() where applicable in the intel_idle power capping driver (Xichao Zhao) - Annotate loops walking device links in the power management core code as _srcu and add macros for walking device links to reduce the likelihood of coding mistakes related to them (Rafael Wysocki) - Document time units for *_time functions in the runtime PM API (Brian Norris) - Clear power.must_resume in noirq suspend error path to avoid resuming a dependant device under a suspended parent or supplier (Rafael Wysocki) - Fix GFP mask handling during hybrid suspend and make the amdgpu driver handle hybrid suspend correctly (Mario Limonciello, Rafael Wysocki) - Fix GFP mask handling after aborted hibernation in platform mode and combine exit paths in power_down() to avoid code duplication (Rafael Wysocki) - Use vmalloc_array() and vcalloc() in the hibernation core to avoid open-coded size computations (Qianfeng Rong) - Fix typo in hibernation core code comment (Li Jun) - Call pm_wakeup_clear() in the same place where other functions that do bookkeeping prior to suspend_prepare() are called (Samuel Wu) - Fix and clean up the x86_energy_perf_policy utility and update its documentation (Len Brown, Kaushlendra Kumar) - Fix incorrect sorting of PMT telemetry in turbostat (Kaushlendra Kumar) - Fix incorrect size in cpuidle_state_disable() and the error return value of cpupower_write_sysfs() in cpupower (Kaushlendra Kumar)" * tag 'pm-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (86 commits) PM: hibernate: Combine return paths in power_down() PM: hibernate: Restrict GFP mask in power_down() PM: hibernate: Fix pm_hibernation_mode_is_suspend() build breakage PM: runtime: Documentation: ABI: Document time units for *_time tools/power x86_energy_perf_policy.8: Emphasize preference for SW interfaces tools/power x86_energy_perf_policy: Add make snapshot target tools/power x86_energy_perf_policy: Prefer driver HWP limits tools/power x86_energy_perf_policy: EPB access is only via sysfs tools/power x86_energy_perf_policy: Prepare for MSR/sysfs refactoring tools/power x86_energy_perf_policy: Enhance HWP enable tools/power x86_energy_perf_policy: Enhance HWP enabled check tools/power x86_energy_perf_policy: Fix incorrect fopen mode usage tools/power turbostat: Fix incorrect sorting of PMT telemetry drm/amd: Fix hybrid sleep PM: hibernate: Add pm_hibernation_mode_is_suspend() PM: hibernate: Fix hybrid-sleep tools/cpupower: Fix incorrect size in cpuidle_state_disable() tools/power/x86/amd_pstate_tracer: Fix python gnuplot package names cpufreq: Replace pointer subtraction with iteration macro cpuidle: Fail cpuidle device registration if there is one already ...
13 daysMerge tag 'driver-core-6.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core updates from Danilo Krummrich: "Auxiliary: - Drop call to dev_pm_domain_detach() in auxiliary_bus_probe() - Optimize logic of auxiliary_match_id() Rust: - Auxiliary: - Use primitive C types from prelude - DebugFs: - Add debugfs support for simple read/write files and custom callbacks through a File-type-based and directory-scope-based API - Sample driver code for the File-type-based API - Sample module code for the directory-scope-based API - I/O: - Add io::poll module and implement Rust specific read_poll_timeout() helper - IRQ: - Implement support for threaded and non-threaded device IRQs based on (&Device<Bound>, IRQ number) tuples (IrqRequest) - Provide &Device<Bound> cookie in IRQ handlers - PCI: - Support IRQ requests from IRQ vectors for a specific pci::Device<Bound> - Implement accessors for subsystem IDs, revision, devid and resource start - Provide dedicated pci::Vendor and pci::Class types for vendor and class ID numbers - Implement Display to print actual vendor and class names; Debug to print the raw ID numbers - Add pci::DeviceId::from_class_and_vendor() helper - Use primitive C types from prelude - Various minor inline and (safety) comment improvements - Platform: - Support IRQ requests from IRQ vectors for a specific platform::Device<Bound> - Nova: - Use pci::DeviceId::from_class_and_vendor() to avoid probing non-display/compute PCI functions - Misc: - Add helper for cpu_relax() - Update ARef import from sync::aref sysfs: - Remove bin_attrs_new field from struct attribute_group - Remove read_new() and write_new() from struct bin_attribute Misc: - Document potential race condition in get_dev_from_fwnode() - Constify node_group argument in software node registration functions - Fix order of kernel-doc parameters in various functions - Set power.no_pm flag for faux devices - Set power.no_callbacks flag along with the power.no_pm flag - Constify the pmu_bus bus type - Minor spelling fixes" * tag 'driver-core-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (43 commits) rust: pci: display symbolic PCI vendor names rust: pci: display symbolic PCI class names rust: pci: fix incorrect platform reference in PCI driver probe doc comment rust: pci: fix incorrect platform reference in PCI driver unbind doc comment perf: make pmu_bus const samples: rust: Add scoped debugfs sample driver rust: debugfs: Add support for scoped directories samples: rust: Add debugfs sample driver rust: debugfs: Add support for callback-based files rust: debugfs: Add support for writable files rust: debugfs: Add support for read-only files rust: debugfs: Add initial support for directories driver core: auxiliary bus: Optimize logic of auxiliary_match_id() driver core: auxiliary bus: Drop dev_pm_domain_detach() call driver core: Fix order of the kernel-doc parameters driver core: get_dev_from_fwnode(): document potential race drivers: base: fix "publically"->"publicly" driver core/PM: Set power.no_callbacks along with power.no_pm driver core: faux: Set power.no_pm for faux devices rust: pci: inline several tiny functions ...
14 daysMerge tag 'bpf-next-6.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Support pulling non-linear xdp data with bpf_xdp_pull_data() kfunc (Amery Hung) Applied as a stable branch in bpf-next and net-next trees. - Support reading skb metadata via bpf_dynptr (Jakub Sitnicki) Also a stable branch in bpf-next and net-next trees. - Enforce expected_attach_type for tailcall compatibility (Daniel Borkmann) - Replace path-sensitive with path-insensitive live stack analysis in the verifier (Eduard Zingerman) This is a significant change in the verification logic. More details, motivation, long term plans are in the cover letter/merge commit. - Support signed BPF programs (KP Singh) This is another major feature that took years to materialize. Algorithm details are in the cover letter/marge commit - Add support for may_goto instruction to s390 JIT (Ilya Leoshkevich) - Add support for may_goto instruction to arm64 JIT (Puranjay Mohan) - Fix USDT SIB argument handling in libbpf (Jiawei Zhao) - Allow uprobe-bpf program to change context registers (Jiri Olsa) - Support signed loads from BPF arena (Kumar Kartikeya Dwivedi and Puranjay Mohan) - Allow access to union arguments in tracing programs (Leon Hwang) - Optimize rcu_read_lock() + migrate_disable() combination where it's used in BPF subsystem (Menglong Dong) - Introduce bpf_task_work_schedule*() kfuncs to schedule deferred execution of BPF callback in the context of a specific task using the kernel’s task_work infrastructure (Mykyta Yatsenko) - Enforce RCU protection for KF_RCU_PROTECTED kfuncs (Kumar Kartikeya Dwivedi) - Add stress test for rqspinlock in NMI (Kumar Kartikeya Dwivedi) - Improve the precision of tnum multiplier verifier operation (Nandakumar Edamana) - Use tnums to improve is_branch_taken() logic (Paul Chaignon) - Add support for atomic operations in arena in riscv JIT (Pu Lehui) - Report arena faults to BPF error stream (Puranjay Mohan) - Search for tracefs at /sys/kernel/tracing first in bpftool (Quentin Monnet) - Add bpf_strcasecmp() kfunc (Rong Tao) - Support lookup_and_delete_elem command in BPF_MAP_STACK_TRACE (Tao Chen) * tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (197 commits) libbpf: Replace AF_ALG with open coded SHA-256 selftests/bpf: Add stress test for rqspinlock in NMI selftests/bpf: Add test case for different expected_attach_type bpf: Enforce expected_attach_type for tailcall compatibility bpftool: Remove duplicate string.h header bpf: Remove duplicate crypto/sha2.h header libbpf: Fix error when st-prefix_ops and ops from differ btf selftests/bpf: Test changing packet data from kfunc selftests/bpf: Add stacktrace map lookup_and_delete_elem test case selftests/bpf: Refactor stacktrace_map case with skeleton bpf: Add lookup_and_delete_elem for BPF_MAP_STACK_TRACE selftests/bpf: Fix flaky bpf_cookie selftest selftests/bpf: Test changing packet data from global functions with a kfunc bpf: Emit struct bpf_xdp_sock type in vmlinux BTF selftests/bpf: Task_work selftest cleanup fixes MAINTAINERS: Delete inactive maintainers from AF_XDP bpf: Mark kfuncs as __noclone selftests/bpf: Add kprobe multi write ctx attach test selftests/bpf: Add kprobe write ctx attach test selftests/bpf: Add uprobe context ip register change test ...
14 daysMerge tag 'timers-vdso-2025-09-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull VDSO updates from Thomas Gleixner: - Further consolidation of the VDSO infrastructure and the common data store - Simplification of the related Kconfig logic - Improve the VDSO selftest suite * tag 'timers-vdso-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests: vDSO: Drop vdso_test_clock_getres selftests: vDSO: vdso_test_abi: Add tests for clock_gettime64() selftests: vDSO: vdso_test_abi: Test CPUTIME clocks selftests: vDSO: vdso_test_abi: Use explicit indices for name array selftests: vDSO: vdso_test_abi: Drop clock availability tests selftests: vDSO: vdso_test_abi: Use ksft_finished() selftests: vDSO: vdso_test_abi: Correctly skip whole test with missing vDSO selftests: vDSO: Fix -Wunitialized in powerpc VDSO_CALL() wrapper vdso: Add struct __kernel_old_timeval forward declaration to gettime.h vdso: Gate VDSO_GETRANDOM behind HAVE_GENERIC_VDSO vdso: Drop Kconfig GENERIC_VDSO_TIME_NS vdso: Drop Kconfig GENERIC_VDSO_DATA_STORE vdso: Drop kconfig GENERIC_COMPAT_VDSO vdso: Drop kconfig GENERIC_VDSO_32 riscv: vdso: Untangle Kconfig logic time: Build generic update_vsyscall() only with generic time vDSO vdso/gettimeofday: Remove !CONFIG_TIME_NS stubs vdso: Move ENABLE_COMPAT_VDSO from core to arm64 ARM: VDSO: Remove cntvct_ok global variable vdso/datastore: Gate time data behind CONFIG_GENERIC_GETTIMEOFDAY
14 daysMerge tag 'timers-clocksource-2025-09-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull clocksource updates from Thomas Gleixner: - Further preparations for modular clocksource/event drivers - The usual device tree updates to support new chip variants and the related changes to thise drivers - Avoid a 64-bit division in the TEGRA186 driver, which caused a build fail on 32-bit machines. - Small fixes, improvements and cleanups all over the place * tag 'timers-clocksource-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) dt-bindings: timer: exynos4210-mct: Add compatible for ARTPEC-9 SoC clocksource/drivers/sh_cmt: Split start/stop of clock source and events clocksource/drivers/clps711x: Fix resource leaks in error paths clocksource/drivers/arm_global_timer: Add auto-detection for initial prescaler values clocksource/drivers/ingenic-sysost: Convert from round_rate() to determine_rate() clocksource/drivers/timer-tegra186: Don't print superfluous errors clocksource/drivers/timer-rtl-otto: Simplify documentation clocksource/drivers/timer-rtl-otto: Do not interfere with interrupts clocksource/drivers/timer-rtl-otto: Drop set_counter function clocksource/drivers/timer-rtl-otto: Work around dying timers clocksource/drivers/timer-ti-dm : Capture functionality for OMAP DM timer clocksource/drivers/arm_arch_timer_mmio: Add MMIO clocksource clocksource/drivers/arm_arch_timer_mmio: Switch over to standalone driver clocksource/drivers/arm_arch_timer: Add standalone MMIO driver ACPI: GTDT: Generate platform devices for MMIO timers clocksource/drivers/nxp-pit: Add NXP Automotive s32g2 / s32g3 support dt: bindings: fsl,vf610-pit: Add compatible for s32g2 and s32g3 clocksource/drivers/vf-pit: Rename the VF PIT to NXP PIT clocksource/drivers/vf-pit: Unify the function name for irq ack clocksource/drivers/vf-pit: Consolidate calls to pit_*_disable/enable ...
14 daysMerge tag 'timers-core-2025-09-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer core updates from Thomas Gleixner: - Address the inconsistent shutdown sequence of per CPU clockevents on CPU hotplug, which only removed it from the core but failed to invoke the actual device driver shutdown callback. This kept the timer active, which prevented power savings and caused pointless noise in virtualization. - Encapsulate the open coded access to the hrtimer clock base, which is a private implementation detail, so that the implementation can be changed without breaking a lot of usage sites. - Enhance the debug output of the clocksource watchdog to provide better information for analysis. - The usual set of cleanups and enhancements all over the place * tag 'timers-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: time: Fix spelling mistakes in comments clocksource: Print durations for sync check unconditionally LoongArch: Remove clockevents shutdown call on offlining tick: Do not set device to detached state in tick_shutdown() hrtimer: Reorder branches in hrtimer_clockid_to_base() hrtimer: Remove hrtimer_clock_base:: Get_time hrtimer: Use hrtimer_cb_get_time() helper media: pwm-ir-tx: Avoid direct access to hrtimer clockbase ALSA: hrtimer: Avoid direct access to hrtimer clockbase lib: test_objpool: Avoid direct access to hrtimer clockbase sched/core: Avoid direct access to hrtimer clockbase timers/itimer: Avoid direct access to hrtimer clockbase posix-timers: Avoid direct access to hrtimer clockbase jiffies: Remove obsolete SHIFTED_HZ comment
14 daysMerge tag 'locking-futex-2025-09-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull futex updates from Thomas Gleixner: "A set of updates for futexes and related selftests: - Plug the ptrace_may_access() race against a concurrent exec() which allows to pass the check before the target's process transition in exec() by taking a read lock on signal->ext_update_lock. - A large set of cleanups and enhancement to the futex selftests. The bulk of changes is the conversion to the kselftest harness" * tag 'locking-futex-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) selftest/futex: Fix spelling mistake "boundarie" -> "boundary" selftests/futex: Remove logging.h file selftests/futex: Drop logging.h include from futex_numa selftests/futex: Refactor futex_numa_mpol with kselftest_harness.h selftests/futex: Refactor futex_priv_hash with kselftest_harness.h selftests/futex: Refactor futex_waitv with kselftest_harness.h selftests/futex: Refactor futex_requeue with kselftest_harness.h selftests/futex: Refactor futex_wait with kselftest_harness.h selftests/futex: Refactor futex_wait_private_mapped_file with kselftest_harness.h selftests/futex: Refactor futex_wait_unitialized_heap with kselftest_harness.h selftests/futex: Refactor futex_wait_wouldblock with kselftest_harness.h selftests/futex: Refactor futex_wait_timeout with kselftest_harness.h selftests/futex: Refactor futex_requeue_pi_signal_restart with kselftest_harness.h selftests/futex: Refactor futex_requeue_pi_mismatched_ops with kselftest_harness.h selftests/futex: Refactor futex_requeue_pi with kselftest_harness.h selftests: kselftest: Create ksft_print_dbg_msg() futex: Don't leak robust_list pointer on exec race selftest/futex: Compile also with libnuma < 2.0.16 selftest/futex: Reintroduce "Memory out of range" numa_mpol's subtest selftest/futex: Make the error check more precise for futex_numa_mpol ...
14 daysMerge tag 'smp-core-2025-09-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp doc fixlet from Thomas Gleixner: "An update of the stale smp_call_function_many() documentation to bring it back in sync with the actual implementation" * tag 'smp-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp: Fix up and expand the smp_call_function_many() kerneldoc
14 daysMerge tag 'irq-core-2025-09-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq core updates from Thomas Gleixner: "A set of updates for the interrupt core subsystem: - Introduce irq_chip_[startup|shutdown]_parent() to prepare for addressing a few short comings in the PCI/MSI interrupt subsystem. It allows to utilize the interrupt chip startup/shutdown callbacks for initializing the interrupt chip hierarchy properly on certain RISCV implementations and provides a mechanism to reduce the overhead of masking and unmasking PCI/MSI interrupts during operation when the underlying MSI provider can mask the interrupt. The actual usage comes with the interrupt driver pull request. - Add generic error handling for devm_request_*_irq() This allows to remove the zoo of random error printk's all over the usage sites. - Add a mechanism to warn about long-running interrupt handlers Long running interrupt handlers can introduce latencies and tracking them down is a tedious task. The tracking has to be enabled with a threshold on the kernel command line and utilizes a static branch to remove the overhead when disabled. - Update and extend the selftests which validate the CPU hotplug interrupt migration logic - Allow dropping the per CPU softirq lock on PREEMPT_RT kernels, which causes contention and latencies all over the place. The serialization requirements have been pushed down into the actual affected usage sites already. - The usual small cleanups and improvements" * tag 'irq-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT softirq: Provide a handshake for canceling tasklets via polling genirq/test: Ensure CPU 1 is online for hotplug test genirq/test: Drop CONFIG_GENERIC_IRQ_MIGRATION assumptions genirq/test: Depend on SPARSE_IRQ genirq/test: Fail early if interrupt request fails genirq/test: Factor out fake-virq setup genirq/test: Select IRQ_DOMAIN genirq/test: Fix depth tests on architectures with NOREQUEST by default. genirq: Add support for warning on long-running interrupt handlers genirq/devres: Add error handling in devm_request_*_irq() genirq: Add irq_chip_(startup/shutdown)_parent() genirq: Remove GENERIC_IRQ_LEGACY
14 daysMerge tag 'core-rseq-2025-09-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull rseq updates from Thomas Gleixner: "Two fixes for RSEQ: - Protect the event mask modification against the membarrier() IPI as otherwise the RmW operation is unprotected and events might be lost - Fix the weak symbol reference in rseq selftests The current weak RSEQ symbols definitions which were added to allow static linkage are not working correctly as they effectively re-define the glibc symbols leading to multiple versions of the symbols when compiled with -fno-common. Mark them as 'extern' to convert them from weak symbol definitions to weak symbol references. That works with static and dynamic linkage independent of -fcommon and -fno-common" * tag 'core-rseq-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rseq/selftests: Use weak symbol reference, not definition, to link with glibc rseq: Protect event mask against membarrier IPI
2025-09-30Merge tag 'perf-core-2025-09-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: "Core perf code updates: - Convert mmap() related reference counts to refcount_t. This is in reaction to the recently fixed refcount bugs, which could have been detected earlier and could have mitigated the bug somewhat (Thomas Gleixner, Peter Zijlstra) - Clean up and simplify the callchain code, in preparation for sframes (Steven Rostedt, Josh Poimboeuf) Uprobes updates: - Add support to optimize usdt probes on x86-64, which gives a substantial speedup (Jiri Olsa) - Cleanups and fixes on x86 (Peter Zijlstra) PMU driver updates: - Various optimizations and fixes to the Intel PMU driver (Dapeng Mi) Misc cleanups and fixes: - Remove redundant __GFP_NOWARN (Qianfeng Rong)" * tag 'perf-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits) selftests/bpf: Fix uprobe_sigill test for uprobe syscall error value uprobes/x86: Return error from uprobe syscall when not called from trampoline perf: Skip user unwind if the task is a kernel thread perf: Simplify get_perf_callchain() user logic perf: Use current->flags & PF_KTHREAD|PF_USER_WORKER instead of current->mm == NULL perf: Have get_perf_callchain() return NULL if crosstask and user are set perf: Remove get_perf_callchain() init_nr argument perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap() perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error perf/x86/intel: Use early_initcall() to hook bts_init() uprobes: Remove redundant __GFP_NOWARN selftests/seccomp: validate uprobe syscall passes through seccomp seccomp: passthrough uprobe systemcall without filtering selftests/bpf: Fix uprobe syscall shadow stack test selftests/bpf: Change test_uretprobe_regs_change for uprobe and uretprobe selftests/bpf: Add uprobe_regs_equal test selftests/bpf: Add optimized usdt variant for basic usdt test ...
2025-09-30Merge tag 'sched-core-2025-09-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Core scheduler changes: - Make migrate_{en,dis}able() inline, to improve performance (Menglong Dong) - Move STDL_INIT() functions out-of-line (Peter Zijlstra) - Unify the SCHED_{SMT,CLUSTER,MC} Kconfig (Peter Zijlstra) Fair scheduling: - Defer throttling to when tasks exit to user-space, to reduce the chance & impact of throttle-preemption with held locks and other resources (Aaron Lu, Valentin Schneider) - Get rid of sched_domains_curr_level hack for tl->cpumask(), as the warning was getting triggered on certain topologies (Peter Zijlstra) Misc cleanups & fixes: - Header cleanups (Menglong Dong) - Fix race in push_dl_task() (Harshit Agarwal)" * tag 'sched-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix some typos in include/linux/preempt.h sched: Make migrate_{en,dis}able() inline rcu: Replace preempt.h with sched.h in include/linux/rcupdate.h arch: Add the macro COMPILE_OFFSETS to all the asm-offsets.c sched/fair: Do not balance task to a throttled cfs_rq sched/fair: Do not special case tasks in throttled hierarchy sched/fair: update_cfs_group() for throttled cfs_rqs sched/fair: Propagate load for throttled cfs_rq sched/fair: Get rid of throttled_lb_pair() sched/fair: Task based throttle time accounting sched/fair: Switch to task based throttle model sched/fair: Implement throttle task work and related helpers sched/fair: Add related data structure for task based throttle sched: Unify the SCHED_{SMT,CLUSTER,MC} Kconfig sched: Move STDL_INIT() functions out-of-line sched/fair: Get rid of sched_domains_curr_level hack for tl->cpumask() sched/deadline: Fix race in push_dl_task()
2025-09-30Merge tag 'cgroup-for-6.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - Extensive cpuset code cleanup and refactoring work with no functional changes: CPU mask computation logic refactoring, introducing new helpers, removing redundant code paths, and improving error handling for better maintainability. - A few bug fixes to cpuset including fixes for partition creation failures when isolcpus is in use, missing error returns, and null pointer access prevention in free_tmpmasks(). - Core cgroup changes include replacing the global percpu_rwsem with per-threadgroup rwsem when writing to cgroup.procs for better scalability, workqueue conversions to use WQ_PERCPU and system_percpu_wq to prepare for workqueue default switching from percpu to unbound, and removal of unused code including the post_attach callback. - New cgroup.stat.local time accounting feature that tracks frozen time duration. - Misc changes including selftests updates (new freezer time tests and backward compatibility fixes), documentation sync, string function safety improvements, and 64-bit division fixes. * tag 'cgroup-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (39 commits) cpuset: remove is_prs_invalid helper cpuset: remove impossible warning in update_parent_effective_cpumask cpuset: remove redundant special case for null input in node mask update cpuset: fix missing error return in update_cpumask cpuset: Use new excpus for nocpu error check when enabling root partition cpuset: fix failure to enable isolated partition when containing isolcpus Documentation: cgroup-v2: Sync manual toctree cpuset: use partition_cpus_change for setting exclusive cpus cpuset: use parse_cpulist for setting cpus.exclusive cpuset: introduce partition_cpus_change cpuset: refactor cpus_allowed_validate_change cpuset: refactor out validate_partition cpuset: introduce cpus_excl_conflict and mems_excl_conflict helpers cpuset: refactor CPU mask buffer parsing logic cpuset: Refactor exclusive CPU mask computation logic cpuset: change return type of is_partition_[in]valid to bool cpuset: remove unused assignment to trialcs->partition_root_state cpuset: move the root cpuset write check earlier cgroup/cpuset: Remove redundant rcu_read_lock/unlock() in spin_lock cgroup: Remove redundant rcu_read_lock/unlock() in spin_lock ...
2025-09-30Merge tag 'wq-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
Pull workqueue updates from Tejun Heo: - WQ_PERCPU was added to remaining alloc_workqueue() users and system_wq usage was replaced with system_percpu_wq and system_unbound_wq with system_dfl_wq. These are equivalent conversions with no functional changes, preparing for switching default to unbound workqueues from percpu. - A handshake mechanism was added for canceling BH workers to avoid live lock scenarios under PREEMPT_RT. - Unnecessary rcu_read_lock/unlock() calls were dropped in wq_watchdog_timer_fn() and workqueue_congested(). - Documentation was fixed to resolve texinfodocs warnings. * tag 'wq-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: fix texinfodocs warning for WQ_* flags reference workqueue: WQ_PERCPU added to alloc_workqueue users workqueue: replace use of system_wq with system_percpu_wq workqueue: replace use of system_unbound_wq with system_dfl_wq workqueue: Provide a handshake for canceling BH workers workqueue: Remove rcu_read_lock/unlock() in wq_watchdog_timer_fn() workqueue: Remove redundant rcu_read_lock/unlock() in workqueue_congested()
2025-09-30Merge tag 'sched_ext-for-6.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - Code organization cleanup. Separate internal types and accessors to ext_internal.h to reduce the size of ext.c and improve maintainability. - Prepare for cgroup sub-scheduler support by adding @sch parameter to various functions and helpers, reorganizing scheduler instance handling, and dropping obsolete helpers like scx_kf_exit() and kf_cpu_valid(). - Add new scx_bpf_cpu_curr() and scx_bpf_locked_rq() BPF helpers to provide safer access patterns with proper RCU protection. scx_bpf_cpu_rq() is deprecated with warnings due to potential race conditions. - Improve debugging with migration-disabled counter in error state dumps, SCX_EFLAG_INITIALIZED flag, bitfields for warning flags, and other enhancements to help diagnose issues. - Use cgroup_lock/unlock() for cgroup synchronization instead of scx_cgroup_rwsem based synchronization. This is simpler and allows enable/disable paths to synchronize against cgroup changes independent of the CPU controller. - rhashtable_lookup() replacement to avoid redundant RCU locking was reverted due to RCU usage warnings. Will be redone once rhashtable is updated to use rcu_dereference_all(). - Other misc updates and fixes including bypass handling improvements, scx_task_iter_relock() improvements, tools/sched_ext updates, and compatibility helpers. * tag 'sched_ext-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (28 commits) Revert "sched_ext: Use rhashtable_lookup() instead of rhashtable_lookup_fast()" sched_ext: Misc updates around scx_sched instance pointer sched_ext: Drop scx_kf_exit() and scx_kf_error() sched_ext: Add the @sch parameter to scx_dsq_insert_preamble/commit() sched_ext: Drop kf_cpu_valid() sched_ext: Add the @sch parameter to ext_idle helpers sched_ext: Add the @sch parameter to __bstr_format() sched_ext: Separate out scx_kick_cpu() and add @sch to it tools/sched_ext: scx_qmap: Make debug output quieter by default sched_ext: Make qmap dump operation non-destructive sched_ext: Add SCX_EFLAG_INITIALIZED to indicate successful ops.init() sched_ext: Use bitfields for boolean warning flags sched_ext: Fix stray scx_root usage in task_can_run_on_remote_rq() sched_ext: Improve SCX_KF_DISPATCH comment sched_ext: Use rhashtable_lookup() instead of rhashtable_lookup_fast() sched_ext: Verify RCU protection in scx_bpf_cpu_curr() sched_ext: Add migration-disabled counter to error state dump sched_ext: Fix NULL dereference in scx_bpf_cpu_rq() warning tools/sched_ext: Add compat helper for scx_bpf_cpu_curr() sched_ext: deprecation warn for scx_bpf_cpu_rq() ...
2025-09-30Merge tag 'audit-pr-20250926' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: - Proper audit support for multiple LSMs As the audit subsystem predated the work to enable multiple LSMs, some additional work was needed to support logging the different LSM labels for the subjects/tasks and objects on the system. Casey's patches add new auxillary records for subjects and objects that convey the additional labels. - Ensure fanotify audit events are always generated Generally speaking security relevant subsystems always generate audit events, unless explicitly ignored. However, up to this point fanotify events had been ignored by default, but starting with this pull request fanotify follows convention and generates audit events by default. - Replace an instance of strcpy() with strscpy() - Minor indentation, style, and comment fixes * tag 'audit-pr-20250926' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: fix skb leak when audit rate limit is exceeded audit: init ab->skb_list earlier in audit_buffer_alloc() audit: add record for multiple object contexts audit: add record for multiple task security contexts lsm: security_lsmblob_to_secctx module selection audit: create audit_stamp structure audit: add a missing tab audit: record fanotify event regardless of presence of rules audit: fix typo in auditfilter.c comment audit: Replace deprecated strcpy() with strscpy() audit: fix indentation in audit_log_exit()
2025-09-29Merge tag 'powerpc-6.18-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Madhavan Srinivasan: - powerpc support for BPF arena and arena atomics - Patches to switch to msi parent domain (per-device MSI domains) - Add a lock contention tracepoint in the queued spinlock slowpath - Fixes for underflow in pseries/powernv msi and pci paths - Switch from legacy-of-mm-gpiochip dependency to platform driver - Fixes for handling TLB misses - Introduce support for powerpc papr-hvpipe - Add vpa-dtl PMU driver for pseries platform - Misc fixes and cleanups Thanks to Aboorva Devarajan, Aditya Bodkhe, Andrew Donnellan, Athira Rajeev, Cédric Le Goater, Christophe Leroy, Erhard Furtner, Gautam Menghani, Geert Uytterhoeven, Haren Myneni, Hari Bathini, Joe Lawrence, Kajol Jain, Kienan Stewart, Linus Walleij, Mahesh Salgaonkar, Nam Cao, Nicolas Schier, Nysal Jan K.A., Ritesh Harjani (IBM), Ruben Wauters, Saket Kumar Bhaskar, Shashank MS, Shrikanth Hegde, Tejas Manhas, Thomas Gleixner, Thomas Huth, Thorsten Blum, Tyrel Datwyler, and Venkat Rao Bagalkote. * tag 'powerpc-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (49 commits) powerpc/pseries: Define __u{8,32} types in papr_hvpipe_hdr struct genirq/msi: Remove msi_post_free() powerpc/perf/vpa-dtl: Add documentation for VPA dispatch trace log PMU powerpc/perf/vpa-dtl: Handle the writing of perf record when aux wake up is needed powerpc/perf/vpa-dtl: Add support to capture DTL data in aux buffer powerpc/perf/vpa-dtl: Add support to setup and free aux buffer for capturing DTL data docs: ABI: sysfs-bus-event_source-devices-vpa-dtl: Document sysfs event format entries for vpa_dtl pmu powerpc/vpa_dtl: Add interface to expose vpa dtl counters via perf powerpc/time: Expose boot_tb via accessor powerpc/32: Remove PAGE_KERNEL_TEXT to fix startup failure powerpc/fprobe: fix updated fprobe for function-graph tracer powerpc/ftrace: support CONFIG_FUNCTION_GRAPH_RETVAL powerpc64/modules: replace stub allocation sentinel with an explicit counter powerpc64/modules: correctly iterate over stubs in setup_ftrace_ool_stubs powerpc/ftrace: ensure ftrace record ops are always set for NOPs powerpc/603: Really copy kernel PGD entries into all PGDIRs powerpc/8xx: Remove left-over instruction and comments in DataStoreTLBMiss handler powerpc/pseries: HVPIPE changes to support migration powerpc/pseries: Enable hvpipe with ibm,set-system-parameter RTAS powerpc/pseries: Enable HVPIPE event message interrupt ...
2025-09-29Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "There's good stuff across the board, including some nice mm improvements for CPUs with the 'noabort' BBML2 feature and a clever patch to allow ptdump to play nicely with block mappings in the vmalloc area. Confidential computing: - Add support for accepting secrets from firmware (e.g. ACPI CCEL) and mapping them with appropriate attributes. CPU features: - Advertise atomic floating-point instructions to userspace - Extend Spectre workarounds to cover additional Arm CPU variants - Extend list of CPUs that support break-before-make level 2 and guarantee not to generate TLB conflict aborts for changes of mapping granularity (BBML2_NOABORT) - Add GCS support to our uprobes implementation. Documentation: - Remove bogus SME documentation concerning register state when entering/exiting streaming mode. Entry code: - Switch over to the generic IRQ entry code (GENERIC_IRQ_ENTRY) - Micro-optimise syscall entry path with a compiler branch hint. Memory management: - Enable huge mappings in vmalloc space even when kernel page-table dumping is enabled - Tidy up the types used in our early MMU setup code - Rework rodata= for closer parity with the behaviour on x86 - For CPUs implementing BBML2_NOABORT, utilise block mappings in the linear map even when rodata= applies to virtual aliases - Don't re-allocate the virtual region between '_text' and '_stext', as doing so confused tools parsing /proc/vmcore. Miscellaneous: - Clean-up Kconfig menuconfig text for architecture features - Avoid redundant bitmap_empty() during determination of supported SME vector lengths - Re-enable warnings when building the 32-bit vDSO object - Avoid breaking our eggs at the wrong end. Perf and PMUs: - Support for v3 of the Hisilicon L3C PMU - Support for Hisilicon's MN and NoC PMUs - Support for Fujitsu's Uncore PMU - Support for SPE's extended event filtering feature - Preparatory work to enable data source filtering in SPE - Support for multiple lanes in the DWC PCIe PMU - Support for i.MX94 in the IMX DDR PMU driver - MAINTAINERS update (Thank you, Yicong) - Minor driver fixes (PERF_IDX2OFF() overflow, CMN register offsets). Selftests: - Add basic LSFE check to the existing hwcaps test - Support nolibc in GCS tests - Extend SVE ptrace test to pass unsupported regsets and invalid vector lengths - Minor cleanups (typos, cosmetic changes). System registers: - Fix ID_PFR1_EL1 definition - Fix incorrect signedness of some fields in ID_AA64MMFR4_EL1 - Sync TCR_EL1 definition with the latest Arm ARM (L.b) - Be stricter about the input fed into our AWK sysreg generator script - Typo fixes and removal of redundant definitions. ACPI, EFI and PSCI: - Decouple Arm's "Software Delegated Exception Interface" (SDEI) support from the ACPI GHES code so that it can be used by platforms booted with device-tree - Remove unnecessary per-CPU tracking of the FPSIMD state across EFI runtime calls - Fix a node refcount imbalance in the PSCI device-tree code. CPU Features: - Ensure register sanitisation is applied to fields in ID_AA64MMFR4 - Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM guests can reliably query the underlying CPU types from the VMM - Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes to our context-switching, signal handling and ptrace code" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits) arm64: cpufeature: Remove duplicate asm/mmu.h header arm64: Kconfig: Make CPU_BIG_ENDIAN depend on BROKEN perf/dwc_pcie: Fix use of uninitialized variable arm/syscalls: mark syscall invocation as likely in invoke_syscall Documentation: hisi-pmu: Add introduction to HiSilicon V3 PMU Documentation: hisi-pmu: Fix of minor format error drivers/perf: hisi: Add support for L3C PMU v3 drivers/perf: hisi: Refactor the event configuration of L3C PMU drivers/perf: hisi: Extend the field of tt_core drivers/perf: hisi: Extract the event filter check of L3C PMU drivers/perf: hisi: Simplify the probe process of each L3C PMU version drivers/perf: hisi: Export hisi_uncore_pmu_isr() drivers/perf: hisi: Relax the event ID check in the framework perf: Fujitsu: Add the Uncore PMU driver arm64: map [_text, _stext) virtual address range non-executable+read-only arm64/sysreg: Update TCR_EL1 register arm64: Enable vmalloc-huge with ptdump arm64: cpufeature: add Neoverse-V3AE to BBML2 allow list arm64: errata: Apply workarounds for Neoverse-V3AE arm64: cputype: Add Neoverse-V3AE definitions ...
2025-09-29Merge tag 'hardening-v6.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "One notable addition is the creation of the 'transitional' keyword for kconfig so CONFIG renaming can go more smoothly. This has been a long-standing deficiency, and with the renaming of CONFIG_CFI_CLANG to CONFIG_CFI (since GCC will soon have KCFI support), this came up again. The breadth of the diffstat is mainly this renaming. - Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva) - lkdtm: fortify: Fix potential NULL dereference on kmalloc failure (Junjie Cao) - Add str_assert_deassert() helper (Lad Prabhakar) - gcc-plugins: Remove TODO_verify_il for GCC >= 16 - kconfig: Fix BrokenPipeError warnings in selftests - kconfig: Add transitional symbol attribute for migration support - kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI" * tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: lib/string_choices: Add str_assert_deassert() helper kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI kconfig: Add transitional symbol attribute for migration support kconfig: Fix BrokenPipeError warnings in selftests gcc-plugins: Remove TODO_verify_il for GCC >= 16 stddef: Introduce __TRAILING_OVERLAP() stddef: Remove token-pasting in TRAILING_OVERLAP() lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
2025-09-29Merge tag 'seccomp-v6.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp update from Kees Cook: - Fix race with WAIT_KILLABLE_RECV (Johannes Nixdorf) * tag 'seccomp-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests/seccomp: Add a test for the WAIT_KILLABLE_RECV fast reply race seccomp: Fix a race with WAIT_KILLABLE_RECV if the tracer replies too fast
2025-09-29Merge tag 'vfs-6.18-rc1.async' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs async directory updates from Christian Brauner: "This contains further preparatory changes for the asynchronous directory locking scheme: - Add lookup_one_positive_killable() which allows overlayfs to perform lookup that won't block on a fatal signal - Unify the mount idmap handling in struct renamedata as a rename can only happen within a single mount - Introduce kern_path_parent() for audit which sets the path to the parent and returns a dentry for the target without holding any locks on return - Rename kern_path_locked() as it is only used to prepare for the removal of an object from the filesystem: kern_path_locked() => start_removing_path() kern_path_create() => start_creating_path() user_path_create() => start_creating_user_path() user_path_locked_at() => start_removing_user_path_at() done_path_create() => end_creating_path() NA => end_removing_path()" * tag 'vfs-6.18-rc1.async' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: debugfs: rename start_creating() to debugfs_start_creating() VFS: rename kern_path_locked() and related functions. VFS/audit: introduce kern_path_parent() for audit VFS: unify old_mnt_idmap and new_mnt_idmap in renamedata VFS: discard err2 in filename_create() VFS/ovl: add lookup_one_positive_killable()
2025-09-29Merge tag 'namespace-6.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull namespace updates from Christian Brauner: "This contains a larger set of changes around the generic namespace infrastructure of the kernel. Each specific namespace type (net, cgroup, mnt, ...) embedds a struct ns_common which carries the reference count of the namespace and so on. We open-coded and cargo-culted so many quirks for each namespace type that it just wasn't scalable anymore. So given there's a bunch of new changes coming in that area I've started cleaning all of this up. The core change is to make it possible to correctly initialize every namespace uniformly and derive the correct initialization settings from the type of the namespace such as namespace operations, namespace type and so on. This leaves the new ns_common_init() function with a single parameter which is the specific namespace type which derives the correct parameters statically. This also means the compiler will yell as soon as someone does something remotely fishy. The ns_common_init() addition also allows us to remove ns_alloc_inum() and drops any special-casing of the initial network namespace in the network namespace initialization code that Linus complained about. Another part is reworking the reference counting. The reference counting was open-coded and copy-pasted for each namespace type even though they all followed the same rules. This also removes all open accesses to the reference count and makes it private and only uses a very small set of dedicated helpers to manipulate them just like we do for e.g., files. In addition this generalizes the mount namespace iteration infrastructure introduced a few cycles ago. As reminder, the vfs makes it possible to iterate sequentially and bidirectionally through all mount namespaces on the system or all mount namespaces that the caller holds privilege over. This allow userspace to iterate over all mounts in all mount namespaces using the listmount() and statmount() system call. Each mount namespace has a unique identifier for the lifetime of the systems that is exposed to userspace. The network namespace also has a unique identifier working exactly the same way. This extends the concept to all other namespace types. The new nstree type makes it possible to lookup namespaces purely by their identifier and to walk the namespace list sequentially and bidirectionally for all namespace types, allowing userspace to iterate through all namespaces. Looking up namespaces in the namespace tree works completely locklessly. This also means we can move the mount namespace onto the generic infrastructure and remove a bunch of code and members from struct mnt_namespace itself. There's a bunch of stuff coming on top of this in the future but for now this uses the generic namespace tree to extend a concept introduced first for pidfs a few cycles ago. For a while now we have supported pidfs file handles for pidfds. This has proven to be very useful. This extends the concept to cover namespaces as well. It is possible to encode and decode namespace file handles using the common name_to_handle_at() and open_by_handle_at() apis. As with pidfs file handles, namespace file handles are exhaustive, meaning it is not required to actually hold a reference to nsfs in able to decode aka open_by_handle_at() a namespace file handle. Instead the FD_NSFS_ROOT constant can be passed which will let the kernel grab a reference to the root of nsfs internally and thus decode the file handle. Namespaces file descriptors can already be derived from pidfds which means they aren't subject to overmount protection bugs. IOW, it's irrelevant if the caller would not have access to an appropriate /proc/<pid>/ns/ directory as they could always just derive the namespace based on a pidfd already. It has the same advantage as pidfds. It's possible to reliably and for the lifetime of the system refer to a namespace without pinning any resources and to compare them trivially. Permission checking is kept simple. If the caller is located in the namespace the file handle refers to they are able to open it otherwise they must hold privilege over the owning namespace of the relevant namespace. The namespace file handle layout is exposed as uapi and has a stable and extensible format. For now it simply contains the namespace identifier, the namespace type, and the inode number. The stable format means that userspace may construct its own namespace file handles without going through name_to_handle_at() as they are already allowed for pidfs and cgroup file handles" * tag 'namespace-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (65 commits) ns: drop assert ns: move ns type into struct ns_common nstree: make struct ns_tree private ns: add ns_debug() ns: simplify ns_common_init() further cgroup: add missing ns_common include ns: use inode initializer for initial namespaces selftests/namespaces: verify initial namespace inode numbers ns: rename to __ns_ref nsfs: port to ns_ref_*() helpers net: port to ns_ref_*() helpers uts: port to ns_ref_*() helpers ipv4: use check_net() net: use check_net() net-sysfs: use check_net() user: port to ns_ref_*() helpers time: port to ns_ref_*() helpers pid: port to ns_ref_*() helpers ipc: port to ns_ref_*() helpers cgroup: port to ns_ref_*() helpers ...
2025-09-29Merge tag 'kernel-6.18-rc1.clone3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull copy_process updates from Christian Brauner: "This contains the changes to enable support for clone3() on nios2 which apparently is still a thing. The more exciting part of this is that it cleans up the inconsistency in how the 64-bit flag argument is passed from copy_process() into the various other copy_*() helpers" [ Fixed up rv ltl_monitor 32-bit support as per Sasha Levin in the merge ] * tag 'kernel-6.18-rc1.clone3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: nios2: implement architecture-specific portion of sys_clone3 arch: copy_thread: pass clone_flags as u64 copy_process: pass clone_flags as u64 across calltree copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64)
2025-09-29Merge tag 'vfs-6.18-rc1.pidfs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pidfs updates from Christian Brauner: "This just contains a few changes to pid_nr_ns() to make it more robust and cleans up or improves a few users that ab- or misuse it currently" * tag 'vfs-6.18-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: pid: change task_state() to use task_ppid_nr_ns() pid: change bacct_add_tsk() to use task_ppid_nr_ns() pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers pid: Add a judgment for ns null in pid_nr_ns
2025-09-29Merge tag 'vfs-6.18-rc1.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains the usual selections of misc updates for this cycle. Features: - Add "initramfs_options" parameter to set initramfs mount options. This allows to add specific mount options to the rootfs to e.g., limit the memory size - Add RWF_NOSIGNAL flag for pwritev2() Add RWF_NOSIGNAL flag for pwritev2. This flag prevents the SIGPIPE signal from being raised when writing on disconnected pipes or sockets. The flag is handled directly by the pipe filesystem and converted to the existing MSG_NOSIGNAL flag for sockets - Allow to pass pid namespace as procfs mount option Ever since the introduction of pid namespaces, procfs has had very implicit behaviour surrounding them (the pidns used by a procfs mount is auto-selected based on the mounting process's active pidns, and the pidns itself is basically hidden once the mount has been constructed) This implicit behaviour has historically meant that userspace was required to do some special dances in order to configure the pidns of a procfs mount as desired. Examples include: * In order to bypass the mnt_too_revealing() check, Kubernetes creates a procfs mount from an empty pidns so that user namespaced containers can be nested (without this, the nested containers would fail to mount procfs) But this requires forking off a helper process because you cannot just one-shot this using mount(2) * Container runtimes in general need to fork into a container before configuring its mounts, which can lead to security issues in the case of shared-pidns containers (a privileged process in the pidns can interact with your container runtime process) While SUID_DUMP_DISABLE and user namespaces make this less of an issue, the strict need for this due to a minor uAPI wart is kind of unfortunate Things would be much easier if there was a way for userspace to just specify the pidns they want. So this pull request contains changes to implement a new "pidns" argument which can be set using fsconfig(2): fsconfig(procfd, FSCONFIG_SET_FD, "pidns", NULL, nsfd); fsconfig(procfd, FSCONFIG_SET_STRING, "pidns", "/proc/self/ns/pid", 0); or classic mount(2) / mount(8): // mount -t proc -o pidns=/proc/self/ns/pid proc /tmp/proc mount("proc", "/tmp/proc", "proc", MS_..., "pidns=/proc/self/ns/pid"); Cleanups: - Remove the last references to EXPORT_OP_ASYNC_LOCK - Make file_remove_privs_flags() static - Remove redundant __GFP_NOWARN when GFP_NOWAIT is used - Use try_cmpxchg() in start_dir_add() - Use try_cmpxchg() in sb_init_done_wq() - Replace offsetof() with struct_size() in ioctl_file_dedupe_range() - Remove vfs_ioctl() export - Replace rwlock() with spinlock in epoll code as rwlock causes priority inversion on preempt rt kernels - Make ns_entries in fs/proc/namespaces const - Use a switch() statement() in init_special_inode() just like we do in may_open() - Use struct_size() in dir_add() in the initramfs code - Use str_plural() in rd_load_image() - Replace strcpy() with strscpy() in find_link() - Rename generic_delete_inode() to inode_just_drop() and generic_drop_inode() to inode_generic_drop() - Remove unused arguments from fcntl_{g,s}et_rw_hint() Fixes: - Document @name parameter for name_contains_dotdot() helper - Fix spelling mistake - Always return zero from replace_fd() instead of the file descriptor number - Limit the size for copy_file_range() in compat mode to prevent a signed overflow - Fix debugfs mount options not being applied - Verify the inode mode when loading it from disk in minixfs - Verify the inode mode when loading it from disk in cramfs - Don't trigger automounts with RESOLVE_NO_XDEV If openat2() was called with RESOLVE_NO_XDEV it didn't traverse through automounts, but could still trigger them - Add FL_RECLAIM flag to show_fl_flags() macro so it appears in tracepoints - Fix unused variable warning in rd_load_image() on s390 - Make INITRAMFS_PRESERVE_MTIME depend on BLK_DEV_INITRD - Use ns_capable_noaudit() when determining net sysctl permissions - Don't call path_put() under namespace semaphore in listmount() and statmount()" * tag 'vfs-6.18-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (38 commits) fcntl: trim arguments listmount: don't call path_put() under namespace semaphore statmount: don't call path_put() under namespace semaphore pid: use ns_capable_noaudit() when determining net sysctl permissions fs: rename generic_delete_inode() and generic_drop_inode() init: INITRAMFS_PRESERVE_MTIME should depend on BLK_DEV_INITRD initramfs: Replace strcpy() with strscpy() in find_link() initrd: Use str_plural() in rd_load_image() initramfs: Use struct_size() helper to improve dir_add() initrd: Fix unused variable warning in rd_load_image() on s390 fs: use the switch statement in init_special_inode() fs/proc/namespaces: make ns_entries const filelock: add FL_RECLAIM to show_fl_flags() macro eventpoll: Replace rwlock with spinlock selftests/proc: add tests for new pidns APIs procfs: add "pidns" mount option pidns: move is-ancestor logic to helper openat2: don't trigger automounts with RESOLVE_NO_XDEV namei: move cross-device check to __traverse_mounts namei: remove LOOKUP_NO_XDEV check from handle_mounts ...
2025-09-29Merge branches 'pm-core', 'pm-runtime' and 'pm-sleep'Rafael J. Wysocki
Merge changes related to system sleep and runtime PM framework for 6.18-rc1: - Annotate loops walking device links in the power management core code as _srcu and add macros for walking device links to reduce the likelihood of coding mistakes related to them (Rafael Wysocki) - Document time units for *_time functions in the runtime PM API (Brian Norris) - Clear power.must_resume in noirq suspend error path to avoid resuming a dependant device under a suspended parent or supplier (Rafael Wysocki) - Fix GFP mask handling during hybrid suspend and make the amdgpu driver handle hybrid suspend correctly (Mario Limonciello, Rafael Wysocki) - Fix GFP mask handling after aborted hibernation in platform mode and combine exit paths in power_down() to avoid code duplication (Rafael Wysocki) - Use vmalloc_array() and vcalloc() in the hibernation core to avoid open-coded size computations (Qianfeng Rong) - Fix typo in hibernation core code comment (Li Jun) - Call pm_wakeup_clear() in the same place where other functions that do bookkeeping prior to suspend_prepare() are called (Samuel Wu) * pm-core: PM: core: Add two macros for walking device links PM: core: Annotate loops walking device links as _srcu * pm-runtime: PM: runtime: Documentation: ABI: Document time units for *_time * pm-sleep: PM: hibernate: Combine return paths in power_down() PM: hibernate: Restrict GFP mask in power_down() PM: hibernate: Fix pm_hibernation_mode_is_suspend() build breakage drm/amd: Fix hybrid sleep PM: hibernate: Add pm_hibernation_mode_is_suspend() PM: hibernate: Fix hybrid-sleep PM: sleep: core: Clear power.must_resume in noirq suspend error path PM: sleep: Make pm_wakeup_clear() call more clear PM: hibernate: Fix typo in memory bitmaps description comment PM: hibernate: Use vmalloc_array() and vcalloc() to improve code
2025-09-29Merge branches 'pm-em', 'pm-opp' and 'pm-devfreq'Rafael J. Wysocki
Merge energy model management, OPP (operating performance points) and devfreq updates for 6.18-rc1: - Prevent CPU capacity updates after registering a perf domain from failing on a first CPU that is not present (Christian Loehle) - Add support for the cases in which frequency alone is not sufficient to uniquely identify an OPP (Krishna Chaitanya Chundru) - Use to_result() for OPP error handling in Rust (Onur Özkan) - Add support for LPDDR5 on Rockhip RK3588 SoC to rockchip-dfi devfreq driver (Nicolas Frattaroli) - Fix an issue where DDR cycle counts on RK3588/RK3528 with LPDDR4(X) are reported as half by adding a cycle multiplier to the DFI driver in rockchip-dfi devfreq-event driver (Nicolas Frattaroli) - Fix missing error pointer dereference check of regulator instance in the mtk-cci devfreq driver probe and remove a redundant condition from an if () statement in that driver (Dan Carpenter, Liao Yuanhong) * pm-em: PM: EM: Fix late boot with holes in CPU topology * pm-opp: OPP: Add support to find OPP for a set of keys rust: opp: use to_result for error handling * pm-devfreq: PM / devfreq: rockchip-dfi: add support for LPDDR5 PM / devfreq: rockchip-dfi: double count on RK3588 PM / devfreq: mtk-cci: avoid redundant conditions PM / devfreq: mtk-cci: Fix potential error pointer dereference in probe()
2025-09-29mm: Allow GFP_ACCOUNT to be used in alloc_pages_nolock().Alexei Starovoitov
Change alloc_pages_nolock() to default to __GFP_COMP when allocating pages, since upcoming reentrant alloc_slab_page() needs __GFP_COMP. Also allow __GFP_ACCOUNT flag to be specified, since most of BPF infra needs __GFP_ACCOUNT except BPF streams. Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-09-29locking/local_lock: Introduce local_lock_is_locked().Alexei Starovoitov
Introduce local_lock_is_locked() that returns true when given local_lock is locked by current cpu (in !PREEMPT_RT) or by current task (in PREEMPT_RT). The goal is to detect a deadlock by the caller. Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-09-28kallsyms: use kmalloc_array() instead of kmalloc()Sahil Chandna
Replace kmalloc(sizeof(*stat) * 2, GFP_KERNEL) with kmalloc_array(2, sizeof(*stat), GFP_KERNEL) to prevent potential overflow, as recommended in Documentation/process/deprecated.rst. Link: https://lkml.kernel.org/r/20250926075053.25615-1-chandna.linuxkernel@gmail.com Signed-off-by: Sahil Chandna <chandna.linuxkernel@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Cc: David Hunter <david.hunter.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-28panic: remove CONFIG_PANIC_ON_OOPS_VALUEJohannes Berg
There's really no need for this since it's 0 or 1 when CONFIG_PANIC_ON_OOPS is disabled/enabled, so just use IS_ENABLED() instead. The extra symbol goes back to the original code adding it in commit 2a01bb3885c9 ("panic: Make panic_on_oops configurable"). Link: https://lkml.kernel.org/r/20250924094303.18521-2-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-28kernel: prevent prctl(PR_SET_PDEATHSIG) from racing with parent process exitDemi Marie Obenour
If a process calls prctl(PR_SET_PDEATHSIG) at the same time that the parent process exits, the child will write to me->pdeath_sig at the same time the parent is reading it. Since there is no synchronization, this is a data race. Worse, it is possible that a subsequent call to getppid() can continue to return the previous parent process ID without the parent death signal being delivered. This happens in the following scenario: parent child forget_original_parent() prctl(PR_SET_PDEATHSIG, SIGKILL) sys_prctl() me->pdeath_sig = SIGKILL; getppid(); RCU_INIT_POINTER(t->real_parent, reaper); if (t->pdeath_signal) /* reads stale me->pdeath_sig */ group_send_sig_info(t->pdeath_signal, ...); And in the following: parent child forget_original_parent() RCU_INIT_POINTER(t->real_parent, reaper); /* also no barrier */ if (t->pdeath_signal) /* reads stale me->pdeath_sig */ group_send_sig_info(t->pdeath_signal, ...); prctl(PR_SET_PDEATHSIG, SIGKILL) sys_prctl() me->pdeath_sig = SIGKILL; getppid(); /* reads old ppid() */ As a result, the following pattern is racy: pid_t parent_pid = getpid(); pid_t child_pid = fork(); if (child_pid == -1) { /* handle error... */ return; } if (child_pid == 0) { if (prctl(PR_SET_PDEATHSIG, SIGKILL) != 0) { /* handle error */ _exit(126); } if (getppid() != parent_pid) { /* parent died already */ raise(SIGKILL); } /* keep going in child */ } /* keep going in parent */ If the parent is killed at exactly the wrong time, the child process can (wrongly) stay running. I didn't manage to reproduce this in my testing, but I'm pretty sure the race is real. KCSAN is probably the best way to spot the race. Fix the bug by holding tasklist_lock for reading whenever pdeath_signal is being written to. This prevents races on me->pdeath_sig, and the locking and unlocking of the rwlock provide the needed memory barriers. If prctl(PR_SET_PDEATHSIG) happens before the parent exits, the signal will be sent. If it happens afterwards, a subsequent getppid() will return the new value. Link: https://lkml.kernel.org/r/20250913-fix-prctl-pdeathsig-race-v1-1-44e2eb426fe9@gmail.com Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-28kho: only fill kimage if KHO is finalizedPratyush Yadav
kho_fill_kimage() only checks for KHO being enabled before filling in the FDT to the image. KHO being enabled does not mean that the kernel has data to hand over. That happens when KHO is finalized. When a kexec is done with KHO enabled but not finalized, the FDT page is allocated but not initialized. FDT initialization happens after finalize. This means the KHO segment is filled in but the FDT contains garbage data. This leads to the below error messages in the next kernel: [ 0.000000] KHO: setup: handover FDT (0x10116b000) is invalid: -9 [ 0.000000] KHO: disabling KHO revival: -22 There is no problem in practice, and the next kernel boots and works fine. But this still leads to misleading error messages and garbage being handed over. Only fill in KHO segment when KHO is finalized. When KHO is not enabled, the debugfs interface is not created and there is no way to finalize it anyway. So the check for kho_enable is not needed, and kho_out.finalize alone is enough. Link: https://lkml.kernel.org/r/20250918170617.91413-1-pratyush@kernel.org Fixes: 3bdecc3c93f9 ("kexec: add KHO support to kexec file loads") Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Baoquan He <bhe@redhat.com> Cc: Changyuan Lyu <changyuanl@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-28Merge tag 'trace-v6.17-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix buffer overflow in osnoise_cpu_write() The allocated buffer to read user space did not add a nul terminating byte after copying from user the string. It then reads the string, and if user space did not add a nul byte, the read will continue beyond the string. Add a nul terminating byte after reading the string. - Fix missing check for lockdown on tracing There's a path from kprobe events or uprobe events that can update the tracing system even if lockdown on tracing is activate. Add a check in the dynamic event path. - Add a recursion check for the function graph return path Now that fprobes can hook to the function graph tracer and call different code between the entry and the exit, the exit code may now call functions that are not called in entry. This means that the exit handler can possibly trigger recursion that is not caught and cause the system to crash. Add the same recursion checks in the function exit handler as exists in the entry handler path. * tag 'trace-v6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: fgraph: Protect return handler from recursion loop tracing: dynevent: Add a missing lockdown check on dynevent tracing/osnoise: Fix slab-out-of-bounds in _parse_integer_limit()
2025-09-27bpf: Enforce expected_attach_type for tailcall compatibilityDaniel Borkmann
Yinhao et al. recently reported: Our fuzzer tool discovered an uninitialized pointer issue in the bpf_prog_test_run_xdp() function within the Linux kernel's BPF subsystem. This leads to a NULL pointer dereference when a BPF program attempts to deference the txq member of struct xdp_buff object. The test initializes two programs of BPF_PROG_TYPE_XDP: progA acts as the entry point for bpf_prog_test_run_xdp() and its expected_attach_type can neither be of be BPF_XDP_DEVMAP nor BPF_XDP_CPUMAP. progA calls into a slot of a tailcall map it owns. progB's expected_attach_type must be BPF_XDP_DEVMAP to pass xdp_is_valid_access() validation. The program returns struct xdp_md's egress_ifindex, and the latter is only allowed to be accessed under mentioned expected_attach_type. progB is then inserted into the tailcall which progA calls. The underlying issue goes beyond XDP though. Another example are programs of type BPF_PROG_TYPE_CGROUP_SOCK_ADDR. sock_addr_is_valid_access() as well as sock_addr_func_proto() have different logic depending on the programs' expected_attach_type. Similarly, a program attached to BPF_CGROUP_INET4_GETPEERNAME should not be allowed doing a tailcall into a program which calls bpf_bind() out of BPF which is only enabled for BPF_CGROUP_INET4_CONNECT. In short, specifying expected_attach_type allows to open up additional functionality or restrictions beyond what the basic bpf_prog_type enables. The use of tailcalls must not violate these constraints. Fix it by enforcing expected_attach_type in __bpf_prog_map_compatible(). Note that we only enforce this for tailcall maps, but not for BPF devmaps or cpumaps: There, the programs are invoked through dev_map_bpf_prog_run*() and cpu_map_bpf_prog_run*() which set up a new environment / context and therefore these situations are not prone to this issue. Fixes: 5e43f899b03a ("bpf: Check attach type at prog load time") Reported-by: Yinhao Hu <dddddd@hust.edu.cn> Reported-by: Kaiyan Mei <M202472210@hust.edu.cn> Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20250926171201.188490-1-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-27tracing: fgraph: Protect return handler from recursion loopMasami Hiramatsu (Google)
function_graph_enter_regs() prevents itself from recursion by ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), which is called at the exit, does not prevent such recursion. Therefore, while it can prevent recursive calls from fgraph_ops::entryfunc(), it is not able to prevent recursive calls to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. This can lead an unexpected recursion bug reported by Menglong. is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> kprobe_multi_link_exit_handler -> is_endbr. To fix this issue, acquire ftrace_test_recursion_trylock() in the __ftrace_return_to_handler() after unwind the shadow stack to mark this section must prevent recursive call of fgraph inside user-defined fgraph_ops::retfunc(). This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer"), because before that fgraph was only used from the function graph tracer. Fprobe allowed user to run any callbacks from fgraph after that commit. Reported-by: Menglong Dong <menglong8.dong@gmail.com> Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") Cc: stable@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/175852292275.307379.9040117316112640553.stgit@devnote2 Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Menglong Dong <menglong8.dong@gmail.com> Acked-by: Menglong Dong <menglong8.dong@gmail.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-09-26Merge tag 'sched-urgent-2025-09-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Fix two dl_server regressions: a race that can end up leaving the dl_server stuck, and a dl_server throttling bug causing lag to fair tasks" * tag 'sched-urgent-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Fix dl_server behaviour sched/deadline: Fix dl_server getting stuck
2025-09-26Merge tag 'locking-urgent-2025-09-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Fix a PI-futexes race, and fix a copy_process() futex cleanup bug" * tag 'locking-urgent-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Use correct exit on failure from futex_hash_allocate_default() futex: Prevent use-after-free during requeue-PI
2025-09-26PM: hibernate: Combine return paths in power_down()Rafael J. Wysocki
To avoid code duplication and improve clarity, combine the code paths in power_down() leading to a return from that function. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Link: https://patch.msgid.link/3571055.QJadu78ljV@rafael.j.wysocki [ rjw: Changed the new label name to "exit" ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-09-26PM: hibernate: Restrict GFP mask in power_down()Rafael J. Wysocki
Commit 12ffc3b1513e ("PM: Restrict swap use to later in the suspend sequence") caused hibernation_platform_enter() to call pm_restore_gfp_mask() via dpm_resume_end(), so when power_down() returns after aborting hibernation_platform_enter(), it needs to match the pm_restore_gfp_mask() call in hibernate() that will occur subsequently. Address this by adding a pm_restrict_gfp_mask() call to the relevant error path in power_down(). Fixes: 12ffc3b1513e ("PM: Restrict swap use to later in the suspend sequence") Cc: 6.16+ <stable@vger.kernel.org> # 6.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-25bpf: Add lookup_and_delete_elem for BPF_MAP_STACK_TRACETao Chen
The stacktrace map can be easily full, which will lead to failure in obtaining the stack. In addition to increasing the size of the map, another solution is to delete the stack_id after looking it up from the user, so extend the existing bpf_map_lookup_and_delete_elem() functionality to stacktrace map types. Signed-off-by: Tao Chen <chen.dylane@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250925175030.1615837-1-chen.dylane@linux.dev
2025-09-25PM: hibernate: Add pm_hibernation_mode_is_suspend()Mario Limonciello (AMD)
Some drivers have different flows for hibernation and suspend. If the driver opportunistically will skip thaw() then it needs a hint to know what is happening after the hibernate. Introduce a new symbol pm_hibernation_mode_is_suspend() that drivers can call to determine if suspending the system for this purpose. Tested-by: Ionut Nechita <ionut_n2001@yahoo.com> Tested-by: Kenneth Crudup <kenny@panix.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-09-25PM: hibernate: Fix hybrid-sleepMario Limonciello (AMD)
Hybrid sleep will hibernate the system followed by running through the suspend routine. Since both the hibernate and the suspend routine will call pm_restrict_gfp_mask(), pm_restore_gfp_mask() must be called before starting the suspend sequence. Add an explicit call to pm_restore_gfp_mask() to power_down() before the suspend sequence starts. Add an extra call for pm_restrict_gfp_mask() when exiting suspend so that the pm_restore_gfp_mask() call in hibernate() is balanced. Reported-by: Ionut Nechita <ionut_n2001@yahoo.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4573 Tested-by: Ionut Nechita <ionut_n2001@yahoo.com> Fixes: 12ffc3b1513eb ("PM: Restrict swap use to later in the suspend sequence") Tested-by: Kenneth Crudup <kenny@panix.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Link: https://patch.msgid.link/20250925185108.2968494-2-superm1@kernel.org [ rjw: Add comment explainig the new pm_restrict_gfp_mask() call purpose ] Cc: 6.16+ <stable@vger.kernel.org> # 6.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-09-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.17-rc8). Conflicts: drivers/net/can/spi/hi311x.c 6b6968084721 ("can: hi311x: fix null pointer dereference when resuming from sleep before interface was enabled") 27ce71e1ce81 ("net: WQ_PERCPU added to alloc_workqueue users") https://lore.kernel.org/72ce7599-1b5b-464a-a5de-228ff9724701@kernel.org net/smc/smc_loopback.c drivers/dibs/dibs_loopback.c a35c04de2565 ("net/smc: fix warning in smc_rx_splice() when calling get_page()") cc21191b584c ("dibs: Move data path to dibs layer") https://lore.kernel.org/74368a5c-48ac-4f8e-a198-40ec1ed3cf5f@kernel.org Adjacent changes: drivers/net/dsa/lantiq/lantiq_gswip.c c0054b25e2f1 ("net: dsa: lantiq_gswip: move gswip_add_single_port_br() call to port_setup()") 7a1eaef0a791 ("net: dsa: lantiq_gswip: support model-specific mac_select_pcs()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>