summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2025-04-19Merge tag 'mm-hotfixes-stable-2025-04-19-21-24' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "16 hotfixes. 2 are cc:stable and the remainder address post-6.14 issues or aren't considered necessary for -stable kernels. All patches are basically for MM although five are alterations to MAINTAINERS" [ Basic counting skills are clearly not a strictly necessary requirement for kernel maintainers. - Linus ] * tag 'mm-hotfixes-stable-2025-04-19-21-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: MAINTAINERS: add section for locking of mm's and VMAs mm: vmscan: fix kswapd exit condition in defrag_mode mm: vmscan: restore high-cpu watermark safety in kswapd MAINTAINERS: add Pedro as reviewer to the MEMORY MAPPING section mm/memory: move sanity checks in do_wp_page() after mapcount vs. refcount stabilization mm, hugetlb: increment the number of pages to be reset on HVO writeback: fix false warning in inode_to_wb() docs: ABI: replace mcroce@microsoft.com with new Meta address mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable() MAINTAINERS: add memory advice section MAINTAINERS: add mmap trace events to MEMORY MAPPING mm: memcontrol: fix swap counter leak from offline cgroup MAINTAINERS: add MM subsection for the page allocator MAINTAINERS: update SLAB ALLOCATOR maintainers fs/dax: fix folio splitting issue by resetting old folio order + _nr_pages mm/page_alloc: fix deadlock on cpu_hotplug_lock in __accept_page()
2025-04-19Merge tag 'vfs-6.15-rc3.fixes.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Revert the hfs{plus} deprecation warning that's also included in this pull request. The commit introducing the deprecation warning resides rather early in this branch. So simply dropping it would've rebased all other commits which I decided to avoid. Hence the revert in the same branch [ Background - the deprecation warning discussion resulted in people stepping up, and so hfs{plus} will have a maintainer taking care of it after all.. - Linus ] - Switch CONFIG_SYSFS_SYCALL default to n and decouple from CONFIG_EXPERT - Fix an audit bug caused by changes to our kernel path lookup helpers this cycle. Audit needs the parent path even if the dentry it tried to look up is negative - Ensure that the kernel path lookup helpers leave the passed in path argument clean when they return an error. This is consistent with all our other helpers - Ensure that vfs_getattr_nosec() calls bdev_statx() so the relevant information is available to kernel consumers as well - Don't set a timer and call schedule() if the timer will expire immediately in epoll - Make netfs lookup tables with __nonstring * tag 'vfs-6.15-rc3.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: Revert "hfs{plus}: add deprecation warning" fs: move the bdex_statx call to vfs_getattr_nosec netfs: Mark __nonstring lookup tables eventpoll: Set epoll timeout if it's in the future fs: ensure that *path_locked*() helpers leave passed path pristine fs: add kern_path_locked_negative() hfs{plus}: add deprecation warning Kconfig: switch CONFIG_SYSFS_SYCALL default to n
2025-04-19Merge tag 'nfsd-6.15-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - v6.15 libcrc clean-up makes invalid configurations possible - Fix a potential deadlock introduced during the v6.15 merge window * tag 'nfsd-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: decrease sc_count directly if fail to queue dl_recall nfs: add missing selections of CONFIG_CRC32
2025-04-19Merge tag 'drm-fixes-2025-04-19' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Easter rc3 pull request, fixes in all the usuals, amdgpu, xe, msm, with some i915/ivpu/mgag200/v3d fixes, then a couple of bits in dma-buf/gem. Hopefully has no easter eggs in it. dma-buf: - Correctly decrement refcounter on errors gem: - Fix test for imported buffers amdgpu: - Cleaner shader sysfs fix - Suspend fix - Fix doorbell free ordering - Video caps fix - DML2 memory allocation optimization - HDP fix i915: - Fix DP DSC configurations that require 3 DSC engines per pipe xe: - Fix LRC address being written too late for GuC - Fix notifier vs folio deadlock - Fix race betwen dma_buf unmap and vram eviction - Fix debugfs handling PXP terminations unconditionally msm: - Display: - Fix to call dpu_plane_atomic_check_pipe() for both SSPPs in case of multi-rect - Fix to validate plane_state pointer before using it in dpu_plane_virtual_atomic_check() - Fix to make sure dereferencing dpu_encoder_phys happens after making sure it is valid in _dpu_encoder_trigger_start() - Remove the remaining intr_tear_rd_ptr which we initialized to -1 because NO_IRQ indices start from 0 now - GPU: - Fix IB_SIZE overflow ivpu: - Fix debugging - Fixes to frequency - Support firmware API 3.28.3 - Flush jobs upon reset mgag200: - Set vblank start to correct values v3d: - Fix Indirect Dispatch" * tag 'drm-fixes-2025-04-19' of https://gitlab.freedesktop.org/drm/kernel: (26 commits) drm/msm/a6xx+: Don't let IB_SIZE overflow drm/xe/pxp: do not queue unneeded terminations from debugfs drm/xe/dma_buf: stop relying on placement in unmap drm/xe/userptr: fix notifier vs folio deadlock drm/xe: Set LRC addresses before guc load drm/mgag200: Fix value in <VBLKSTR> register drm/gem: Internally test import_attach for imported objects drm/amdgpu: Use the right function for hdp flush drm/amd/display/dml2: use vzalloc rather than kzalloc drm/amdgpu: Add back JPEG to video caps for carrizo and newer drm/amdgpu: fix warning of drm_mm_clean drm/amd: Forbid suspending into non-default suspend states drm/amdgpu: use a dummy owner for sysfs triggered cleaner shaders v4 drm/i915/dp: Check for HAS_DSC_3ENGINES while configuring DSC slices drm/i915/display: Add macro for checking 3 DSC engines dma-buf/sw_sync: Decrement refcount on error in sw_sync_ioctl_get_deadline() accel/ivpu: Add cmdq_id to job related logs accel/ivpu: Show NPU frequency in sysfs accel/ivpu: Fix the NPU's DPU frequency calculation accel/ivpu: Update FW Boot API to version 3.28.3 ...
2025-04-18Merge tag 'irq-urgent-2025-04-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc irq fixes from Ingo Molnar: - Fix BCM2712 irqchip driver Kconfig dependencies required on the Raspberry PI5 - Fix spurious interrupts on RZ/G3E SMARC EVK systems - Fix crash regression on Sun/NIU hardware - Apply MSI driver quirk for Sun Neptune chips * tag 'irq-urgent-2025-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-bcm2712-mip: Enable driver when ARCH_BCM2835 is enabled irqchip/renesas-rzv2h: Prevent TINT spurious interrupt net/niu: Niu requires MSIX ENTRY_DATA fields touch before entry reads PCI/MSI: Add an option to write MSIX ENTRY_DATA before any reads
2025-04-18cxl: Fix devm host device for CXL fwctl initializationDave Jiang
Testing revealed the following error message for a CXL memdev that has Feature support: [ 56.690430] cxl mem0: Resources present before probing Attach the allocation of cxl_fwctl to the parent device of cxl_memdev. devm_add_* calls for cxl_memdev should not happen before the memdev probe function or outside the scope of the memdev driver. cxl_test missed this bug because cxl_test always arranges for the cxl_mem driver to be loaded before cxl_mock_mem runs. So the driver core always finds the devres list idle in that case. [DJ: Updated subject title and added commit log suggestion from djbw] Fixes: 858ce2f56b52 ("cxl: Add FWCTL support to CXL") Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/linux-cxl/6801aea053466_71fe2944c@dwillia2-xfh.jf.intel.com.notmuch/ Link: https://patch.msgid.link/20250418002933.406439-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-04-18Merge tag 'io_uring-6.15-20250418' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Correctly cap iov_iter->nr_segs for imports of registered buffers, both kbuf and normal ones. Three cleanups to make it saner first, then two fixes for each of the buffer types. This fixes a performance regression where partial buffer usage doesn't trim the tail number of segments, leading the block layer to iterate the IOs to check if it needs splitting. - Two patches tweaking the newly introduced zero-copy rx API, mostly to keep the API consistent once we add multiple interface queues per ring support in the 6.16 release. - zc rx unmapping fix for a dead device * tag 'io_uring-6.15-20250418' of git://git.kernel.dk/linux: io_uring/zcrx: fix late dma unmap for a dead dev io_uring/rsrc: ensure segments counts are correct on kbuf buffers io_uring/rsrc: send exact nr_segs for fixed buffer io_uring/rsrc: refactor io_import_fixed io_uring/rsrc: separate kbuf offset adjustments io_uring/rsrc: don't skip offset calculation io_uring/zcrx: add pp to ifq conversion helper io_uring/zcrx: return ifq id to the user
2025-04-18virtgpu: don't reset on shutdownMichael S. Tsirkin
It looks like GPUs are used after shutdown is invoked. Thus, breaking virtio gpu in the shutdown callback is not a good idea - guest hangs attempting to finish console drawing, with these warnings: [ 20.504464] WARNING: CPU: 0 PID: 568 at drivers/gpu/drm/virtio/virtgpu_vq.c:358 virtio_gpu_queue_ctrl_sgs+0x236/0x290 [virtio_gpu] [ 20.505685] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink vfat fat intel_rapl_msr intel_rapl_common intel_uncore_frequency_common nfit libnvdimm kvm_intel kvm rapl iTCO_wdt iTCO_vendor_support virtio_gpu virtio_dma_buf pcspkr drm_shmem_helper i2c_i801 drm_kms_helper lpc_ich i2c_smbus virtio_balloon joydev drm fuse xfs libcrc32c ahci libahci crct10dif_pclmul crc32_pclmul crc32c_intel libata virtio_net ghash_clmulni_intel net_failover virtio_blk failover serio_raw dm_mirror dm_region_hash dm_log dm_mod [ 20.511847] CPU: 0 PID: 568 Comm: kworker/0:3 Kdump: loaded Tainted: G W ------- --- 5.14.0-578.6675_1757216455.el9.x86_64 #1 [ 20.513157] Hardware name: Red Hat KVM/RHEL, BIOS edk2-20241117-3.el9 11/17/2024 [ 20.513918] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] [ 20.514626] RIP: 0010:virtio_gpu_queue_ctrl_sgs+0x236/0x290 [virtio_gpu] [ 20.515332] Code: 00 00 48 85 c0 74 0c 48 8b 78 08 48 89 ee e8 51 50 00 00 65 ff 0d 42 e3 74 3f 0f 85 69 ff ff ff 0f 1f 44 00 00 e9 5f ff ff ff <0f> 0b e9 3f ff ff ff 48 83 3c 24 00 74 0e 49 8b 7f 40 48 85 ff 74 [ 20.517272] RSP: 0018:ff34f0a8c0787ad8 EFLAGS: 00010282 [ 20.517820] RAX: 00000000fffffffb RBX: 0000000000000000 RCX: 0000000000000820 [ 20.518565] RDX: 0000000000000000 RSI: ff34f0a8c0787be0 RDI: ff218bef03a26300 [ 20.519308] RBP: ff218bef03a26300 R08: 0000000000000001 R09: ff218bef07224360 [ 20.520059] R10: 0000000000008dc0 R11: 0000000000000002 R12: ff218bef02630028 [ 20.520806] R13: ff218bef0263fb48 R14: ff218bef00cb8000 R15: ff218bef07224360 [ 20.521555] FS: 0000000000000000(0000) GS:ff218bef7ba00000(0000) knlGS:0000000000000000 [ 20.522397] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 20.522996] CR2: 000055ac4f7871c0 CR3: 000000010b9f2002 CR4: 0000000000771ef0 [ 20.523740] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 20.524477] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 20.525223] PKRU: 55555554 [ 20.525515] Call Trace: [ 20.525777] <TASK> [ 20.526003] ? show_trace_log_lvl+0x1c4/0x2df [ 20.526464] ? show_trace_log_lvl+0x1c4/0x2df [ 20.526925] ? virtio_gpu_queue_fenced_ctrl_buffer+0x82/0x2c0 [virtio_gpu] [ 20.527643] ? virtio_gpu_queue_ctrl_sgs+0x236/0x290 [virtio_gpu] [ 20.528282] ? __warn+0x7e/0xd0 [ 20.528621] ? virtio_gpu_queue_ctrl_sgs+0x236/0x290 [virtio_gpu] [ 20.529256] ? report_bug+0x100/0x140 [ 20.529643] ? handle_bug+0x3c/0x70 [ 20.530010] ? exc_invalid_op+0x14/0x70 [ 20.530421] ? asm_exc_invalid_op+0x16/0x20 [ 20.530862] ? virtio_gpu_queue_ctrl_sgs+0x236/0x290 [virtio_gpu] [ 20.531506] ? virtio_gpu_queue_ctrl_sgs+0x174/0x290 [virtio_gpu] [ 20.532148] virtio_gpu_queue_fenced_ctrl_buffer+0x82/0x2c0 [virtio_gpu] [ 20.532843] virtio_gpu_primary_plane_update+0x3e2/0x460 [virtio_gpu] [ 20.533520] drm_atomic_helper_commit_planes+0x108/0x320 [drm_kms_helper] [ 20.534233] drm_atomic_helper_commit_tail+0x45/0x80 [drm_kms_helper] [ 20.534914] commit_tail+0xd2/0x130 [drm_kms_helper] [ 20.535446] drm_atomic_helper_commit+0x11b/0x140 [drm_kms_helper] [ 20.536097] drm_atomic_commit+0xa4/0xe0 [drm] [ 20.536588] ? __pfx___drm_printfn_info+0x10/0x10 [drm] [ 20.537162] drm_atomic_helper_dirtyfb+0x192/0x270 [drm_kms_helper] [ 20.537823] drm_fbdev_shmem_helper_fb_dirty+0x43/0xa0 [drm_shmem_helper] [ 20.538536] drm_fb_helper_damage_work+0x87/0x160 [drm_kms_helper] [ 20.539188] process_one_work+0x194/0x380 [ 20.539612] worker_thread+0x2fe/0x410 [ 20.540007] ? __pfx_worker_thread+0x10/0x10 [ 20.540456] kthread+0xdd/0x100 [ 20.540791] ? __pfx_kthread+0x10/0x10 [ 20.541190] ret_from_fork+0x29/0x50 [ 20.541566] </TASK> [ 20.541802] ---[ end trace 0000000000000000 ]--- It looks like the shutdown is called in the middle of console drawing, so we should either wait for it to finish, or let drm handle the shutdown. This patch implements this second option: Add an option for drivers to bypass the common break+reset handling. As DRM is careful to flush/synchronize outstanding buffers, it looks like GPU can just have a NOP there. Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Fixes: 8bd2fa086a04 ("virtio: break and reset virtio devices on device_shutdown()") Cc: Eric Auger <eauger@redhat.com> Cc: Jocelyn Falempe <jfalempe@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <8490dbeb6f79ed039e6c11d121002618972538a3.1744293540.git.mst@redhat.com>
2025-04-17mm: vmscan: restore high-cpu watermark safety in kswapdJohannes Weiner
Vlastimil points out that commit a211c6550efc ("mm: page_alloc: defrag_mode kswapd/kcompactd watermarks") switched kswapd from zone_watermark_ok_safe() to the standard, percpu-cached version of reading free pages, thus dropping the watermark safety precautions for systems with high CPU counts (e.g. >212 cpus on 64G). Restore them. Since zone_watermark_ok_safe() is no longer the right interface, and this was the last caller of the function anyway, open-code the zone_page_state_snapshot() conditional and delete the function. Link: https://lkml.kernel.org/r/20250416135142.778933-2-hannes@cmpxchg.org Fixes: a211c6550efc ("mm: page_alloc: defrag_mode kswapd/kcompactd watermarks") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Brendan Jackman <jackmanb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-04-17writeback: fix false warning in inode_to_wb()Andreas Gruenbacher
inode_to_wb() is used also for filesystems that don't support cgroup writeback. For these filesystems inode->i_wb is stable during the lifetime of the inode (it points to bdi->wb) and there's no need to hold locks protecting the inode->i_wb dereference. Improve the warning in inode_to_wb() to not trigger for these filesystems. Link: https://lkml.kernel.org/r/20250412163914.3773459-3-agruenba@redhat.com Fixes: aaa2cacf8184 ("writeback: add lockdep annotation to inode_to_wb()") Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-04-17fs/dax: fix folio splitting issue by resetting old folio order + _nr_pagesDavid Hildenbrand
Alison reports an issue with fsdax when large extends end up using large ZONE_DEVICE folios: [ 417.796271] BUG: kernel NULL pointer dereference, address: 0000000000000b00 [ 417.796982] #PF: supervisor read access in kernel mode [ 417.797540] #PF: error_code(0x0000) - not-present page [ 417.798123] PGD 2a5c5067 P4D 2a5c5067 PUD 2a5c6067 PMD 0 [ 417.798690] Oops: Oops: 0000 [#1] SMP NOPTI [ 417.799178] CPU: 5 UID: 0 PID: 1515 Comm: mmap Tainted: ... [ 417.800150] Tainted: [O]=OOT_MODULE [ 417.800583] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 [ 417.801358] RIP: 0010:__lruvec_stat_mod_folio+0x7e/0x250 [ 417.801948] Code: ... [ 417.803662] RSP: 0000:ffffc90002be3a08 EFLAGS: 00010206 [ 417.804234] RAX: 0000000000000000 RBX: 0000000000000200 RCX: 0000000000000002 [ 417.804984] RDX: ffffffff815652d7 RSI: 0000000000000000 RDI: ffffffff82a2beae [ 417.805689] RBP: ffffc90002be3a28 R08: 0000000000000000 R09: 0000000000000000 [ 417.806384] R10: ffffea0007000040 R11: ffff888376ffe000 R12: 0000000000000001 [ 417.807099] R13: 0000000000000012 R14: ffff88807fe4ab40 R15: ffff888029210580 [ 417.807801] FS: 00007f339fa7a740(0000) GS:ffff8881fa9b9000(0000) knlGS:0000000000000000 [ 417.808570] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 417.809193] CR2: 0000000000000b00 CR3: 000000002a4f0004 CR4: 0000000000370ef0 [ 417.809925] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 417.810622] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 417.811353] Call Trace: [ 417.811709] <TASK> [ 417.812038] folio_add_file_rmap_ptes+0x143/0x230 [ 417.812566] insert_page_into_pte_locked+0x1ee/0x3c0 [ 417.813132] insert_page+0x78/0xf0 [ 417.813558] vmf_insert_page_mkwrite+0x55/0xa0 [ 417.814088] dax_fault_iter+0x484/0x7b0 [ 417.814542] dax_iomap_pte_fault+0x1ca/0x620 [ 417.815055] dax_iomap_fault+0x39/0x40 [ 417.815499] __xfs_write_fault+0x139/0x380 [ 417.815995] ? __handle_mm_fault+0x5e5/0x1a60 [ 417.816483] xfs_write_fault+0x41/0x50 [ 417.816966] xfs_filemap_fault+0x3b/0xe0 [ 417.817424] __do_fault+0x31/0x180 [ 417.817859] __handle_mm_fault+0xee1/0x1a60 [ 417.818325] ? debug_smp_processor_id+0x17/0x20 [ 417.818844] handle_mm_fault+0xe1/0x2b0 [...] The issue is that when we split a large ZONE_DEVICE folio to order-0 ones, we don't reset the order/_nr_pages. As folio->_nr_pages overlays page[1]->memcg_data, once page[1] is a folio, it suddenly looks like it has folio->memcg_data set. And we never manually initialize folio->memcg_data in fsdax code, because we never expect it to be set at all. When __lruvec_stat_mod_folio() then stumbles over such a folio, it tries to use folio->memcg_data (because it's non-NULL) but it does not actually point at a memcg, resulting in the problem. Alison also observed that these folios sometimes have "locked" set, which is rather concerning (folios locked from the beginning ...). The reason is that the order for large folios is stored in page[1]->flags, which become the folio->flags of a new small folio. Let's fix it by adding a folio helper to clear order/_nr_pages for splitting purposes. Maybe we should reinitialize other large folio flags / folio members as well when splitting, because they might similarly cause harm once page[1] becomes a folio? At least other flags in PAGE_FLAGS_SECOND should not be set for fsdax, so at least page[1]->flags might be as expected with this fix. From a quick glimpse, initializing ->mapping, ->pgmap and ->share should re-initialize most things from a previous page[1] used by large folios that fsdax cares about. For example folio->private might not get reinitialized, but maybe that's not relevant -- no traces of it's use in fsdax code. Needs a closer look. Another thing that should be considered in the future is performing similar checks as we perform in free_tail_page_prepare() -- checking pincount etc. -- when freeing a large fsdax folio. Link: https://lkml.kernel.org/r/20250410091020.119116-1-david@redhat.com Fixes: 4996fc547f5b ("mm: let _folio_nr_pages overlay memcg_data in first tail page") Fixes: 38607c62b34b ("fs/dax: properly refcount fs dax pages") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Alison Schofield <alison.schofield@intel.com> Closes: https://lkml.kernel.org/r/Z_W9Oeg-D9FhImf3@aschofie-mobl2.lan Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: "Darrick J. Wong" <djwong@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-04-17mm/page_alloc: fix deadlock on cpu_hotplug_lock in __accept_page()Kirill A. Shutemov
When the last page in the zone is accepted, __accept_page() calls static_branch_dec(). This function takes cpu_hotplug_lock, which can lead to a deadlock if the allocation occurs during CPU bringup path as _cpu_up() also takes the lock. To prevent this deadlock, defer static_branch_dec() to a workqueue. Call static_branch_dec() only when the workqueue is not yet initialized. Workqueues are initialized before CPU bring up, so this will not conflict with the first scenario. Link: https://lkml.kernel.org/r/20250329171030.3942298-1-kirill.shutemov@linux.intel.com Fixes: 55ad43e8ba0f ("mm: add a helper to accept page") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Srikanth Aithal <sraithal@amd.com> Tested-by: Srikanth Aithal <sraithal@amd.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Ashish Kalra <ashish.kalra@amd.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> Cc: Thomas Lendacky <thomas.lendacky@amd.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-04-18Merge tag 'drm-misc-fixes-2025-04-17' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: dma-buf: - Correctly decrement refcounter on errors gem: - Fix test for imported buffers ivpu: - Fix debugging - Fixes to frequency - Support firmware API 3.28.3 - Flush jobs upon reset mgag200: - Set vblank start to correct values v3d: - Fix Indirect Dispatch Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250417084043.GA365738@linux.fritz.box
2025-04-17Merge tag 'net-6.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Bluetooth, CAN and Netfilter. Current release - regressions: - two fixes for the netdev per-instance locking - batman-adv: fix double-hold of meshif when getting enabled Current release - new code bugs: - Bluetooth: increment TX timestamping tskey always for stream sockets - wifi: static analysis and build fixes for the new Intel sub-driver Previous releases - regressions: - net: fib_rules: fix iif / oif matching on L3 master (VRF) device - ipv6: add exception routes to GC list in rt6_insert_exception() - netfilter: conntrack: fix erroneous removal of offload bit - Bluetooth: - fix sending MGMT_EV_DEVICE_FOUND for invalid address - l2cap: process valid commands in too long frame - btnxpuart: Revert baudrate change in nxp_shutdown Previous releases - always broken: - ethtool: fix memory corruption during SFP FW flashing - eth: - hibmcge: fixes for link and MTU handling, pause frames etc - igc: fixes for PTM (PCIe timestamping) - dsa: b53: enable BPDU reception for management port Misc: - fixes for Netlink protocol schemas" * tag 'net-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits) net: ethernet: mtk_eth_soc: revise QDMA packet scheduler settings net: ethernet: mtk_eth_soc: correct the max weight of the queue limit for 100Mbps net: ethernet: mtk_eth_soc: reapply mdc divider on reset net: ti: icss-iep: Fix possible NULL pointer dereference for perout request net: ti: icssg-prueth: Fix possible NULL pointer dereference inside emac_xmit_xdp_frame() net: ti: icssg-prueth: Fix kernel warning while bringing down network interface netfilter: conntrack: fix erronous removal of offload bit net: don't try to ops lock uninitialized devs ptp: ocp: fix start time alignment in ptp_ocp_signal_set net: dsa: avoid refcount warnings when ds->ops->tag_8021q_vlan_del() fails net: dsa: free routing table on probe failure net: dsa: clean up FDB, MDB, VLAN entries on unbind net: dsa: mv88e6xxx: fix -ENOENT when deleting VLANs and MST is unsupported net: dsa: mv88e6xxx: avoid unregistering devlink regions which were never registered net: txgbe: fix memory leak in txgbe_probe() error path net: bridge: switchdev: do not notify new brentries as changed net: b53: enable BPDU reception for management port netlink: specs: rt-neigh: prefix struct nfmsg members with ndm netlink: specs: rt-link: adjust mctp attribute naming netlink: specs: rtnetlink: attribute naming corrections ...
2025-04-17Merge tag 'for-linus-fwctl' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull fwctl fixes from Jason Gunthorpe: "Three small changes from further build testing: - Don't rely on the userspace uuid.h for the uapi header - Fix sparse warnings in pds - Typo in log message" * tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: fwctl: Fix repeated device word in log message pds_fwctl: Fix type and endian complaints fwctl/cxl: Fix uuid_t usage in uapi
2025-04-17Merge tag 'sound-6.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes. All are device-specific like quirks, new IDs, and other safe (or rather boring) changes" * tag 'sound-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: firmware: cs_dsp: test_bin_error: Fix uninitialized data used as fw version ASoC: codecs: Add of_match_table for aw888081 driver ASoC: fsl: fsl_qmc_audio: Reset audio data pointers on TRIGGER_START event mailmap: Add entry for Srinivas Kandagatla MAINTAINERS: use kernel.org alias ASoC: cs42l43: Reset clamp override on jack removal ALSA: hda/realtek - Fixed ASUS platform headset Mic issue ALSA: hda/cirrus_scodec_test: Don't select dependencies ALSA: azt2320: Replace deprecated strcpy() with strscpy() ASoC: hdmi-codec: use RTD ID instead of DAI ID for ELD entry ASoC: Intel: avs: Constrain path based on BE capabilities ALSA: hda/tas2781: Remove unnecessary NULL check before release_firmware() ASoC: Intel: avs: Fix null-ptr-deref in avs_component_probe() ASoC: fsl_asrc_dma: get codec or cpu dai from backend ASoC: qcom: Fix sc7280 lpass potential buffer overflow ASoC: dwc: always enable/disable i2s irqs ASoC: Intel: sof_sdw: Add quirk for Asus Zenbook S16 ASoC: codecs:lpass-wsa-macro: Fix logic of enabling vi channels ASoC: codecs:lpass-wsa-macro: Fix vi feedback rate
2025-04-17Merge tag 'platform-drivers-x86-v6.15-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fixes from Ilpo Järvinen: "Fixes: - amd/pmf: Fix STT limits - asus-laptop: Fix an uninitialized variable - intel_pmc_ipc: Allow building without ACPI - mlxbf-bootctl: Use sysfs_emit_at() in secure_boot_fuse_state_show() - msi-wmi-platform: Add locking to workaround ACPI firmware bug New HW support: - alienware-wmi-wmax: - Extended thermal control support to: - Alienware Area-51m R2 - Alienware m16 R1 - Alienware m16 R2 - Dell G16 7630 - Dell G5 5505 SE - G-Mode support to Alienware m16 R1 - x86-android-tablets: Add Vexia Edu Atla 10 tablet 5V data" * tag 'platform-drivers-x86-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: msi-wmi-platform: Workaround a ACPI firmware bug platform/x86: msi-wmi-platform: Rename "data" variable platform/x86: alienware-wmi-wmax: Extend support to more laptops platform/x86: alienware-wmi-wmax: Add G-Mode support to Alienware m16 R1 platform/x86: amd: pmf: Fix STT limits mlxbf-bootctl: use sysfs_emit_at() in secure_boot_fuse_state_show() platform/x86: x86-android-tablets: Add Vexia Edu Atla 10 tablet 5V data platform/x86: x86-android-tablets: Add "9v" to Vexia EDU ATLA 10 tablet symbols asus-laptop: Fix an uninitialized variable platform/x86: intel_pmc_ipc: add option to build without ACPI
2025-04-17Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Small drivers fixes, except for ufs which has two large updates, one for exposing the device level feature, which is a new addition to the device spec and the other reworking the exynos driver to fix coherence issues on some android phones" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: megaraid_sas: Driver version update to 07.734.00.00-rc1 scsi: megaraid_sas: Block zero-length ATA VPD inquiry scsi: scsi_transport_srp: Replace min/max nesting with clamp() scsi: ufs: core: Add device level exception support scsi: ufs: core: Rename ufshcd_wb_presrv_usrspc_keep_vcc_on() scsi: smartpqi: Use is_kdump_kernel() to check for kdump scsi: pm80xx: Set phy_attached to zero when device is gone scsi: ufs: exynos: gs101: Put UFS device in reset on .suspend() scsi: ufs: exynos: Move phy calls to .exit() callback scsi: ufs: exynos: Enable PRDT pre-fetching with UFSHCD_CAP_CRYPTO scsi: ufs: exynos: Ensure consistent phy reference counts scsi: ufs: exynos: Disable iocc if dma-coherent property isn't set scsi: ufs: exynos: Move UFS shareability value to drvdata scsi: ufs: exynos: Ensure pre_link() executes before exynos_ufs_phy_init() scsi: iscsi: Fix missing scsi_host_put() in error path scsi: ufs: core: Fix a race condition related to device commands scsi: hisi_sas: Fix I/O errors caused by hardware port ID changes scsi: hisi_sas: Enable force phy when SATA disk directly connected
2025-04-17iommu: Fix two issues in iommu_copy_struct_from_user()Nicolin Chen
In the review for iommu_copy_struct_to_user() helper, Matt pointed out that a NULL pointer should be rejected prior to dereferencing it: https://lore.kernel.org/all/86881827-8E2D-461C-BDA3-FA8FD14C343C@nvidia.com And Alok pointed out a typo at the same time: https://lore.kernel.org/all/480536af-6830-43ce-a327-adbd13dc3f1d@oracle.com Since both issues were copied from iommu_copy_struct_from_user(), fix them first in the current header. Fixes: e9d36c07bb78 ("iommu: Add iommu_copy_struct_from_user helper") Cc: stable@vger.kernel.org Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Matthew R. Ochs <mochs@nvidia.com> Link: https://lore.kernel.org/r/20250414191635.450472-1-nicolinc@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17block: introduce zone capacity helperNaohiro Aota
{bdev,disk}_zone_capacity() takes block_device or gendisk and sector position and returns the zone capacity of the corresponding zone. With that, move disk_nr_zones() and blk_zone_plug_bio() to consolidate them in the same #ifdef block. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-04-17landlock: Update log documentationMickaël Salaün
Fix and improve documentation related to landlock_restrict_self(2)'s flags. Update the LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF documentation according to the current semantic. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250416154716.1799902-3-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-17landlock: Fix documentation for landlock_restrict_self(2)Mickaël Salaün
Fix, deduplicate, and improve rendering of landlock_restrict_self(2)'s flags documentation. The flags are now rendered like the syscall's parameters and description. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250416154716.1799902-2-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-17landlock: Fix documentation for landlock_create_ruleset(2)Mickaël Salaün
Move and fix the flags documentation, and improve formatting. It makes more sense and it eases maintenance to document syscall flags in landlock.h, where they are defined. This is already the case for landlock_restrict_self(2)'s flags. The flags are now rendered like the syscall's parameters and description. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250416154716.1799902-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-17fs: move the bdex_statx call to vfs_getattr_nosecChristoph Hellwig
Currently bdex_statx is only called from the very high-level vfs_statx_path function, and thus bypassing it for in-kernel calls to vfs_getattr or vfs_getattr_nosec. This breaks querying the block ѕize of the underlying device in the loop driver and also is a pitfall for any other new kernel caller. Move the call into the lowest level helper to ensure all callers get the right results. Fixes: 2d985f8c6b91 ("vfs: support STATX_DIOALIGN on block devices") Fixes: f4774e92aab8 ("loop: take the file system minimum dio alignment into account") Reported-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/20250417064042.712140-1-hch@lst.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-04-17drm/gem: Internally test import_attach for imported objectsThomas Zimmermann
Test struct drm_gem_object.import_attach to detect imported objects. During object clenanup, the dma_buf field might be NULL. Testing it in an object's free callback then incorrectly does a cleanup as for native objects. Happens for calls to drm_mode_destroy_dumb_ioctl() that clears the dma_buf field in drm_gem_object_exported_dma_buf_free(). v3: - only test for import_attach (Boris) v2: - use import_attach.dmabuf instead of dma_buf (Christian) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: b57aa47d39e9 ("drm/gem: Test for imported GEM buffers with helper") Reported-by: Andy Yan <andyshrk@163.com> Closes: https://lore.kernel.org/dri-devel/38d09d34.4354.196379aa560.Coremail.andyshrk@163.com/ Tested-by: Andy Yan <andyshrk@163.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Anusha Srivatsa <asrivats@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://lore.kernel.org/r/20250416065820.26076-1-tzimmermann@suse.de
2025-04-17dma-mapping: avoid potential unused data compilation warningMarek Szyprowski
When CONFIG_NEED_DMA_MAP_STATE is not defined, dma-mapping clients might report unused data compilation warnings for dma_unmap_*() calls arguments. Redefine macros for those calls to let compiler to notice that it is okay when the provided arguments are not used. Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250415075659.428549-1-m.szyprowski@samsung.com
2025-04-16Merge tag 'mm-hotfixes-stable-2025-04-16-19-59' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "31 hotfixes. 9 are cc:stable and the remainder address post-6.15 issues or aren't considered necessary for -stable kernels. 22 patches are for MM, 9 are otherwise" * tag 'mm-hotfixes-stable-2025-04-16-19-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (31 commits) MAINTAINERS: update HUGETLB reviewers mm: fix apply_to_existing_page_range() selftests/mm: fix compiler -Wmaybe-uninitialized warning alloc_tag: handle incomplete bulk allocations in vm_module_tags_populate mailmap: add entry for Jean-Michel Hautbois mm: (un)track_pfn_copy() fix + doc improvements mm: fix filemap_get_folios_contig returning batches of identical folios mm/hugetlb: add a line break at the end of the format string selftests: mincore: fix tmpfs mincore test failure mm/hugetlb: fix set_max_huge_pages() when there are surplus pages mm/cma: report base address of single range correctly mm: page_alloc: speed up fallbacks in rmqueue_bulk() kunit: slub: add module description mm/kasan: add module decription ucs2_string: add module description zlib: add module description fpga: tests: add module descriptions samples/livepatch: add module descriptions ASN.1: add module description mm/vma: add give_up_on_oom option on modify/merge, use in uffd release ...
2025-04-16Merge tag 'v6.15-p4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Disable ahash request chaining as it causes problems with the sa2ul driver - Fix a couple of bugs in the new scomp stream freeing code - Fix an old caam refcount underflow that is possibly showing up now because of the new parallel self-tests - Fix regression in the tegra driver * tag 'v6.15-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: ahash - Disable request chaining crypto: scomp - Fix wild memory accesses in scomp_free_streams crypto: caam/qi - Fix drv_ctx refcount bug crypto: scomp - Fix null-pointer deref when freeing streams crypto: tegra - Fix IV usage for AES ECB
2025-04-16kernel: globalize lookup_or_create_module_kobject()Shyam Saini
lookup_or_create_module_kobject() is marked as static and __init, to make it global drop static keyword. Since this function can be called from non-init code, use __modinit instead of __init, __modinit marker will make it __init if CONFIG_MODULES is not defined. Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Shyam Saini <shyamsaini@linux.microsoft.com> Link: https://lore.kernel.org/r/20250227184930.34163-4-shyamsaini@linux.microsoft.com Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-04-15net: fib_rules: Fix iif / oif matching on L3 master deviceIdo Schimmel
Before commit 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") it was possible to use FIB rules to match on a L3 domain. This was done by having a FIB rule match on iif / oif being a L3 master device. It worked because prior to the FIB rule lookup the iif / oif fields in the flow structure were reset to the index of the L3 master device to which the input / output device was enslaved to. The above scheme made it impossible to match on the original input / output device. Therefore, cited commit stopped overwriting the iif / oif fields in the flow structure and instead stored the index of the enslaving L3 master device in a new field ('flowi_l3mdev') in the flow structure. While the change enabled new use cases, it broke the original use case of matching on a L3 domain. Fix this by interpreting the iif / oif matching on a L3 master device as a match against the L3 domain. In other words, if the iif / oif in the FIB rule points to a L3 master device, compare the provided index against 'flowi_l3mdev' rather than 'flowi_{i,o}if'. Before cited commit, a FIB rule that matched on 'iif vrf1' would only match incoming traffic from devices enslaved to 'vrf1'. With the proposed change (i.e., comparing against 'flowi_l3mdev'), the rule would also match traffic originating from a socket bound to 'vrf1'. Avoid that by adding a new flow flag ('FLOWI_FLAG_L3MDEV_OIF') that indicates if the L3 domain was derived from the output interface or the input interface (when not set) and take this flag into account when evaluating the FIB rule against the flow structure. Avoid unnecessary checks in the data path by detecting that a rule matches on a L3 master device when the rule is installed and marking it as such. Tested using the following script [1]. Output before 40867d74c374 (v5.4.291): default dev dummy1 table 100 scope link default dev dummy1 table 200 scope link Output after 40867d74c374: default dev dummy1 table 300 scope link default dev dummy1 table 300 scope link Output with this patch: default dev dummy1 table 100 scope link default dev dummy1 table 200 scope link [1] #!/bin/bash ip link add name vrf1 up type vrf table 10 ip link add name dummy1 up master vrf1 type dummy sysctl -wq net.ipv4.conf.all.forwarding=1 sysctl -wq net.ipv4.conf.all.rp_filter=0 ip route add table 100 default dev dummy1 ip route add table 200 default dev dummy1 ip route add table 300 default dev dummy1 ip rule add prio 0 oif vrf1 table 100 ip rule add prio 1 iif vrf1 table 200 ip rule add prio 2 table 300 ip route get 192.0.2.1 oif dummy1 fibmatch ip route get 192.0.2.1 iif dummy1 from 198.51.100.1 fibmatch Fixes: 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") Reported-by: hanhuihui <hanhuihui5@huawei.com> Closes: https://lore.kernel.org/netdev/ec671c4f821a4d63904d0da15d604b75@huawei.com/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250414172022.242991-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15device property: Add a note to the fwnode.hAndy Shevchenko
Add a note to the fwnode.h that the header should not be used directly in the leaf drivers, they all should use the higher level APIs and the respective headers. The purpose of this note is to give guidance to driver writers to avoid repeating a common mistake. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20250408095229.1298005-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15io_uring/zcrx: return ifq id to the userPavel Begunkov
IORING_OP_RECV_ZC requests take a zcrx object id via sqe::zcrx_ifq_idx, which binds it to the corresponding if / queue. However, we don't return that id back to the user. It's fine as currently there can be only one zcrx and the user assumes that its id should be 0, but as we'll need multiple zcrx objects in the future let's explicitly pass it back on registration. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8714667d370651962f7d1a169032e5f02682a73e.1744722517.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-15fs: add kern_path_locked_negative()Christian Brauner
The audit code relies on the fact that kern_path_locked() returned a path even for a negative dentry. If it doesn't find a valid dentry it immediately calls: audit_find_parent(d_backing_inode(parent_path.dentry)); which assumes that parent_path.dentry is still valid. But it isn't since kern_path_locked() has been changed to path_put() also for a negative dentry. Fix this by adding a helper that implements the required audit semantics and allows us to fix the immediate bleeding. We can find a unified solution for this afterwards. Link: https://lore.kernel.org/20250414-rennt-wimmeln-f186c3a780f1@brauner Fixes: 1c3cb50b58c3 ("VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry") Reported-and-tested-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-04-15PCI/MSI: Add an option to write MSIX ENTRY_DATA before any readsJonathan Currier
Commit 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries") introduced a readl() from ENTRY_VECTOR_CTRL before the writel() to ENTRY_DATA. This is correct, however some hardware, like the Sun Neptune chips, the NIU module, will cause an error and/or fatal trap if any MSIX table entry is read before the corresponding ENTRY_DATA field is written to. Add an optional early writel() in msix_prepare_msi_desc(). Fixes: 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries") Signed-off-by: Jonathan Currier <dullfire@yahoo.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241117234843.19236-2-dullfire@yahoo.com
2025-04-14Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: - Fix hang in bnxt_re due to miscomputing the budget - Avoid a -Wformat-security message in dev_set_name() - Avoid an unused definition warning in fs.c with some kconfigs - Fix error handling in usnic and remove IS_ERR_OR_NULL() usage - Regression in RXE support foudn by blktests due to missing ODP exclusions - Set the dma_segment_size on HNS so it doesn't corrupt DMA when using very large IOs - Move a INIT_WORK to near when the work is allocated in cm.c to fix a racey crash where work in progress was being init'd - Use __GFP_NOWARN to not dump in kvcalloc() if userspace requests a very big MR * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/bnxt_re: Remove unusable nq variable RDMA/core: Silence oversized kvmalloc() warning RDMA/cma: Fix workqueue crash in cma_netevent_work_handler RDMA/hns: Fix wrong maximum DMA segment size RDMA/rxe: Fix null pointer dereference in ODP MR check RDMA/mlx5: Fix compilation warning when USER_ACCESS isn't set RDMA/usnic: Fix passing zero to PTR_ERR in usnic_ib_pci_probe() RDMA/ucaps: Avoid format-security warning RDMA/bnxt_re: Fix budget handling of notification queue
2025-04-14Merge tag 'vfs-6.15-rc3.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix NULL pointer dereference in virtiofs - Fix slab OOB access in hfs/hfsplus - Only create /proc/fs/netfs when CONFIG_PROC_FS is set - Fix getname_flags() to initialize pointer correctly - Convert dentry flags to enum - Don't allow datadir without lowerdir in overlayfs - Use namespace_{lock,unlock} helpers in dissolve_on_fput() instead of plain namespace_sem so unmounted mounts are properly cleaned up - Skip unnecessary ifs_block_is_uptodate check in iomap - Remove an unused forward declaration in overlayfs - Fix devpts uid/gid handling after converting to the new mount api - Fix afs_dynroot_readdir() to not use the RCU read lock - Fix mount_setattr() and open_tree_attr() to not pointlessly do path lookup or walk the mount tree if no mount option change has been requested * tag 'vfs-6.15-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: use namespace_{lock,unlock} in dissolve_on_fput() iomap: skip unnecessary ifs_block_is_uptodate check fs: Fix filename init after recent refactoring netfs: Only create /proc/fs/netfs with CONFIG_PROC_FS mount: ensure we don't pointlessly walk the mount tree dcache: convert dentry flag macros to enum afs: Fix afs_dynroot_readdir() to not use the RCU read lock hfs/hfsplus: fix slab-out-of-bounds in hfs_bnode_read_key virtiofs: add filesystem context source name check devpts: Fix type for uid and gid params ovl: remove unused forward declaration ovl: don't allow datadir only
2025-04-14vhost: fix VHOST_*_OWNER documentationStefano Garzarella
VHOST_OWNER_SET and VHOST_OWNER_RESET are used in the documentation instead of VHOST_SET_OWNER and VHOST_RESET_OWNER respectively. To avoid confusion, let's use the right names in the documentation. No change to the API, only the documentation is involved. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20250303085237.19990-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-04-14virtio_pci: Use self group type for cap commandsDaniel Jurgens
Section 2.12.1.2 of v1.4 of the VirtIO spec states: The device and driver capabilities commands are currently defined for self group type. 1. VIRTIO_ADMIN_CMD_CAP_ID_LIST_QUERY 2. VIRTIO_ADMIN_CMD_DEVICE_CAP_GET 3. VIRTIO_ADMIN_CMD_DRIVER_CAP_SET Fixes: bfcad518605d ("virtio: Manage device and driver capabilities via the admin commands") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Message-Id: <20250304161442.90700-1-danielj@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-04-14Fix up building KUnit tests for Cirrus Logic modulesMark Brown
Merge series from Richard Fitzgerald <rf@opensource.cirrus.com>: This series fixes the KConfig for cs_dsp and cs-amp-lib tests so that CONFIG_KUNIT_ALL_TESTS doesn't cause them to add modules to the build.
2025-04-13nfs: add missing selections of CONFIG_CRC32Eric Biggers
nfs.ko, nfsd.ko, and lockd.ko all use crc32_le(), which is available only when CONFIG_CRC32 is enabled. But the only NFS kconfig option that selected CONFIG_CRC32 was CONFIG_NFS_DEBUG, which is client-specific and did not actually guard the use of crc32_le() even on the client. The code worked around this bug by only actually calling crc32_le() when CONFIG_CRC32 is built-in, instead hard-coding '0' in other cases. This avoided randconfig build errors, and in real kernels the fallback code was unlikely to be reached since CONFIG_CRC32 is 'default y'. But, this really needs to just be done properly, especially now that I'm planning to update CONFIG_CRC32 to not be 'default y'. Therefore, make CONFIG_NFS_FS, CONFIG_NFSD, and CONFIG_LOCKD select CONFIG_CRC32. Then remove the fallback code that becomes unnecessary, as well as the selection of CONFIG_CRC32 from CONFIG_NFS_DEBUG. Fixes: 1264a2f053a3 ("NFS: refactor code for calculating the crc32 hash of a filehandle") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-04-11scsi: ufs: Introduce quirk to extend PA_HIBERN8TIME for UFS devicesManish Pandey
Samsung UFS devices require additional time in hibern8 mode before exiting, beyond the negotiated handshaking phase between the host and device. Introduce a quirk to increase the PA_HIBERN8TIME parameter by 100 µs, a value derived from experiments, to ensure a proper hibernation process. Signed-off-by: Manish Pandey <quic_mapa@quicinc.com> Link: https://lore.kernel.org/r/20250411121630.21330-3-quic_mapa@quicinc.com Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-04-12crypto: ahash - Disable request chainingHerbert Xu
Disable hash request chaining in case a driver that copies an ahash_request object by hand accidentally triggers chaining. Reported-by: Manorit Chawdhry <m-chawdhry@ti.com> Fixes: f2ffe5a9183d ("crypto: hash - Add request chaining API") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Manorit Chawdhry <m-chawdhry@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-11mm: (un)track_pfn_copy() fix + doc improvementsDavid Hildenbrand
We got a late smatch warning and some additional review feedback. smatch warnings: mm/memory.c:1428 copy_page_range() error: uninitialized symbol 'pfn'. We actually use the pfn only when it is properly initialized; however, we may pass an uninitialized value to a function -- although it will not use it that likely still is UB in C. So let's just fix it by always initializing pfn in the caller of track_pfn_copy(), and improving the documentation of track_pfn_copy(). While at it, clarify the doc of untrack_pfn_copy(), that internal checks make sure if we actually have to untrack anything. Link: https://lkml.kernel.org/r/20250408085950.976103-1-david@redhat.com Fixes: dc84bc2aba85 ("x86/mm/pat: Fix VM_PAT handling when fork() fails in copy_page_range()") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Closes: https://lore.kernel.org/r/202503270941.IFILyNCX-lkp@intel.com/ Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-04-11locking/local_lock, mm: replace localtry_ helpers with local_trylock_t typeAlexei Starovoitov
Partially revert commit 0aaddfb06882 ("locking/local_lock: Introduce localtry_lock_t"). Remove localtry_*() helpers, since localtry_lock() name might be misinterpreted as "try lock". Introduce local_trylock[_irqsave]() helpers that only work with newly introduced local_trylock_t type. Note that attempt to use local_trylock[_irqsave]() with local_lock_t will cause compilation failure. Usage and behavior in !PREEMPT_RT: local_lock_t lock; // sizeof(lock) == 0 local_lock(&lock); // preempt disable local_lock_irqsave(&lock, ...); // irq save if (local_trylock_irqsave(&lock, ...)) // compilation error local_trylock_t lock; // sizeof(lock) == 4 local_lock(&lock); // preempt disable, acquired = 1 local_lock_irqsave(&lock, ...); // irq save, acquired = 1 if (local_trylock(&lock)) // if (!acquired) preempt disable, acquired = 1 if (local_trylock_irqsave(&lock, ...)) // if (!acquired) irq save, acquired = 1 The existing local_lock_*() macros can be used either with local_lock_t or local_trylock_t. With local_trylock_t they set acquired = 1 while local_unlock_*() clears it. In !PREEMPT_RT local_lock_irqsave(local_lock_t *) disables interrupts to protect critical section, but it doesn't prevent NMI, so the fully reentrant code cannot use local_lock_irqsave(local_lock_t *) for exclusive access. The local_lock_irqsave(local_trylock_t *) helper disables interrupts and sets acquired=1, so local_trylock_irqsave(local_trylock_t *) from NMI attempting to acquire the same lock will return false. In PREEMPT_RT local_lock_irqsave() maps to preemptible spin_lock(). Map local_trylock_irqsave() to preemptible spin_trylock(). When in hard IRQ or NMI return false right away, since spin_trylock() is not safe due to explicit locking in the underneath rt_spin_trylock() implementation. Removing this explicit locking and attempting only "trylock" is undesired due to PI implications. The local_trylock() without _irqsave can be used to avoid the cost of disabling/enabling interrupts by only disabling preemption, so local_trylock() in an interrupt attempting to acquire the same lock will return false. Note there is no need to use local_inc for acquired variable, since it's a percpu variable with strict nesting scopes. Note that guard(local_lock)(&lock) works only for "local_lock_t lock". The patch also makes sure that local_lock_release(l) is called before WRITE_ONCE(l->acquired, 0). Though IRQs are disabled at this point the local_trylock() from NMI will succeed and local_lock_acquire(l) will warn. Link: https://lkml.kernel.org/r/20250403025514.41186-1-alexei.starovoitov@gmail.com Fixes: 0aaddfb06882 ("locking/local_lock: Introduce localtry_lock_t") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Daniel Borkman <daniel@iogearbox.net> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Martin KaFai Lau <martin.lau@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-04-11fwctl/cxl: Fix uuid_t usage in uapiDan Williams
The uuid_t type is kernel internal, and Paul reports the following build error when it is used in a uapi header: usr/include/cxl/features.h:59:9: error: unknown type name ‘uuid_t’ Create a uuid type (__uapi_uuid_t) compatible with the longstanding definition uuid/uuid.h for userspace builds, and use uuid_t directly for kernel builds. Fixes: 9b8e73cdb141 ("cxl: Move cxl feature command structs to user header") Link: https://patch.msgid.link/r/174430961702.617339.13963021112051029933.stgit@dwillia2-xfh.jf.intel.com Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reported-by: Paul E. McKenney <paulmck@kernel.org> Closes: http://lore.kernel.org/f6489337-67c7-48c8-b48a-58603ec15328@paulmck-laptop Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504050434.Eb4vugh5-lkp@intel.com/ Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-04-11dcache: convert dentry flag macros to enumOmar Sandoval
Commit 9748cb2dc393 ("VFS: repack DENTRY_ flags.") changed the value of DCACHE_MOUNTED, which broke drgn's path_lookup() helper. drgn is forced to hard-code it because it's a macro, and macros aren't preserved in debugging information by default. Enums, on the other hand, are included in debugging information. Convert the DCACHE_* flag macros to an enum so that debugging tools like drgn and bpftrace can make use of them. Link: https://github.com/osandov/drgn/blob/2027d0fea84d74b835e77392f7040c2a333180c6/drgn/helpers/linux/fs.py#L43-L46 Signed-off-by: Omar Sandoval <osandov@fb.com> Link: https://lore.kernel.org/177665a082f048cf536b9cd6af467b3be6b6e6ed.1744141838.git.osandov@fb.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-04-11accel/ivpu: Fix the NPU's DPU frequency calculationAndrzej Kacprowski
Fix the frequency returned to the user space by the DRM_IVPU_PARAM_CORE_CLOCK_RATE GET_PARAM IOCTL. The kernel driver returned CPU frequency for MTL and bare PLL frequency for LNL - this was inconsistent and incorrect for both platforms. With this fix the driver returns maximum frequency of the NPU data processing unit (DPU) for all HW generations. This is what user space always expected. Also do not set CPU frequency in boot params - the firmware does not use frequency passed from the driver, it was only used by the early pre-production firmware. With that we can remove CPU frequency calculation code. Show NPU frequency in FREQ_CHANGE interrupt when frequency tracking is enabled. Fixes: 8a27ad81f7d3 ("accel/ivpu: Split IP and buttress code") Cc: stable@vger.kernel.org # v6.11+ Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com> Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250401155912.4049340-2-maciej.falkowski@linux.intel.com
2025-04-10Merge tag 'drm-fixes-2025-04-11-1' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Weekly fixes, as expected it has a bit more in it than probably usual for rc2. amdgpu/xe/i915 lead the way with fixes all over for a bunch of other drivers. Nothing major stands out from what I can see. tests: - Clean up struct drm_display_mode in various places i915: - Fix scanline offset for LNL+ and BMG+ - Fix GVT unterminated-string-initialization build warning - Fix DP rate limit when sink doesn't support TPS4 - Handle GDDR + ECC memory type detection - Fix VRR parameter change check - Fix fence not released on early probe errors - Disable render power gating during live selftests xe: - Add another BMG PCI ID - Fix UAFs on migration paths - Fix shift-out-of-bounds access on TLB invalidation - Ensure ccs_mode is correctly set on gt reset - Extend some HW workarounds to Xe3 - Fix PM runtime get/put on sysfs files - Fix u64 division on 32b - Fix flickering due to missing L3 invalidations - Fix missing error code return amdgpu: - MES FW version caching fixes - Only use GTT as a fallback if we already have a backing store - dma_buf fix - IP discovery fix - Replay and PSR with VRR fix - DC FP fixes - eDP fixes - KIQ TLB invalidate fix - Enable dmem groups support - Allow pinning VRAM dma bufs if imports can do P2P - Workload profile fixes - Prevent possible division by 0 in fan handling amdkfd: - Queue reset fixes imagination: - Fix overflow - Fix use-after-free ivpu: - Fix suspend/resume nouveau: - Do not deref dangling pointer rockchip: - Set DP/HDMI registers correctly udmabuf: - Fix overflow virtgpu: - Set reservation lock on dma-buf import - Fix error handling in prepare_fb" * tag 'drm-fixes-2025-04-11-1' of https://gitlab.freedesktop.org/drm/kernel: (58 commits) drm/rockchip: dw_hdmi_qp: Fix io init for dw_hdmi_qp_rockchip_resume drm/rockchip: vop2: Fix interface enable/mux setting of DP1 on rk3588 drm/amdgpu/mes12: optimize MES pipe FW version fetching drm/amd/pm/smu11: Prevent division by zero drm/amdgpu: cancel gfx idle work in device suspend for s0ix drm/amd/display: pause the workload setting in dm drm/amdgpu/pm/swsmu: implement pause workload profile drm/amdgpu/pm: add workload profile pause helper drm/i915/huc: Fix fence not released on early probe errors drm/i915/vrr: Add vrr.vsync_{start, end} in vrr_params_changed drm/tests: probe-helper: Fix drm_display_mode memory leak drm/tests: modes: Fix drm_display_mode memory leak drm/tests: modes: Fix drm_display_mode memory leak drm/tests: cmdline: Fix drm_display_mode memory leak drm/tests: modeset: Fix drm_display_mode memory leak drm/tests: modeset: Fix drm_display_mode memory leak drm/tests: helpers: Create kunit helper to destroy a drm_display_mode drm/xe: Restore EIO errno return when GuC PC start fails drm/xe: Invalidate L3 read-only cachelines for geometry streams too drm/xe: avoid plain 64-bit division ...
2025-04-11Merge tag 'drm-xe-fixes-2025-04-10' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Add another BMG PCI ID - Fix UAFs on migration paths - Fix shift-out-of-bounds access on TLB invalidation - Ensure ccs_mode is correctly set on gt reset - Extend some HW workarounds to Xe3 - Fix PM runtime get/put on sysfs files - Fix u64 division on 32b - Fix flickering due to missing L3 invalidations - Fix missing error code return Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/unq5j26aejbrjz5nuvmdtcgupyix5bacpoahod4bdohlvwrney@kekimsi5ossx
2025-04-10Merge tag 'irq-urgent-2025-04-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc irqchip fixes from Ingo Molnar: - Fix NULL pointer dereference crashes due to missing .chip_flags setup in the sg2042-msi and irq-bcm2712-mip irqchip drivers - Remove the davinci aintc irqchip driver's leftover header too * tag 'irq-urgent-2025-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-bcm2712-mip: Set EOI/ACK flags in msi_parent_ops irqchip/sg2042-msi: Add missing chip flags irqchip/davinci: Remove leftover header