linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-06-04	PCI/MSI: Size device MSI domain with the maximum number of vectors	Marc Zyngier
	Zenghui reports that since 1396e89e09f0 ("genirq/msi: Move prepare() call to per-device allocation"), his Multi-MSI capable device isn't working anymore. This is a consequence of 15c72f824b32 ("PCI/MSI: Add support for per device MSI[X] domains"), which always creates a MSI domain of size 1, even in the presence of Multi-MSI. While this was somehow working until then, moving the .prepare() call ends up sizing the ITS table with a tiny value for this device, and making the endpoint driver unhappy. Instead, always create the domain and call the .prepare() helper with the maximum expected size. Fixes: 1396e89e09f0 ("genirq/msi: Move prepare() call to per-device allocation") Fixes: 15c72f824b32 ("PCI/MSI: Add support for per device MSI[X] domains") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Link: https://lore.kernel.org/all/20250603141801.915305-1-maz@kernel.org Closes: https://lore.kernel.org/r/0b1d7aec-1eac-a9cd-502a-339e216e08a1@huawei.com
2025-06-04	nvme: spelling fixes	Yi Zhang
	Fix various spelling errors in comments. Signed-off-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-06-04	nvme-tcp: fix I/O stalls on congested sockets	Hannes Reinecke
	When the socket is busy processing nvme_tcp_try_recv() might return -EAGAIN, but this doesn't automatically imply that the sending side is blocked, too. So check if there are pending requests once nvme_tcp_try_recv() returns -EAGAIN and continue with the sending loop to avoid I/O stalls. Signed-off-by: Hannes Reinecke <hare@kernel.org> Acked-by: Chris Leech <cleech@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-06-04	nvme-tcp: sanitize request list handling	Hannes Reinecke
	Validate the request in nvme_tcp_handle_r2t() to ensure it's not part of any list, otherwise a malicious R2T PDU might inject a loop in request list processing. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-06-04	nvme-tcp: remove tag set when second admin queue config fails	Shin'ichiro Kawasaki
	Commit 104d0e2f6222 ("nvme-fabrics: reset admin connection for secure concatenation") modified nvme_tcp_setup_ctrl() to call nvme_tcp_configure_admin_queue() twice. The first call prepares for DH-CHAP negotitation, and the second call is required for secure concatenation. However, this change triggered BUG KASAN slab-use-after- free in blk_mq_queue_tag_busy_iter(). This BUG can be recreated by repeating the blktests test case nvme/063 a few times [1]. When the BUG happens, nvme_tcp_create_ctrl() fails in the call chain below: nvme_tcp_create_ctrl() nvme_tcp_alloc_ctrl() new=true ... Alloc nvme_tcp_ctrl and admin_tag_set nvme_tcp_setup_ctrl() new=true nvme_tcp_configure_admin_queue() new=true ... Succeed nvme_alloc_admin_tag_set() ... Alloc the tag set for admin_tag_set nvme_stop_keep_alive() nvme_tcp_teardown_admin_queue() remove=false nvme_tcp_configure_admin_queue() new=false nvme_tcp_alloc_admin_queue() ... Fail, but do not call nvme_remove_admin_tag_set() nvme_uninit_ctrl() nvme_put_ctrl() ... Free up the nvme_tcp_ctrl and admin_tag_set The first call of nvme_tcp_configure_admin_queue() succeeds with new=true argument. The second call fails with new=false argument. This second call does not call nvme_remove_admin_tag_set() on failure, due to the new=false argument. Then the admin tag set is not removed. However, nvme_tcp_create_ctrl() assumes that nvme_tcp_setup_ctrl() would call nvme_remove_admin_tag_set(). Then it frees up struct nvme_tcp_ctrl which has admin_tag_set field. Later on, the timeout handler accesses the admin_tag_set field and causes the BUG KASAN slab-use-after-free. To not leave the admin tag set, call nvme_remove_admin_tag_set() when the second nvme_tcp_configure_admin_queue() call fails. Do not return from nvme_tcp_setup_ctrl() on failure. Instead, jump to "destroy_admin" go-to label to call nvme_tcp_teardown_admin_queue() which calls nvme_remove_admin_tag_set(). Fixes: 104d0e2f6222 ("nvme-fabrics: reset admin connection for secure concatenation") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-nvme/6mhxskdlbo6fk6hotsffvwriauurqky33dfb3s44mqtr5dsxmf@gywwmnyh3twm/ [1] Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-06-04	nvme: enable vectored registered bufs for passthrough cmds	Pavel Begunkov
	nvme already supports registered buffers for non-vectored io_uring passthrough commands, enable it for the vectored mode as well. It takes an iovec, each entry of which should contain a range within the same registered buffer specificied in sqe->buf_index. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-06-04	nvme: fix implicit bool to flags conversion	Pavel Begunkov
	nvme_map_user_request() takes flags as the last argument, but nvme_uring_cmd_io() shoves a bool "vec" into it. It behaves as expected because bool is converted to 0/1 and NVME_IOCTL_VEC is defined as 1, but it's better to pass flags explicitly. Fixes: 7b7fdb8e2dbc1 ("nvme: replace the "bool vec" arguments with flags in the ioctl path") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-06-04	nvme: fix command limits status code	Keith Busch
	The command specific status code, 0x183, was introduced in the NVMe 2.0 specification defined to "Command Size Limits Exceeded" and only ever applied to DSM and Copy commands. Fix the name and, remove the incorrect translation to error codes and special treatment in the target code for it. Fixes: 3b7c33b28a44d4 ("nvme.h: add Write Zeroes definitions") Cc: Chaitanya Kulkarni <chaitanyak@nvidia.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-06-03	Input: psmouse - switch to use scnprintf() to suppress truncation warning	Dmitry Torokhov
	Switch the driver to use scnprintf() to avoid warnings about potential truncation of "phys" field which we can tolerate. Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-06-03	Input: lifebook - switch to use scnprintf() to suppress truncation warning	Dmitry Torokhov
	Switch the driver to use scnprintf() to avoid warnings about potential truncation of "phys" field which we can tolerate. Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-06-03	Input: alps - switch to use scnprintf() to suppress truncation warning	Dmitry Torokhov
	Switch the driver to use scnprintf() to avoid warnings about potential truncation of "phys" field which we can tolerate. Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-06-03	Input: atkbd - switch to use scnprintf() to suppress truncation warning	Dmitry Torokhov
	Switch the driver to use scnprintf() to avoid warnings about potential truncation of "phys" field which we can tolerate. Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-06-03	Input: fsia6b - suppress buffer truncation warning for phys	Markus Koch
	Switch the driver to use scnprintf() to avoid warnings about potential truncation of "phys" field which we can tolerate. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501020303.1WtxWWTu-lkp@intel.com/ Signed-off-by: Markus Koch <markus@notsyncing.net> Link: https://lore.kernel.org/r/20250602175710.61583-4-markus@notsyncing.net Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-06-03	Input: iqs626a - replace snprintf() with scnprintf()	Jeff LaBundy
	W=1 builds warn that the data written to 'tc_name' is truncated for theoretical strings such as "channel-2147483646". Solve this problem by replacing snprintf() with scnprintf() so that the return value corresponds to what was actually written. In practice, the largest string that will be written is "channel-8", and the return value is not actually evaluated. Instead, this patch ultimately removes the warning without unnecessarily increasing the size of 'tc_name' from 10 bytes. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412221136.0S4kRoCC-lkp@intel.com/ Signed-off-by: Jeff LaBundy <jeff@labundy.com> Link: https://lore.kernel.org/r/Z3rV8GTHxLyjBQ5I@nixie71 Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-06-03	scsi: core: ufs: Fix a hang in the error handler	Sanjeev Yadav
	ufshcd_err_handling_prepare() calls ufshcd_rpm_get_sync(). The latter function can only succeed if UFSHCD_EH_IN_PROGRESS is not set because resuming involves submitting a SCSI command and ufshcd_queuecommand() returns SCSI_MLQUEUE_HOST_BUSY if UFSHCD_EH_IN_PROGRESS is set. Fix this hang by setting UFSHCD_EH_IN_PROGRESS after ufshcd_rpm_get_sync() has been called instead of before. Backtrace: __switch_to+0x174/0x338 __schedule+0x600/0x9e4 schedule+0x7c/0xe8 schedule_timeout+0xa4/0x1c8 io_schedule_timeout+0x48/0x70 wait_for_common_io+0xa8/0x160 //waiting on START_STOP wait_for_completion_io_timeout+0x10/0x20 blk_execute_rq+0xe4/0x1e4 scsi_execute_cmd+0x108/0x244 ufshcd_set_dev_pwr_mode+0xe8/0x250 __ufshcd_wl_resume+0x94/0x354 ufshcd_wl_runtime_resume+0x3c/0x174 scsi_runtime_resume+0x64/0xa4 rpm_resume+0x15c/0xa1c __pm_runtime_resume+0x4c/0x90 // Runtime resume ongoing ufshcd_err_handler+0x1a0/0xd08 process_one_work+0x174/0x808 worker_thread+0x15c/0x490 kthread+0xf4/0x1ec ret_from_fork+0x10/0x20 Signed-off-by: Sanjeev Yadav <sanjeev.y@mediatek.com> [ bvanassche: rewrote patch description ] Fixes: 62694735ca95 ("[SCSI] ufs: Add runtime PM support for UFS host controller driver") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20250523201409.1676055-1-bvanassche@acm.org Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-06-03	Merge tag 'for-6.16/dm-changes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mikulas Patocka: - better error handling when reloading a table - use use generic disable_* functions instead of open coding them - lock queue limits when reading them - remove unneeded kvfree from alloc_targets - fix BLK_FEAT_ATOMIC_WRITES - pass through operations on wrapped inline crypto keys - dm-verity: - use softirq context only when !need_resched() - fix a memory leak if some arguments are specified multiple times - dm-mpath: - interface for explicit probing of active paths - replace spin_lock_irqsave with spin_lock_irq - dm-delay: don't busy-wait in kthread - dm-bufio: remove maximum age based eviction - dm-flakey: various fixes - vdo indexer: don't read request structure after enqueuing - dm-zone: Use bdev_() helper functions where applicable - dm-mirror: fix a tiny race condition - dm-stripe: small code cleanup tag 'for-6.16/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (29 commits) dm-stripe: small code cleanup dm-verity: fix a memory leak if some arguments are specified multiple times dm-mirror: fix a tiny race condition dm-table: check BLK_FEAT_ATOMIC_WRITES inside limits_lock dm mpath: replace spin_lock_irqsave with spin_lock_irq dm-mpath: Don't grab work_mutex while probing paths dm-zone: Use bdev_*() helper functions where applicable dm vdo indexer: don't read request structure after enqueuing dm: pass through operations on wrapped inline crypto keys blk-crypto: export wrapped key functions dm-table: Set BLK_FEAT_ATOMIC_WRITES for target queue limits dm mpath: Interface for explicit probing of active paths dm: Allow .prepare_ioctl to handle ioctls directly dm-flakey: make corrupting read bios work dm-flakey: remove useless ERROR_READS check in flakey_end_io dm-flakey: error all IOs when num_features is absent dm-flakey: Clean up parsing messages dm: remove unneeded kvfree from alloc_targets dm-bufio: remove maximum age based eviction dm-verity: use softirq context only when !need_resched() ...
2025-06-03	Merge tag 'cxl-for-6.16' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull Compute Express Link (CXL) updates from Dave Jiang: - Remove always true condition in cxl features code - Add verification of CHBS length for CXL 2.0 - Ignore interleave granularity when interleave ways is 1 - Add update addressing mising MODULE_DESCRIPTION for cxl_test - A series of cleanups/refactor to prep for AMD Zen5 translate code - Clean %pa debug printk in core/hdm.c - Documentation updates: - Update to CXL Maturity Map - Fixes to source linking in CXL documentation - CXL documentation fixes, spelling corrections - A large collection of CXL documentation for the entire CXL subsystem, including documentation on CXL related platform and firmware notes - Remove redundant code of cxlctl_get_supported_features() - Series to support CXL RAS Features - Including "Patrol Scrub Control", "Error Check Scrub", "Performance Maitenance" and "Memory Sparing". The series connects CXL to EDAC. * tag 'cxl-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (53 commits) cxl/edac: Add CXL memory device soft PPR control feature cxl/edac: Add CXL memory device memory sparing control feature cxl/edac: Support for finding memory operation attributes from the current boot cxl/edac: Add support for PERFORM_MAINTENANCE command cxl/edac: Add CXL memory device ECS control feature cxl/edac: Add CXL memory device patrol scrub control feature cxl: Update prototype of function get_support_feature_info() EDAC: Update documentation for the CXL memory patrol scrub control feature cxl/features: Remove the inline specifier from to_cxlfs() cxl/feature: Remove redundant code of get supported features docs: ABI: Fix "firwmare" to "firmware" cxl/Documentation: Fix typo in sysfs write_bandwidth attribute path cxl: doc/linux/access-coordinates Update access coordinates calculation methods cxl: docs/platform/acpi/srat Add generic target documentation cxl: docs/platform/cdat reference documentation Documentation: Update the CXL Maturity Map cxl: Sync up the driver-api/cxl documentation cxl: docs - add self-referencing cross-links cxl: docs/allocation/hugepages cxl: docs/allocation/reclaim ...
2025-06-03	PM: sleep: Add locking to dpm_async_resume_children()	Rafael J. Wysocki
	Commit 0cbef962ce1f ("PM: sleep: Resume children after resuming the parent") introduced a subtle concurrency issue that may lead to a kernel crash if system suspend is aborted and may also slow down asynchronous device resume otherwise. Namely, the initial list walks in dpm_noirq_resume_devices(), dpm_resume_early(), and dpm_resume() call dpm_clear_async_state() for every device and attempt to asynchronously resume it if it has no children (so it is a "root" device). The asynchronous resume of a root device triggers an attempt to asynchronously resume its children which may take place before calling dpm_clear_async_state() for them due to the lack of synchronization between dpm_async_resume_children() and the code calling dpm_clear_async_state(). If this happens, the dpm_clear_async_state() that comes in late, will clear power.work_in_progress for the given device after it has been set by __dpm_async(), so the suspend callback will be allowed to run once again for the same device during the same transition. This leads to a whole range of interesting breakage. Fortunately, if the suspend transition is not aborted, power.work_in_progress is set by it for all devices, so dpm_async_resume_children() will not schedule asynchronous resume for them until dpm_clear_async_state() clears that flag, but this means missing an opportunity to start the resume of those devices earlier. Address the above issue by adding dpm_list_mtx locking to dpm_async_resume_children(), so it will wait for the entire initial list walk and the invocation of dpm_clear_async_state() for all devices to be completed before scheduling any new asynchronous resume callbacks. Fixes: 0cbef962ce1f ("PM: sleep: Resume children after resuming the parent") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4280 Reported-and-tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/13779172.uLZWGnKmhe@rjwysocki.net
2025-06-03	PM: sleep: Fix power.is_suspended cleanup for direct-complete devices	Rafael J. Wysocki
	Commit 03f1444016b7 ("PM: sleep: Fix handling devices with direct_complete set on errors") caused power.is_suspended to be set for devices with power.direct_complete set, but it forgot to ensure the clearing of that flag for them in device_resume(), so power.is_suspended is still set for them during the next system suspend-resume cycle. If that cycle is aborted in dpm_suspend(), the subsequent invocation of dpm_resume() will trigger a device_resume() call for every device and because power.is_suspended is set for the devices in question, they will not be skipped by device_resume() as expected which causes scary error messages to be logged (as appropriate). To address this issue, move the clearing of power.is_suspended in device_resume() immediately after the power.is_suspended check so it will be always cleared for all devices processed by that function. Fixes: 03f1444016b7 ("PM: sleep: Fix handling devices with direct_complete set on errors") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4280 Reported-and-tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/4990586.GXAFRqVoOG@rjwysocki.net
2025-06-03	PM: sleep: Fix list splicing in device suspend error paths	Rafael J. Wysocki
	Commits aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children") and 443046d1ad66 ("PM: sleep: Make suspend of devices more asynchronous") added list splicing to the error paths of dpm_suspend(), dpm_suspend_late(), and dpm_noirq_suspend_devices(), but they should have used the list_splice_init() variant because the emptied list is used going forward in all of these cases. Replace list_splice() with list_splice_init() in the code in question as appropriate. Fixes: aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children") Fixes: 443046d1ad66 ("PM: sleep: Make suspend of devices more asynchronous") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4280 Reported-and-tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/4659282.LvFx2qVVIh@rjwysocki.net
2025-06-03	Merge tag 'backlight-next-6.16' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight Pull backlight updates from Lee Jones: "Framebuffer Subsystem (fbdev): - The display's blanking status is now tracked in 'struct fb_info' - 'framebuffer_alloc()' initializes the blank state to FB_BLANK_UNBLANK - 'register_framebuffer()' sets the state to 'FB_BLANK_POWERDOWN' if an 'fb_blank' callback exists, ensuring 'FB_EVENT_BLANK' listeners correctly see the display being turned on during the first modeset - The 'FB_EVENT_BLANK' event data now includes both the new and the old blank states - 'fb_blank()' has been reworked to return early on errors, without functional changes, in preparation for further state tracking improvements - Fbdev now calls dedicated functions in the backlight subsystems to notify them of blank state changes, instead of relying on fbdev event notifiers - For LCDs, fbdev also calls a dedicated function to notify of mode changes - Removed the definitions for the unused fbdev event constants 'FB_EVENT_MODE_CHANGE' and 'FB_EVENT_BLANK' from the header file Backlight Subsystem: - Implemented fbdev blank state tracking using the (newly enhanced) blank state information provided directly by 'FB_EVENT_BLANK' - Removed internal blank state tracking fields ('fb_bl_on') from 'struct backlight_device' - Moved the handling of blank-state updates into a separate internal helper function, 'backlight_notify_blank()' - Removed support for fbdev events and replaced it with a dedicated function call interface ('backlight_notify_blank()' and 'backlight_notify_blank_all()') for display drivers to update backlight status LCD Subsystem: - Moved the handling of display updates (blank events and mode changes) from fbdev event notifiers to separate internal helper functions ('lcd_notify_blank', 'lcd_notify_mode_change') - Removed support for fbdev events and replaced it with dedicated function call interfaces ('lcd_notify_blank_all()', 'lcd_notify_mode_change_all()') - The LCD subsystem now maintains its own internal list of LCD devices instead of relying on fbdev notifiers LED Backlight Trigger: - Moved the handling of blank-state updates into a separate internal helper, 'ledtrig_backlight_notify_blank()' - Removed support for fbdev events and replaced it with a dedicated function call, 'ledtrig_backlight_blank()', for fbdev to notify trigger of blank state changes - The LED backlight trigger now maintains its own internal list of triggers instead of relying on fbdev notifiers Qualcomm WLED Backlight: - Added a NULL check after 'devm_kasprintf()' in 'wled_configure()' to prevent a potential NULL pointer dereference if memory allocation fails" * tag 'backlight-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight: backlight: pm8941: Add NULL check in wled_configure() fbdev: Remove constants of unused events leds: backlight trigger: Replace fb events with a dedicated function call leds: backlight trigger: Move blank-state handling into helper backlight: lcd: Replace fb events with a dedicated function call backlight: lcd: Move event handling into helpers backlight: Replace fb events with a dedicated function call backlight: Move blank-state handling into helper backlight: Implement fbdev tracking with blank state from event fbdev: Send old blank state in FB_EVENT_BLANK fbdev: Track display blanking state fbdev: Rework fb_blank()
2025-06-03	drm/amd/display: Fix default DC and AC levels	Mario Limonciello
	[Why] DC and AC levels are advertised in a percentage, not a luminance. [How] Scale DC and AC levels to supported values. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4221 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03	drm/amd/display: Add debugging message for brightness caps	Mario Limonciello
	[Why] Default BIOS brightness caps are buried in ACPI. [How] Add extra dynamic debug that can show default brightness caps. Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03	drm/amd/display: Use DC log instead of using DM error msg	Cruise Hung
	[Why & How] It sent an error msg when it failed to read the DP tunneling DPCD field. This should just be a warning msg. Use a DC log instead of a DM error msg. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Cruise Hung <Cruise.Hung@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03	drm/amd/display: Avoid calling blank_stream() twice	Zhongwei Zhang
	[Why] We've made fix for garbage in dcn31_reset_back_end_for_pipe(), adding blank_stream() before disable_crtc(). And set_dpms_off() will call blank_stream() again. [How] Add flag to avoid calling blank_stream() twice. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Zhongwei Zhang <Zhongwei.Zhang@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03	drm/amd/display: Correct non-OLED pre_T11_delay.	Zhongwei Zhang
	[Why] Only OLED panels require non-zero pre_T11_delay defaultly. Others should be controlled by power sequence. [How] For non OLED, pre_T11_delay delay in code should be zero. Also post_T7_delay. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Zhongwei Zhang <Zhongwei.Zhang@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03	drm/amdgpu: Add userq fence support to SDMAv7.0	Arunpravin Paneer Selvam
	- Add userq fence support to SDMAv7.0. - GFX12's user fence irq src id differs from GFX11's, hence we need create a new irq srcid header file for GFX12. User fence irq src id information- GFX11 and SDMA6.0 - 0x43 GFX12 and SDMA7.0 - 0x46 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03	drm/amdgpu: Fix integer overflow in amdgpu_gem_add_input_fence()	Dan Carpenter
	The "num_syncobj_handles" is a u32 value that comes from the user via the ioctl. On 32bit systems the "sizeof(uint32_t) * num_syncobj_handles" multiplication can have an integer overflow. Use size_mul() to fix that. Fixes: 38c67ec9aa4b ("drm/amdgpu: Add input fence to sync bo map/unmap") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03	Merge tag 'leds-next-6.16' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds Pull LED updates from Lee Jones: "LED Triggers: - Allow writing "default" to the sysfs 'trigger' attribute to set an LED to its default trigger - If the default trigger is "none", writing "default" will remove the current trigger - Updated sysfs ABI documentation for the new "default" trigger functionality LED KUnit Testing: - Provide a skeleton KUnit test suite for the LEDs framework - Expand the LED class device registration KUnit test to cover more scenarios, including 'brightness_get' behavior - Add KUnit tests for the LED lookup and get API ('led_add_lookup', 'devm_led_get') LED Flash Class: - Add support for setting flash/strobe duration through a new 'duration_set' op and 'led_set_flash_duration()' function, aligning with 'V4L2_CID_FLASH_DURATION' Texas Instruments TPS6131x: - Add a new driver for the TPS61310/TPS61311 flash LED controllers - The driver supports the device's three constant-current sinks for flash and torch modes LED Core: - Prevent potential 'snprintf()' truncations in LED names by checking for buffer overflows ChromeOS EC LEDs: - Avoid a -Wflex-array-member-not-at-end GCC warning by replacing an on-stack flexible structure definition with a utility function call Multicolor LEDs: - Fix issue where setting multi_intensity while software blinking is active could stop blinking PCA955x LEDs: - Avoid potential buffer overflow when creating default labels by changing a field's type to 'u8' and updating format specifiers PCA995x LEDs: - Fix a typo (stray space) in an 'of_device_id' entry in the 'pca995x_of_match' table Kconfig: - Prevent LED drivers from being enabled by default when 'COMPILE_TEST' is set Device Property API: - Split 'device_get_child_node_count()' into a new helper 'fwnode_get_child_node_count()' that doesn't require a device struct, making the API more symmetrical Driver Modernization (using 'fwnode_get_child_node_count()'): - Update 'leds-pwm-multicolor', 'leds-ncp5623' and 'leds-ncp5623' to use the new 'fwnode_get_child_node_count()' helper, removing their custom implementation - As above in the USB Type-C TCPM driver Driver Modernization (using new GPIO setter callbacks): - Convert 'leds-lgm-sso' to use new GPIO line value setter callbacks which return an integer for error handling - Convert 'leds-pca955x', 'leds-pca9532' and 'leds-tca6507' to use new GPIO setter callbacks Documentation: - Remove the '.rst' extension for 'leds-st1202' in the documentation index for consistency LP8860 LEDs: - Use 'regmap_multi_reg_write()' for EEPROM writes instead of manual looping - Use scoped mutex guards and 'devm_mutex_init()' to simplify function exits and ensure automatic cleanup - Remove default register definitions that are unused when regmap caching is not active - Use 'devm_regulator_get_enable_optional()' to handle the optional regulator, simplifying enabling and removing manual disabling - Refactor 'lp8860_unlock_eeprom()' to only perform the unlock operation, removing the lock part and an unnecessary parameter - Use a 'devm' action to disable the enable-GPIO, simplifying cleanup and error paths, and remove the now-empty '.remove()' function Turris Omnia LEDs: - Drop unnecessary commas in terminator entries of 'struct attribute' and 'struct of_device_id' arrays MT6370 RGB LEDs: - Use the 'LINEAR_RANGE()' for defining 'struct linear_range' entries to improve robustness Texas Instruments TPS6131x: - Add new devicetree bindings for the TI TPS61310/TPS61311 flash LED driver" * tag 'leds-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds: (31 commits) leds: tps6131x: Add support for Texas Instruments TPS6131X flash LED driver dt-bindings: leds: Add Texas Instruments TPS6131x flash LED driver leds: flash: Add support for flash/strobe duration leds: rgb: leds-mt6370-rgb: Improve definition of some struct linear_range leds: led-test: Provide tests for the lookup and get infrastructure leds: led-test: Fill out the registration test to cover more test cases leds: led-test: Remove standard error checking after KUNIT_ASSERT_*() leds: pca995x: Fix typo in pca995x_of_match's of_device_id entry leds: Provide skeleton KUnit testing for the LEDs framework leds: tca6507: Use new GPIO line value setter callbacks leds: pca9532: Use new GPIO line value setter callbacks leds: pca955x: Use new GPIO line value setter callbacks leds: lgm-sso: Use new GPIO line value setter callbacks leds: Do not enable by default during compile testing leds: turris-omnia: Drop commas in the terminator entries leds: lp8860: Disable GPIO with devm action leds: lp8860: Only unlock in lp8860_unlock_eeprom() leds: lp8860: Enable regulator using enable_optional helper leds: lp8860: Remove default regs when not caching leds: lp8860: Use new mutex guards to cleanup function exits ...
2025-06-03	drm/amdgpu: Fix integer overflow issues in amdgpu_userq_fence.c	Dan Carpenter
	This patch only affects 32bit systems. There are several integer overflows bugs here but only the "sizeof(u32) * num_syncobj" multiplication is a problem at runtime. (The last lines of this patch). These variables are u32 variables that come from the user. The issue is the multiplications can overflow leading to us allocating a smaller buffer than intended. For the first couple integer overflows, the syncobj_handles = memdup_user() allocation is immediately followed by a kmalloc_array(): syncobj = kmalloc_array(num_syncobj_handles, sizeof(*syncobj), GFP_KERNEL); In that situation the kmalloc_array() works as a bounds check and we haven't accessed the syncobj_handlesp[] array yet so the integer overflow is harmless. But the "num_syncobj" multiplication doesn't have that and the integer overflow could lead to an out of bounds access. Fixes: a292fdecd728 ("drm/amdgpu: Implement userqueue signal/wait IOCTL") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03	drm/amdgpu: disable workload profile switching when OD is enabled	Alex Deucher
	Users have reported that they have to reduce the level of undervolting to acheive stability when dynamic workload profiles are enabled on GC 10.3.x. Disable dynamic workload profiles if the user has enabled OD. Fixes: b9467983b774 ("drm/amdgpu: add dynamic workload profile switching for gfx10") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4262 Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.15.x
2025-06-03	drm/amdgpu/gfx10: Refine Cleaner Shader for GFX10.1.10	Vitaly Prosyak
	This patch updates the cleaner shader, which is responsible for initializing GPU resources such as Local Data Share (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs). Changes include adjustments to register clearing and shader configuration. - Updated GPU resource initialization addresses in the cleaner shader from `be803080` to `be803000`. - Simplified the logic in the SGPR clearing section, ensuring all SGPRs are set to zero. Fixes: 25961bad9212 ("drm/amdgpu/gfx10: Add cleaner shader for GFX10.1.10") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Manu Rastogi <manu.rastogi@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03	drm/amdgpu: Add more checks to discovery fetch	Lijo Lazar
	Add more checks for valid vram size and log error, if any. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03	drm/amdkfd: enable kfd on RISCV systems	Xuemei Liu
	KFD has been confirmed that can run on RISCV systems. It's necessary to support CONFIG_HSA_AMD on RISCV. Signed-off-by: Xuemei Liu <liu.xuemei1@zte.com.cn> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-03	Merge tag 'mfd-next-6.16' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD updates from Lee Jones: "Samsung Exynos ACPM: - Populate child platform devices from device tree data - Introduce a new API, 'devm_acpm_get_by_node()', for child devices to get the ACPM handle ROHM PMICs: - Add support for the ROHM BD96802 scalable companion PMIC to the BD96801 core driver - Add support for controlling the BD96802 using the BD96801 regulator driver - Add support to the BD96805, which is almost identical to the BD96801 - Add support to the BD96806, which is similar to the BD96802 Maxim MAX77759: - Add a core driver for the MAX77759 companion PMIC - Add a GPIO driver for the expander functions on the MAX77759 - Add an NVMEM driver to expose the non-volatile memory on the MAX77759 STMicroelectronics STM32MP25: - Add support for the STM32MP25 SoC to the stm32-lptimer - Add support for the STM32MP25 to the clocksource driver, handling new register access requirements - Add support for the STM32MP25 to the PWM driver, enabling up to two PWM outputs Broadcom BCM590xx: - Add support for the BCM59054 PMU - Parse the PMU ID and revision to support behavioral differences between chip revisions - Add regulator support for the BCM59054 Samsung S2MPG10: - Add support for the S2MPG10 PMIC, which communicates via the Samsung ACPM firmware instead of I2C Exynos ACPM: - Improve timeout detection reliability by using ktime APIs instead of a loop counter assumption - Allow PMIC access during late system shutdown by switching to 'udelay()' instead of a sleeping function - Fix an issue where reading command results longer than 8 bytes would fail - Silence non-error '-EPROBE_DEFER' messages during boot to clean up logs Exynos LPASS: - Fix an error handling path by switching to 'devm_regmap_init_mmio()' to prevent resource leaks - Fix a bug where 'exynos_lpass_disable()' was called twice in the remove function - Fix another resource leak in the probe's error path by using 'devm_add_action_or_reset()' Samsung SEC: - Handle the s2dos05, which does not have IRQ support, explicitly to prevent warnings - Fix the core driver to correctly handle errors from 'sec_irq_init()' instead of ignoring them STMPE-SPI: - Correct an undeclared identifier in the 'MODULE_DEVICE_TABLE' macro MAINTAINERS: - Adjust a file path for the Siemens IPC LED drivers entry to fix a broken reference Maxim Drivers: - Correct the spelling of "Electronics" in Samsung copyright headers across multiple files General: - Fix wakeup source memory leaks on device unbind for 88pm886, as3722, max14577, max77541, max77705, max8925, rt5033, and sprd-sc27xx drivers Samsung SEC Drivers: - Split the driver into a transport-agnostic core ('sec-core') and transport-specific ('sec-i2c', 'sec-acpm') modules to support non-I2C devices - Merge the 'sec-core' and 'sec-irq' modules to reduce memory consumption - Move internal APIs to a private header to clean up the public API - Improve code style by sorting includes, cleaning up headers, sorting device tables, and using helper macros like 'dev_err_probe()', 'MFD_CELL', and 'REGMAP_IRQ_REG' - Make regmap configuration for s2dos05/s2mpu05 explicit to improve clarity - Rework platform data and regmap instantiation to use OF match data instead of a large switch statement ROHM BD96801/2: - Prepare the driver for new models by separating chip-specific data into its own structure - Drop IC name prefix from IRQ resource names in both the MFD and regulator drivers for simplification Broadcom BCM590xx: - Refactor the regulator driver to store descriptions in a table to ease support for new chips - Rename BCM59056-specific data to prepare for the addition of other regulators - Use 'dev_err_probe()' for cleaner error handling Exynos ACPM: - Correct kerneldoc warnings and use the conventional 'np' argument name General MFD: - Convert 'aat2870' and 'tps65010' to use the per-client debugfs directory provided by the I2C core - Convert 'sm501', 'tps65010' and 'ucb1x00' to use the new GPIO line value setter callbacks - Constify 'regmap_irq_chip' and other structures in '88pm886' to move data to read-only sections BCM590xx: - Drop the unused "id" member from the 'bcm590xx' struct in preparation for a replacement Samsung SEC Core: - Remove forward declarations for functions that no longer exist SM501: - Remove the unused 'sm501_find_clock()' function New Compatibles: - Google: Add a PMIC child node to the 'google,gs101-acpm-ipc' binding - ROHM: Add new bindings for 'rohm,bd96802-regulator' and 'rohm,bd96802-pmic', and add compatibles for BD96805 and BD96806 - Maxim: Add new bindings for 'maxim,max77759-gpio', 'maxim,max77759-nvmem', and the top-level 'maxim,max77759' - STM: Add 'stm32mp25' compatible to the 'stm32-lptimer' binding - Broadcom: Add 'bcm59054' compatible - Atmel/Microchip: Add 'microchip,sama7d65-gpbr' and 'microchip,sama7d65-secumod' compatibles - Samsung: Add 's2mpg10' compatible to the 'samsung,s2mps11' MFD binding - MediaTek: Add compatibles for 'mt6893' (scpsys), 'mt7988-topmisc', and 'mt8365-infracfg-nao' - Qualcomm: Add 'qcom,apq8064-mmss-sfpb' and 'qcom,apq8064-sps-sic' syscon compatibles Refactoring & Cleanup: - Convert Broadcom BCM59056 devicetree bindings to YAML and split them into MFD and regulator parts - Convert the Microchip AT91 secumod binding to YAML - Drop unrelated consumer nodes from binding examples to reduce bloat - Correct indentation and style in various DTS examples" * tag 'mfd-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (81 commits) mfd: maxim: Correct Samsung "Electronics" spelling in copyright headers mfd: maxim: Correct Samsung "Electronics" spelling in headers mfd: sm501: Remove unused sm501_find_clock mfd: 88pm886: Constify struct regmap_irq_chip and some other structures dt-bindings: mfd: syscon: Add mediatek,mt8365-infracfg-nao mfd: sprd-sc27xx: Fix wakeup source leaks on device unbind mfd: rt5033: Fix wakeup source leaks on device unbind mfd: max8925: Fix wakeup source leaks on device unbind mfd: max77705: Fix wakeup source leaks on device unbind mfd: max77541: Fix wakeup source leaks on device unbind mfd: max14577: Fix wakeup source leaks on device unbind mfd: as3722: Fix wakeup source leaks on device unbind mfd: 88pm886: Fix wakeup source leaks on device unbind dt-bindings: mfd: Correct indentation and style in DTS example dt-bindings: mfd: Drop unrelated nodes from DTS example dt-bindings: mfd: syscon: Add qcom,apq8064-sps-sic dt-bindings: mfd: syscon: Add qcom,apq8064-mmss-sfpb mfd: stmpe-spi: Correct the name used in MODULE_DEVICE_TABLE dt-bindings: mfd: syscon: Add mt7988-topmisc mfd: exynos-lpass: Fix another error handling path in exynos_lpass_probe() ...
2025-06-03	Merge tag 'hid-for-linus-2025060301' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID updates from Jiri Kosina: - support for Apple Magic Mouse 2 USB-C (Aditya Garg) - power management improvement for multitouch devices (Werner Sembach) - fix for ACPI initialization in intel-thc driver (Wentao Guan) - adaptation of HID drivers to use new gpio_chip's line setter callbacks (Bartosz Golaszewski) - fix potential OOB in usbhid_parse() (Terry Junge) - make it possible to set hid_mouse_ignore_list dynamically (the same way we handle other quirks) (Aditya Garg) - other small assorted fixes and device ID additions * tag 'hid-for-linus-2025060301' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: multitouch: Disable touchpad on firmware level while not in use HID: core: Add functions for HID drivers to react on first open and last close call HID: HID_APPLETB_BL should depend on X86 HID: HID_APPLETB_KBD should depend on X86 HID: appletb-kbd: Use secs_to_jiffies() instead of msecs_to_jiffies() HID: intel-thc-hid: intel-thc: make read-only arrays static const HID: magicmouse: Apple Magic Mouse 2 USB-C support HID: mcp2221: use new line value setter callbacks HID: mcp2200: use new line value setter callbacks HID: cp2112: use new line value setter callbacks HID: cp2112: use lock guards HID: cp2112: hold the lock for the entire direction_output() call HID: cp2112: destroy mutex on driver detach HID: intel-thc-hid: intel-quicki2c: pass correct arguments to acpi_evaluate_object HID: corsair-void: Use to_delayed_work() HID: hid-logitech: use sysfs_emit_at() instead of scnprintf() HID: quirks: Add HID_QUIRK_IGNORE_MOUSE quirk HID: usbhid: Eliminate recurrent out-of-bounds bug in usbhid_parse() HID: Kysona: Add periodic online check
2025-06-03	dm-stripe: small code cleanup	Mikulas Patocka
	This commit doesn't fix any bug, it is just code cleanup. Use the function format_dev_t instead of sprintf, because format_dev_t does the same thing. Remove the useless memset call. An unsigned integer can take at most 10 digits, so extend the array size to 22. (note that because the range of minor and major numbers is limited, the size 16 could not be exceeded, thus this function couldn't write beyond string end) Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-06-03	dm-verity: fix a memory leak if some arguments are specified multiple times	Mikulas Patocka
	If some of the arguments "check_at_most_once", "ignore_zero_blocks", "use_fec_from_device", "root_hash_sig_key_desc" were specified more than once on the target line, a memory leak would happen. This commit fixes the memory leak. It also fixes error handling in verity_verify_sig_parse_opt_args. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org
2025-06-03	dm-mirror: fix a tiny race condition	Mikulas Patocka
	There's a tiny race condition in dm-mirror. The functions queue_bio and write_callback grab a spinlock, add a bio to the list, drop the spinlock and wake up the mirrord thread that processes bios in the list. It may be possible that the mirrord thread processes the bio just after spin_unlock_irqrestore is called, before wakeup_mirrord. This spurious wake-up is normally harmless, however if the device mapper device is unloaded just after the bio was processed, it may be possible that wakeup_mirrord(ms) uses invalid "ms" pointer. Fix this bug by moving wakeup_mirrord inside the spinlock. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org
2025-06-03	iavf: get rid of the crit lock	Przemek Kitszel
	Get rid of the crit lock. That frees us from the error prone logic of try_locks. Thanks to netdev_lock() by Jakub it is now easy, and in most cases we were protected by it already - replace crit lock by netdev lock when it was not the case. Lockdep reports that we should cancel the work under crit_lock [splat1], and that was the scheme we have mostly followed since [1] by Slawomir. But when that is done we still got into deadlocks [splat2]. So instead we should look at the bigger problem, namely "weird locking/scheduling" of the iavf. The first step to fix that is to remove the crit lock. I will followup with a -next series that simplifies scheduling/tasks. Cancel the work without netdev lock (weird unlock+lock scheme), to fix the [splat2] (which would be totally ugly if we would kept the crit lock). Extend protected part of iavf_watchdog_task() to include scheduling more work. Note that the removed comment in iavf_reset_task() was misplaced, it belonged to inside of the removed if condition, so it's gone now. [splat1] - w/o this patch - The deadlock during VF removal: WARNING: possible circular locking dependency detected sh/3825 is trying to acquire lock: ((work_completion)(&(&adapter->watchdog_task)->work)){+.+.}-{0:0}, at: start_flush_work+0x1a1/0x470 but task is already holding lock: (&adapter->crit_lock){+.+.}-{4:4}, at: iavf_remove+0xd1/0x690 [iavf] which lock already depends on the new lock. [splat2] - when cancelling work under crit lock, w/o this series, see [2] for the band aid attempt WARNING: possible circular locking dependency detected sh/3550 is trying to acquire lock: ((wq_completion)iavf){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x26/0x90 but task is already holding lock: (&dev->lock){+.+.}-{4:4}, at: iavf_remove+0xa6/0x6e0 [iavf] which lock already depends on the new lock. [1] fc2e6b3b132a ("iavf: Rework mutexes for better synchronisation") [2] https://github.com/pkitszel/linux/commit/52dddbfc2bb60294083f5711a158a Fixes: d1639a17319b ("iavf: fix a deadlock caused by rtnl and driver's lock circular dependencies") Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-03	iavf: sprinkle netdev_assert_locked() annotations	Przemek Kitszel
	Lockdep annotations help in general, but here it is extra good, as next commit will remove crit lock. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-03	iavf: extract iavf_watchdog_step() out of iavf_watchdog_task()	Przemek Kitszel
	Finish up easy refactor of watchdog_task, total for this + prev two commits is: 1 file changed, 47 insertions(+), 82 deletions(-) Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-03	iavf: simplify watchdog_task in terms of adminq task scheduling	Przemek Kitszel
	Simplify the decision whether to schedule adminq task. The condition is the same, but it is executed in more scenarios. Note that movement of watchdog_done label makes this commit a bit surprising. (Hence not squashing it to anything bigger). Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-03	iavf: centralize watchdog requeueing itself	Przemek Kitszel
	Centralize the unlock(critlock); unlock(netdev); queue_delayed_work(watchog_task); pattern to one place. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-03	iavf: iavf_suspend(): take RTNL before netdev_lock()	Przemek Kitszel
	Fix an obvious violation of lock ordering. Jakub's [1] added netdev_lock() call that is wrong ordered wrt RTNL, but the Fixes tag points to crit_lock being wrongly placed (by lockdep standards). Actual reason we got it wrong is dated back to critical section managed by pure flag checks, which is with us since the very beginning. [1] afc664987ab3 ("eth: iavf: extend the netdev_lock usage") Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-03	Merge tag 'ata-6.16-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata updates from Damien Le Moal: - Simplify ata_print_version_once() using dev_dbg_once() (Heiner) - Some cleanups of libata-sata code to simplify the sense data fetching code and use BIT() macro for tag bit handling (Niklas) - Fix variable name spelling in the sata_sx4 driver (Colin) - Improve sense data information field handling for passthrough commands (Igor) - Add Rockchip RK3576 SoC compatible to the Designware AHCI DT bindings (Nicolas) - Add a message to indicate if a port is marked as external or not, to help with debugging potential issues with LPM (Niklas) - Convert DT bindings for "ti,dm816-ahci", "apm,xgene-ahci", "cavium,ebt3000-compact-flash", "marvell,orion-sata", and "arasan,cf-spear1340" to DT schema (Rob) - Cleanup and improve the code and related comments for HIPM and DIPM (host initiated and device initiated power managent) handling. In particular, keep DIPM disabled while modifying the allowed LPM states to avoid races with the device initiating power state changes (Niklas) * tag 'ata-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata-eh: Keep DIPM disabled while modifying the allowed LPM states ata: libata-eh: Rename no_dipm variable to be more clear ata: libata-eh: Rename hipm and dipm variables ata: libata-eh: Add ata_eh_set_lpm() WARN_ON_ONCE ata: libata-eh: Update DIPM comments to reflect reality dt-bindings: ata: Convert arasan,cf-spear1340 to DT schema dt-bindings: ata: Convert marvell,orion-sata to DT schema dt-bindings: ata: Convert cavium,ebt3000-compact-flash to DT schema dt-bindings: ata: Convert apm,xgene-ahci to DT schema dt-bindings: ata: Convert st,ahci to DT schema dt-bindings: ata: Convert ti,dm816-ahci to DT schema ata: libata: Print if port is external on boot dt-bindings: ata: rockchip-dwc-ahci: add RK3576 compatible ata: libata-scsi: Do not set the INFORMATION field twice for ATA PT ata: sata_sx4: Fix spelling mistake "parttern" -> "pattern" ata: libata-sata: Use BIT() macro to convert tag to bit field ata: libata-sata: Simplify sense_valid fetching ata: libata-core: Simplify ata_print_version_once
2025-06-03	Merge tag 'hwmon-for-v6.16' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon updates from Guenter Roeck: "New drivers: - KEBA fan controller - KEBA battery monitoring controller - MAX77705 Support added to existing drivers: - MAXIMUS VI HERO and ROG MAXIMUS Z90 Formula support (asus-ec-sensors) - SQ52206 support (ina238) - lt3074 support (pmbus/lt3074) - ADPM12160 support (pmbus/max34440) - MPM82504 and for MPM3695 family support (pmbus/mpq8785) - Add the Dell OptiPlex 7050 to the DMI whitelist (dell-smm) - Zen5 Ryzen Desktop support (k10temp) Various other minor fixes and improvements" * tag 'hwmon-for-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (48 commits) doc: hwmon: acpi_power_meter: Add information about enabling the power capping feature. hwmon: (isl28022) Fix current reading calculation hwmon: (lm75) Fix I3C transfer buffer pointer for incoming data hwmon: Add KEBA fan controller support hwmon: pmbus: mpq8785: Add support for MPM3695 family hwmon: pmbus: mpq8785: Add support for MPM82504 hwmon: pmbus: mpq8785: Implement VOUT feedback resistor divider ratio configuration hwmon: pmbus: mpq8785: Prepare driver for multiple device support dt-bindings: hwmon: Add bindings for mpq8785 driver hwmon: (ina238) Modify the calculation formula to adapt to different chips hwmon: (ina238) Add support for SQ52206 dt-bindings: Add SQ52206 to ina2xx devicetree bindings hwmon: (ina238) Add ina238_config to save configurations for different chips hwmon: (ausus-ec-sensors) add MAXIMUS VI HERO. hwmon: (isl28022, nct7363) Convert to use maple tree register cache hwmon: (asus-ec-sensors) check sensor index in read_string() hwmon: (asus-ec-sensors) add ROG MAXIMUS Z90 Formula. dt-bindings: hwmon: Add Sophgo SG2044 external hardware monitor support hwmon: (max77705) Add initial support hwmon: (tmp102) add vcc regulator support ...
2025-06-03	Merge tag 'hyperv-next-signed-20250602' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Support for Virtual Trust Level (VTL) on arm64 (Roman Kisel) - Fixes for Hyper-V UIO driver (Long Li) - Fixes for Hyper-V PCI driver (Michael Kelley) - Select CONFIG_SYSFB for Hyper-V guests (Michael Kelley) - Documentation updates for Hyper-V VMBus (Michael Kelley) - Enhance logging for hv_kvp_daemon (Shradha Gupta) * tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (23 commits) Drivers: hv: Always select CONFIG_SYSFB for Hyper-V guests Drivers: hv: vmbus: Add comments about races with "channels" sysfs dir Documentation: hyperv: Update VMBus doc with new features and info PCI: hv: Remove unnecessary flex array in struct pci_packet Drivers: hv: Remove hv_alloc/free_* helpers Drivers: hv: Use kzalloc for panic page allocation uio_hv_generic: Align ring size to system page uio_hv_generic: Use correct size for interrupt and monitor pages Drivers: hv: Allocate interrupt and monitor pages aligned to system page boundary arch/x86: Provide the CPU number in the wakeup AP callback x86/hyperv: Fix APIC ID and VP index confusion in hv_snp_boot_ap() PCI: hv: Get vPCI MSI IRQ domain from DeviceTree ACPI: irq: Introduce acpi_get_gsi_dispatcher() Drivers: hv: vmbus: Introduce hv_get_vmbus_root_device() Drivers: hv: vmbus: Get the IRQ number from DeviceTree dt-bindings: microsoft,vmbus: Add interrupt and DMA coherence properties arm64, x86: hyperv: Report the VTL the system boots in arm64: hyperv: Initialize the Virtual Trust Level field Drivers: hv: Provide arch-neutral implementation of get_vtl() Drivers: hv: Enable VTL mode for arm64 ...
2025-06-03	Merge tag 'bitmap-for-6.16-rc1' of https://github.com/norov/linux	Linus Torvalds
	Pull bitmap updates from Yury Norov: - dead code cleanups for cpumasks and nodemasks (me) - fixed-width flavors of GENMASK() and BIT() (Vincent, Lucas and me) - FIELD_MODIFY() helper (Luo) - for_each_node_with_cpus() optimization (me) - bitmap-str fixes (Andy) * tag 'bitmap-for-6.16-rc1' of https://github.com/norov/linux: topology: make for_each_node_with_cpus() O(N) bitfield: Add FIELD_MODIFY() helper bitmap-str: Add missing header(s) bitmap-str: Get rid of 'extern' for function prototypes build_bug.h: more user friendly error messages in BUILD_BUG_ON_ZERO() test_bits: add tests for BIT_U() test_bits: add tests for GENMASK_U() drm/i915: Convert REG_GENMASK() to fixed-width GENMASK_U() bits: introduce fixed-type BIT_U() bits: introduce fixed-type GENMASK_U() bits: add comments and newlines to #if, #else and #endif directives cpumask: drop cpumask_assign_cpu() riscv: switch set_icache_stale_mask() to using non-atomic assign_cpu() cpumask: add non-atomic __assign_cpu() nodemask: drop nodes_shift
2025-06-03	ovpn: avoid sleep in atomic context in TCP RX error path	Antonio Quartulli
	Upon error along the TCP data_ready event path, we have the following chain of calls: strp_data_ready() ovpn_tcp_rcv() ovpn_peer_del() ovpn_socket_release() Since strp_data_ready() may be invoked from softirq context, and ovpn_socket_release() may sleep, the above sequence may cause a sleep in atomic context like the following: BUG: sleeping function called from invalid context at ./ovpn-backports-ovpn-net-next-main-6.15.0-rc5-20250522/drivers/net/ovpn/socket.c:71 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 25, name: ksoftirqd/3 5 locks held by ksoftirqd/3/25: #0: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0xb8/0x5b0 OpenVPN/ovpn-backports#1: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0xb8/0x5b0 OpenVPN/ovpn-backports#2: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x66/0x1e0 OpenVPN/ovpn-backports#3: ffffffe003ce9818 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x156e/0x17a0 OpenVPN/ovpn-backports#4: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: ovpn_tcp_data_ready+0x0/0x1b0 [ovpn] CPU: 3 PID: 25 Comm: ksoftirqd/3 Not tainted 5.10.104+ #0 Call Trace: walk_stackframe+0x0/0x1d0 show_stack+0x2e/0x44 dump_stack+0xc2/0x102 ___might_sleep+0x29c/0x2b0 __might_sleep+0x62/0xa0 ovpn_socket_release+0x24/0x2d0 [ovpn] unlock_ovpn+0x6e/0x190 [ovpn] ovpn_peer_del+0x13c/0x390 [ovpn] ovpn_tcp_rcv+0x280/0x560 [ovpn] __strp_recv+0x262/0x940 strp_recv+0x66/0x80 tcp_read_sock+0x122/0x410 strp_data_ready+0x156/0x1f0 ovpn_tcp_data_ready+0x92/0x1b0 [ovpn] tcp_data_ready+0x6c/0x150 tcp_rcv_established+0xb36/0xc50 tcp_v4_do_rcv+0x25e/0x380 tcp_v4_rcv+0x166a/0x17a0 ip_protocol_deliver_rcu+0x8c/0x250 ip_local_deliver_finish+0xf8/0x1e0 ip_local_deliver+0xc2/0x2d0 ip_rcv+0x1f2/0x330 __netif_receive_skb+0xfc/0x290 netif_receive_skb+0x104/0x5b0 br_pass_frame_up+0x190/0x3f0 br_handle_frame_finish+0x3e2/0x7a0 br_handle_frame+0x750/0xab0 __netif_receive_skb_core.constprop.0+0x4c0/0x17f0 __netif_receive_skb+0xc6/0x290 netif_receive_skb+0x104/0x5b0 xgmac_dma_rx+0x962/0xb40 __napi_poll.constprop.0+0x5a/0x350 net_rx_action+0x1fe/0x4b0 __do_softirq+0x1f8/0x85c run_ksoftirqd+0x80/0xd0 smpboot_thread_fn+0x1f0/0x3e0 kthread+0x1e6/0x210 ret_from_kernel_thread+0x8/0xc Fix this issue by postponing the ovpn_peer_del() call to a scheduled worker, as we already do in ovpn_tcp_send_sock() for the very same reason. Fixes: 11851cbd60ea ("ovpn: implement TCP transport") Reported-by: Qingfang Deng <dqfext@gmail.com> Closes: https://github.com/OpenVPN/ovpn-net-next/issues/13 Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>