summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-06-19drm/xe: Fix memset on iomemLucas De Marchi
It should rather use xe_map_memset() as the BO is created with XE_BO_FLAG_VRAM_IF_DGFX in xe_guc_pc_init(). Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250612-vmap-vaddr-v1-1-26238ed443eb@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 21cf47d89fba353b2d5915ba4718040c4cb955d3) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-19drm/xe/bmg: Update Wa_16023588340Vinay Belgaumkar
This allows for additional L2 caching modes. Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20250612-wa-14022085890-v4-2-94ba5dcc1e30@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 6ab42fa03d4c88a0ddf5e56e62794853b198e7bf) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-19ublk: santizize the arguments from userspace when adding a deviceRonnie Sahlberg
Sanity check the values for queue depth and number of queues we get from userspace when adding a device. Signed-off-by: Ronnie Sahlberg <rsahlberg@whamcloud.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Fixes: 62fe99cef94a ("ublk: add read()/write() support for ublk char device") Link: https://lore.kernel.org/r/20250619021031.181340-1-ronniesahlberg@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-19net: lan743x: fix potential out-of-bounds write in ↵Alexey Kodanev
lan743x_ptp_io_event_clock_get() Before calling lan743x_ptp_io_event_clock_get(), the 'channel' value is checked against the maximum value of PCI11X1X_PTP_IO_MAX_CHANNELS(8). This seems correct and aligns with the PTP interrupt status register (PTP_INT_STS) specifications. However, lan743x_ptp_io_event_clock_get() writes to ptp->extts[] with only LAN743X_PTP_N_EXTTS(4) elements, using channel as an index: lan743x_ptp_io_event_clock_get(..., u8 channel,...) { ... /* Update Local timestamp */ extts = &ptp->extts[channel]; extts->ts.tv_sec = sec; ... } To avoid an out-of-bounds write and utilize all the supported GPIO inputs, set LAN743X_PTP_N_EXTTS to 8. Detected using the static analysis tool - Svace. Fixes: 60942c397af6 ("net: lan743x: Add support for PTP-IO Event Input External Timestamp (extts)") Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Rengarajan S <rengarajan.s@microchip.com> Link: https://patch.msgid.link/20250616113743.36284-1-aleksei.kodanev@bell-sw.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-06-19serial: core: restore of_node information in sysfsAidan Stewart
Since in v6.8-rc1, the of_node symlink under tty devices is missing. This breaks any udev rules relying on this information. Link the of_node information in the serial controller device with the parent defined in the device tree. This will also apply to the serial device which takes the serial controller as a parent device. Fixes: b286f4e87e32 ("serial: core: Move tty and serdev to be children of serial core port device") Cc: stable@vger.kernel.org Signed-off-by: Aidan Stewart <astewart@tektelic.com> Link: https://lore.kernel.org/r/20250617164819.13912-1-astewart@tektelic.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19vt: fix kernel-doc warnings in ucs_get_fallback()Randy Dunlap
Use the correct function parameter name in ucs_get_fallback() to prevent kernel-doc warnings: Warning: drivers/tty/vt/ucs.c:218 function parameter 'cp' not described in 'ucs_get_fallback' Warning: drivers/tty/vt/ucs.c:218 Excess function parameter 'base' description in 'ucs_get_fallback' Fixes: fe26933cf1e1 ("vt: add ucs_get_fallback()") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Nicolas Pitre <npitre@baylibre.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: linux-serial@vger.kernel.org Reviewed-by: Nicolas Pitre <npitre@baylibre.com>. Link: https://lore.kernel.org/r/20250611020229.2650595-1-rdunlap@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19vt: add missing notification when switching back to text modeNicolas Pitre
Programs using poll() on /dev/vcsa to be notified when VT changes occur were missing one case: the switch from gfx to text mode. Signed-off-by: Nicolas Pitre <npitre@baylibre.com> Link: https://lore.kernel.org/r/9o5ro928-0pp4-05rq-70p4-ro385n21n723@onlyvoer.pbz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19mtk-sd: Prevent memory corruption from DMA map failureMasami Hiramatsu (Google)
If msdc_prepare_data() fails to map the DMA region, the request is not prepared for data receiving, but msdc_start_data() proceeds the DMA with previous setting. Since this will lead a memory corruption, we have to stop the request operation soon after the msdc_prepare_data() fails to prepare it. Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Fixes: 208489032bdd ("mmc: mediatek: Add Mediatek MMC driver") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/174972756982.3337526.6755001617701603082.stgit@mhiramat.tok.corp.google.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-06-19Revert "usb: xhci: Implement xhci_handshake_check_state() helper"Roy Luo
This reverts commit 6ccb83d6c4972ebe6ae49de5eba051de3638362c. Commit 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper") was introduced to workaround watchdog timeout issues on some platforms, allowing xhci_reset() to bail out early without waiting for the reset to complete. Skipping the xhci handshake during a reset is a dangerous move. The xhci specification explicitly states that certain registers cannot be accessed during reset in section 5.4.1 USB Command Register (USBCMD), Host Controller Reset (HCRST) field: "This bit is cleared to '0' by the Host Controller when the reset process is complete. Software cannot terminate the reset process early by writinga '0' to this bit and shall not write any xHC Operational or Runtime registers until while HCRST is '1'." This behavior causes a regression on SNPS DWC3 USB controller with dual-role capability. When the DWC3 controller exits host mode and removes xhci while a reset is still in progress, and then tries to configure its hardware for device mode, the ongoing reset leads to register access issues; specifically, all register reads returns 0. These issues extend beyond the xhci register space (which is expected during a reset) and affect the entire DWC3 IP block, causing the DWC3 device mode to malfunction. Cc: stable <stable@kernel.org> Fixes: 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper") Signed-off-by: Roy Luo <royluo@google.com> Link: https://lore.kernel.org/r/20250522190912.457583-3-royluo@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: xhci: Skip xhci_reset in xhci_resume if xhci is being removedRoy Luo
xhci_reset() currently returns -ENODEV if XHCI_STATE_REMOVING is set, without completing the xhci handshake, unless the reset completes exceptionally quickly. This behavior causes a regression on Synopsys DWC3 USB controllers with dual-role capabilities. Specifically, when a DWC3 controller exits host mode and removes xhci while a reset is still in progress, and then attempts to configure its hardware for device mode, the ongoing, incomplete reset leads to critical register access issues. All register reads return zero, not just within the xHCI register space (which might be expected during a reset), but across the entire DWC3 IP block. This patch addresses the issue by preventing xhci_reset() from being called in xhci_resume() and bailing out early in the reinit flow when XHCI_STATE_REMOVING is set. Cc: stable <stable@kernel.org> Fixes: 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper") Suggested-by: Mathias Nyman <mathias.nyman@intel.com> Signed-off-by: Roy Luo <royluo@google.com> Link: https://lore.kernel.org/r/20250522190912.457583-2-royluo@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: gadget: u_serial: Fix race condition in TTY wakeupKuen-Han Tsai
A race condition occurs when gs_start_io() calls either gs_start_rx() or gs_start_tx(), as those functions briefly drop the port_lock for usb_ep_queue(). This allows gs_close() and gserial_disconnect() to clear port.tty and port_usb, respectively. Use the null-safe TTY Port helper function to wake up TTY. Example CPU1: CPU2: gserial_connect() // lock gs_close() // await lock gs_start_rx() // unlock usb_ep_queue() gs_close() // lock, reset port.tty and unlock gs_start_rx() // lock tty_wakeup() // NPE Fixes: 35f95fd7f234 ("TTY: usb/u_serial, use tty from tty_port") Cc: stable <stable@kernel.org> Signed-off-by: Kuen-Han Tsai <khtsai@google.com> Reviewed-by: Prashanth K <prashanth.k@oss.qualcomm.com> Link: https://lore.kernel.org/linux-usb/20240116141801.396398-1-khtsai@google.com/ Link: https://lore.kernel.org/r/20250617050844.1848232-2-khtsai@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19Revert "usb: gadget: u_serial: Add null pointer check in gs_start_io"Kuen-Han Tsai
This reverts commit ffd603f214237e250271162a5b325c6199a65382. Commit ffd603f21423 ("usb: gadget: u_serial: Add null pointer check in gs_start_io") adds null pointer checks at the beginning of the gs_start_io() function to prevent a null pointer dereference. However, these checks are redundant because the function's comment already requires callers to hold the port_lock and ensure port.tty and port_usb are not null. All existing callers already follow these rules. The true cause of the null pointer dereference is a race condition. When gs_start_io() calls either gs_start_rx() or gs_start_tx(), the port_lock is temporarily released for usb_ep_queue(). This allows port.tty and port_usb to be cleared. Fixes: ffd603f21423 ("usb: gadget: u_serial: Add null pointer check in gs_start_io") Cc: stable <stable@kernel.org> Signed-off-by: Kuen-Han Tsai <khtsai@google.com> Reviewed-by: Prashanth K <prashanth.k@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250617050844.1848232-1-khtsai@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: chipidea: udc: disconnect/reconnect from host when do suspend/resumeXu Yang
Shawn and John reported a hang issue during system suspend as below: - USB gadget is enabled as Ethernet - There is data transfer over USB Ethernet (scp a big file between host and device) - Device is going in/out suspend (echo mem > /sys/power/state) The root cause is the USB device controller is suspended but the USB bus is still active which caused the USB host continues to transfer data with device and the device continues to queue USB requests (in this case, a delayed TCP ACK packet trigger the issue) after controller is suspended, however the USB controller clock is already gated off. Then if udc driver access registers after that point, the system will hang. The correct way to avoid such issue is to disconnect device from host when the USB bus is not at suspend state. Then the host will receive disconnect event and stop data transfer in time. To continue make USB gadget device work after system resume, this will reconnect device automatically. To make usb wakeup work if USB bus is already at suspend state, this will keep connection for it only when USB device controller has enabled wakeup capability. Reported-by: Shawn Guo <shawnguo@kernel.org> Reported-by: John Ernberg <john.ernberg@actia.se> Closes: https://lore.kernel.org/linux-usb/aEZxmlHmjeWcXiF3@dragon/ Tested-by: John Ernberg <john.ernberg@actia.se> # iMX8QXP Fixes: 235ffc17d014 ("usb: chipidea: udc: add suspend/resume support for device controller") Cc: stable <stable@kernel.org> Reviewed-by: Jun Li <jun.li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/20250614124914.207540-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: acpi: fix device link removalHeikki Krogerus
The device link to the USB4 host interface has to be removed manually since it's no longer auto removed. Fixes: 623dae3e7084 ("usb: acpi: fix boot hang due to early incorrect 'tunneled' USB3 device links") Cc: stable <stable@kernel.org> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20250611111415.2707865-1-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: hub: fix detection of high tier USB3 devices behind suspended hubsMathias Nyman
USB3 devices connected behind several external suspended hubs may not be detected when plugged in due to aggressive hub runtime pm suspend. The hub driver immediately runtime-suspends hubs if there are no active children or port activity. There is a delay between the wake signal causing hub resume, and driver visible port activity on the hub downstream facing ports. Most of the LFPS handshake, resume signaling and link training done on the downstream ports is not visible to the hub driver until completed, when device then will appear fully enabled and running on the port. This delay between wake signal and detectable port change is even more significant with chained suspended hubs where the wake signal will propagate upstream first. Suspended hubs will only start resuming downstream ports after upstream facing port resumes. The hub driver may resume a USB3 hub, read status of all ports, not yet see any activity, and runtime suspend back the hub before any port activity is visible. This exact case was seen when conncting USB3 devices to a suspended Thunderbolt dock. USB3 specification defines a 100ms tU3WakeupRetryDelay, indicating USB3 devices expect to be resumed within 100ms after signaling wake. if not then device will resend the wake signal. Give the USB3 hubs twice this time (200ms) to detect any port changes after resume, before allowing hub to runtime suspend again. Cc: stable <stable@kernel.org> Fixes: 2839f5bcfcfc ("USB: Turn on auto-suspend for USB 3.0 hubs.") Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250611112441.2267883-1-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19Logitech C-270 even more brokenOliver Neukum
Some varieties of this device don't work with RESET_RESUME alone. Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/20250605122852.1440382-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: dwc3: Abort suspend on soft disconnect failureKuen-Han Tsai
When dwc3_gadget_soft_disconnect() fails, dwc3_suspend_common() keeps going with the suspend, resulting in a period where the power domain is off, but the gadget driver remains connected. Within this time frame, invoking vbus_event_work() will cause an error as it attempts to access DWC3 registers for endpoint disabling after the power domain has been completely shut down. Abort the suspend sequence when dwc3_gadget_suspend() cannot halt the controller and proceeds with a soft connect. Fixes: 9f8a67b65a49 ("usb: dwc3: gadget: fix gadget suspend/resume") Cc: stable <stable@kernel.org> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Signed-off-by: Kuen-Han Tsai <khtsai@google.com> Link: https://lore.kernel.org/r/20250528100315.2162699-1-khtsai@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: cdnsp: do not disable slot for disabled slotPeter Chen
It doesn't need to do it, and the related command event returns 'Slot Not Enabled Error' status. Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable <stable@kernel.org> Suggested-by: Hongliang Yang <hongliang.yang@cixtech.com> Reviewed-by: Fugang Duan <fugang.duan@cixtech.com> Signed-off-by: Peter Chen <peter.chen@cixtech.com> Link: https://lore.kernel.org/r/20250619013413.35817-1-peter.chen@cixtech.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19eth: fbnic: avoid double free when failing to DMA-map FW msgJakub Kicinski
The semantics are that caller of fbnic_mbx_map_msg() retains the ownership of the message on error. All existing callers dutifully free the page. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250616195510.225819-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-06-19mfd: Fix building without CONFIG_OFArnd Bergmann
Using the of_fwnode_handle() means that local 'node' variables are unused whenever CONFIG_OF is disabled for compile testing: drivers/mfd/88pm860x-core.c: In function 'device_irq_init': drivers/mfd/88pm860x-core.c:576:29: error: unused variable 'node' [-Werror=unused-variable] 576 | struct device_node *node = i2c->dev.of_node; | ^~~~ drivers/mfd/max8925-core.c: In function 'max8925_irq_init': drivers/mfd/max8925-core.c:659:29: error: unused variable 'node' [-Werror=unused-variable] 659 | struct device_node *node = chip->dev->of_node; | ^~~~ drivers/mfd/twl4030-irq.c: In function 'twl4030_init_irq': drivers/mfd/twl4030-irq.c:679:46: error: unused variable 'node' [-Werror=unused-variable] 679 | struct device_node *node = dev->of_node; | ^~~~ Replace these with the corresponding dev_fwnode() lookups that keep the code simpler in addition to avoiding the warnings. Fixes: e3d44f11da04 ("mfd: Switch to irq_domain_create_*()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20250520154106.2019525-1-arnd@kernel.org Signed-off-by: Lee Jones <lee@kernel.org>
2025-06-19Merge tag 'amd-drm-fixes-6.16-2025-06-18' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.16-2025-06-18: amdgpu: - DP tunneling fix - LTTPR fix - DSC fix - DML2.x ABGR16161616 fix - RMCM fix - Backlight fixes - GFX11 kicker support - SDMA reset fixes - VCN 5.0.1 fix - Reset fix - Misc small fixes amdkfd: - SDMA reset fix - Fix race in GWS scheduling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250618203115.1533451-1-alexander.deucher@amd.com
2025-06-19Merge tag 'drm-intel-fixes-2025-06-18' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix MIPI vtotal programming off by one on Broxton - Fix PMU code for GCOV and AutoFDO enabled build Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://lore.kernel.org/r/aFJfykDpUwtmpilE@jlahtine-mobl
2025-06-18Octeontx2-pf: Fix Backpresure configurationHariprasad Kelam
NIX block can receive packets from multiple links such as MAC (RPM), LBK and CPT. ----------------- RPM --| NIX | ----------------- | | LBK Each link supports multiple channels for example RPM link supports 16 channels. In case of link oversubsribe, NIX will assert backpressure on receive channels. The previous patch considered a single channel per link, resulting in backpressure not being enabled on the remaining channels Fixes: a7ef63dbd588 ("octeontx2-af: Disable backpressure between CPT and NIX") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250617063403.3582210-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-06-17 (ice, e1000e) For ice: Krishna Kumar modifies aRFS match criteria to correctly identify matching filters. Grzegorz fixes a memory leak in eswitch legacy mode. For e1000e: Vitaly sets clock frequency on some Nahum systems which may misreport their value. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: set fixed clock frequency indication for Nahum 11 and Nahum 13 ice: fix eswitch code memory leak in reset scenario net: ice: Perform accurate aRFS flow match ==================== Link: https://patch.msgid.link/20250617172444.1419560-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18net: ftgmac100: select FIXED_PHYHeiner Kallweit
Depending on e.g. DT configuration this driver uses a fixed link. So we shouldn't rely on the user to enable FIXED_PHY, select it in Kconfig instead. We may end up with a non-functional driver otherwise. Fixes: 38561ded50d0 ("net: ftgmac100: support fixed link") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/476bb33b-5584-40f0-826a-7294980f2895@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18ACPICA: Refuse to evaluate a method if arguments are missingRafael J. Wysocki
As reported in [1], a platform firmware update that increased the number of method parameters and forgot to update a least one of its callers, caused ACPICA to crash due to use-after-free. Since this a result of a clear AML issue that arguably cannot be fixed up by the interpreter (it cannot produce missing data out of thin air), address it by making ACPICA refuse to evaluate a method if the caller attempts to pass fewer arguments than expected to it. Closes: https://github.com/acpica/acpica/issues/1027 [1] Reported-by: Peter Williams <peter@newton.cx> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hans de Goede <hansg@kernel.org> Tested-by: Hans de Goede <hansg@kernel.org> # Dell XPS 9640 with BIOS 1.12.0 Link: https://patch.msgid.link/5909446.DvuYhMxLoT@rjwysocki.net Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-18EDAC/igen6: Fix NULL pointer dereferenceQiuxu Zhuo
A kernel panic was reported with the following kernel log: EDAC igen6: Expected 2 mcs, but only 1 detected. BUG: unable to handle page fault for address: 000000000000d570 ... Hardware name: Notebook V54x_6x_TU/V54x_6x_TU, BIOS Dasharo (coreboot+UEFI) v0.9.0 07/17/2024 RIP: e030:ecclog_handler+0x7e/0xf0 [igen6_edac] ... igen6_probe+0x2a0/0x343 [igen6_edac] ... igen6_init+0xc5/0xff0 [igen6_edac] ... This issue occurred because one memory controller was disabled by the BIOS but the igen6_edac driver still checked all the memory controllers, including this absent one, to identify the source of the error. Accessing the null MMIO for the absent memory controller resulted in the oops above. Fix this issue by reverting the configuration structure to non-const and updating the field 'res_cfg->num_imc' to reflect the number of detected memory controllers. Fixes: 20e190b1c1fd ("EDAC/igen6: Skip absent memory controllers") Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Closes: https://lore.kernel.org/all/aFFN7RlXkaK_loQb@mail-itl/ Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Link: https://lore.kernel.org/r/20250618162307.1523736-1-qiuxu.zhuo@intel.com
2025-06-18Merge tag 'ata-6.16-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fixes from Niklas Cassel: - Force PIO for ATAPI devices on VT6415/VT6330 as the controller locks up on ATAPI DMA (Tasos) - Fix ACPI PATA cable type detection such that the controller is not forced down to a slow transfer mode (Tasos) - Fix build error on 32-bit UML (Johannes) - Fix a PCI region leak in the pata_macio driver so that the driver no longer fails to load after rmmod (Philipp) - Use correct DMI BIOS build date for ThinkPad W541 quirk (me) - Disallow LPM for ASUSPRO-D840SA motherboard as this board interestingly enough gets graphical corruptions on the iGPU when LPM is enabled (me) - Disallow LPM for Asus B550-F motherboard as this board will get command timeouts on ports 5 and 6, yet LPM with the same drive works fine on all other ports (Mikko) * tag 'ata-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: ahci: Disallow LPM for Asus B550-F motherboard ata: ahci: Disallow LPM for ASUSPRO-D840SA motherboard ata: ahci: Use correct BIOS build date for ThinkPad W541 quirk ata: pata_macio: Fix PCI region leak ata: pata_cs5536: fix build on 32-bit UML ata: libata-acpi: Do not assume 40 wire cable if no devices are enabled ata: pata_via: Force PIO for ATAPI devices on VT6415/VT6330
2025-06-18drm/amdgpu/sdma5.2: init engine reset mutexAlex Deucher
Missing the mutex init. Fixes: 47454f2dc0bf ("drm/amdgpu: Register the new sdma function pointers for sdma_v5_2") Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ea685ff30a51a25dd9be90786933ada49a088f65)
2025-06-18drm/amdkfd: Fix race in GWS queue schedulingJay Cornwall
q->gws is not updated atomically with qpd->mapped_gws_queue. If a runlist is created between pqm_set_gws and update_queue it will contain a queue which uses GWS in a process with no GWS allocated. This will result in a scheduler hang. Use q->properties.is_gws which is changed while holding the DQM lock. Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b98370220eb3110e82248e3354e16a489a492cfb) Cc: stable@vger.kernel.org
2025-06-18drm/amdgpu/sdma5: init engine reset mutexAlex Deucher
Missing the mutex init. Fixes: e56d4bf57fab ("drm/amdgpu/: drm/amdgpu: Register the new sdma function pointers for sdma_v5_0") Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3f4caf092f02f0de169c6122639af481c7edc8f9)
2025-06-18drm/amdgpu: switch job hw_fence to amdgpu_fenceAlex Deucher
Use the amdgpu fence container so we can store additional data in the fence. This also fixes the start_time handling for MCBP since we were casting the fence to an amdgpu_fence and it wasn't. Fixes: 3f4c175d62d8 ("drm/amdgpu: MCBP based on DRM scheduler (v9)") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bf1cd14f9e2e1fdf981eed273ddd595863f5288c) Cc: stable@vger.kernel.org
2025-06-18drm/amdgpu: Fix SDMA UTC_L1 handling during start/stop sequencesJesse Zhang
This commit makes two key fixes to SDMA v4.4.2 handling: 1. disable UTC_L1 in sdma_cntl register when stopping SDMA engines by reading the current value before modifying UTC_L1_ENABLE bit. 2. Ensure UTC_L1_ENABLE is consistently managed by: - Adding the missing register write when enabling UTC_L1 during start - Keeping UTC_L1 enabled by default as per hardware requirements v2: Correct SDMA_CNTL setting (Philip) Suggested-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 375bf564654e85a7b1b0657b191645b3edca1bda) Cc: stable@vger.kernel.org
2025-06-18drm/amdgpu: Release reset locks during failuresLijo Lazar
Make sure to release reset domain lock in case of failures. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Fixes: 11bb33766f66 ("drm/amdgpu: refactor amdgpu_device_gpu_recover") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1ab11a82681eb33a66f423216cb063e7f40c6f85)
2025-06-18drm/amd/display: Check dce_hwseq before dereferencing itAlex Hung
[WHAT] hws was checked for null earlier in dce110_blank_stream, indicating hws can be null, and should be checked whenever it is used. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 79db43611ff61280b6de58ce1305e0b2ecf675ad) Cc: stable@vger.kernel.org
2025-06-18drm/amdgpu: VCN v5_0_1 to prevent FW checking RB during DPG pauseSonny Jiang
Add a protection to ensure programming are all complete prior VCPU starting. This is a WA for an unintended VCPU running. Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c29521b529fa5e225feaf709d863a636ca0cbbfa) Cc: stable@vger.kernel.org
2025-06-18drm/amdgpu: Use logical instance ID for SDMA v4_4_2 queue operationsJesse Zhang
Simplify SDMA v4_4_2 queue reset and stop operations by: 1. Removing GET_INST(SDMA0) conversion for ring->me 2. Using the logical instance ID (ring->me) directly 3. Maintaining consistent behavior with other SDMA queue operations This change aligns with the existing queue handling logic where ring->me already represents the correct instance identifier. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3bab282dfe25dff7a55add432f56898505a6cc6c) Cc: stable@vger.kernel.org
2025-06-18drm/amdgpu: Fix SDMA engine reset with logical instance IDJesse Zhang
This commit makes the following improvements to SDMA engine reset handling: 1. Clarifies in the function documentation that instance_id refers to a logical ID 2. Adds conversion from logical to physical instance ID before performing reset using GET_INST(SDMA0, instance_id) macro 3. Improves error messaging to indicate when a logical instance reset fails 4. Adds better code organization with blank lines for readability The change ensures proper SDMA engine reset by using the correct physical instance ID while maintaining the logical ID interface for callers. V2: Remove harvest_config check and convert directly to physical instance (Lijo) Suggested-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5efa6217c239ed1ceec0f0414f9b6f6927387dfc) Cc: stable@vger.kernel.org
2025-06-18drm/amdgpu: add kicker fws loading for gfx11/smu13/psp13Frank Min
1. Add kicker firmwares loading for gfx11/smu13/psp13 2. Register additional MODULE_FIRMWARE entries for kicker fws - gc_11_0_0_rlc_kicker.bin - gc_11_0_0_imu_kicker.bin - psp_13_0_0_sos_kicker.bin - psp_13_0_0_ta_kicker.bin - smu_13_0_0_kicker.bin Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit fb5ec2174d70a8989bc207d257db90ffeca3b163) Cc: stable@vger.kernel.org
2025-06-18drm/amdgpu: Add kicker device detectionFrank Min
1. add kicker device list 2. add kicker device checking helper function Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 09aa2b408f4ab689c3541d22b0968de0392ee406) Cc: stable@vger.kernel.org
2025-06-18drm/amd/display: Export full brightness range to userspaceMario Limonciello
[WHY] Userspace currently is offered a range from 0-0xFF but the PWM is programmed from 0-0xFFFF. This can be limiting to some software that wants to apply greater granularity. [HOW] Convert internally to firmware values only when mapping custom brightness curves because these are in 0-0xFF range. Advertise full PWM range to userspace. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8dbd72cb790058ce52279af38a43c2b302fdd3e5) Cc: stable@vger.kernel.org
2025-06-18drm/amd/display: Only read ACPI backlight caps onceMario Limonciello
[WHY] Backlight caps are read already in amdgpu_dm_update_backlight_caps(). They may be updated by update_connector_ext_caps(). Reading again when registering backlight device may cause wrong values to be used. [HOW] Use backlight caps already registered to the dm. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 148144f6d2f14b02eaaa39b86bbe023cbff350bd) Cc: stable@vger.kernel.org
2025-06-18drm/amd/display: Fix RMCM programming seq errorsYihan Zhu
[WHY & HOW] Fix RMCM programming sequence errors and mapping issues to pass the RMCM test. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 11baa4975025033547f45f5894087a0dda6efbb8) Cc: stable@vger.kernel.org
2025-06-18drm/amd/display: Fix mpv playback corruption on westonAlex Hung
[WHAT] Severe video playback corruption is observed in the following setup: weston 14.0.90 (built from source) + mpv v0.40.0 with command: mpv bbb_sunflower_1080p_60fps_normal.mp4 --vo=gpu [HOW] ABGR16161616 needs to be included in dml2/2.1 translation. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d023de809f85307ca819a9dbbceee6ae1f50e2ad) Cc: stable@vger.kernel.org
2025-06-18drm/amd/display: Add more checks for DSC / HUBP ONO guaranteesNicholas Kazlauskas
[WHY] For non-zero DSC instances it's possible that the HUBP domain required to drive it for sequential ONO ASICs isn't met, potentially causing the logic to the tile to enter an undefined state leading to a system hang. [HOW] Add more checks to ensure that the HUBP domain matching the DSC instance is appropriately powered. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Duncan Ma <duncan.ma@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit da63df07112e5a9857a8d2aaa04255c4206754ec) Cc: stable@vger.kernel.org
2025-06-18drm/amd/display: Get LTTPR IEEE OUI/Device ID From Closest LTTPR To HostMichael Strauss
[WHY] These fields are read for the explicit purpose of detecting embedded LTTPRs (i.e. between host ASIC and the user-facing port), and thus need to calculate the correct DPCD address offset based on LTTPR count to target the appropriate LTTPR's DPCD register space with these queries. [HOW] Cascaded LTTPRs in a link each snoop and increment LTTPR count when queried via DPCD read, so an LTTPR embedded in a source device (e.g. USB4 port on a laptop) will always be addressible using the max LTTPR count seen by the host. Therefore we simply need to use a recently added helper function to calculate the correct DPCD address to target potentially embedded LTTPRs based on the received LTTPR count. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Michael Strauss <michael.strauss@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 791897f5c77a2a65d0e500be4743af2ddf6eb061) Cc: stable@vger.kernel.org
2025-06-18drm/amd/display: Add dc cap for dp tunnelingPeichen Huang
[WHAT] 1. add dc cap for dp tunneling 2. add function to get index of host router Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Cruise Hung <cruise.hung@amd.com> Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 29e178d13979cf6fdb42c5fe2dfec2da2306c4ad) Cc: stable@vger.kernel.org
2025-06-18drm/amdkfd: move SDMA queue reset capability check to node_showJesse.Zhang
Relocate the per-SDMA queue reset capability check from kfd_topology_set_capabilities() to node_show() to ensure we read the latest value of sdma.supported_reset after all IP blocks are initialized. Fixes: ceb7114c961b ("drm/amdkfd: flag per-sdma queue reset supported to user space") Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e17df7b086cf908cedd919f448da9e00028419bb)
2025-06-18PCI: pciehp: Ignore belated Presence Detect Changed caused by DPCLukas Wunner
Commit c3be50f7547c ("PCI: pciehp: Ignore Presence Detect Changed caused by DPC") sought to ignore Presence Detect Changed events occurring as a side effect of Downstream Port Containment. The commit awaits recovery from DPC and then clears events which occurred in the meantime. However if the first event seen after DPC is Data Link Layer State Changed, only that event is cleared and not Presence Detect Changed. The object of the commit is thus defeated. That's because pciehp_ist() computes the events to clear based on the local "events" variable instead of "ctrl->pending_events". The former contains the events that had occurred when pciehp_ist() was entered, whereas the latter also contains events that have accumulated while awaiting DPC recovery. In practice, the order of PDC and DLLSC events is arbitrary and the delay in-between can be several milliseconds. So change the logic to always clear PDC events, even if they come after an initial DLLSC event. Fixes: c3be50f7547c ("PCI: pciehp: Ignore Presence Detect Changed caused by DPC") Reported-by: Lương Việt Hoàng <tcm4095@gmail.com> Reported-by: Joel Mathew Thomas <proxy0@tutamail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219765#c165 Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Lương Việt Hoàng <tcm4095@gmail.com> Tested-by: Joel Mathew Thomas <proxy0@tutamail.com> Link: https://patch.msgid.link/d9c4286a16253af7e93eaf12e076e3ef3546367a.1750257164.git.lukas@wunner.de
2025-06-18pinctrl: nuvoton: Fix boot on ma35dx platformsMiquel Raynal
As part of a wider cleanup trying to get rid of OF specific APIs, an incorrect (and partially unrelated) cleanup was introduced. The goal was to replace a device_for_each_chil_node() loop including an additional condition inside by a macro doing both the loop and the check on a single line. The snippet: device_for_each_child_node(dev, child) if (fwnode_property_present(child, "gpio-controller")) continue; was replaced by: for_each_gpiochip_node(dev, child) which expands into: device_for_each_child_node(dev, child) for_each_if(fwnode_property_present(child, "gpio-controller")) This change is actually doing the opposite of what was initially expected, breaking the probe of this driver, breaking at the same time the whole boot of Nuvoton platforms (no more console, the kernel WARN()). Revert these two changes to roll back to the correct behavior. Fixes: 693c9ecd8326 ("pinctrl: nuvoton: Reduce use of OF-specific APIs") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/20250613181312.1269794-1-miquel.raynal@bootlin.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>