summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-02-25net: enetc: correct the xdp_tx statisticsWei Fang
The 'xdp_tx' is used to count the number of XDP_TX frames sent, not the number of Tx BDs. Fixes: 7ed2bc80074e ("net: enetc: add support for XDP_TX") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250224111251.1061098-4-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: enetc: keep track of correct Tx BD count in enetc_map_tx_tso_buffs()Wei Fang
When creating a TSO header, if the skb is VLAN tagged, the extended BD will be used and the 'count' should be increased by 2 instead of 1. Otherwise, when an error occurs, less tx_swbd will be freed than the actual number. Fixes: fb8629e2cbfc ("net: enetc: add support for software TSO") Cc: stable@vger.kernel.org Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://patch.msgid.link/20250224111251.1061098-3-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: enetc: fix the off-by-one issue in enetc_map_tx_buffs()Wei Fang
When a DMA mapping error occurs while processing skb frags, it will free one more tx_swbd than expected, so fix this off-by-one issue. Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Cc: stable@vger.kernel.org Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Suggested-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://patch.msgid.link/20250224111251.1061098-2-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25ixgbe: fix media cage present detection for E610 devicePiotr Kwapulinski
The commit 23c0e5a16bcc ("ixgbe: Add link management support for E610 device") introduced incorrect checking of media cage presence for E610 device. Fix it. Fixes: 23c0e5a16bcc ("ixgbe: Add link management support for E610 device") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/e7d73b32-f12a-49d1-8b60-1ef83359ec13@stanley.mountain/ Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Bharath R <bharath.r@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250224190647.3601930-6-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25iavf: fix circular lock dependency with netdev_lockJacob Keller
We have recently seen reports of lockdep circular lock dependency warnings when loading the iAVF driver: [ 1504.790308] ====================================================== [ 1504.790309] WARNING: possible circular locking dependency detected [ 1504.790310] 6.13.0 #net_next_rt.c2933b2befe2.el9 Not tainted [ 1504.790311] ------------------------------------------------------ [ 1504.790312] kworker/u128:0/13566 is trying to acquire lock: [ 1504.790313] ffff97d0e4738f18 (&dev->lock){+.+.}-{4:4}, at: register_netdevice+0x52c/0x710 [ 1504.790320] [ 1504.790320] but task is already holding lock: [ 1504.790321] ffff97d0e47392e8 (&adapter->crit_lock){+.+.}-{4:4}, at: iavf_finish_config+0x37/0x240 [iavf] [ 1504.790330] [ 1504.790330] which lock already depends on the new lock. [ 1504.790330] [ 1504.790330] [ 1504.790330] the existing dependency chain (in reverse order) is: [ 1504.790331] [ 1504.790331] -> #1 (&adapter->crit_lock){+.+.}-{4:4}: [ 1504.790333] __lock_acquire+0x52d/0xbb0 [ 1504.790337] lock_acquire+0xd9/0x330 [ 1504.790338] mutex_lock_nested+0x4b/0xb0 [ 1504.790341] iavf_finish_config+0x37/0x240 [iavf] [ 1504.790347] process_one_work+0x248/0x6d0 [ 1504.790350] worker_thread+0x18d/0x330 [ 1504.790352] kthread+0x10e/0x250 [ 1504.790354] ret_from_fork+0x30/0x50 [ 1504.790357] ret_from_fork_asm+0x1a/0x30 [ 1504.790361] [ 1504.790361] -> #0 (&dev->lock){+.+.}-{4:4}: [ 1504.790364] check_prev_add+0xf1/0xce0 [ 1504.790366] validate_chain+0x46a/0x570 [ 1504.790368] __lock_acquire+0x52d/0xbb0 [ 1504.790370] lock_acquire+0xd9/0x330 [ 1504.790371] mutex_lock_nested+0x4b/0xb0 [ 1504.790372] register_netdevice+0x52c/0x710 [ 1504.790374] iavf_finish_config+0xfa/0x240 [iavf] [ 1504.790379] process_one_work+0x248/0x6d0 [ 1504.790381] worker_thread+0x18d/0x330 [ 1504.790383] kthread+0x10e/0x250 [ 1504.790385] ret_from_fork+0x30/0x50 [ 1504.790387] ret_from_fork_asm+0x1a/0x30 [ 1504.790389] [ 1504.790389] other info that might help us debug this: [ 1504.790389] [ 1504.790389] Possible unsafe locking scenario: [ 1504.790389] [ 1504.790390] CPU0 CPU1 [ 1504.790391] ---- ---- [ 1504.790391] lock(&adapter->crit_lock); [ 1504.790393] lock(&dev->lock); [ 1504.790394] lock(&adapter->crit_lock); [ 1504.790395] lock(&dev->lock); [ 1504.790397] [ 1504.790397] *** DEADLOCK *** This appears to be caused by the change in commit 5fda3f35349b ("net: make netdev_lock() protect netdev->reg_state"), which added a netdev_lock() in register_netdevice. The iAVF driver calls register_netdevice() from iavf_finish_config(), as a final stage of its state machine post-probe. It currently takes the RTNL lock, then the netdev lock, and then the device critical lock. This pattern is used throughout the driver. Thus there is a strong dependency that the crit_lock should not be acquired before the net device lock. The change to register_netdevice creates an ABBA lock order violation because the iAVF driver is holding the crit_lock while calling register_netdevice, which then takes the netdev_lock. It seems likely that future refactors could result in netdev APIs which hold the netdev_lock while calling into the driver. This means that we should not re-order the locks so that netdev_lock is acquired after the device private crit_lock. Instead, notice that we already release the netdev_lock prior to calling the register_netdevice. This flow only happens during the early driver initialization as we transition through the __IAVF_STARTUP, __IAVF_INIT_VERSION_CHECK, __IAVF_INIT_GET_RESOURCES, etc. Analyzing the places where we take crit_lock in the driver there are two sources: a) several of the work queue tasks including adminq_task, watchdog_task, reset_task, and the finish_config task. b) various callbacks which ultimately stem back to .ndo operations or ethtool operations. The latter cannot be triggered until after the netdevice registration is completed successfully. The iAVF driver uses alloc_ordered_workqueue, which is an unbound workqueue that has a max limit of 1, and thus guarantees that only a single work item on the queue is executing at any given time, so none of the other work threads could be executing due to the ordered workqueue guarantees. The iavf_finish_config() function also does not do anything else after register_netdevice, unless it fails. It seems unlikely that the driver private crit_lock is protecting anything that register_netdevice() itself touches. Thus, to fix this ABBA lock violation, lets simply release the adapter->crit_lock as well as netdev_lock prior to calling register_netdevice(). We do still keep holding the RTNL lock as required by the function. If we do fail to register the netdevice, then we re-acquire the adapter critical lock to finish the transition back to __IAVF_INIT_CONFIG_ADAPTER. This ensures every call where both netdev_lock and the adapter->crit_lock are acquired under the same ordering. Fixes: afc664987ab3 ("eth: iavf: extend the netdev_lock usage") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250224190647.3601930-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25ice: Avoid setting default Rx VSI twice in switchdev setupMarcin Szycik
As part of switchdev environment setup, uplink VSI is configured as default for both Tx and Rx. Default Rx VSI is also used by promiscuous mode. If promisc mode is enabled and an attempt to enter switchdev mode is made, the setup will fail because Rx VSI is already configured as default (rule exists). Reproducer: devlink dev eswitch set $PF1_PCI mode switchdev ip l s $PF1 up ip l s $PF1 promisc on echo 1 > /sys/class/net/$PF1/device/sriov_numvfs In switchdev setup, use ice_set_dflt_vsi() instead of plain ice_cfg_dflt_vsi(), which avoids repeating setting default VSI for Rx if it's already configured. Fixes: 50d62022f455 ("ice: default Tx rule instead of to queue") Reported-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Closes: https://lore.kernel.org/intel-wired-lan/PH0PR11MB50138B635F2E5CEB7075325D961F2@PH0PR11MB5013.namprd11.prod.outlook.com Reviewed-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250224190647.3601930-3-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25ice: Fix deinitializing VF in error pathMarcin Szycik
If ice_ena_vfs() fails after calling ice_create_vf_entries(), it frees all VFs without removing them from snapshot PF-VF mailbox list, leading to list corruption. Reproducer: devlink dev eswitch set $PF1_PCI mode switchdev ip l s $PF1 up ip l s $PF1 promisc on sleep 1 echo 1 > /sys/class/net/$PF1/device/sriov_numvfs sleep 1 echo 1 > /sys/class/net/$PF1/device/sriov_numvfs Trace (minimized): list_add corruption. next->prev should be prev (ffff8882e241c6f0), but was 0000000000000000. (next=ffff888455da1330). kernel BUG at lib/list_debug.c:29! RIP: 0010:__list_add_valid_or_report+0xa6/0x100 ice_mbx_init_vf_info+0xa7/0x180 [ice] ice_initialize_vf_entry+0x1fa/0x250 [ice] ice_sriov_configure+0x8d7/0x1520 [ice] ? __percpu_ref_switch_mode+0x1b1/0x5d0 ? __pfx_ice_sriov_configure+0x10/0x10 [ice] Sometimes a KASAN report can be seen instead with a similar stack trace: BUG: KASAN: use-after-free in __list_add_valid_or_report+0xf1/0x100 VFs are added to this list in ice_mbx_init_vf_info(), but only removed in ice_free_vfs(). Move the removing to ice_free_vf_entries(), which is also being called in other places where VFs are being removed (including ice_free_vfs() itself). Fixes: 8cd8a6b17d27 ("ice: move VF overflow message count into struct ice_mbx_vf_info") Reported-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Closes: https://lore.kernel.org/intel-wired-lan/PH0PR11MB50138B635F2E5CEB7075325D961F2@PH0PR11MB5013.namprd11.prod.outlook.com Reviewed-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250224190647.3601930-2-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: ethernet: ti: am65-cpsw: select PAGE_POOLSascha Hauer
am65-cpsw uses page_pool_dev_alloc_pages(), thus needs PAGE_POOL selected to avoid linker errors. This is missing since the driver started to use page_pool helpers in 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250224-net-am654-nuss-kconfig-v2-1-c124f4915c92@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: - Fix a mlx5 malfunction if the UMR QP gets an error - Return the correct port number to userspace for a mlx5 DCT - Don't cause a UMR QP error if DMABUF teardown races with invalidation - Fix a WARN splat when unregisering so mlx5 device memory MR types - Use the correct alignment for the mana doorbell so that two processes do not share the same physical page on non-4k page systems - MAINTAINERS updates for MANA - Retry failed HNS FW commands because some can take a long time - Cast void * handle to the correct type in bnxt to fix corruption - Avoid a NULL pointer crash in bnxt_re - Fix skipped ib_device_unregsiter() for bnxt_re due to some earlier rework - Correctly detect if the bnxt supports extended statistics - Fix refcount leak in mlx5 odp introduced by a previous fix - Map the FW result for the port rate to the userspace values properly in mlx5, returns correct values for newer 800G ports - Don't wrongly destroy counters objects that were not automatically created during mlx5 bind qp - Set page size/shift members of kernel owned SRQs to fix a crash in nvme target * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/bnxt_re: Fix the page details for the srq created by kernel consumers RDMA/mlx5: Fix bind QP error cleanup flow RDMA/mlx5: Fix AH static rate parsing RDMA/mlx5: Fix implicit ODP hang on parent deregistration RDMA/bnxt_re: Fix the statistics for Gen P7 VF RDMA/bnxt_re: Fix issue in the unload path RDMA/bnxt_re: Add sanity checks on rdev validity RDMA/bnxt_re: Fix an issue in bnxt_re_async_notifier RDMA/hns: Fix mbox timing out by adding retry mechanism MAINTAINERS: update maintainer for Microsoft MANA RDMA driver RDMA/mana_ib: Allocate PAGE aligned doorbell index RDMA/mlx5: Fix a WARN during dereg_mr for DM type RDMA/mlx5: Fix a race for DMABUF MR which can lead to CQE with error IB/mlx5: Set and get correct qp_num for a DCT QP RDMA/mlx5: Fix the recovery flow of the UMR QP
2025-02-25Input: i8042 - swap old quirk combination with new quirk for more devicesWerner Sembach
Some older Clevo barebones have problems like no or laggy keyboard after resume or boot which can be fixed with the SERIO_QUIRK_FORCENORESTORE quirk. We could not activly retest these devices because we no longer have them in our archive, but based on the other old Clevo barebones we tested where the new quirk had the same or a better behaviour I think it would be good to apply it on these too. Cc: stable@vger.kernel.org Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Link: https://lore.kernel.org/r/20250221230137.70292-4-wse@tuxedocomputers.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-02-25Input: i8042 - swap old quirk combination with new quirk for several devicesWerner Sembach
Some older Clevo barebones have problems like no or laggy keyboard after resume or boot which can be fixed with the SERIO_QUIRK_FORCENORESTORE quirk. While the old quirk combination did not show negative effects on these devices specifically, the new quirk works just as well and seems more stable in general. Cc: stable@vger.kernel.org Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Link: https://lore.kernel.org/r/20250221230137.70292-3-wse@tuxedocomputers.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-02-25Input: i8042 - add required quirks for missing old boardnamesWerner Sembach
Some older Clevo barebones have problems like no or laggy keyboard after resume or boot which can be fixed with the SERIO_QUIRK_FORCENORESTORE quirk. The PB71RD keyboard is sometimes laggy after resume and the PC70DR, PB51RF, P640RE, and PCX0DX_GN20 keyboard is sometimes unresponsive after resume. This quirk fixes that. Cc: stable@vger.kernel.org Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Link: https://lore.kernel.org/r/20250221230137.70292-2-wse@tuxedocomputers.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-02-25Input: i8042 - swap old quirk combination with new quirk for NHxxRZQWerner Sembach
Some older Clevo barebones have problems like no or laggy keyboard after resume or boot which can be fixed with the SERIO_QUIRK_FORCENORESTORE quirk. With the old i8042 quirks this devices keyboard is sometimes laggy after resume. With the new quirk this issue doesn't happen. Cc: stable@vger.kernel.org Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Link: https://lore.kernel.org/r/20250221230137.70292-1-wse@tuxedocomputers.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-02-25drm/amdgpu: init return value in amdgpu_ttm_clear_bufferPierre-Eric Pelloux-Prayer
Otherwise an uninitialized value can be returned if amdgpu_res_cleared returns true for all regions. Possibly closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3812 Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality") Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7c62aacc3b452f73a1284198c81551035fac6d71) Cc: stable@vger.kernel.org
2025-02-25drm/amd/display: Fix HPD after gpu resetRoman Li
[Why] DC is not using amdgpu_irq_get/put to manage the HPD interrupt refcounts. So when amdgpu_irq_gpu_reset_resume_helper() reprograms all of the IRQs, HPD gets disabled. [How] Use amdgpu_irq_get/put() for HPD init/fini in DM in order to sync refcounts Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f3dde2ff7fcaacd77884502e8f572f2328e9c745) Cc: stable@vger.kernel.org
2025-02-25drm/amd/display: add a quirk to enable eDP0 on DP1Yilin Chen
[why] some board designs have eDP0 connected to DP1, need a way to enable support_edp0_on_dp1 flag, otherwise edp related features cannot work [how] do a dmi check during dm initialization to identify systems that require support_edp0_on_dp1. Optimize quirk table with callback functions to set quirk entries, retrieve_dmi_info can set quirks according to quirk entries Cc: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Yilin Chen <Yilin.Chen@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f6d17270d18a6a6753fff046330483d43f8405e4) Cc: stable@vger.kernel.org
2025-02-25drm/amd/display: Disable PSR-SU on eDP panelsTom Chung
[Why] PSR-SU may cause some glitching randomly on several panels. [How] Temporarily disable the PSR-SU and fallback to PSR1 for all eDP panels. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3388 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6deeefb820d0efb0b36753622fb982d03b37b3ad) Cc: stable@vger.kernel.org
2025-02-25drm/amd/display: restore edid reading from a given i2c adapterMelissa Wen
When switching to drm_edid, we slightly changed how to get edid by removing the possibility of getting them from dc_link when in aux transaction mode. As MST doesn't initialize the connector with `drm_connector_init_with_ddc()`, restore the original behavior to avoid functional changes. v2: - Fix build warning of unchecked dereference (kernel test bot) CC: Alex Hung <alex.hung@amd.com> CC: Mario Limonciello <mario.limonciello@amd.com> CC: Roman Li <Roman.Li@amd.com> CC: Aurabindo Pillai <Aurabindo.Pillai@amd.com> Fixes: 48edb2a4256e ("drm/amd/display: switch amdgpu_dm_connector to use struct drm_edid") Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 81262b1656feb3813e3d917ab78824df6831e69e)
2025-02-25drm/amdgpu/mes: keep enforce isolation up to dateAlex Deucher
Re-send the mes message on resume to make sure the mes state is up to date. Fixes: 8521e3c5f058 ("drm/amd/amdgpu: limit single process inside MES") Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Shaoyun Liu <shaoyun.liu@amd.com> Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 27b791514789844e80da990c456c2465325e0851)
2025-02-25drm/amdgpu/gfx: only call mes for enforce isolation if supportedAlex Deucher
This should not be called on chips without MES so check if MES is enabled and if the cleaner shader is supported. Fixes: 8521e3c5f058 ("drm/amd/amdgpu: limit single process inside MES") Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Shaoyun Liu <shaoyun.liu@amd.com> Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> (cherry picked from commit 80513e389765c8f9543b26d8fa4bbdf0e59ff8bc)
2025-02-25drm/amdgpu: disable BAR resize on Dell G5 SEAlex Deucher
There was a quirk added to add a workaround for a Sapphire RX 5600 XT Pulse that didn't allow BAR resizing. However, the quirk caused a regression with runtime pm on Dell laptops using those chips, rather than narrowing the scope of the resizing quirk, add a quirk to prevent amdgpu from resizing the BAR on those Dell platforms unless runtime pm is disabled. v2: update commit message, add runpm check Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1707 Fixes: 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5235053f443cef4210606e5fb71f99b915a9723d) Cc: stable@vger.kernel.org
2025-02-25drm/amdkfd: Preserve cp_hqd_pq_control on update_mqdDavid Yat Sin
When userspace applications call AMDKFD_IOC_UPDATE_QUEUE. Preserve bitfields that do not need to be modified as they contain flags to track queue states that are used by CP FW. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8150827990b709ab5a40c46c30d21b7f7b9e9440) Cc: stable@vger.kernel.org
2025-02-25amdgpu/pm/legacy: fix suspend/resume issueschr[]
resume and irq handler happily races in set_power_state() * amdgpu_legacy_dpm_compute_clocks() needs lock * protect irq work handler * fix dpm_enabled usage v2: fix clang build, integrate Lijo's comments (Alex) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2524 Fixes: 3712e7a49459 ("drm/amd/pm: unified lock protections in amdgpu_dpm.c") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Tested-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> # on Oland PRO Signed-off-by: chr[] <chris@rudorff.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ee3dc9e204d271c9c7a8d4d38a0bce4745d33e71) Cc: stable@vger.kernel.org
2025-02-25nvme-ioctl: fix leaked requests on mapping errorKeith Busch
All the callers assume nvme_map_user_request() frees the request on a failure. This wasn't happening on invalid metadata or io_uring command flags, so we've been leaking those requests. Fixes: 23fd22e55b767b ("nvme: wire up fixed buffer support for nvme passthrough") Fixes: 7c2fd76048e95d ("nvme: fix metadata handling in nvme-passthrough") Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-25pinctrl: spacemit: enable config optionYixun Lan
Pinctrl is an essential driver for SpacemiT's SoC, The uart driver requires it, same as sd card driver, so let's enable it by default for this SoC. The CONFIG_PINCTRL_SPACEMIT_K1 isn't enabled when using 'make defconfig' to select kernel configuration options. This result in a broken uart driver where fail at probe() stage due to no pins found. Fixes: a83c29e1d145 ("pinctrl: spacemit: add support for SpacemiT K1 SoC") Reported-by: Alex Elder <elder@kernel.org> Acked-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Alex Elder <elder@riscstar.com> Signed-off-by: Yixun Lan <dlan@gentoo.org> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://lore.kernel.org/20250218-k1-pinctrl-option-v3-1-36e031e0da1b@gentoo.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-02-25pinctrl: nuvoton: npcm8xx: Add NULL check in npcm8xx_gpio_fwCharles Han
devm_kasprintf() calls can return null pointers on failure. But the return values were not checked in npcm8xx_gpio_fw(). Add NULL check in npcm8xx_gpio_fw(), to handle kernel NULL pointer dereference error. Fixes: acf4884a5717 ("pinctrl: nuvoton: add NPCM8XX pinctrl and GPIO driver") Signed-off-by: Charles Han <hanchunchao@inspur.com> Link: https://lore.kernel.org/20250212100532.4317-1-hanchunchao@inspur.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-02-25pinctrl: bcm281xx: Fix incorrect regmap max_registers valueArtur Weber
The max_registers value does not take into consideration the stride; currently, it's set to the number of the last pin, but this does not accurately represent the final register. Fix this by multiplying the current value by 4. Fixes: 54b1aa5a5b16 ("ARM: pinctrl: Add Broadcom Capri pinctrl driver") Signed-off-by: Artur Weber <aweber.kernel@gmail.com> Link: https://lore.kernel.org/20250207-bcm21664-pinctrl-v1-2-e7cfac9b2d3b@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-02-25Input: xpad - rename QH controller to Legion Go SAntheas Kapenekakis
The QH controller is actually the controller of the Legion Go S, with the manufacturer string wch.cn and product name Legion Go S in its USB descriptor. A cursory lookup of the VID reveals the same. Therefore, rename the xpad entries to match. Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev> Link: https://lore.kernel.org/r/20250222170010.188761-4-lkml@antheas.dev Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-02-25Input: xpad - add support for TECNO Pocket GoAntheas Kapenekakis
TECNO Pocket Go is a kickstarter handheld by manufacturer TECNO Mobile. It poses a unique feature: it does not have a display. Instead, the handheld is essentially a pc in a controller. As customary, it has an xpad endpoint, a keyboard endpoint, and a vendor endpoint for its vendor software. Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev> Link: https://lore.kernel.org/r/20250222170010.188761-3-lkml@antheas.dev Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-02-25Input: xpad - add support for ZOTAC Gaming ZoneAntheas Kapenekakis
ZOTAC Gaming Zone is ZOTAC's 2024 handheld release. As it is common with these handhelds, it uses a hybrid USB device with an xpad endpoint, a keyboard endpoint, and a vendor-specific endpoint for RGB control et al. Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev> Link: https://lore.kernel.org/r/20250222170010.188761-2-lkml@antheas.dev Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-02-25drm/vkms: Round fixp2int conversion in lerp_u16Harry Wentland
fixp2int always rounds down, fixp2int_ceil rounds up. We need the new fixp2int_round. Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241220043410.416867-3-alex.hung@amd.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-02-25firmware: cs_dsp: Remove async regmap writesRichard Fitzgerald
Change calls to async regmap write functions to use the normal blocking writes so that the cs35l56 driver can use spi_bus_lock() to gain exclusive access to the SPI bus. As this is part of a fix, it makes only the minimal change to swap the functions to the blocking equivalents. There's no need to risk reworking the buffer allocation logic that is now partially redundant. The async writes are a 12-year-old workaround for inefficiency of synchronous writes in the SPI subsystem. The SPI subsystem has since been changed to avoid the overheads, so this workaround should not be necessary. The cs35l56 driver needs to use spi_bus_lock() prevent bus activity while it is soft-resetting the cs35l56. But spi_bus_lock() is incompatible with spi_async() calls, which will fail with -EBUSY. Fixes: 8a731fd37f8b ("ASoC: cs35l56: Move utility functions to shared file") Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20250225131843.113752-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-25drm/xe/oa: Allow oa_exponent value of 0Umesh Nerlige Ramappa
OA exponent value of 0 is a valid value for periodic reports. Allow user to pass 0 for the OA sampling interval since it gets converted to 2 gt clock ticks. v2: Update the check in xe_oa_stream_init as well (Ashutosh) v3: Fix mi-rpc failure by setting default exponent to -1 (CI) v4: Add the Fixes tag Fixes: b6fd51c62119 ("drm/xe/oa/uapi: Define and parse OA stream properties") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221213352.1712932-1-umesh.nerlige.ramappa@intel.com (cherry picked from commit 30341f0b8ea71725cc4ab2c43e3a3b749892fc92) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-25thermal: gov_power_allocator: Update total_weight on bind and cdev updatesYu-Che Cheng
params->total_weight is not initialized during bind and not updated when the bound cdev changes. The cooling device weight will not be used due to the uninitialized total_weight, until an update via sysfs is triggered. The bound cdevs are updated during thermal zone registration, where each cooling device will be bound to the thermal zone one by one, but power_allocator_bind() can be called without an additional cdev update when manually changing the policy of a thermal zone via sysfs. Add a new function to handle weight update logic, including updating total_weight, and call it when bind, weight changes, and cdev updates to ensure total_weight is always correct. Fixes: a3cd6db4cc2e ("thermal: gov_power_allocator: Support new update callback of weights") Signed-off-by: Yu-Che Cheng <giver@chromium.org> Link: https://patch.msgid.link/20250222-fix-power-allocator-weight-v2-1-a94de86b685a@chromium.org [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-02-25thermal/of: Fix cdev lookup in thermal_of_should_bind()Rafael J. Wysocki
Since thermal_of_should_bind() terminates the loop after processing the first child found in cooling-maps, it will never match more than one cdev to a given trip point which is incorrect, as there may be cooling-maps associating one trip point with multiple cooling devices. Address this by letting the loop continue until either all children have been processed or a matching one has been found. To avoid adding conditionals or goto statements, put the loop in question into a separate function and make that function return right away after finding a matching cooling-maps entry. Fixes: 94c6110b0b13 ("thermal/of: Use the .should_bind() thermal zone callback") Link: https://lore.kernel.org/linux-pm/20250219-fix-thermal-of-v1-1-de36e7a590c4@chromium.org/ Reported-by: Yu-Che Cheng <giver@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Yu-Che Cheng <giver@chromium.org> Tested-by: Yu-Che Cheng <giver@chromium.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/2788228.mvXUDI8C0e@rjwysocki.net
2025-02-24Input: goodix-berlin - fix vddio regulator referencesLuca Weiss
As per dt-bindings the property is called vddio-supply, so use the correct name in the driver instead of iovdd. The datasheet also calls the supply 'VDDIO'. Fixes: 44362279bdd4 ("Input: add core support for Goodix Berlin Touchscreen IC") Cc: stable@vger.kernel.org Signed-off-by: Luca Weiss <luca.weiss@fairphone.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250103-goodix-berlin-fixes-v1-2-b014737b08b2@fairphone.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-02-24Input: goodix-berlin - fix comment referencing wrong regulatorLuca Weiss
In the statement above AVDD gets enabled, and not IOVDD, so fix this copy-paste mistake. Fixes: 44362279bdd4 ("Input: add core support for Goodix Berlin Touchscreen IC") Reported-by: Jens Reidel <adrian@travitia.xyz> Signed-off-by: Luca Weiss <luca.weiss@fairphone.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250103-goodix-berlin-fixes-v1-1-b014737b08b2@fairphone.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-02-24Input: imagis - add support for imagis IST3038HAndras Sebok
Add support for imagis IST3038H, which seems mostly compatible with IST3038C except that it reports a different chip ID value. Tested on samsung,j5y17lte. Signed-off-by: Andras Sebok <sebokandris2009@gmail.com> Link: https://lore.kernel.org/r/20250224090354.102903-2-sebokandris2009@gmail.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-02-24hwmon: (peci/dimmtemp) Do not provide fake thresholds dataPaul Fertser
When an Icelake or Sapphire Rapids CPU isn't providing the maximum and critical thresholds for particular DIMM the driver should return an error to the userspace instead of giving it stale (best case) or wrong (the structure contains all zeros after kzalloc() call) data. The issue can be reproduced by binding the peci driver while the host is fully booted and idle, this makes PECI interaction unreliable enough. Fixes: 73bc1b885dae ("hwmon: peci: Add dimmtemp driver") Fixes: 621995b6d795 ("hwmon: (peci/dimmtemp) Add Sapphire Rapids support") Cc: stable@vger.kernel.org Signed-off-by: Paul Fertser <fercerpav@gmail.com> Reviewed-by: Iwona Winiarska <iwona.winiarska@intel.com> Link: https://lore.kernel.org/r/20250123122003.6010-1-fercerpav@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-02-24Merge tag 'for-6.14/dm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mikulas Patocka: - dm-vdo: add missing spin_lock_init - dm-integrity: divide-by-zero fix - dm-integrity: do not report unused entries in the table line * tag 'for-6.14/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm vdo: add missing spin_lock_init dm-integrity: Do not emit journal configuration in DM table for Inline mode dm-integrity: Avoid divide by zero in table status in Inline mode
2025-02-24nvme-pci: skip CMB blocks incompatible with PCI P2P DMAIcenowy Zheng
The PCI P2PDMA code will register the CMB block to the memory hot-plugging subsystem, which have an alignment requirement. Memory blocks that do not satisfy this alignment requirement (usually 2MB) will lead to a WARNING from memory hotplugging. Verify the CMB block's address and size against the alignment and only try to send CMB blocks compatible with it to prevent this warning. Tested on Intel DC D4502 SSD, which has a 512K CMB block that is too small for memory hotplugging (thus PCI P2PDMA). Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-24nvme-pci: clean up CMBMSC when registering CMB failsIcenowy Zheng
CMB decoding should get disabled when the CMB block isn't successfully registered to P2P DMA subsystem. Clean up the CMBMSC register in this error handling codepath to disable CMB decoding (and CMBLOC/CMBSZ registers). Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-24nvme-tcp: fix possible UAF in nvme_tcp_pollSagi Grimberg
nvme_tcp_poll() may race with the send path error handler because it may complete the request while it is actively being polled for completion, resulting in a UAF panic [1]: We should make sure to stop polling when we see an error when trying to read from the socket. Hence make sure to propagate the error so that the block layer breaks the polling cycle. [1]: -- [35665.692310] nvme nvme2: failed to send request -13 [35665.702265] nvme nvme2: unsupported pdu type (3) [35665.702272] BUG: kernel NULL pointer dereference, address: 0000000000000000 [35665.702542] nvme nvme2: queue 1 receive failed: -22 [35665.703209] #PF: supervisor write access in kernel mode [35665.703213] #PF: error_code(0x0002) - not-present page [35665.703214] PGD 8000003801cce067 P4D 8000003801cce067 PUD 37e6f79067 PMD 0 [35665.703220] Oops: 0002 [#1] SMP PTI [35665.703658] nvme nvme2: starting error recovery [35665.705809] Hardware name: Inspur aaabbb/YZMB-00882-104, BIOS 4.1.26 09/22/2022 [35665.705812] Workqueue: kblockd blk_mq_requeue_work [35665.709172] RIP: 0010:_raw_spin_lock+0xc/0x30 [35665.715788] Call Trace: [35665.716201] <TASK> [35665.716613] ? show_trace_log_lvl+0x1c1/0x2d9 [35665.717049] ? show_trace_log_lvl+0x1c1/0x2d9 [35665.717457] ? blk_mq_request_bypass_insert+0x2c/0xb0 [35665.717950] ? __die_body.cold+0x8/0xd [35665.718361] ? page_fault_oops+0xac/0x140 [35665.718749] ? blk_mq_start_request+0x30/0xf0 [35665.719144] ? nvme_tcp_queue_rq+0xc7/0x170 [nvme_tcp] [35665.719547] ? exc_page_fault+0x62/0x130 [35665.719938] ? asm_exc_page_fault+0x22/0x30 [35665.720333] ? _raw_spin_lock+0xc/0x30 [35665.720723] blk_mq_request_bypass_insert+0x2c/0xb0 [35665.721101] blk_mq_requeue_work+0xa5/0x180 [35665.721451] process_one_work+0x1e8/0x390 [35665.721809] worker_thread+0x53/0x3d0 [35665.722159] ? process_one_work+0x390/0x390 [35665.722501] kthread+0x124/0x150 [35665.722849] ? set_kthread_struct+0x50/0x50 [35665.723182] ret_from_fork+0x1f/0x30 Reported-by: Zhang Guanghui <zhang.guanghui@cestc.cn> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-24selftests: drv-net: test XDP, HDS auto and the ioctl pathJakub Kicinski
Test XDP and HDS interaction. While at it add a test for using the IOCTL, as that turned out to be the real culprit. Testing bnxt: # NETIF=eth0 ./ksft-net-drv/drivers/net/hds.py KTAP version 1 1..12 ok 1 hds.get_hds ok 2 hds.get_hds_thresh ok 3 hds.set_hds_disable # SKIP disabling of HDS not supported by the device ok 4 hds.set_hds_enable ok 5 hds.set_hds_thresh_zero ok 6 hds.set_hds_thresh_max ok 7 hds.set_hds_thresh_gt ok 8 hds.set_xdp ok 9 hds.enabled_set_xdp ok 10 hds.ioctl ok 11 hds.ioctl_set_xdp ok 12 hds.ioctl_enabled_set_xdp # Totals: pass:11 fail:0 xfail:0 xpass:0 skip:1 error:0 and netdevsim: # ./ksft-net-drv/drivers/net/hds.py KTAP version 1 1..12 ok 1 hds.get_hds ok 2 hds.get_hds_thresh ok 3 hds.set_hds_disable ok 4 hds.set_hds_enable ok 5 hds.set_hds_thresh_zero ok 6 hds.set_hds_thresh_max ok 7 hds.set_hds_thresh_gt ok 8 hds.set_xdp ok 9 hds.enabled_set_xdp ok 10 hds.ioctl ok 11 hds.ioctl_set_xdp ok 12 hds.ioctl_enabled_set_xdp # Totals: pass:12 fail:0 xfail:0 xpass:0 skip:0 error:0 Netdevsim needs a sane default for tx/rx ring size. ethtool 6.11 is needed for the --disable-netlink option. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Tested-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20250221025141.1132944-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-24drm/xe/userptr: fix EFAULT handlingMatthew Auld
Currently we treat EFAULT from hmm_range_fault() as a non-fatal error when called from xe_vm_userptr_pin() with the idea that we want to avoid killing the entire vm and chucking an error, under the assumption that the user just did an unmap or something, and has no intention of actually touching that memory from the GPU. At this point we have already zapped the PTEs so any access should generate a page fault, and if the pin fails there also it will then become fatal. However it looks like it's possible for the userptr vma to still be on the rebind list in preempt_rebind_work_func(), if we had to retry the pin again due to something happening in the caller before we did the rebind step, but in the meantime needing to re-validate the userptr and this time hitting the EFAULT. This explains an internal user report of hitting: [ 191.738349] WARNING: CPU: 1 PID: 157 at drivers/gpu/drm/xe/xe_res_cursor.h:158 xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738551] Workqueue: xe-ordered-wq preempt_rebind_work_func [xe] [ 191.738616] RIP: 0010:xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738690] Call Trace: [ 191.738692] <TASK> [ 191.738694] ? show_regs+0x69/0x80 [ 191.738698] ? __warn+0x93/0x1a0 [ 191.738703] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738759] ? report_bug+0x18f/0x1a0 [ 191.738764] ? handle_bug+0x63/0xa0 [ 191.738767] ? exc_invalid_op+0x19/0x70 [ 191.738770] ? asm_exc_invalid_op+0x1b/0x20 [ 191.738777] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738834] ? ret_from_fork_asm+0x1a/0x30 [ 191.738849] bind_op_prepare+0x105/0x7b0 [xe] [ 191.738906] ? dma_resv_reserve_fences+0x301/0x380 [ 191.738912] xe_pt_update_ops_prepare+0x28c/0x4b0 [xe] [ 191.738966] ? kmemleak_alloc+0x4b/0x80 [ 191.738973] ops_execute+0x188/0x9d0 [xe] [ 191.739036] xe_vm_rebind+0x4ce/0x5a0 [xe] [ 191.739098] ? trace_hardirqs_on+0x4d/0x60 [ 191.739112] preempt_rebind_work_func+0x76f/0xd00 [xe] Followed by NPD, when running some workload, since the sg was never actually populated but the vma is still marked for rebind when it should be skipped for this special EFAULT case. This is confirmed to fix the user report. v2 (MattB): - Move earlier. v3 (MattB): - Update the commit message to make it clear that this indeed fixes the issue. Fixes: 521db22a1d70 ("drm/xe: Invalidate userptr VMA on page pin fault") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221143840.167150-5-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 6b93cb98910c826c2e2004942f8b060311e43618) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-24drm/xe/userptr: restore invalidation list on errorMatthew Auld
On error restore anything still on the pin_list back to the invalidation list on error. For the actual pin, so long as the vma is tracked on either list it should get picked up on the next pin, however it looks possible for the vma to get nuked but still be present on this per vm pin_list leading to corruption. An alternative might be then to instead just remove the link when destroying the vma. v2: - Also add some asserts. - Keep the overzealous locking so that we are consistent with the docs; updating the docs and related bits will be done as a follow up. Fixes: ed2bdf3b264d ("drm/xe/vm: Subclass userptr vmas") Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221143840.167150-4-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 4e37e928928b730de9aa9a2f5dc853feeebc1742) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-24dm vdo: add missing spin_lock_initKen Raeburn
Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org
2025-02-24net: dsa: rtl8366rb: Fix compilation problemLinus Walleij
When the kernel is compiled without LED framework support the rtl8366rb fails to build like this: rtl8366rb.o: in function `rtl8366rb_setup_led': rtl8366rb.c:953:(.text.unlikely.rtl8366rb_setup_led+0xe8): undefined reference to `led_init_default_state_get' rtl8366rb.c:980:(.text.unlikely.rtl8366rb_setup_led+0x240): undefined reference to `devm_led_classdev_register_ext' As this is constantly coming up in different randconfig builds, bite the bullet and create a separate file for the offending code, split out a header with all stuff needed both in the core driver and the leds code. Add a new bool Kconfig option for the LED compile target, such that it depends on LEDS_CLASS=y || LEDS_CLASS=RTL8366RB which make LED support always available when LEDS_CLASS is compiled into the kernel and enforce that if the LEDS_CLASS is a module, then the RTL8366RB driver needs to be a module as well so that modprobe can resolve the dependencies. Fixes: 32d617005475 ("net: dsa: realtek: add LED drivers for rtl8366rb") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202502070525.xMUImayb-lkp@intel.com/ Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-23Merge tag 'i2c-for-6.14-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fix from Wolfram Sang: "Revert one cleanup which turned out to eat too much stack space" * tag 'i2c-for-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: core: Allocate temporary client dynamically
2025-02-23Merge tag 'edac_urgent_for_v6.14_rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: - Have qcom_edac use the correct interrupt enable register to configure the RAS interrupt lines * tag 'edac_urgent_for_v6.14_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/qcom: Correct interrupt enable register configuration