git.armlinux.org.uk/linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2025-05-16	drm/amd/display: Add GPINT retries to ips_query_residency_info	Ovidiu Bunea
	[why & how] GPINTs can timeout without returning any data. Since this path is only for testing purposes, it should retry several times to ensure data is collected. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amd/display: Modify DCN401 DMUB reset & halt sequence	Dillon Varone
	[WHY&HOW] If DMCUB is already disabled or reset, no need to send the halt command again. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amd/display: add support for 2nd sharpening range	Samson Tam
	[Why & How] Add support for 2nd sharpening range for cases where we want override existing DCN sharpening range. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amd/display: Fix the typo in dcn401 Hubp block	Nevenko Stupar
	[Why & How] Fix the typo for hubp_clear_tiling, currently calls hubp2_clear_tiling for dcn401 instead of intended hubp401_clear_tiling. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Nevenko Stupar <Nevenko.Stupar@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amd/display: Skip backend validation for virtual monitors	Chiawen Huang
	[Why&How] Virtual monitors are now being validated during set_mode. Virtual monitors should not undergo backend validation, as the backend is intended only for physical monitors. Virtual sinks have no real backend part information and should be excluded from this validation. Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Chiawen Huang <chiawen.huang@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amd/display: Move mcache allocation programming from DML to resource	Karthi Kandasamy
	[Why] mcache allocation programming is not part of DML's core responsibilities. Keeping this logic in DML leads to poor separation of concerns and complicates maintenance. [How] Refactored code to move mcache parameter preparation and mcache ID assignment into the resource file. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Karthi Kandasamy <karthi.kandasamy@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amd/display: Support external tunneling feature	Cruise Hung
	[Why & How] The original code only supports the tunneling for embedded one. To support external tunneling feature, it needs to check Tunneling_Support bit register. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Cruise Hung <Cruise.Hung@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amd/display: init local variable to fix format errors	Yihan Zhu
	[WHY & HOW] Uninitialized local variables will cause format checker complain about them. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amd/display: Extend dc_plane_get_status with flags	Tomasz Siemek
	[WHY] dc_plane_get_status may be used for reading other plane properties in the future. [HOW] Provide API for choosing plane properties to read. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Reviewed-by: Aric Cyr <aric.cyr@amd.com> Reviewed-by: Swapnil Patel <swapnil.patel@amd.com> Signed-off-by: Tomasz Siemek <Tomasz.Siemek@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amdgpu: fix use-after-unlock in eviction fence destroy	Arvind Yadav
	The eviction fence destroy path incorrectly calls dma_fence_put() on evf_mgr->ev_fence after releasing the ev_fence_lock. This introduces a potential use-after-unlock or race because another thread concurrently modifies evf_mgr->ev_fence. Fix this by grabbing a local reference to evf_mgr->ev_fence under the lock and using that for dma_fence_put() after waiting. Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amdgpu: Allow NPS2-CPX combination for VFs	Lijo Lazar
	CPX partition mode is compatible with NPS2 on aquavanjaram VFs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amdgpu/mmsch: Add MMSCH v5_0 support for sriov	fanhuang
	These structures are basically ported from MMSCH v4_0 The structures are the same as v4_0 except for the init header Signed-off-by: fanhuang <FangSheng.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amdgpu: Use compatible NPS mode info	Lijo Lazar
	Compatible NPS modes for a partition mode are exposed through xcp_config interface. To determine if a compute partition mode is valid, check if the current NPS mode is part of compatible NPS modes. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amd/pm: Move SMUv13.0.12 function declarations	Lijo Lazar
	Move them to SMUv13.0.6 header file as they are used only in SMU v13.0.6. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amdgpu: Add pldm version reporting	Asad Kamal
	Add pldm version reporting through sysfs node Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/amdkfd: Support chain runlists of XNACK+/XNACK-	Amber Lin
	If the MEC firmware supports chaining runlists of XNACK+/XNACK- processes, set SQ_CONFIG1 chicken bit and SET_RESOURCES bit 28. When the MEC/HWS supports it, KFD checks the XNACK+/XNACK- processes mix happens or not. If it does, enter over-subscription. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	drm/radeon/cik: Clean up doorbells	Dr. David Alan Gilbert
	Free doorbells in the error paths of cik_init and in cik_fini. Build tested only. Suggested-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-16	Merge tag 'drm-xe-fixes-2025-05-15-1' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Core Changes: - Add timeslicing and allocation restriction for SVM Driver Changes: - Fix shrinker debugfs name - Add HW workaround to Xe2 - Fix SVM when mixing GPU and CPU atomics - Fix per client engine utilization due to active contexts not saving timestamp with lite restore enabled. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/qil4scyn6ucnt43u5ju64bi7r7n5r36k4pz5rsh2maz7isle6g@lac3jpsjrrvs
2025-05-16	Merge tag 'drm-misc-fixes-2025-05-15' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: dma-buf: - Avoid memory reordering in fence handling ivpu: - Fix buffer size in debugfs code meson: - Avoid integer overflow in mode-clock calculations panel-mipi-dbi: - Fix output with drm_client_setup_with_fourcc() Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250515125534.GA41174@linux.fritz.box
2025-05-16	Merge tag 'drm-intel-next-fixes-2025-05-15' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/i915/kernel into drm-next - Stop writing ALPM registers when PSR is enabled - Use the correct connector while computing the link BPP limit on MST Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://lore.kernel.org/r/aCWlWk5rTE7TH1pN@jlahtine-mobl
2025-05-16	Merge tag 'mediatek-drm-next-20250515' of ↵	Dave Airlie
	https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next Mediatek DRM Next - 20250515 1. Prepare for support MT8195/88 HDMIv2 and DDCv2 2. DPI: Cleanups and add support for more formats 3. Cleanups and sanitization 4. Replace custom compare_dev with component_compare_of Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://lore.kernel.org/r/20250514233647.15907-1-chunkuang.hu@kernel.org
2025-05-15	gpu: drm: nova: select AUXILIARY_BUS instead of depending on it	Alexandre Courbot
	CONFIG_AUXILIARY_BUS cannot be enabled explicitly, and unless we select it we have no way to include it (and thus to enable NOVA_DRM) unless another driver happens to do it for us. Fixes: cdeaeb9dd762 ("drm: nova-drm: add initial driver skeleton") Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://lore.kernel.org/r/20250515-aux_bus-v2-3-47c70f96ae9b@nvidia.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-05-15	gpu: nova-core: select AUXILIARY_BUS instead of depending on it	Alexandre Courbot
	CONFIG_AUXILIARY_BUS cannot be enabled explicitly, and unless we select it we have no way to include it (and thus to enable NOVA_CORE) unless another driver happens to do it for us. Fixes: e041d81a0377 ("gpu: nova-core: register auxiliary device for nova-drm") Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://lore.kernel.org/r/20250515-aux_bus-v2-2-47c70f96ae9b@nvidia.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-05-15	Merge tag 'drm-misc-next-2025-05-12' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.16-rc1: Once more, with async flips. UAPI Changes: - Add IN_FORMATS_ASYNC property, use in i915. Cross-subsystem Changes: - Remove some unused debug code in dma-buf. Core Changes: Driver Changes: - Add Novatek NT37801 panel. - Allow submitting empty commands in amdxdna. - Convert cirrus to use managed request_all_regions. - Move Sitronix from tiny to their own place. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://lore.kernel.org/r/23ded62c-6a62-4195-9c08-4dfb81eafd72@linux.intel.com
2025-05-14	drm/mediatek: Replace custom compare_dev with component_compare_of	Tang Dongxing
	Remove the custom device comparison function compare_dev and replace it with the existing kernel helper component_compare_of Signed-off-by: Tang Dongxing <tang.dongxing@zte.com.cn> Signed-off-by: Shao Mingyin <shao.mingyin@zte.com.cn> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250403155419406T5YhIJKId1FWor70EWWHG@zte.com.cn/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2025-05-14	drm/mediatek: mtk_drm_drv: Unbind secondary mmsys components on err	AngeloGioacchino Del Regno
	When calling component_bind_all(), if a component that is included in the list fails, all of those that have been successfully bound will be unbound, but this driver has two components lists for two actual devices, as in, each mmsys instance has its own components list. In case mmsys0 (or actually vdosys0) is able to bind all of its components, but the secondary one fails, all of the components of the first are kept bound, while the ones of mmsys1/vdosys1 are correctly cleaned up. This is not right because, in case of a failure, the components are re-bound for all of the mmsys/vdosys instances without caring about the ones that were previously left in a bound state. Fix that by calling component_unbind_all() on all of the previous component masters that succeeded binding all subdevices when any of the other masters errors out. Fixes: 1ef7ed48356c ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support") Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250403104741.71045-4-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2025-05-14	drm/mediatek: Fix kobject put for component sub-drivers	AngeloGioacchino Del Regno
	In function mtk_drm_get_all_drm_priv(), this driver is incrementing the refcount for the sub-drivers of mediatek-drm with a call to device_find_child() when taking a reference to all of those child devices. When the component bind fails multiple times this results in a refcount_t overflow, as the reference count is never decremented: fix that by adding a call to put_device() for all of the mmsys devices in a loop, in error cases of mtk_drm_bind() and in the mtk_drm_unbind() callback. Fixes: 1ef7ed48356c ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support") Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250403104741.71045-3-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2025-05-14	drm/mediatek: mtk_drm_drv: Fix kobject put for mtk_mutex device ptr	AngeloGioacchino Del Regno
	This driver is taking a kobject for mtk_mutex only once per mmsys device for each drm-mediatek driver instance, differently from the behavior with other components, but it is decrementing the kobj's refcount in a loop and once per mmsys: this is not right and will result in a refcount_t underflow warning when mediatek-drm returns multiple probe deferrals in one boot (or when manually bound and unbound). Besides that, the refcount for mutex_dev was not decremented for error cases in mtk_drm_bind(), causing another refcount_t warning but this time for overflow, when the failure happens not during driver bind but during component bind. In order to fix one of the reasons why this is happening, remove the put_device(xx->mutex_dev) loop from the mtk_drm_kms_init()'s put_mutex_dev label (and drop the label) and add a single call to correctly free the single incremented refcount of mutex_dev to the mtk_drm_unbind() function to fix the refcount_t underflow. Moreover, add the same call to the error cases in mtk_drm_bind() to fix the refcount_t overflow. Fixes: 1ef7ed48356c ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support") Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250403104741.71045-2-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2025-05-14	drm/xe: Add WA BB to capture active context utilization	Umesh Nerlige Ramappa
	Context Timestamp (CTX_TIMESTAMP) in the LRC accumulates the run ticks of the context, but only gets updated when the context switches out. In order to check how long a context has been active before it switches out, two things are required: (1) Determine if the context is running: To do so, we program the WA BB to set an initial value for CTX_TIMESTAMP in the LRC. The value chosen is 1 since 0 is the initial value when the LRC is initialized. During a query, we just check for this value to determine if the context is active. If the context switched out, it would overwrite this location with the actual CTX_TIMESTAMP MMIO value. Note that WA BB runs as the last part of the context restore, so reusing this LRC location will not clobber anything. (2) Calculate the time that the context has been active for: The CTX_TIMESTAMP ticks only when the context is active. If a context is active, we just use the CTX_TIMESTAMP MMIO as the new value of utilization. While doing so, we need to read the CTX_TIMESTAMP MMIO for the specific engine instance. Since we do not know which instance the context is running on until it is scheduled, we also read the ENGINE_ID MMIO in the WA BB and store it in the PPHSWP. Using the above 2 instructions in a WA BB, capture active context utilization. v2: (Matt Brost) - This breaks TDR, fix it by saving the CTX_TIMESTAMP register "drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value" - Drop tile from LRC if using gt "drm/xe: Save the gt pointer in LRC and drop the tile" v3: - Remove helpers for bb_per_ctx_ptr (Matt) - Add define for context active value (Matt) - Use 64 bit CTX TIMESTAMP for platforms that support it. For platforms that don't, live with the rare race. (Matt, Lucas) - Convert engine id to hwe and get the MMIO value (Lucas) - Correct commit message on when WA BB runs (Lucas) v4: - s/GRAPHICS_VER(...)/xe->info.has_64bit_timestamp/ (Matt) - Drop support for active utilization on a VF (CI failure) - In xe_lrc_init ensure the lrc value is 0 to begin with (CI regression) v5: - Minor checkpatch fix - Squash into previous commit and make TDR use 32-bit time - Update code comment to match commit msg Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4532 Cc: <stable@vger.kernel.org> # v6.13+ Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250509161159.2173069-8-umesh.nerlige.ramappa@intel.com (cherry picked from commit 82b98cadb01f63cdb159e596ec06866d00f8e8c7) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14	drm/xe: Save the gt pointer in lrc and drop the tile	Umesh Nerlige Ramappa
	Save the gt pointer in the lrc so that it can used for gt based helpers. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250509161159.2173069-7-umesh.nerlige.ramappa@intel.com (cherry picked from commit 741d3ef8b8b88fab2729ca89de1180e49bc9cef0) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14	drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value	Umesh Nerlige Ramappa
	For determining actual job execution time, save the current value of the CTX_TIMESTAMP register rather than the value saved in LRC since the current register value is the closest to the start time of the job. v2: Define MI_STORE_REGISTER_MEM to fix compile error v3: Place MI_STORE_REGISTER_MEM sorted by MI_INSTR (Lucas) Fixes: 65921374c48f ("drm/xe: Emit ctx timestamp copy in ring ops") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250509161159.2173069-6-umesh.nerlige.ramappa@intel.com (cherry picked from commit 38b14233e5deff51db8faec287b4acd227152246) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14	drm/xe: Timeslice GPU on atomic SVM fault	Matthew Brost
	Ensure GPU can make forward progress on an atomic SVM GPU fault by giving the GPU a timeslice of 5ms v2: - Reduce timeslice to 5ms - Double timeslice on retry - Split out GPU SVM changes into independent patch v5: - Double timeslice in a few more places Fixes: 2f118c949160 ("drm/xe: Add SVM VRAM migration") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250512135500.1405019-5-matthew.brost@intel.com (cherry picked from commit a5d8d3be1dea8154edbbea481081469627665659) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14	drm/gpusvm: Add timeslicing support to GPU SVM	Matthew Brost
	Add timeslicing support to GPU SVM which will guarantee the GPU a minimum execution time on piece of physical memory before migration back to CPU. Intended to implement strict migration policies which require memory to be in a certain placement for correct execution. Required for shared CPU and GPU atomics on certain devices. Fixes: 99624bdff867 ("drm/gpusvm: Add support for GPU Shared Virtual Memory") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250512135500.1405019-4-matthew.brost@intel.com (cherry picked from commit 8dc1812b5b3a42311d28eb385eed88e2053ad3cb) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14	drm/xe: Strict migration policy for atomic SVM faults	Matthew Brost
	Mixing GPU and CPU atomics does not work unless a strict migration policy of GPU atomics must be device memory. Enforce a policy of must be in VRAM with a retry loop of 3 attempts, if retry loop fails abort fault. Removing always_migrate_to_vram modparam as we now have real migration policy. v2: - Only retry migration on atomics - Drop alway migrate modparam v3: - Only set vram_only on DGFX (Himal) - Bail on get_pages failure if vram_only and retry count exceeded (Himal) - s/vram_only/devmem_only - Update xe_svm_range_is_valid to accept devmem_only argument v4: - Fix logic bug get_pages failure v5: - Fix commit message (Himal) - Mention removing always_migrate_to_vram in commit message (Lucas) - Fix xe_svm_range_is_valid to check for devmem pages - Bail on devmem_only && !migrate_devmem (Thomas) v6: - Add READ_ONCE barriers for opportunistic checks (Thomas) - Pair READ_ONCE with WRITE_ONCE (Thomas) v7: - Adjust comments (Thomas) Fixes: 2f118c949160 ("drm/xe: Add SVM VRAM migration") Cc: stable@vger.kernel.org Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250512135500.1405019-3-matthew.brost@intel.com (cherry picked from commit a9ac0fa455b050d03e3032501368048fb284d318) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14	drm/gpusvm: Introduce devmem_only flag for allocation	Himal Prasad Ghimiray
	This commit adds a new flag, devmem_only, to the drm_gpusvm structure. The purpose of this flag is to ensure that the get_pages function allocates memory exclusively from the device's memory. If the allocation from device memory fails, the function will return an -EFAULT error. Required for shared CPU and GPU atomics on certain devices. v3: - s/vram_only/devmem_only/ Fixes: 99624bdff867 ("drm/gpusvm: Add support for GPU Shared Virtual Memory") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250512135500.1405019-2-matthew.brost@intel.com (cherry picked from commit 8a9b978ebd47df9e0694c34748c2d6fa0c31eb4d) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14	drm/xe/xe2hpg: Add Wa_22021007897	Aradhya Bhatia
	Add Wa_22021007897 for the Xe2_HPG (graphics version: 20.01) IP. It is a permanent workaround, and applicable on all the steppings. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Link: https://lore.kernel.org/r/20250512065004.2576-1-aradhya.bhatia@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit e5c13e2c505b73a8667ef9a0fd5cbd4227e483e6) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14	drm/amdgpu: read back register after written for VCN v4.0.5	David (Ming Qiang) Wu
	On VCN v4.0.5 there is a race condition where the WPTR is not updated after starting from idle when doorbell is used. Adding register read-back after written at function end is to ensure all register writes are done before they can be used. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12528 Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 07c9db090b86e5211188e1b351303fbc673378cf) Cc: stable@vger.kernel.org
2025-05-14	Revert "drm/amd/display: Hardware cursor changes color when switched to ↵	Melissa Wen
	software cursor" This reverts commit 272e6aab14bbf98d7a06b2b1cd6308a02d4a10a1. Applying degamma curve to the cursor by default breaks Linux userspace expectation. On Linux, AMD display manager enables cursor degamma ROM just for implict sRGB on HW versions where degamma is split into two blocks: degamma ROM for pre-defined TFs and `gamma correction` for user/custom curves, and degamma ROM settings doesn't apply to cursor plane. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1513 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2803 Reported-by: Michel Dänzer <michel.daenzer@mailbox.org> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4144 Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f6a305d4748801a6c799ae9375b2ecff3aed094b) Cc: stable@vger.kernel.org
2025-05-14	drm/amdgpu: add debugfs for spirom IFWI dump	Shiwu Zhang
	Expose the debugfs file node for user space to dump the IFWI image on spirom. For one transaction between PSP and host, it will read out the images on both active and inactive partitions so a buffer with two times the size of maximum IFWI image (currently 16MByte) is needed. v2: move the vbios gfl macros to the common header and rename the bo triplet struct to spirom_bo for this specific usage (Hawking) v3: return directly the result of last command execution (Lijo) Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-14	drm/amdgpu: fix userq resource double freed	Prike Liang
	As the userq resource was already freed at the drm_release early phase, it should avoid freeing userq resource again at the later kms postclose callback. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-14	drm/amdgpu: Fix circular locking in userq creation	Jesse.Zhang
	A circular locking dependency was detected between the global `adev->userq_mutex` and per-file `userq_mgr->userq_mutex` when creating user queues. The issue occurs because: 1. `amdgpu_userq_suspend()` and `amdgpu_userq_resume` take `adev->userq_mutex` first, then `userq_mgr->userq_mutex` 2. While `amdgpu_userq_create()` takes them in reverse order This patch resolves the issue by: 1. Moving the `adev->userq_mutex` lock earlier in `amdgpu_userq_create()` to cover the `amdgpu_userq_ensure_ev_fence()` call 2. Releasing it after we're done with both queue creation and the scheduling halt check v2: remove unused adev->userq_mutex lock (Prike) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-14	drm/amdgpu: read back register after written for VCN v4.0.5	David (Ming Qiang) Wu
	On VCN v4.0.5 there is a race condition where the WPTR is not updated after starting from idle when doorbell is used. Adding register read-back after written at function end is to ensure all register writes are done before they can be used. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12528 Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-14	Revert "drm/amd/display: Hardware cursor changes color when switched to ↵	Melissa Wen
	software cursor" This reverts commit 272e6aab14bbf98d7a06b2b1cd6308a02d4a10a1. Applying degamma curve to the cursor by default breaks Linux userspace expectation. On Linux, AMD display manager enables cursor degamma ROM just for implict sRGB on HW versions where degamma is split into two blocks: degamma ROM for pre-defined TFs and `gamma correction` for user/custom curves, and degamma ROM settings doesn't apply to cursor plane. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1513 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2803 Reported-by: Michel Dänzer <michel.daenzer@mailbox.org> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4144 Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-14	drm/i915/alpm: Stop writing ALPM registers when PSR is enabled	Jouni Högander
	Currently we are seeing these on PTL: xe 0000:00:02.0: [drm] ERROR Timeout waiting for DDI BUF A to get active These seem to be caused by writing ALPM registers while Panel Replay is enabled. Fix this by writing ALPM registers only when Panel Replay is about to be enabled. v4: improve comment on intel_psr_panel_replay_enable_sink call v3: enable/disable ALPM from PSR code Fixes: 172757acd6f6 ("drm/i915/lobf: Add lobf enablement in post plane update") Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/20250513054814.3702977-3-jouni.hogander@intel.com (cherry picked from commit a8eb102ce0944a9de2a62aa9d195861b7f26668a) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-05-14	drm/i915/alpm: Make intel_alpm_enable_sink available for PSR	Jouni Högander
	We want to enable sink ALPM from PSR code. Make intel_alpm_enable_sink available for PSR. v2: do not add kerneldoc comments Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Link: https://lore.kernel.org/r/20250513054814.3702977-2-jouni.hogander@intel.com (cherry picked from commit 2d278488761f0b5be651a3db41e615a964123d6c) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-05-13	drm/xe: Fix the gem shrinker name	Thomas Hellström
	The xe buffer object shrinker name is visible in the <debugfs>/shrinker directory and most if not all other shinkers follow a naming convention that looks like <subsystem>-<driver>_<objects>:<unique> Follow the same convention for xe, changing the name to drm-xe_gem:<unique>. Other shrinkers typically use the device node for <unique> but since drm drivers typically don't have a single unique device- node, instead use the unique name in the drm device. Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://lore.kernel.org/r/20250508112931.3347-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 243bf99e2fe75edf8df1711c1377b6fc020b806c) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-13	drm/amd/display: Avoid flooding unnecessary info messages	Wayne Lin
	It's expected that we'll encounter temporary exceptions during aux transactions. Adjust logging from drm_info to drm_dbg_dp to prevent flooding with unnecessary log messages. Fixes: 3637e457eb00 ("drm/amd/display: Fix wrong handling for AUX_DEFER case") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250513032026.838036-1-Wayne.Lin@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9a9c3e1fe5256da14a0a307dff0478f90c55fc8c) Cc: stable@vger.kernel.org
2025-05-13	drm/amd/display: Fix null check of pipe_ctx->plane_state for update_dchubp_dpp	Melissa Wen
	Similar to commit 6a057072ddd1 ("drm/amd/display: Fix null check for pipe_ctx->plane_state in dcn20_program_pipe") that addresses a null pointer dereference on dcn20_update_dchubp_dpp. This is the same function hooked for update_dchubp_dpp in dcn401, with the same issue. Fix possible null pointer deference on dcn401_program_pipe too. Fixes: 63ab80d9ac0a ("drm/amd/display: DML2.1 Post-Si Cleanup") Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d8d47f739752227957d8efc0cb894761bfe1d879)
2025-05-13	drm/amd/display: check stream id dml21 wrapper to get plane_id	Aurabindo Pillai
	[Why & How] Fix a false positive warning which occurs due to lack of correct checks when querying plane_id in DML21. This fixes the warning when performing a mode1 reset (cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover): [ 35.751250] WARNING: CPU: 11 PID: 326 at /tmp/amd.PHpyAl7v/amd/amdgpu/../display/dc/dml2/dml2_dc_resource_mgmt.c:91 dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.751434] Modules linked in: amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) drm_suballoc_helper drm_ttm_helper ttm drm_display_helper cec rc_core i2c_algo_bit rfcomm qrtr cmac algif_hash algif_skcipher af_alg bnep amd_atl intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_hda_core snd_hwdep snd_pcm kvm snd_seq_midi snd_seq_midi_event snd_rawmidi crct10dif_pclmul polyval_clmulni polyval_generic btusb ghash_clmulni_intel sha256_ssse3 btrtl sha1_ssse3 snd_seq btintel aesni_intel btbcm btmtk snd_seq_device crypto_simd sunrpc cryptd bluetooth snd_timer ccp binfmt_misc rapl snd i2c_piix4 wmi_bmof gigabyte_wmi k10temp i2c_smbus soundcore gpio_amdpt mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 hid_generic usbhid hid crc32_pclmul igc ahci xhci_pci libahci xhci_pci_renesas video wmi [ 35.751501] CPU: 11 UID: 0 PID: 326 Comm: kworker/u64:9 Tainted: G OE 6.11.0-21-generic #21~24.04.1-Ubuntu [ 35.751504] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 35.751505] Hardware name: Gigabyte Technology Co., Ltd. X670E AORUS PRO X/X670E AORUS PRO X, BIOS F30 05/22/2024 [ 35.751506] Workqueue: amdgpu-reset-dev amdgpu_debugfs_reset_work [amdgpu] [ 35.751638] RIP: 0010:dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.751794] Code: 6d 0c 00 00 8b 84 24 88 00 00 00 41 3b 44 9c 20 0f 84 fc 07 00 00 48 83 c3 01 48 83 fb 06 75 b3 4c 8b 64 24 68 4c 8b 6c 24 40 <0f> 0b b8 06 00 00 00 49 8b 94 24 a0 49 00 00 89 c3 83 f8 07 0f 87 [ 35.751796] RSP: 0018:ffffbfa3805d7680 EFLAGS: 00010246 [ 35.751798] RAX: 0000000000010000 RBX: 0000000000000006 RCX: 0000000000000000 [ 35.751799] RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000 [ 35.751800] RBP: ffffbfa3805d78f0 R08: 0000000000000000 R09: 0000000000000000 [ 35.751801] R10: 0000000000000000 R11: 0000000000000000 R12: ffffbfa383249000 [ 35.751802] R13: ffffa0e68f280000 R14: ffffbfa383249658 R15: 0000000000000000 [ 35.751803] FS: 0000000000000000(0000) GS:ffffa0edbe580000(0000) knlGS:0000000000000000 [ 35.751804] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 35.751805] CR2: 00005d847ef96c58 CR3: 000000041de3e000 CR4: 0000000000f50ef0 [ 35.751806] PKRU: 55555554 [ 35.751807] Call Trace: [ 35.751810] <TASK> [ 35.751816] ? show_regs+0x6c/0x80 [ 35.751820] ? __warn+0x88/0x140 [ 35.751822] ? dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.751964] ? report_bug+0x182/0x1b0 [ 35.751969] ? handle_bug+0x6e/0xb0 [ 35.751972] ? exc_invalid_op+0x18/0x80 [ 35.751974] ? asm_exc_invalid_op+0x1b/0x20 [ 35.751978] ? dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.752117] ? math_pow+0x48/0xa0 [amdgpu] [ 35.752256] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752260] ? math_pow+0x48/0xa0 [amdgpu] [ 35.752400] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752403] ? math_pow+0x11/0xa0 [amdgpu] [ 35.752524] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752526] ? core_dcn4_mode_programming+0xe4d/0x20d0 [amdgpu] [ 35.752663] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752669] dml21_validate+0x3d4/0x980 [amdgpu] Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f8ad62c0a93e5dd94243e10f1b742232e4d6411e)
2025-05-13	drm/amd/display: fix link_set_dpms_off multi-display MST corner case	George Shen
	[Why & How] When MST config is unplugged/replugged too quickly, it can potentially result in a scenario where previous DC state has not been reset before the HPD link detection sequence begins. In this case, driver will disable the streams/link prior to re-enabling the link for link training. There is a bug in the current logic that does not account for the fact that current_state can be released and cleared prior to swapping to a new state (resulting in the pipe_ctx stream pointers to be cleared) in between disabling streams. To resolve this, cache the original streams prior to committing any stream updates. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1561782686ccc36af844d55d31b44c938dd412dc)