linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-04-08	drm/amdgpu: Few optimization and fixes for userq fence driver	Arunpravin Paneer Selvam
	Few optimization and fixes for userq fence driver. v1:(Christian): - Remove unnecessary comments. - In drm_exec_init call give num_bo_handles as last parameter it would making allocation of the array more efficient - Handle return value of __xa_store() and improve the error handling of amdgpu_userq_fence_driver_alloc(). v2:(Christian): - Revert userq_xa xarray init to XA_FLAGS_LOCK_IRQ. - move the xa_unlock before the error check of the call xa_err(__xa_store()) and moved this change to a separate patch as this is adding a missing error handling. - Removed the unnecessary comments. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Remove the MES self test	Arunpravin Paneer Selvam
	Remove MES self test as this conflicts the userqueue fence interrupts. v2:(Christian) - remove the amdgpu_mes_self_test() function and any now unused code. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: update userqueue BOs and PDs	Arvind Yadav
	This patch updates the VM_IOCTL to allow userspace to synchronize the mapping/unmapping of a BO in the page table. The major changes are: - it adds a drm_timeline object as an input parameter to the VM IOCTL. - this object is used by the kernel to sync the update of the BO in the page table during the mapping of the object. - the kernel also synchronizes the tlb flush of the page table entry of this object during the unmapping (Added in this series: https://patchwork.freedesktop.org/series/131276/ and https://patchwork.freedesktop.org/patch/584182/) - the userspace can wait on this timeline, and then the BO is ready to be consumed by the GPU. The UAPI for the same has been approved here: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392 V2: - remove the eviction fence coupling V3: - added the drm timeline support instead of input/output fence (Christian) V4: - made timeline 64-bit (Christian) - bug fix (Arvind) V5: GLCTS bug fix (Arvind) V6: Rename syncobj_handle -> timeline_syncobj_out Rename point -> timeline_point_in (Marek) V7: Addressed review comments from Christian: - do not send last_update fence in case of vm_clear_freed, instead return the fence from gen_va_update_vm - move the functions to update bo_mapping to amdgpu_gem.c - do not use amdgpu_userq_update_vm anymore in userq_create() V8: Addressed review comments from Christian: - Split amdgpu_gem_update_bo_mapping function. - amdgpu_gem_va_update_vm should return stub for error. V9: Addressed review comments from Christian: - Rename the function amdgpu_gem_update_timeline_node. - amdgpu_gem_update_timeline_node should be void function. - when timeline_point is zero don't allocate a chain and call drm_syncobj_replace_fence() instead of drm_syncobj_add_point(). V11: rebase V12: Fix 32-bit holes issue in sturct drm_amdgpu_gem_va. V13: Fix the review comment by renaming timeline syncobj (Marek) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Enable userq fence interrupt support	Arunpravin Paneer Selvam
	Add support to handle the userqueue protected fence signal hardware interrupt. Create a xarray which maps the doorbell index to the fence driver address. This would help to retrieve the fence driver information when an userq fence interrupt is triggered. Firmware sends the doorbell offset value and this info is compared with the queue's mqd doorbell offset value. If they are same, we process the userq fence interrupt. v1:(Christian): - use xa_load to extract the fence driver. - move the amdgpu_userq_fence_driver_process call within the xa_lock as there is a chance that fence_drv might be freed. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Add wait IOCTL timeline syncobj support	Arunpravin Paneer Selvam
	Add user fence wait IOCTL timeline syncobj support. v2:(Christian) - handle dma_fence_wait() return value. - shorten the variable name syncobj_timeline_points a bit. - move num_points up to avoid padding issues. v3:(Christian) - Handle timeline drm_syncobj_find_fence() call error handling - Use dma_fence_unwrap_for_each() in timeline fence as there could be more than one fence. v4:(Christian) - Drop the first num_fences since fence is always included in the dma_fence_unwrap_for_each() iteration, when fence != f then fence is most likely just a container. v5: Added Alex RB to merge the kernel UAPI changes since he has already approved the amdgpu_drm.h changes. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Implement userqueue signal/wait IOCTL	Arunpravin Paneer Selvam
	This patch introduces new IOCTL for userqueue secure semaphore. The signal IOCTL called from userspace application creates a drm syncobj and array of bo GEM handles and passed in as parameter to the driver to install the fence into it. The wait IOCTL gets an array of drm syncobjs, finds the fences attached to the drm syncobjs and obtain the array of memory_address/fence_value combintion which are returned to userspace. v2: (Christian) - Install fence into GEM BO object. - Lock all BO's using the dma resv subsystem - Reorder the sequence in signal IOCTL function. - Get write pointer from the shadow wptr - use userq_fence to fetch the va/value in wait IOCTL. v3: (Christian) - Use drm_exec helper for the proper BO drm reserve and avoid BO lock/unlock issues. - fence/fence driver reference count logic for signal/wait IOCTLs. v4: (Christian) - Fixed the drm_exec calling sequence - use dma_resv_for_each_fence_unlock if BO's are not locked - Modified the fence_info array storing logic. v5: (Christian) - Keep fence_drv until wait queue execution. - Add dma_fence_wait for other fences. - Lock BO's using drm_exec as the number of fences in them could change. - Install signaled fences as well into BO/Syncobj. - Move Syncobj fence installation code after the drm_exec_prepare_array. - Directly add dma_resv_usage_rw(args->bo_flags.... - remove unnecessary dma_fence_put. v6: (Christian) - Add xarray stuff to store the fence_drv - Implement a function to iterate over the xarray and drop the fence_drv references. - Add drm_exec_until_all_locked() wrapper - Add a check that if we haven't exceeded the user allocated num_fences before adding dma_fence to the fences array. v7: (Christian) - Use memdup_user() for kmalloc_array + copy_from_user - Move the fence_drv references from the xarray into the newly created fence and drop the fence_drv references when we signal this fence. - Move this locking of BOs before the "if (!wait_info->num_fences)", this way you need this code block only once. - Merge the error handling code and the cleanup + return 0 code. - Initializing the xa should probably be done in the userq code. - Remove the userq back pointer stored in fence_drv. - Pass xarray as parameter in amdgpu_userq_walk_and_drop_fence_drv() v8: (Christian) - Move fence_drv references must come before adding the fence to the list. - Use xa_lock_irqsave_nested for nested spinlock operations. - userq_mgr should be per fpriv and not one per device. - Restructure the interrupt process code for the early exit of the loop. - The reference acquired in the syncobj fence replace code needs to be kept around. - Modify the dma_fence acquire placement in wait IOCTL. - Move USERQ_BO_WRITE flag to UAPI header file. - drop the fence drv reference after telling the hw to stop accessing it. - Add multi sync object support to userq signal IOCTL. V9: (Christian) - Store all the fence_drv ref to other drivers and not ourself. - Remove the userq fence xa implementation and replace with kvmalloc_array. v10: (Christian) - Add a comment for the userq_xa xarray - drop the if check of userq_fence->fence_drv_array - use the i variable to initialize userq_fence->fence_drv_array_count - drop the fence reference before you free the array in the error handling, otherwise it could be that some references leaked. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: screen freeze and userq driver crash	Arunpravin Paneer Selvam
	Screen freeze and userq fence driver crash while playing Xonotic v2: (Christian) - There is change that fence might signal in between testing and grabbing the lock. Hence we can move the lock above the if..else check and use the dma_fence_is_signaled_locked(). Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Add mqd support for the fence address	Arunpravin Paneer Selvam
	- Add a field in struct v11_gfx_mqd for userqueue fence address. - Assign fence gpu VA address to the userqueue mqd fence address fields. v2: Remove the mask and replace with lower_32_bits (Christian) Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Implement a new userqueue fence driver	Arunpravin Paneer Selvam
	Developed a userqueue fence driver for the userqueue process shared BO synchronization. Create a dma fence having write pointer as the seqno and allocate a seq64 memory for each user queue process and feed this memory address into the firmware/hardware, thus the firmware writes the read pointer into the given address when the process completes it execution. Compare wptr and rptr, if rptr >= wptr, signal the fences for the waiting process to consume the buffers. v2: Worked on review comments from Christian for the following modifications - Add wptr as sequence number into the fence - Add a reference count for the fence driver - Add dma_fence_put below the list_del as it might frees the userq fence. - Trim unnecessary code in interrupt handler. - Check dma fence signaled state in dma fence creation function for a potential problem of hardware completing the job processing beforehand. - Add necessary locks. - Create a list and process all the unsignaled fences. - clean up fences in destroy function. - implement .signaled callback function v3: Worked on review comments from Christian - Modify naming convention for reference counted objects - Fix fence driver reference drop issue - Drop amdgpu_userq_fence_driver_process() function return value v4: Worked on review comments from Christian - Moved fence driver allocation into amdgpu_userq_fence_driver_alloc() - Added detail doc mentioning the differences b/w two spinlocks declared. v5: Worked on review comments from Christian - Check before upcast and remove local variable - Add error handling in fence_drv alloc function. - Move rptr read fn outside of the loop and remove WARN_ON in destroy function. v6: - clear the seq64 memory in user fence driver(Christian) - fix for the wptr va bo mapping(Christian) - move the fence_drv xa entry erase code from the interrupt handler into user fence destroy function Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: add kernel config for gfx-userqueue	Shashank Sharma
	This patch: - adds a kernel config option "CONFIG_DRM_AMDGPU_NAVI3X_USERQ" - moves the usequeue initialization code for all IPs under this flag - cover the core userqueue functions under this config - adds stub function for userqueue ioctl. so that the userqueue works only when the config is enabled. V9: Introduce this patch V10: Call it CONFIG_DRM_AMDGPU_NAVI3X_USERQ instead of CONFIG_DRM_AMDGPU_USERQ_GFX (Christian) V11: Add GFX in the config help description message. V12: Add depends on BROKEN for this config, remove this when the rest of the code is available. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: fix MES GFX mask	Arvind Yadav
	Current MES GFX mask prevents FW to enable oversubscription. This patch does the following: - Fixes the mask values and adds a description for the same - Removes the central mask setup and makes it IP specific, as it would be different when the number of pipes and queues are different. v2: squash in fix from Shashank Cc: Christian König <Christian.Koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: enable compute/gfx usermode queue	Shashank Sharma
	This patch does the necessary changes required to enable compute workload support using the existing usermode queues infrastructure. V9: Patch introduced V10: Add custom IP specific mqd strcuture for compute (Alex) V11: Rename drm_amdgpu_userq_mqd_compute_gfx_v11 to drm_amdgpu_userq_mqd_compute_gfx11 (Marek) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: enable SDMA usermode queues	Arvind Yadav
	This patch does necessary modifications to enable the SDMA usermode queues using the existing userqueue infrastructure. V9: introduced this patch in the series V10: use header file instead of extern (Alex) V11: rename drm_amdgpu_userq_mqd_sdma_gfx_v11 to drm_amdgpu_userq_mqd_sdma_gfx11 (Marek) Cc: Christian König <Christian.Koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: enable GFX-V11 userqueue support	Shashank Sharma
	This patch enables GFX-v11 IP support in the usermode queue base code. It typically: - adds a GFX_v11 specific MQD structure - sets IP functions to create and destroy MQDs - sets MQD objects coming from userspace V10: introduced this spearate patch for GFX V11 enabling (Alex). V11: Addressed review comments: - update the comments in GFX mqd structure informing user about using the INFO IOCTL for object sizes (Alex) - rename struct drm_amdgpu_userq_mqd_gfx_v11 to drm_amdgpu_userq_mqd_gfx11 (Marek) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: cleanup leftover queues	Shashank Sharma
	This patch adds code to cleanup any leftover userqueues which a user might have missed to destroy due to a crash or any other programming error. V7: Added Alex's R-B V8: Rebase V9: Rebase V10: Rebase V11: Rebase Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: generate doorbell index for userqueue	Shashank Sharma
	The userspace sends us the doorbell object and the relative doobell index in the object to be used for the usermode queue, but the FW expects the absolute doorbell index on the PCI BAR in the MQD. This patch adds a function to convert this relative doorbell index to absolute doorbell index. V5: Fix the db object reference leak (Christian) V6: Pin the doorbell bo in userqueue_create() function, and unpin it in userqueue destoy (Christian) V7: Added missing kfree for queue in error cases Added Alex's R-B V8: Rebase V9: Changed the function names from gfx_v11* to mes_v11* V10: Rebase V11: Rebase Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: map wptr BO into GART	Shashank Sharma
	To support oversubscription, MES FW expects WPTR BOs to be mapped into GART, before they are submitted to usermode queues. This patch adds a function for the same. V4: fix the wptr value before mapping lookup (Bas, Christian). V5: Addressed review comments from Christian: - Either pin object or allocate from GART, but not both. - All the handling must be done with the VM locks held. V7: Addressed review comments from Christian: - Do not take vm->eviction_lock - Use amdgpu_bo_gpu_offset to get the wptr_bo GPU offset V8: Rebase V9: Changed the function names from gfx_v11* to mes_v11* V10: Remove unused adev (Harish) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: map usermode queue into MES	Shashank Sharma
	This patch adds new functions to map/unmap a usermode queue into the FW, using the MES ring. As soon as this mapping is done, the queue would be considered ready to accept the workload. V1: Addressed review comments from Alex on the RFC patch series - Map/Unmap should be IP specific. V2: Addressed review comments from Christian: - Fix the wptr_mc_addr calculation (moved into another patch) Addressed review comments from Alex: - Do not add fptrs for map/unmap V3: Integration with doorbell manager V4: Rebase V5: Use gfx_v11_0 for function names (Alex) V6: Removed queue->proc/gang/fw_ctx_address variables and doing the address calculations locally to keep the queue structure GEN independent (Alex) V7: Added R-B from Alex V8: Rebase V9: Rebase V10: Rebase Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: create context space for usermode queue	Shashank Sharma
	The MES FW expects us to allocate at least one page as context space to process gang and process related context data. This patch creates a joint object for the same, and calculates GPU space offsets of these spaces. V1: Addressed review comments on RFC patch: Alex: Make this function IP specific V2: Addressed review comments from Christian - Allocate only one object for total FW space, and calculate offsets for each of these objects. V3: Integration with doorbell manager V4: Review comments: - Remove shadow from FW space list from cover letter (Alex) - Alignment of macro (Luben) V5: Merged patches 5 and 6 into this single patch Addressed review comments: - Use lower_32_bits instead of mask (Christian) - gfx_v11_0 instead of gfx_v11 in function names (Alex) - Shadow and GDS objects are now coming from userspace (Christian, Alex) V6: - Add a comment to replace amdgpu_bo_create_kernel() with amdgpu_bo_create() during fw_ctx object creation (Christian). - Move proc_ctx_gpu_addr, gang_ctx_gpu_addr and fw_ctx_gpu_addr out of generic queue structure and make it gen11 specific (Alex). V7: - Using helper function to create/destroy userqueue objects. - Removed FW object space allocation. V8: - Updating FW object address from user values. V9: - uppdated function name from gfx_v11_* to mes_v11_* V10: - making this patch independent of IP based changes, moving any GFX object related changes in GFX specific patch (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Acked-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: create MES-V11 usermode queue for GFX	Shashank Sharma
	A Memory queue descriptor (MQD) of a userqueue defines it in the hw's context. As MQD format can vary between different graphics IPs, we need gfx GEN specific handlers to create MQDs. This patch: - Adds a new file which will be used for MES based userqueue functions targeting GFX and SDMA IP. - Introduces MQD handler functions for the usermode queues. V1: Worked on review comments from Alex: - Make MQD functions GEN and IP specific V2: Worked on review comments from Alex: - Reuse the existing adev->mqd[ip] for MQD creation - Formatting and arrangement of code V3: - Integration with doorbell manager V4: Review comments addressed: - Do not create a new file for userq, reuse gfx_v11_0.c (Alex) - Align name of structure members (Luben) - Don't break up the Cc tag list and the Sob tag list in commit message (Luben) V5: - No need to reserve the bo for MQD (Christian). - Some more changes to support IP specific MQD creation. V6: - Add a comment reminding us to replace the amdgpu_bo_create_kernel() calls while creating MQD object to amdgpu_bo_create() once eviction fences are ready (Christian). V7: - Re-arrange userqueue functions in adev instead of uq_mgr (Alex) - Use memdup_user instead of copy_from_user (Christian) V9: - Moved userqueue code from gfx_v11_0.c to new file mes_v11_0.c so that it can be reused for SDMA userqueues as well (Shashank, Alex) V10: Addressed review comments from Alex - Making this patch independent of IP engine(GFX/SDMA/Compute) and specific to MES V11 only, using the generic MQD structure. - Splitting a spearate patch to enabling GFX support from here. - Verify mqd va address to be non-NULL. - Add a separate header file. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: add helpers to create userqueue object	Shashank Sharma
	This patch introduces amdgpu_userqueue_object and its helper functions to creates and destroy this object. The helper functions creates/destroys a base amdgpu_bo, kmap/unmap it and save the respective GPU and CPU addresses in the encapsulating userqueue object. These helpers will be used to create/destroy userqueue MQD, WPTR and FW areas. V7: - Forked out this new patch from V11-gfx-userqueue patch to prevent that patch from growing very big. - Using amdgpu_bo_create instead of amdgpu_bo_create_kernel in prep for eviction fences (Christian) V9: - Rebase V10: - Added Alex's R-B Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: add new IOCTL for usermode queue	Shashank Sharma
	This patch adds: - A new IOCTL function to create and destroy - A new structure to keep all the user queue data in one place. - A function to generate unique index for the queue. V1: Worked on review comments from RFC patch series: - Alex: Keep a list of queues, instead of single queue per process. - Christian: Use the queue manager instead of global ptrs, Don't keep the queue structure in amdgpu_ctx V2: Worked on review comments: - Christian: - Formatting of text - There is no need for queuing of userqueues, with idr in place - Alex: - Remove use_doorbell, its unnecessary - Reuse amdgpu_mqd_props for saving mqd fields - Code formatting and re-arrangement V3: - Integration with doorbell manager V4: - Accommodate MQD union related changes in UAPI (Alex) - Do not set the queue size twice (Bas) V5: - Remove wrapper functions for queue indexing (Christian) - Do not save the queue id/idr in queue itself (Christian) - Move the idr allocation in the IP independent generic space (Christian) V6: - Check the validity of input IP type (Christian) V7: - Move uq_func from uq_mgr to adev (Alex) - Add missing free(queue) for error cases (Yifan) V9: - Rebase V10: Addressed review comments from Christian, and added R-B: - Do not initialize the local variable - Convert DRM_ERROR to DEBUG. V11: - check the input flags to be zero (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: add usermode queue base code	Shashank Sharma
	This patch adds IP independent skeleton code for amdgpu usermode queue. It contains: - A new files with init functions of usermode queues. - A queue context manager in driver private data. V1: Worked on design review comments from RFC patch series: (https://patchwork.freedesktop.org/series/112214/) - Alex: Keep a list of queues, instead of single queue per process. - Christian: Use the queue manager instead of global ptrs, Don't keep the queue structure in amdgpu_ctx V2: - Reformatted code, split the big patch into two V3: - Integration with doorbell manager V4: - Align the structure member names to the largest member's column (Luben) - Added SPDX license (Luben) V5: - Do not add amdgpu.h in amdgpu_userqueue.h (Christian). - Move struct amdgpu_userq_mgr into amdgpu_userqueue.h (Christian). V6: Rebase V9: Rebase V10: Rebase + Alex's R-B Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: still cleanup sid.h	Alexandre Demers
	The defines, shifts and masks are already available in dce_6_0_d.h, dce_6_0_sh_mask.h. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: fill in gmc_v6_0_set_clockgating_state()	Alexandre Demers
	Pretty much was already there, just not ported to amdgpu. Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amd/display/dc: reclassify DCE6 resources and hw sequencer	Alexandre Demers
	Classify DCE6 resource and sequencer as they are for other DCE versions Put dce60_resource.c and .h under amd/display/dc/resource/dce60 Put and rename dce60_hw_sequencer.c and .h under amd/display/dc/hwss/dce60 v2: fix build when CONFIG_DRM_AMD_DC_SI=n (Alex) Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Reset RAS table if header is invalid	Lijo Lazar
	If a valid header is not found during RAS eeprom init, consider it as new and reset RAS table info. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: add loop bits for NPS2 page retirement	Tao Zhou
	Support NPS2 RAS. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amd/amdgpu: decouple ASPM with pcie dpm	Kenneth Feng
	ASPM doesn't need to be disabled if pcie dpm is disabled. So ASPM can be independantly enabled. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	amd/amdgpu: Init vcn hardware per instance for vcn 4.0.3	Ruili Ji
	Add interface for hardware init by vcn instance. v2: fix code format Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Disable ACA on VFs	Victor Skvortsov
	VFs query RAS error counts directly from host with AMDGPU_RAS_VIRT_ERROR_COUNT_QUERY. When ACA is enabled, an unusable aca_sysfs is created rather than amdgpu_ras_sysfs_create() Likewise, VFs depend on host support to query CPERs, rather than ACA component. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Zhigang Luo <Zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: use "irq" in place of "interrupt" in DCE6/8 as in DCE10/11	Alexandre Demers
	"interrupt" becomes "irq" in: dce_vX_0_set_hpd_interrupt_state() dce_vX_0_set_crtc_interrupt_state() dce_vX_0_set_pageflip_interrupt_state() It is easier when going through the code to just change the DCE number in the functions' name to find and compare them across DCE versions. Also, it standardizes function mapping inside a given structure where .set and .process are both set to functions with a "_irq" suffix. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: fix typos in DCEs	Alexandre Demers
	In DCE6, DCE8, DCE10, DCE11, "hdp" is replaced by "hpd" and replace "type" by "hpd" for a uniform parameter naming usage across DCEs. In link_factory.c, there is a missing "p" to "types" Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/mes12: optimize MES pipe FW version fetching	Alex Deucher
	Don't fetch it again if we already have it. It seems the registers don't reliably have the value at resume in some cases. Fixes: 785f0f9fe742 ("drm/amdgpu: Add mes v12_0 ip block support (v4)") Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amd/pm/smu11: Prevent division by zero	Denis Arefev
	The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 1e866f1fe528 ("drm/amd/pm: Prevent divide by zero") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: cancel gfx idle work in device suspend for s0ix	Alex Deucher
	This is normally handled in the gfx IP suspend callbacks, but for S0ix, those are skipped because we don't want to touch gfx. So handle it in device suspend. Fixes: b9467983b774 ("drm/amdgpu: add dynamic workload profile switching for gfx10") Fixes: 963537ca2325 ("drm/amdgpu: add dynamic workload profile switching for gfx11") Fixes: 5f95a1549555 ("drm/amdgpu: add dynamic workload profile switching for gfx12") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amd/display: pause the workload setting in dm	Kenneth Feng
	Pause the workload setting in dm when doing idle optimization Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/pm/swsmu: implement pause workload profile	Alex Deucher
	Add the callback for implementation for swsmu. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/pm: add workload profile pause helper	Alex Deucher
	To be used for display idle optimizations when we want to pause non-default profiles. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/gfx12: dump full CP packet header FIFOs	Alex Deucher
	In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/gfx11: dump full CP packet header FIFOs	Alex Deucher
	In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/gfx10: dump full CP packet header FIFOs	Alex Deucher
	In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07	drm/amdgpu/gfx9.4.3: dump full CP packet header FIFOs	Alex Deucher
	In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07	drm/amdgpu/gfx9: dump full CP packet header FIFOs	Alex Deucher
	In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07	drm/amd/pm: implement dpm vcn reset function	Ruili Ji
	Implement VCN engine reset by sending MSG_ResetVCN on smu 13.0.6. v2: fix format for code and message Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07	drm/amd/display: Promote DC to 3.2.328	Taimur Hassan
	Summary: * Optimize custom brightness curve * Correct SSC enable detection for DCN351 * Turn off eDP lcdvdd and backlight if not required * Use DMUB Fused IO interface for HDCP * Extend eDP-on-DP1 quirk list Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07	drm/amd/display: rename IPS2 entry/exit message	Sherry Wang
	[Why&How] Fix the confusing entry/exit message name for IPS2 Reviewed-by: Duncan Ma <duncan.ma@amd.com> Signed-off-by: Sherry Wang <Yao.Wang1@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07	drm/amd/display: [FW Promotion] Release 0.1.5.0	Taimur Hassan
	Aligning dmub_cmd header with dmu firmware release 0.1.5.0 Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07	drm/amd/display: turn off eDP lcdvdd and backlight if not required	Charlene Liu
	[why] A+N configuration, eDP on A-APU is off, extended display active. Resume from s4, eDP's backlight is still on. [how] Turn off inactive eDP backlight and lcdvdd. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Jing Zhou <Jing.Zhou@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07	drm/amd/display: dont disable dtb as dto src during dpms off	Ausef Yousof
	fix was previously in 25.20 but was reverted out as it was accompanied by other changes that caused regression. [why&how] Disabling dtb as the dto src during dpms off relies on in the same instance being able to also alter the dto src bit to dpref (or not dtb in general), but this was recently changed to only take place in dcn31_program_pix_clk, as that is where we want to perform any dto src changes because tg is off at that point, it is unsafe to do that elsewhere. What this means is now instead of disabling dtb as dto src and modifying source bit, we are left with the configuration for a given tg that specifies dtb as dto src and dtb dto en simultaneously is unset. dcn31_program_pix_clk can rectify this but its possible for us to perform some tg dependant operation that would simply hang because when we go to enable say crtc then, the clk we specify as dto src is "off" en bit is cleared, source bit was never changed, and program_pix_clk hasnt been called yet (as apart of dpms on) We cant disable it as dto src during dpms off if we want the luxury of performing tg dependant operation during dpms off and before dpms on. Reviewed-by: Yihan Zhu <yihan.zhu@amd.com> Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>