summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/panthor
AgeCommit message (Collapse)Author
2025-04-29drm/panthor: Fix build warning when DEBUG_FS is disabledAdrián Larumbe
Commit a3707f53eb3f ("drm/panthor: show device-wide list of DRM GEM objects over DebugFS") causes a build warning and linking error when built without support for DebugFS, because of a non-inline non-static function declaration in a header file. On top of that, the function is only being used inside a single compilation unit, so there is no point in exposing it as a global symbol. This is a follow-up from Arnd Bergmann's first fix. Also move panthor_gem_debugfs_set_usage_flags() into panthor_gem.c and declare it static. Fixes: a3707f53eb3f ("drm/panthor: show device-wide list of DRM GEM objects over DebugFS") Reported-by: Arnd Bergmann <arnd@arndb.de> Closes: https://lore.kernel.org/dri-devel/20250424142419.47b9d457@collabora.com/T/#t Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://lore.kernel.org/r/20250424184041.356191-1-adrian.larumbe@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-04-23drm/panthor: show device-wide list of DRM GEM objects over DebugFSAdrián Larumbe
Add a device DebugFS file that displays a complete list of all the DRM GEM objects that are exposed to UM through a DRM handle. Since leaking object identifiers that might belong to a different NS is inadmissible, this functionality is only made available in debug builds with DEBUGFS support enabled. File format is that of a table, with each entry displaying a variety of fields with information about each GEM object. Each GEM object entry in the file displays the following information fields: Client PID, BO's global name, reference count, BO virtual size, BO resize size, VM address in its DRM-managed range, BO label and a GEM state flags. There's also a usage flags field for the type of BO, which tells us whether it's a kernel BO and/or mapped onto the FW's address space. GEM state and usage flag meanings are printed in the file prelude, so that UM parsing tools can interpret the numerical values in the table. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250423021238.1639175-5-adrian.larumbe@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-04-23drm/panthor: Label all kernel BO'sAdrián Larumbe
Kernel BO's aren't exposed to UM, so labelling them is the responsibility of the driver itself. This kind of tagging will prove useful in further commits when want to expose these objects through DebugFS. Expand panthor_kernel_bo_create() interface to take a NUL-terminated string. No bounds checking is done because all label strings are given as statically-allocated literals, but if a more complex kernel BO naming scheme with explicit memory allocation and formatting was desired in the future, this would have to change. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250423021238.1639175-4-adrian.larumbe@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-04-23drm/panthor: Add driver IOCTL for setting BO labelsAdrián Larumbe
Allow UM to label a BO for which it possesses a DRM handle. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250423021238.1639175-3-adrian.larumbe@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-04-23drm/panthor: Introduce BO labelingAdrián Larumbe
Add a new character string Panthor BO field, and a function that allows setting it from within the driver. Driver takes care of freeing the string when it's replaced or no longer needed at object destruction time, but allocating it is the responsibility of callers. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250423021238.1639175-2-adrian.larumbe@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-04-22drm/panthor: Don't create a file offset for NO_MMAP BOsBoris Brezillon
Right now the DRM_PANTHOR_BO_NO_MMAP flag is ignored by panthor_ioctl_bo_mmap_offset(), meaning BOs with this flag set can have a file offset but can't be mapped anyway, because panthor_gem_mmap() will filter them out. If we error out at mmap_offset creation time, we can get rid of panthor_gem_mmap() and call drm_gem_shmem_object_mmap directly, and we get rid of this inconsistency of having an mmap offset for a BO that can never be mmap-ed. Changes in v2: - Get rid of panthor_gem_mmap() - Get rid of the Fixes tag and adjust the commit message accordingly - Return ENOPERM instead of EINVAL Changes in v3: - Don't leak the BO ref - Add R-bs Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Adrián Larumbe <adrian.larumbe@collabora.com> Link: https://lore.kernel.org/r/20250417121942.3574111-1-boris.brezillon@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-04-17drm/panthor: Fix the panthor_gpu_coherency_init() error pathBoris Brezillon
The panthor_gpu_coherency_init() call has been moved around, but the error path hasn't been adjusted accordingly. Make sure we undo what has been done before this call in case of failure. Fixes: 7d5a3b22f5b5 ("drm/panthor: Call panthor_gpu_coherency_init() after PM resume()") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/dri-devel/4da470aa-4f84-460e-aff8-dabc8cc4da15@stanley.mountain/T/#t Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://lore.kernel.org/r/20250414130120.581274-1-boris.brezillon@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-04-14drm/panthor: Test for imported buffers with drm_gem_is_imported()Thomas Zimmermann
Instead of testing import_attach for imported GEM buffers, invoke drm_gem_is_imported() to do the test. The helper tests the dma_buf itself while import_attach is just an artifact of the import. Prepares to make import_attach optional. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250317131923.238374-10-tzimmermann@suse.de
2025-04-10drm/panthor: Don't update MMU_INT_MASK in panthor_mmu_irq_handler()Boris Brezillon
Interrupts are automatically unmasked in panthor_mmu_irq_threaded_handler() when the handler returns. Unmasking prematurely might generate spurious interrupts if the IRQ line is shared. Changes in v2: - New patch Changes in v3: - Add R-bs Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250404080933.2912674-6-boris.brezillon@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-04-10drm/panthor: Let IRQ handlers clear the interrupts themselvesBoris Brezillon
MMU handler needs to be in control of the job interrupt clears because clearing the interrupt also unblocks the writer/reader that triggered the fault, and we don't want it to be unblocked until we've had a chance to process the IRQ. Since clearing the clearing is just one line, let's make it explicit instead of doing it in the generic code path. Note that this commit changes the existing behavior in that the MMU COMPLETED irqs are no longer cleared, which is fine because they are masked, so we're not risking an interrupt flood. Changes in v3: - Mention the fact we no longer clear MMU COMPLETED irqs - Add Liviu's R-b Changes in v2: - Move the MMU_INT_CLEAR around Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250404080933.2912674-5-boris.brezillon@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-04-10drm/panthor: Update panthor_mmu::irq::mask when neededBoris Brezillon
When we clear the faulty bits in the AS mask, we also need to update the panthor_mmu::irq::mask field otherwise our IRQ handler won't get called again until the GPU is reset. Changes in v2: - Add Liviu's R-b Changes in v3: - Add Steve's R-b Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block") Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250404080933.2912674-4-boris.brezillon@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-04-10drm/panthor: Call panthor_gpu_coherency_init() after PM resume()Boris Brezillon
When the device is coherent, panthor_gpu_coherency_init() will read GPU_COHERENCY_FEATURES to make sure the GPU supports the ACE-Lite coherency protocol, which will fail if the clocks/power-domains are not enabled when the read is done. Move the panthor_gpu_coherency_init() call after the device has been resumed to prevent that. Changes in v2: - Add Liviu's R-b Changes in v3: - Add Steve's R-b Fixes: dd7db8d911a1 ("drm/panthor: Explicitly set the coherency mode") Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250404080933.2912674-3-boris.brezillon@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-04-10drm/panthor: Fix GPU_COHERENCY_ACE[_LITE] definitionsBoris Brezillon
GPU_COHERENCY_ACE and GPU_COHERENCY_ACE_LITE definitions have been swapped. Changes in v2: - New patch Changes in v3: - Add Steve's R-b Reported-by: Liviu Dudau <liviu.dudau@arm.com> Fixes: 546b366600ef ("drm/panthor: Add GPU register definitions") Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://lore.kernel.org/r/20250404080933.2912674-2-boris.brezillon@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2025-03-26drm/gem: Change locked/unlocked postfix of drm_gem_v/unmap() function namesDmitry Osipenko
Make drm/gem API function names consistent by having locked function use the _locked postfix in the name, while the unlocked variants don't use the _unlocked postfix. Rename drm_gem_v/unmap() function names to make them consistent with the rest of the API functions. Acked-by: Maxime Ripard <mripard@kernel.org> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Suggested-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.d> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250322212608.40511-2-dmitry.osipenko@collabora.com
2025-03-05drm/panthor: Clean up FW version information displaySteven Price
Assigning a string to an array which is too small to include the NUL byte at the end causes a warning on some compilers. But this function also has some other oddities like the 'header' array which is only ever used within sizeof(). Tidy up the function by removing the 'header' array, allow the NUL byte to be present in git_sha_header, and calculate the length directly from git_sha_header. Reported-by: Will Deacon <will@kernel.org> Closes: https://lore.kernel.org/all/20250213154237.GA11897@willie-the-truck/ Fixes: 9d443deb0441 ("drm/panthor: Display FW version information") Signed-off-by: Steven Price <steven.price@arm.com> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250213161248.1642392-1-steven.price@arm.com
2025-03-05drm/panthor: Avoid sleep locking in the internal BO size pathAdrián Larumbe
Commit 434e5ca5b5d7 ("drm/panthor: Expose size of driver internal BO's over fdinfo") locks the VMS xarray, to avoid UAF errors when the same VM is being concurrently destroyed by another thread. However, that puts the current thread in atomic context, which means taking the VMS' heap locks will trigger a warning as the thread is no longer allowed to sleep. Because in this case replacing the heap mutex with a spinlock isn't feasible, the fdinfo handler no longer traverses the list of heaps for every single VM associated with an open DRM file. Instead, when a new heap chunk is allocated, its size is accumulated into a pool-wide tally, which also makes the atomic context code path somewhat faster. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Fixes: 434e5ca5b5d7 ("drm/panthor: Expose size of driver internal BO's over fdinfo") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250303190923.1639985-2-adrian.larumbe@collabora.com
2025-03-05drm/panthor: Replace sleep locks with spinlocks in fdinfo pathAdrián Larumbe
Commit 0590c94c3596 ("drm/panthor: Fix race condition when gathering fdinfo group samples") introduced an xarray lock to deal with potential use-after-free errors when accessing groups fdinfo figures. However, this toggles the kernel's atomic context status, so the next nested mutex lock will raise a warning when the kernel is compiled with mutex debug options: CONFIG_DEBUG_RT_MUTEXES=y CONFIG_DEBUG_MUTEXES=y Replace Panthor's group fdinfo data mutex with a guarded spinlock. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Fixes: 0590c94c3596 ("drm/panthor: Fix race condition when gathering fdinfo group samples") Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250303190923.1639985-1-adrian.larumbe@collabora.com
2025-03-05drm/panthor: Update CS_STATUS_ defines to correct valuesAshley Smith
Values for SC_STATUS_BLOCKED_REASON_ are documented in the G610 "Odin" GPU specification (CS_STATUS_BLOCKED_REASON register). This change updates the defines to the correct values. Fixes: 2718d91816ee ("drm/panthor: Add the FW logical block") Signed-off-by: Ashley Smith <ashley.smith@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250303180444.3768993-1-ashley.smith@collabora.com
2025-02-25Merge tag 'v6.14-rc4' into drm-nextDave Airlie
Backmerge Linux 6.14-rc4 at the request of tzimmermann so misc-next can base on rc4. Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-02-12drm/sched: Use struct for drm_sched_init() paramsPhilipp Stanner
drm_sched_init() has a great many parameters and upcoming new functionality for the scheduler might add even more. Generally, the great number of parameters reduces readability and has already caused one missnaming, addressed in: commit 6f1cacf4eba7 ("drm/nouveau: Improve variable name in nouveau_sched_init()"). Introduce a new struct for the scheduler init parameters and port all users. Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Acked-by: Matthew Brost <matthew.brost@intel.com> # for Xe Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> # for Panfrost and Panthor Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> # for Etnaviv Reviewed-by: Frank Binns <frank.binns@imgtec.com> # for Imagination Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> # for Sched Reviewed-by: Maíra Canal <mcanal@igalia.com> # for v3d Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> # for amdxdna Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250211111422.21235-2-phasta@kernel.org
2025-02-07drm/panthor: avoid garbage value in panthor_ioctl_dev_query()Su Hui
'priorities_info' is uninitialized, and the uninitialized value is copied to user object when calling PANTHOR_UOBJ_SET(). Using memset to initialize 'priorities_info' to avoid this garbage value problem. Fixes: f70000ef2352 ("drm/panthor: Add DEV_QUERY_GROUP_PRIORITIES_INFO dev query") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250119025828.1168419-1-suhui@nfschina.com
2025-02-07drm/panthor: Fix race condition when gathering fdinfo group samplesAdrián Larumbe
Commit e16635d88fa0 ("drm/panthor: add DRM fdinfo support") failed to protect access to groups with an xarray lock, which could lead to use-after-free errors. Fixes: e16635d88fa0 ("drm/panthor: add DRM fdinfo support") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250130172851.941597-6-adrian.larumbe@collabora.com Link: https://patchwork.freedesktop.org/patch/msgid/20250107173310.88329-1-florent.tomasin@arm.com
2025-02-07drm/panthor: Expose size of driver internal BO's over fdinfoAdrián Larumbe
This will display the sizes of kenrel BO's bound to an open file, which are otherwise not exposed to UM through a handle. The sizes recorded are as follows: - Per group: suspend buffer, protm-suspend buffer, syncobjcs - Per queue: ringbuffer, profiling slots, firmware interface - For all heaps in all heap pools across all VM's bound to an open file, record size of all heap chuks, and for each pool the gpu_context BO too. This does not record the size of FW regions, as these aren't bound to a specific open file and remain active through the whole life of the driver. Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Mihail Atanassov <mihail.atanassov@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250130172851.941597-4-adrian.larumbe@collabora.com
2025-01-13drm/panthor: Fix a race between the reset and suspend pathBoris Brezillon
If a reset is scheduled when the suspend happens, we drop the reset-pending info on the floor assuming the resume will fix things, but the resume logic might try a fast reset. If we're lucky, the fast reset fails and we fallback to a slow reset, but if the FW was corrupted in a way that makes it partially functional (it boots but doesn't quite do what it's expected to do), we won't notice immediately that things are not working correctly, leading to a new reset further down the road. Fixes: 5fe909cae118 ("drm/panthor: Add the device logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241217092457.1582053-1-boris.brezillon@collabora.com
2025-01-13drm/panthor: fix all mmu kernel-doc commentsRandy Dunlap
Use the correct format for all kernel-doc comments. Use structname.membername for named structs. Don't precede function names in kernel-doc with '@' sign. Use the correct function parameter names in kernel-doc comments. This fixes around 80 kernel-doc warnings. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Liviu Dudau <liviu.dudau@arm.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250111062832.910495-1-rdunlap@infradead.org
2025-01-13drm/panthor: Remove dead codeFlorent Tomasin
Remove unused function declaration in panthor_gem.h: - `panthor_gem_prime_import_sg_table()` Remove duplicate macro definitions: - `MAX_CSG_PRIO` - `MIN_CS_PER_CSG` - `MIN_CSGS` Signed-off-by: Florent Tomasin <florent.tomasin@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250107173310.88329-1-florent.tomasin@arm.com
2024-12-17drm/panthor: Report innocent group killBoris Brezillon
Groups can be killed during a reset even though they did nothing wrong. That usually happens when the FW is put in a bad state by other groups, resulting in group suspension failures when the reset happens. If we end up in that situation, flag the group innocent and report innocence through a new DRM_PANTHOR_GROUP_STATE flag. Bump the minor driver version to reflect the uAPI change. Changes in v4: - Add an entry to the driver version changelog - Add R-bs Changes in v3: - Actually report innocence to userspace Changes in v2: - New patch Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211080500.2349505-1-boris.brezillon@collabora.com
2024-12-11drm/panthor: Fix the fast-reset logicBoris Brezillon
If we do a GPU soft-reset, that's no longer fast reset. This also means the slow reset fallback doesn't work because the MCU state is only reset after a GPU soft-reset. Let's move the retry logic to panthor_device_resume() to issue a soft-reset between the fast and slow attempts, and patch panthor_gpu_suspend() to only power-off the L2 when a fast reset is requested. v3: - No changes v2: - Add R-b Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211075419.2333731-6-boris.brezillon@collabora.com
2024-12-11drm/panthor: Be robust against resume failuresBoris Brezillon
When the runtime PM resume callback returns an error, it puts the device in a state where it can't be resumed anymore. Make sure we can recover from such transient failures by calling pm_runtime_set_suspended() explicitly after a pm_runtime_resume_and_get() failure. v3: - Add R-b/A-b v2: - Add a comment explaining potential races in panthor_device_resume_and_get() Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Adrian Larumbe <adrian.larumbe@collabora.com> Acked-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211075419.2333731-5-boris.brezillon@collabora.com
2024-12-11drm/panthor: Ignore devfreq_{suspend, resume}_device() failuresBoris Brezillon
devfreq_{resume,suspend}_device() don't bother undoing the suspend_count modifications if something fails, so either it assumes failures are harmless, or it's super fragile/buggy. In either case it's not something we can address at the driver level, so let's just assume failures are harmless for now, like is done in panfrost. v3: - Add R-b v2: - Add R-b Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Adrian Larumbe <adrian.larumbe@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211075419.2333731-4-boris.brezillon@collabora.com
2024-12-11drm/panthor: Be robust against runtime PM resume failures in the suspend pathBoris Brezillon
The runtime PM resume operation is not guaranteed to succeed, but if it fails, the device should be in a suspended state. Make sure we're robust to resume failures in the unplug path. v3: - Fix typo - Add R-bs v2: - Move the bit that belonged in the next commit - Drop the panthor_device_unplug() changes Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Adrian Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211075419.2333731-3-boris.brezillon@collabora.com
2024-12-11drm/panthor: Preserve the result returned by panthor_fw_resume()Boris Brezillon
WARN() will return true if the condition is true, false otherwise. If we store the return of drm_WARN_ON() in ret, we lose the actual error code. v3: - Add R-b v2: - Add R-b Fixes: 5fe909cae118 ("drm/panthor: Add the device logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Adrian Larumbe <adrian.larumbe@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211075419.2333731-2-boris.brezillon@collabora.com
2024-12-09Merge remote-tracking branch 'drm/drm-next' into drm-misc-nextMaarten Lankhorst
The v6.13-rc2 release included a bunch of breaking changes, specifically the MODULE_IMPORT_NS commit. Backmerge in order to fix them before the next pull-request. Include the fix from Stephen Roswell. Caused by commit 25c3fd1183c0 ("drm/virtio: Add a helper to map and note the dma addrs and lengths") Interacting with commit cdd30ebb1b9f ("module: Convert symbol namespace to string literal") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://patchwork.freedesktop.org/patch/msgid/20241209121717.2abe8026@canb.auug.org.au Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2024-12-05drm: remove driver date from struct drm_driver and all driversJani Nikula
We stopped using the driver initialized date in commit 7fb8af6798e8 ("drm: deprecate driver date") and (eventually) started returning "0" for drm_version ioctl instead. Finish the job, and remove the unused date member from struct drm_driver, its initialization from drivers, along with the common DRIVER_DATE macros. v2: Also update drivers/accel (kernel test robot) Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Simon Ser <contact@emersion.fr> Acked-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # msm Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/1f2bf2543aed270a06f6c707fd6ed1b78bf16712.1733322525.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-12-02Merge drm/drm-next into drm-misc-nextMaxime Ripard
Kickstart 6.14 cycle. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-12-01Get rid of 'remove_new' relic from platform driver structLinus Torvalds
The continual trickle of small conversion patches is grating on me, and is really not helping. Just get rid of the 'remove_new' member function, which is just an alias for the plain 'remove', and had a comment to that effect: /* * .remove_new() is a relic from a prototype conversion of .remove(). * New drivers are supposed to implement .remove(). Once all drivers are * converted to not use .remove_new any more, it will be dropped. */ This was just a tree-wide 'sed' script that replaced '.remove_new' with '.remove', with some care taken to turn a subsequent tab into two tabs to make things line up. I did do some minimal manual whitespace adjustment for places that used spaces to line things up. Then I just removed the old (sic) .remove_new member function, and this is the end result. No more unnecessary conversion noise. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-28drm/panthor: Fix a typo in the FW iface flag definitionsBoris Brezillon
Drop the _RD_ in the flag names. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241113160257.2002333-1-boris.brezillon@collabora.com
2024-11-21Merge tag 'drm-next-2024-11-21' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm updates from Dave Airlie: "There's a lot of rework, the panic helper support is being added to more drivers, v3d gets support for HW superpages, scheduler documentation, drm client and video aperture reworks, some new MAINTAINERS added, amdgpu has the usual lots of IP refactors, Intel has some Pantherlake enablement and xe is getting some SRIOV bits, but just lots of stuff everywhere. core: - split DSC helpers from DP helpers - clang build fixes for drm/mm test - drop simple pipeline support for gem vram - document submission error signaling - move drm_rect to drm core module from kms helper - add default client setup to most drivers - move to video aperture helpers instead of drm ones tests: - new framebuffer tests ttm: - remove swapped and pinned BOs from TTM lru panic: - fix uninit spinlock - add ABGR2101010 support bridge: - add TI TDP158 support - use standard PM OPS dma-fence: - use read_trylock instead of read_lock to help lockdep scheduler: - add errno to sched start to report different errors - add locking to drm_sched_entity_modify_sched - improve documentation xe: - add drm_line_printer - lots of refactoring - Enable Xe2 + PES disaggregation - add new ARL PCI ID - SRIOV development work - fix exec unnecessary implicit fence - define and parse OA sync props - forcewake refactoring i915: - Enable BMG/LNL ultra joiner - Enable 10bpx + CCS scanout on ICL+, fp16/CCS on TGL+ - use DSB for plane/color mgmt - Arrow lake PCI IDs - lots of i915/xe display refactoring - enable PXP GuC autoteardown - Pantherlake (PTL) Xe3 LPD display enablement - Allow fastset HDR infoframe changes - write DP source OUI for non-eDP sinks - share PCI IDs between i915 and xe amdgpu: - SDMA queue reset support - SMU 13.0.6, JPEG 4.0.3 updates - Initial runtime repartitioning support - rework IP structs for multiple IP instances - Fetch EDID from _DDC if available - SMU13 zero rpm user control - lots of fixes/cleanups amdkfd: - Increase event FIFO size - add topology cap flag for per queue reset msm: - DPU: - SA8775P support - (disabled by default) MSM8917, MSM8937, MSM8953 and MSM8996 support - Enable large framebuffer support - Drop MSM8998 and SDM845 - DP: - SA8775P support - GPU: - a7xx preemption support - Adreno A663 support ast: - warn about unsupported TX chips ivpu: - add coredump - add pantherlake support rockchip: - 4K@60Hz display enablement - generate pll programming tables panthor: - add timestamp query API - add realtime group priority - add fdinfo support etnaviv: - improve handling of DMA address limits - improve GPU hangcheck exynos: - Decon Exynos7870 support mediatek: - add OF graph support omap: - locking fixes bochs: - convert to gem/shmem from simpledrm v3d: - support big/super pages - add gemfs vc4: - BCM2712 support refactoring - add YUV444 format support udmabuf: - folio related fixes nouveau: - add panic support on nv50+" * tag 'drm-next-2024-11-21' of https://gitlab.freedesktop.org/drm/kernel: (1583 commits) drm/xe/guc: Fix dereference before NULL check drm/amd: Fix initialization mistake for NBIO 7.7.0 Revert "drm/amd/display: parse umc_info or vram_info based on ASIC" drm/amd/display: Fix failure to read vram info due to static BP_RESULT drm/amdgpu: enable GTT fallback handling for dGPUs only drm/amd/amdgpu: limit single process inside MES drm/fourcc: add AMD_FMT_MOD_TILE_GFX9_4K_D_X drm/amdgpu/mes12: correct kiq unmap latency drm/amdgpu: Support vcn and jpeg error info parsing drm/amd : Update MES API header file for v11 & v12 drm/amd/amdkfd: add/remove kfd queues on start/stop KFD scheduling drm/amdkfd: change kfd process kref count at creation drm/amdgpu: Cleanup shift coding style drm/amd/amdgpu: Increase MES log buffer to dump mes scratch data drm/amdgpu: Implement virt req_ras_err_count drm/amdgpu: VF Query RAS Caps from Host if supported drm/amdgpu: Add msg handlers for SRIOV RAS Telemetry drm/amdgpu: Update SRIOV Exchange Headers for RAS Telemetry Support drm/amd/display: 3.2.309 drm/amd/display: Adjust VSDB parser for replay feature ...
2024-11-20drm/panthor: Fix compilation failure on panthor_fw.cLiviu Dudau
Commit 498893bd596e ("drm/panthor: Simplify FW fast reset path") forgot to copy the definition of glb_iface when it move one line of code. Fixes: 498893bd596e ("drm/panthor: Simplify FW fast reset path") Link: https://lore.kernel.org/dri-devel/20241119164455.572771-1-liviu.dudau@arm.com/ Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Karunika Choo <karunika.choo@arm.com> Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
2024-11-19drm/panthor: Simplify FW fast reset pathKarunika Choo
Stop checking the FW halt_status as MCU_STATUS should be sufficient. This should make the check for successful FW halt and subsequently setting fast_reset to true more robust. We should also clear GLB_REQ.GLB_HALT bit only on post-reset prior to starting the FW and only if we're doing a fast reset, because the slow reset will re-initialize all FW sections, including the global interface. Signed-off-by: Karunika Choo <karunika.choo@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://lore.kernel.org/r/20241119135030.3352939-1-karunika.choo@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
2024-11-19drm/panthor: Explicitly set the coherency modeAkash Goel
This commit fixes the potential misalignment between the value of device tree property "dma-coherent" and default value of COHERENCY_ENABLE register. Panthor driver didn't explicitly program the COHERENCY_ENABLE register with the desired coherency mode. The default value of COHERENCY_ENABLE register is implementation defined, so it may not be always aligned with the "dma-coherent" property value. The commit also checks the COHERENCY_FEATURES register to confirm that the coherency protocol is actually supported or not. v2: - Added R-b tags Signed-off-by: Akash Goel <akash.goel@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20241030225407.4077513-3-akash.goel@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
2024-11-19drm/panthor: Update memattr programing to align with GPU specAkash Goel
Mali GPU Arch spec forbids the GPU PTEs to indicate Inner or Outer shareability when no_coherency protocol is selected. Doing so results in unexpected or undesired snooping of the CPU caches on some platforms, such as Juno FPGA, causing functional issues. For example the boot of MCU firmware fails as GPU ends up reading stale data for the FW memory pages from the CPU's cache. The FW memory pages are initialized with uncached mapping when the device is not reported to be dma-coherent. The shareability bits are set to inner-shareable when IOMMU_CACHE flag is passed to map_pages() callback and IOMMU_CACHE flag is passed by Panthor driver when memory needs to be mapped as cached on the GPU side. IOMMU_CACHE seems to imply cache coherent and is probably not fit for purpose for the memory that is mapped as cached on GPU side but doesn't need to remain coherent with the CPU. This commit updates the programming of MEMATTR register to use MIDGARD_INNER instead of CPU_INNER when coherency is disabled. That way the inner-shareability specified in the GPU PTEs would map to Mali's internal-shareable mode, which is always supported by the GPU regardless of the coherency protocal and is required by the Userspace driver to ensure coherency between the shader cores. v2: - Added R-b tags Signed-off-by: Akash Goel <akash.goel@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20241030225407.4077513-2-akash.goel@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
2024-11-13drm/panthor: Fix handling of partial GPU mapping of BOsAkash Goel
This commit fixes the bug in the handling of partial mapping of the buffer objects to the GPU, which caused kernel warnings. Panthor didn't correctly handle the case where the partial mapping spanned multiple scatterlists and the mapping offset didn't point to the 1st page of starting scatterlist. The offset variable was not cleared after reaching the starting scatterlist. Following warning messages were seen. WARNING: CPU: 1 PID: 650 at drivers/iommu/io-pgtable-arm.c:659 __arm_lpae_unmap+0x254/0x5a0 <snip> pc : __arm_lpae_unmap+0x254/0x5a0 lr : __arm_lpae_unmap+0x2cc/0x5a0 <snip> Call trace: __arm_lpae_unmap+0x254/0x5a0 __arm_lpae_unmap+0x108/0x5a0 __arm_lpae_unmap+0x108/0x5a0 __arm_lpae_unmap+0x108/0x5a0 arm_lpae_unmap_pages+0x80/0xa0 panthor_vm_unmap_pages+0xac/0x1c8 [panthor] panthor_gpuva_sm_step_unmap+0x4c/0xc8 [panthor] op_unmap_cb.isra.23.constprop.30+0x54/0x80 __drm_gpuvm_sm_unmap+0x184/0x1c8 drm_gpuvm_sm_unmap+0x40/0x60 panthor_vm_exec_op+0xa8/0x120 [panthor] panthor_vm_bind_exec_sync_op+0xc4/0xe8 [panthor] panthor_ioctl_vm_bind+0x10c/0x170 [panthor] drm_ioctl_kernel+0xbc/0x138 drm_ioctl+0x210/0x4b0 __arm64_sys_ioctl+0xb0/0xf8 invoke_syscall+0x4c/0x110 el0_svc_common.constprop.1+0x98/0xf8 do_el0_svc+0x24/0x38 el0_svc+0x34/0xc8 el0t_64_sync_handler+0xa0/0xc8 el0t_64_sync+0x174/0x178 <snip> panthor : [drm] drm_WARN_ON(unmapped_sz != pgsize * pgcount) WARNING: CPU: 1 PID: 650 at drivers/gpu/drm/panthor/panthor_mmu.c:922 panthor_vm_unmap_pages+0x124/0x1c8 [panthor] <snip> pc : panthor_vm_unmap_pages+0x124/0x1c8 [panthor] lr : panthor_vm_unmap_pages+0x124/0x1c8 [panthor] <snip> panthor : [drm] *ERROR* failed to unmap range ffffa388f000-ffffa3890000 (requested range ffffa388c000-ffffa3890000) Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block") Signed-off-by: Akash Goel <akash.goel@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241111134720.780403-1-akash.goel@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
2024-11-07drm/panthor: Be stricter about IO mapping flagsJann Horn
The current panthor_device_mmap_io() implementation has two issues: 1. For mapping DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET, panthor_device_mmap_io() bails if VM_WRITE is set, but does not clear VM_MAYWRITE. That means userspace can use mprotect() to make the mapping writable later on. This is a classic Linux driver gotcha. I don't think this actually has any impact in practice: When the GPU is powered, writes to the FLUSH_ID seem to be ignored; and when the GPU is not powered, the dummy_latest_flush page provided by the driver is deliberately designed to not do any flushes, so the only thing writing to the dummy_latest_flush could achieve would be to make *more* flushes happen. 2. panthor_device_mmap_io() does not block MAP_PRIVATE mappings (which are mappings without the VM_SHARED flag). MAP_PRIVATE in combination with VM_MAYWRITE indicates that the VMA has copy-on-write semantics, which for VM_PFNMAP are semi-supported but fairly cursed. In particular, in such a mapping, the driver can only install PTEs during mmap() by calling remap_pfn_range() (because remap_pfn_range() wants to **store the physical address of the mapped physical memory into the vm_pgoff of the VMA**); installing PTEs later on with a fault handler (as panthor does) is not supported in private mappings, and so if you try to fault in such a mapping, vmf_insert_pfn_prot() splats when it hits a BUG() check. Fix it by clearing the VM_MAYWRITE flag (userspace writing to the FLUSH_ID doesn't make sense) and requiring VM_SHARED (copy-on-write semantics for the FLUSH_ID don't make sense). Reproducers for both scenarios are in the notes of my patch on the mailing list; I tested that these bugs exist on a Rock 5B machine. Note that I only compile-tested the patch, I haven't tested it; I don't have a working kernel build setup for the test machine yet. Please test it before applying it. Cc: stable@vger.kernel.org Fixes: 5fe909cae118 ("drm/panthor: Add the device logical block") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105-panthor-flush-page-fixes-v1-1-829aaf37db93@google.com
2024-11-07drm/panthor: Lock XArray when getting entries for the VMLiviu Dudau
Similar to commit cac075706f29 ("drm/panthor: Fix race when converting group handle to group object") we need to use the XArray's internal locking when retrieving a vm pointer from there. v2: Removed part of the patch that was trying to protect fetching the heap pointer from XArray, as that operation is protected by the @pool->lock. Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block") Reported-by: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241106185806.389089-1-liviu.dudau@arm.com
2024-11-06drm/panthor: Fix OPP refcnt leaks in devfreq initialisationAdrián Larumbe
Rearrange lookup of recommended OPP for the Mali GPU device and its refcnt decremental to make sure no OPP object leaks happen in the error path. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Fixes: fac9b22df4b1 ("drm/panthor: Add the devfreq logical block") Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105205458.1318989-2-adrian.larumbe@collabora.com
2024-11-04Backmerge v6.12-rc6 of ↵Dave Airlie
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next Backmerge Linus tree for some drm-fixes needed for msm and xe merges. Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-10-30drm/panthor: Report group as timedout when we fail to properly suspendBoris Brezillon
If we don't do that, the group is considered usable by userspace, but all further GROUP_SUBMIT will fail with -EINVAL. Changes in v3: - Add R-bs Changes in v2: - New patch Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029152912.270346-3-boris.brezillon@collabora.com
2024-10-30drm/panthor: Fail job creation when the group is deadBoris Brezillon
Userspace can use GROUP_SUBMIT errors as a trigger to check the group state and recreate the group if it became unusable. Make sure we report an error when the group became unusable. Changes in v3: - None Changes in v2: - Add R-bs Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029152912.270346-2-boris.brezillon@collabora.com
2024-10-30drm/panthor: Fix firmware initialization on systems with a page size > 4kBoris Brezillon
The system and GPU MMU page size might differ, which becomes a problem for FW sections that need to be mapped at explicit addresses since our PAGE_SIZE alignment might cover a VA range that's expected to be used for another section. Make sure we never map more than we need. Changes in v3: - Add R-bs Changes in v2: - Plan for per-VM page sizes so the MCU VM and user VM can have different pages sizes Fixes: 2718d91816ee ("drm/panthor: Add the FW logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241030150231.768949-1-boris.brezillon@collabora.com