summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
6 daysdrm/gem: Fix race in drm_gem_handle_create_tail()Simona Vetter
Object creation is a careful dance where we must guarantee that the object is fully constructed before it is visible to other threads, and GEM buffer objects are no difference. Final publishing happens by calling drm_gem_handle_create(). After that the only allowed thing to do is call drm_gem_object_put() because a concurrent call to the GEM_CLOSE ioctl with a correctly guessed id (which is trivial since we have a linear allocator) can already tear down the object again. Luckily most drivers get this right, the very few exceptions I've pinged the relevant maintainers for. Unfortunately we also need drm_gem_handle_create() when creating additional handles for an already existing object (e.g. GETFB ioctl or the various bo import ioctl), and hence we cannot have a drm_gem_handle_create_and_put() as the only exported function to stop these issues from happening. Now unfortunately the implementation of drm_gem_handle_create() isn't living up to standards: It does correctly finishe object initialization at the global level, and hence is safe against a concurrent tear down. But it also sets up the file-private aspects of the handle, and that part goes wrong: We fully register the object in the drm_file.object_idr before calling drm_vma_node_allow() or obj->funcs->open, which opens up races against concurrent removal of that handle in drm_gem_handle_delete(). Fix this with the usual two-stage approach of first reserving the handle id, and then only registering the object after we've completed the file-private setup. Jacek reported this with a testcase of concurrently calling GEM_CLOSE on a freshly-created object (which also destroys the object), but it should be possible to hit this with just additional handles created through import or GETFB without completed destroying the underlying object with the concurrent GEM_CLOSE ioctl calls. Note that the close-side of this race was fixed in f6cd7daecff5 ("drm: Release driver references to handle before making it available again"), which means a cool 9 years have passed until someone noticed that we need to make this symmetry or there's still gaps left :-/ Without the 2-stage close approach we'd still have a race, therefore that's an integral part of this bugfix. More importantly, this means we can have NULL pointers behind allocated id in our drm_file.object_idr. We need to check for that now: - drm_gem_handle_delete() checks for ERR_OR_NULL already - drm_gem.c:object_lookup() also chekcs for NULL - drm_gem_release() should never be called if there's another thread still existing that could call into an IOCTL that creates a new handle, so cannot race. For paranoia I added a NULL check to drm_gem_object_release_handle() though. - most drivers (etnaviv, i915, msm) are find because they use idr_find(), which maps both ENOENT and NULL to NULL. - drivers using idr_for_each_entry() should also be fine, because idr_get_next does filter out NULL entries and continues the iteration. - The same holds for drm_show_memory_stats(). v2: Use drm_WARN_ON (Thomas) Reported-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Tested-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: stable@vger.kernel.org Cc: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Signed-off-by: Simona Vetter <simona.vetter@intel.com> Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20250707151814.603897-1-simona.vetter@ffwll.ch
6 daysdrm/framebuffer: Acquire internal references on GEM handlesThomas Zimmermann
Acquire GEM handles in drm_framebuffer_init() and release them in the corresponding drm_framebuffer_cleanup(). Ties the handle's lifetime to the framebuffer. Not all GEM buffer objects have GEM handles. If not set, no refcounting takes place. This is the case for some fbdev emulation. This is not a problem as these GEM objects do not use dma-bufs and drivers will not release them while fbdev emulation is running. Framebuffer flags keep a bit per color plane of which the framebuffer holds a GEM handle reference. As all drivers use drm_framebuffer_init(), they will now all hold dma-buf references as fixed in commit 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers"). In the GEM framebuffer helpers, restore the original ref counting on buffer objects. As the helpers for handle refcounting are now no longer called from outside the DRM core, unexport the symbols. v3: - don't mix internal flags with mode flags (Christian) v2: - track framebuffer handle refs by flag - drop gma500 cleanup (Christian) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers") Reported-by: Bert Karwatzki <spasswolf@web.de> Closes: https://lore.kernel.org/dri-devel/20250703115915.3096-1-spasswolf@web.de/ Tested-by: Bert Karwatzki <spasswolf@web.de> Tested-by: Mario Limonciello <superm1@kernel.org> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Anusha Srivatsa <asrivats@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: <stable@vger.kernel.org> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250707131224.249496-1-tzimmermann@suse.de
6 daysagp/amd64: Check AGP Capability before binding to unsupported devicesLukas Wunner
Since commit 172efbb40333 ("AGP: Try unsupported AGP chipsets on x86-64 by default"), the AGP driver for AMD Opteron/Athlon64 CPUs has attempted to bind to any PCI device possessing an AGP Capability. Commit 6fd024893911 ("amd64-agp: Probe unknown AGP devices the right way") subsequently reworked the driver to perform a bind attempt to any PCI device (regardless of AGP Capability) and reject a device in the driver's ->probe() hook if it lacks the AGP Capability. On modern CPUs exposing an AMD IOMMU, this subtle change results in an annoying message with KERN_CRIT severity: pci 0000:00:00.2: Resources present before probing The message is emitted by the driver core prior to invoking a driver's ->probe() hook. The check for an AGP Capability in the ->probe() hook happens too late to prevent the message. The message has appeared only recently with commit 3be5fa236649 (Revert "iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices"). Prior to the commit, no driver could bind to AMD IOMMUs. The reason for the message is that an MSI is requested early on for the AMD IOMMU, which results in a call from msi_sysfs_create_group() to devm_device_add_group(). A devres resource is thus attached to the driver-less AMD IOMMU, which is normally not allowed, but presumably cannot be avoided because requesting the MSI from a regular PCI driver might be too late. Avoid the message by once again checking for an AGP Capability *before* binding to an unsupported device. Achieve that by way of the PCI core's dynid functionality. pci_add_dynid() can fail only with -ENOMEM (on allocation failure) or -EINVAL (on bus_to_subsys() failure). It doesn't seem worth the extra code to propagate those error codes out of the for_each_pci_dev() loop, so simply error out with -ENODEV if there was no successful bind attempt. In the -ENOMEM case, a splat is emitted anyway, and the -EINVAL case can never happen because it requires failure of bus_register(&pci_bus_type), in which case there's no driver probing of PCI devices. Hans has voiced a preference to no longer probe unsupported devices by default (i.e. set agp_try_unsupported = 0). In fact, the help text for CONFIG_AGP_AMD64 pretends this to be the default. Alternatively, he proposes probing only devices with PCI_CLASS_BRIDGE_HOST. However these approaches risk regressing users who depend on the existing behavior. Fixes: 3be5fa236649 (Revert "iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices") Reported-by: Fedor Pchelkin <pchelkin@ispras.ru> Closes: https://lore.kernel.org/r/wpoivftgshz5b5aovxbkxl6ivvquinukqfvb5z6yi4mv7d25ew@edtzr2p74ckg/ Reported-by: Hans de Goede <hansg@kernel.org> Closes: https://lore.kernel.org/r/20250625112411.4123-1-hansg@kernel.org/ Tested-by: Hans de Goede <hansg@kernel.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Hans de Goede <hansg@kernel.org> Link: https://lore.kernel.org/r/b29e7fbfc6d146f947603d0ebaef44cbd2f0d754.1751468802.git.lukas@wunner.de
8 daysdrm/nouveau/gsp: fix potential leak of memory used during acpi initBen Skeggs
If any of the ACPI calls fail, memory allocated for the input buffer would be leaked. Fix failure paths to free allocated memory. Also add checks to ensure the allocations succeeded in the first place. Reported-by: Danilo Krummrich <dakr@kernel.org> Fixes: 176fdcbddfd2 ("drm/nouveau/gsp/r535: add support for booting GSP-RM") Signed-off-by: Ben Skeggs <bskeggs@nvidia.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250617040036.2932-1-bskeggs@nvidia.com
10 daysrust: drm: remove unnecessary importsTamir Duberstein
`kernel::str::CStr` is included in the prelude. Signed-off-by: Tamir Duberstein <tamird@gmail.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250704-cstr-include-drm-v1-1-a279dfc4d753@gmail.com
10 daysMAINTAINERS: Change habanalabs maintainerOfir Bitton
I will be leaving Intel soon, Yaron Avizrat will take the role of habanalabs driver maintainer. Signed-off-by: Ofir Bitton <obitton@habana.ai> Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Yaron Avizrat <yaron.avizrat@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Oded Gabbay <ogabbay@kernel.org> Link: https://lore.kernel.org/r/20240729121718.540489-2-obitton@habana.ai
11 daysdrm/imagination: Fix kernel crash when hard resetting the GPUAlessio Belle
The GPU hard reset sequence calls pm_runtime_force_suspend() and pm_runtime_force_resume(), which according to their documentation should only be used during system-wide PM transitions to sleep states. The main issue though is that depending on some internal runtime PM state as seen by pm_runtime_force_suspend() (whether the usage count is <= 1), pm_runtime_force_resume() might not resume the device unless needed. If that happens, the runtime PM resume callback pvr_power_device_resume() is not called, the GPU clocks are not re-enabled, and the kernel crashes on the next attempt to access GPU registers as part of the power-on sequence. Replace calls to pm_runtime_force_suspend() and pm_runtime_force_resume() with direct calls to the driver's runtime PM callbacks, pvr_power_device_suspend() and pvr_power_device_resume(), to ensure clocks are re-enabled and avoid the kernel crash. Fixes: cc1aeedb98ad ("drm/imagination: Implement firmware infrastructure and META FW support") Signed-off-by: Alessio Belle <alessio.belle@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://lore.kernel.org/r/20250624-fix-kernel-crash-gpu-hard-reset-v1-1-6d24810d72a6@imgtec.com Cc: stable@vger.kernel.org Signed-off-by: Matt Coster <matt.coster@imgtec.com>
11 daysdrm/tegra: nvdec: Fix dma_alloc_coherent error checkMikko Perttunen
Check for NULL return value with dma_alloc_coherent, in line with Robin's fix for vic.c in 'drm/tegra: vic: Fix DMA API misuse'. Fixes: 46f226c93d35 ("drm/tegra: Add NVDEC driver") Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20250702-nvdec-dma-error-check-v1-1-c388b402c53a@nvidia.com
11 daysrust: drm: device: drop_in_place() the drm::Device in release()Danilo Krummrich
In drm::Device::new() we allocate with __drm_dev_alloc() and return an ARef<drm::Device>. When the reference count of the drm::Device falls to zero, the C code automatically calls drm_dev_release(), which eventually frees the memory allocated in drm::Device::new(). However, due to that, drm::Device::drop() is never called. As a result the destructor of the user's private data, i.e. drm::Device::data is never called. Hence, fix this by calling drop_in_place() from the DRM device's release callback. Fixes: 1e4b8896c0f3 ("rust: drm: add device abstraction") Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250629153747.72536-1-dakr@kernel.org
11 daysnouveau/gsp: add a 50ms delay between fbsr and driver unload rpcsDave Airlie
This fixes a bunch of command hangs after runtime suspend/resume. This fixes a regression caused by code movement in the commit below, the commit seems to just change timings enough to cause this to happen now, and adding the sleep seems to avoid it. I've spent some time trying to root cause it to no great avail, it seems like a bug on the firmware side, but it could be a bug in our rpc handling that I can't find. Either way, we should land the workaround to fix the problem, while we continue to work out the root cause. Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: Ben Skeggs <bskeggs@nvidia.com> Cc: Danilo Krummrich <dakr@kernel.org> Fixes: c21b039715ce ("drm/nouveau/gsp: add hals for fbsr.suspend/resume()") Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250702232707.175679-1-airlied@gmail.com
11 daysdrm/nouveau: Do not fail module init on debugfs errorsAaron Thompson
If CONFIG_DEBUG_FS is enabled, nouveau_drm_init() returns an error if it fails to create the "nouveau" directory in debugfs. One case where that will happen is when debugfs access is restricted by CONFIG_DEBUG_FS_ALLOW_NONE or by the boot parameter debugfs=off, which cause the debugfs APIs to return -EPERM. So just ignore errors from debugfs. Note that nouveau_debugfs_root may be an error now, but that is a standard pattern for debugfs. From include/linux/debugfs.h: "NOTE: it's expected that most callers should _ignore_ the errors returned by this function. Other debugfs functions handle the fact that the "dentry" passed to them could be an error and they don't crash in that case. Drivers should generally work fine even if debugfs fails to init anyway." Fixes: 97118a1816d2 ("drm/nouveau: create module debugfs root") Cc: stable@vger.kernel.org Signed-off-by: Aaron Thompson <dev@aaront.org> Acked-by: Timur Tabi <ttabi@nvidia.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250703211949.9916-1-dev@aaront.org
12 daysdrm/v3d: Disable interrupts before resetting the GPUMaíra Canal
Currently, an interrupt can be triggered during a GPU reset, which can lead to GPU hangs and NULL pointer dereference in an interrupt context as shown in the following trace: [ 314.035040] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0 [ 314.043822] Mem abort info: [ 314.046606] ESR = 0x0000000096000005 [ 314.050347] EC = 0x25: DABT (current EL), IL = 32 bits [ 314.055651] SET = 0, FnV = 0 [ 314.058695] EA = 0, S1PTW = 0 [ 314.061826] FSC = 0x05: level 1 translation fault [ 314.066694] Data abort info: [ 314.069564] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 314.075039] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 314.080080] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 314.085382] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000102728000 [ 314.091814] [00000000000000c0] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 [ 314.100511] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP [ 314.106770] Modules linked in: v3d i2c_brcmstb vc4 snd_soc_hdmi_codec gpu_sched drm_shmem_helper drm_display_helper cec drm_dma_helper drm_kms_helper drm drm_panel_orientation_quirks snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd backlight [ 314.129654] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.25+rpt-rpi-v8 #1 Debian 1:6.12.25-1+rpt1 [ 314.139388] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT) [ 314.145211] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 314.152165] pc : v3d_irq+0xec/0x2e0 [v3d] [ 314.156187] lr : v3d_irq+0xe0/0x2e0 [v3d] [ 314.160198] sp : ffffffc080003ea0 [ 314.163502] x29: ffffffc080003ea0 x28: ffffffec1f184980 x27: 021202b000000000 [ 314.170633] x26: ffffffec1f17f630 x25: ffffff8101372000 x24: ffffffec1f17d9f0 [ 314.177764] x23: 000000000000002a x22: 000000000000002a x21: ffffff8103252000 [ 314.184895] x20: 0000000000000001 x19: 00000000deadbeef x18: 0000000000000000 [ 314.192026] x17: ffffff94e51d2000 x16: ffffffec1dac3cb0 x15: c306000000000000 [ 314.199156] x14: 0000000000000000 x13: b2fc982e03cc5168 x12: 0000000000000001 [ 314.206286] x11: ffffff8103f8bcc0 x10: ffffffec1f196868 x9 : ffffffec1dac3874 [ 314.213416] x8 : 0000000000000000 x7 : 0000000000042a3a x6 : ffffff810017a180 [ 314.220547] x5 : ffffffec1ebad400 x4 : ffffffec1ebad320 x3 : 00000000000bebeb [ 314.227677] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 [ 314.234807] Call trace: [ 314.237243] v3d_irq+0xec/0x2e0 [v3d] [ 314.240906] __handle_irq_event_percpu+0x58/0x218 [ 314.245609] handle_irq_event+0x54/0xb8 [ 314.249439] handle_fasteoi_irq+0xac/0x240 [ 314.253527] handle_irq_desc+0x48/0x68 [ 314.257269] generic_handle_domain_irq+0x24/0x38 [ 314.261879] gic_handle_irq+0x48/0xd8 [ 314.265533] call_on_irq_stack+0x24/0x58 [ 314.269448] do_interrupt_handler+0x88/0x98 [ 314.273624] el1_interrupt+0x34/0x68 [ 314.277193] el1h_64_irq_handler+0x18/0x28 [ 314.281281] el1h_64_irq+0x64/0x68 [ 314.284673] default_idle_call+0x3c/0x168 [ 314.288675] do_idle+0x1fc/0x230 [ 314.291895] cpu_startup_entry+0x3c/0x50 [ 314.295810] rest_init+0xe4/0xf0 [ 314.299030] start_kernel+0x5e8/0x790 [ 314.302684] __primary_switched+0x80/0x90 [ 314.306691] Code: 940029eb 360ffc13 f9442ea0 52800001 (f9406017) [ 314.312775] ---[ end trace 0000000000000000 ]--- [ 314.317384] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 314.324249] SMP: stopping secondary CPUs [ 314.328167] Kernel Offset: 0x2b9da00000 from 0xffffffc080000000 [ 314.334076] PHYS_OFFSET: 0x0 [ 314.336946] CPU features: 0x08,00002013,c0200000,0200421b [ 314.342337] Memory Limit: none [ 314.345382] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- Before resetting the GPU, it's necessary to disable all interrupts and deal with any interrupt handler still in-flight. Otherwise, the GPU might reset with jobs still running, or yet, an interrupt could be handled during the reset. Cc: stable@vger.kernel.org Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+") Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://lore.kernel.org/r/20250628224243.47599-1-mcanal@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
13 daysdrm/gem: Acquire references on GEM handles for framebuffersThomas Zimmermann
A GEM handle can be released while the GEM buffer object is attached to a DRM framebuffer. This leads to the release of the dma-buf backing the buffer object, if any. [1] Trying to use the framebuffer in further mode-setting operations leads to a segmentation fault. Most easily happens with driver that use shadow planes for vmap-ing the dma-buf during a page flip. An example is shown below. [ 156.791968] ------------[ cut here ]------------ [ 156.796830] WARNING: CPU: 2 PID: 2255 at drivers/dma-buf/dma-buf.c:1527 dma_buf_vmap+0x224/0x430 [...] [ 156.942028] RIP: 0010:dma_buf_vmap+0x224/0x430 [ 157.043420] Call Trace: [ 157.045898] <TASK> [ 157.048030] ? show_trace_log_lvl+0x1af/0x2c0 [ 157.052436] ? show_trace_log_lvl+0x1af/0x2c0 [ 157.056836] ? show_trace_log_lvl+0x1af/0x2c0 [ 157.061253] ? drm_gem_shmem_vmap+0x74/0x710 [ 157.065567] ? dma_buf_vmap+0x224/0x430 [ 157.069446] ? __warn.cold+0x58/0xe4 [ 157.073061] ? dma_buf_vmap+0x224/0x430 [ 157.077111] ? report_bug+0x1dd/0x390 [ 157.080842] ? handle_bug+0x5e/0xa0 [ 157.084389] ? exc_invalid_op+0x14/0x50 [ 157.088291] ? asm_exc_invalid_op+0x16/0x20 [ 157.092548] ? dma_buf_vmap+0x224/0x430 [ 157.096663] ? dma_resv_get_singleton+0x6d/0x230 [ 157.101341] ? __pfx_dma_buf_vmap+0x10/0x10 [ 157.105588] ? __pfx_dma_resv_get_singleton+0x10/0x10 [ 157.110697] drm_gem_shmem_vmap+0x74/0x710 [ 157.114866] drm_gem_vmap+0xa9/0x1b0 [ 157.118763] drm_gem_vmap_unlocked+0x46/0xa0 [ 157.123086] drm_gem_fb_vmap+0xab/0x300 [ 157.126979] drm_atomic_helper_prepare_planes.part.0+0x487/0xb10 [ 157.133032] ? lockdep_init_map_type+0x19d/0x880 [ 157.137701] drm_atomic_helper_commit+0x13d/0x2e0 [ 157.142671] ? drm_atomic_nonblocking_commit+0xa0/0x180 [ 157.147988] drm_mode_atomic_ioctl+0x766/0xe40 [...] [ 157.346424] ---[ end trace 0000000000000000 ]--- Acquiring GEM handles for the framebuffer's GEM buffer objects prevents this from happening. The framebuffer's cleanup later puts the handle references. Commit 1a148af06000 ("drm/gem-shmem: Use dma_buf from GEM object instance") triggers the segmentation fault easily by using the dma-buf field more widely. The underlying issue with reference counting has been present before. v2: - acquire the handle instead of the BO (Christian) - fix comment style (Christian) - drop the Fixes tag (Christian) - rename err_ gotos - add missing Link tag Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://elixir.bootlin.com/linux/v6.15/source/drivers/gpu/drm/drm_gem.c#L241 # [1] Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Anusha Srivatsa <asrivats@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: <stable@vger.kernel.org> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250630084001.293053-1-tzimmermann@suse.de
13 daysdrm/sched: Increment job count before swapping tail spsc queueMatthew Brost
A small race exists between spsc_queue_push and the run-job worker, in which spsc_queue_push may return not-first while the run-job worker has already idled due to the job count being zero. If this race occurs, job scheduling stops, leading to hangs while waiting on the job’s DMA fences. Seal this race by incrementing the job count before appending to the SPSC queue. This race was observed on a drm-tip 6.16-rc1 build with the Xe driver in an SVM test case. Fixes: 1b1f42d8fde4 ("drm: move amd_gpu_scheduler into common location") Fixes: 27105db6c63a ("drm/amdgpu: Add SPSC queue to scheduler.") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250613212013.719312-1-matthew.brost@intel.com
2025-06-30drm/vmwgfx: Fix guests running with TDX/SEVMarko Kiiskila
Commit 81256a50aa0f ("x86/mm: Make memremap(MEMREMAP_WB) map memory as encrypted by default") changed the default behavior of memremap(MEMREMAP_WB) and started mapping memory as encrypted. The driver requires the fifo memory to be decrypted to communicate with the host but was relaying on the old default behavior of memremap(MEMREMAP_WB) and thus broke. Fix it by explicitly specifying the desired behavior and passing MEMREMAP_DEC to memremap. Fixes: 81256a50aa0f ("x86/mm: Make memremap(MEMREMAP_WB) map memory as encrypted by default") Signed-off-by: Marko Kiiskila <marko.kiiskila@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/r/20250618192926.1092450-1-zack.rusin@broadcom.com
2025-06-30drm/bridge: aux-hpd-bridge: fix assignment of the of_nodeDmitry Baryshkov
Perform fix similar to the one in the commit 85e444a68126 ("drm/bridge: Fix assignment of the of_node of the parent to aux bridge"). The assignment of the of_node to the aux HPD bridge needs to mark the of_node as reused, otherwise driver core will attempt to bind resources like pinctrl, which is going to fail as corresponding pins are already marked as used by the parent device. Fix that by using the device_set_of_node_from_dev() helper instead of assigning it directly. Fixes: e560518a6c2e ("drm/bridge: implement generic DP HPD bridge") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250608-fix-aud-hpd-bridge-v1-1-4641a6f8e381@oss.qualcomm.com
2025-06-30drm/bridge: panel: move prepare_prev_first handling to ↵Dmitry Baryshkov
drm_panel_bridge_add_typed The commit 5ea6b1702781 ("drm/panel: Add prepare_prev_first flag to drm_panel") and commit 0974687a19c3 ("drm/bridge: panel: Set pre_enable_prev_first from drmm_panel_bridge_add") added handling of panel's prepare_prev_first to devm_panel_bridge_add() and drmm_panel_bridge_add(). However if the driver calls drm_panel_bridge_add_typed() directly, then the flag won't be handled and thus the drm_bridge.pre_enable_prev_first will not be set. Move prepare_prev_first handling to the drm_panel_bridge_add_typed() so that there is no way to miss the flag. Fixes: 5ea6b1702781 ("drm/panel: Add prepare_prev_first flag to drm_panel") Fixes: 0974687a19c3 ("drm/bridge: panel: Set pre_enable_prev_first from drmm_panel_bridge_add") Reported-by: Svyatoslav Ryhel <clamor95@gmail.com> Closes: https://lore.kernel.org/dri-devel/CAPVz0n3YZass3Bns1m0XrFxtAC0DKbEPiW6vXimQx97G243sXw@mail.gmail.com/ Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250220-panel_prev_first-v1-1-b9e787825a1a@linaro.org
2025-06-30drm/ttm: fix error handling in ttm_buffer_object_transferChristian König
Unlocking the resv object was missing in the error path, additionally to that we should move over the resource only after the fence slot was reserved. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Fixes: c8d4c18bfbc4a ("dma-buf/drivers: make reserving a shared slot mandatory v4") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20250616130726.22863-3-christian.koenig@amd.com
2025-06-30dma-buf: fix timeout handling in dma_resv_wait_timeout v2Christian König
Even the kerneldoc says that with a zero timeout the function should not wait for anything, but still return 1 to indicate that the fences are signaled now. Unfortunately that isn't what was implemented, instead of only returning 1 we also waited for at least one jiffies. Fix that by adjusting the handling to what the function is actually documented to do. v2: improve code readability Reported-by: Marek Olšák <marek.olsak@amd.com> Reported-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20250129105841.1806-1-christian.koenig@amd.com
2025-06-27drm/vesadrm: Avoid NULL-ptr deref in vesadrm_pmi_cmap_write()Thomas Zimmermann
Only set PMI fields if the screen_info's Vesa PM segment has been set. Vesa PMI is the power-management interface. It also provides means to set the color palette. The interface is optional, so not all VESA graphics cards support it. Print vesafb's warning [1] if the hardware palette cannot be set at all. If unsupported the field PrimaryPalette in struct vesadrm.pmi is NULL, which results in a segmentation fault. Happens with qemu's Cirrus emulation. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 814d270b31d2 ("drm/sysfb: vesadrm: Add gamma correction") Link: https://elixir.bootlin.com/linux/v6.15/source/drivers/video/fbdev/vesafb.c#L375 # 1 Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: dri-devel@lists.freedesktop.org Acked-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://lore.kernel.org/r/20250617140944.142392-1-tzimmermann@suse.de
2025-06-27drm/panel: panel-simple: get rid of panel_dpi hackMaxime Ripard
The empty panel_dpi struct was only ever used as a discriminant, but it's kind of a hack, and with the reworks done in the previous patches, we shouldn't need it anymore. Let's get rid of it. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Colibri iMX6 Link: https://lore.kernel.org/r/20250626-drm-panel-simple-fixes-v2-5-5afcaa608bdc@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-27drm/panel: panel-simple: Add function to look panel data upMaxime Ripard
Commit de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") moved the call to drm_panel_init into the devm_drm_panel_alloc(), which needs a connector type to initialize properly. In the panel-dpi compatible case, the passed panel_desc structure is an empty one used as a discriminant, and the connector type it contains isn't actually initialized. It is initialized through a call to panel_dpi_probe() later in the function, which used to be before the call to drm_panel_init() that got merged into devm_drm_panel_alloc(). So, we do need a proper panel_desc pointer before the call to devm_drm_panel_alloc() now. All cases associate their panel_desc with the panel compatible and use of_device_get_match_data, except for the panel-dpi compatible. In that case, we're expected to call panel_dpi_probe, which will allocate and initialize the panel_desc for us. Let's create such a helper function that would be called first in the driver and will lookup the desc by compatible, or allocate one if relevant. Reported-by: Francesco Dolcini <francesco@dolcini.it> Closes: https://lore.kernel.org/all/20250612081834.GA248237@francesco-nb/ Fixes: de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Colibri iMX6 Link: https://lore.kernel.org/r/20250626-drm-panel-simple-fixes-v2-4-5afcaa608bdc@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-27drm/panel: panel-simple: Make panel_simple_probe return its panelMaxime Ripard
In order to fix the regession introduced by commit de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()"), we need to move the panel_desc lookup into the common panel_simple_probe() function. There's two callers for that function, the probe implementations of the platform and MIPI-DSI drivers panel-simple implements. The MIPI-DSI driver's probe will need to access the current panel_desc to initialize properly, which won't be possible anymore if we make that lookup in panel_simple_probe(). However, we can make panel_simple_probe() return the initialized panel_simple structure it allocated, which will contain a pointer to the associated panel_desc in its desc field. This doesn't fix de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") still, but makes progress towards that goal. Fixes: de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Colibri iMX6 Link: https://lore.kernel.org/r/20250626-drm-panel-simple-fixes-v2-3-5afcaa608bdc@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-27drm/panel: panel-simple: make panel_dpi_probe return a panel_descMaxime Ripard
If the panel-simple driver is probed from a panel-dpi compatible, the driver will use an empty panel_desc structure as a descriminant. It will then allocate and fill another panel_desc as part of its probe. However, that allocation needs to happen after the panel_simple structure has been allocated, since panel_dpi_probe(), the function doing the panel_desc allocation and initialization, takes a panel_simple pointer as an argument. This pointer is used to fill the panel_simple->desc pointer that is still initialized with the empty panel_desc when panel_dpi_probe() is called. Since commit de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()"), we will need the panel connector type found in panel_desc to allocate panel_simple. This creates a circular dependency where we need panel_desc to create panel_simple, and need panel_simple to create panel_desc. Let's break that dependency by making panel_dpi_probe simply return the panel_desc it initialized and move the panel_simple->desc assignment to the caller. This will not fix the breaking commit entirely, but will move us towards the right direction. Fixes: de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Colibri iMX6 Link: https://lore.kernel.org/r/20250626-drm-panel-simple-fixes-v2-2-5afcaa608bdc@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-27drm/mipi-dsi: Add dev_is_mipi_dsi functionMaxime Ripard
This will be especially useful for generic panels (like panel-simple) which can take different code path depending on if they are MIPI-DSI devices or platform devices. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Colibri iMX6 Link: https://lore.kernel.org/r/20250626-drm-panel-simple-fixes-v2-1-5afcaa608bdc@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-25drm/bridge: ti-sn65dsi86: Add HPD for DisplayPort connector typeJayesh Choudhary
By default, HPD was disabled on SN65DSI86 bridge. When the driver was added (commit "a095f15c00e27"), the HPD_DISABLE bit was set in pre-enable call which was moved to other function calls subsequently. Later on, commit "c312b0df3b13" added detect utility for DP mode. But with HPD_DISABLE bit set, all the HPD events are disabled[0] and the debounced state always return 1 (always connected state). Set HPD_DISABLE bit conditionally based on display sink's connector type. Since the HPD_STATE is reflected correctly only after waiting for debounce time (~100-400ms) and adding this delay in detect() is not feasible owing to the performace impact (glitches and frame drop), remove runtime calls in detect() and add hpd_enable()/disable() bridge hooks with runtime calls, to detect hpd properly without any delay. [0]: <https://www.ti.com/lit/gpn/SN65DSI86> (Pg. 32) Fixes: c312b0df3b13 ("drm/bridge: ti-sn65dsi86: Implement bridge connector operations for DP") Cc: Max Krummenacher <max.krummenacher@toradex.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Tested-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Jayesh Choudhary <j-choudhary@ti.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250624044835.165708-1-j-choudhary@ti.com
2025-06-24drm/bridge-connector: Fix bridge in drm_connector_hdmi_audio_init()Chaoyi Chen
The bridge used in drm_connector_hdmi_audio_init() does not correctly point to the required audio bridge, which lead to incorrect audio configuration input. Fixes: 231adeda9f67 ("drm/bridge-connector: hook DisplayPort audio support") Signed-off-by: Chaoyi Chen <chaoyi.chen@rock-chips.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Tested-by: Stephan Gerhold <stephan.gerhold@linaro.org> Link: https://lore.kernel.org/r/20250620011616.118-1-kernel@airkyi.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-06-23drm: writeback: Fix drm_writeback_connector_cleanup signatureLouis Chauvet
The drm_writeback_connector_cleanup have the signature: static void drm_writeback_connector_cleanup( struct drm_device *dev, struct drm_writeback_connector *wb_connector) But it is stored and used as a drmres_release_t typedef void (*drmres_release_t)(struct drm_device *dev, void *res); While the current code is valid and does not produce any warning, the CFI runtime check (CONFIG_CFI_CLANG) can fail because the function signature is not the same as drmres_release_t. In order to fix this, change the function signature to match what is expected by drmres_release_t. Fixes: 1914ba2b91ea ("drm: writeback: Create drmm variants for drm_writeback_connector initialization") Suggested-by: Mark Yacoub <markyacoub@google.com> Reviewed-by: Maíra Canal <mcanal@igalia.com> Link: https://lore.kernel.org/r/20250429-drm-fix-writeback-cleanup-v2-1-548ff3a4e284@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-06-16drm/etnaviv: Protect the scheduler's pending list with its lockMaíra Canal
Commit 704d3d60fec4 ("drm/etnaviv: don't block scheduler when GPU is still active") ensured that active jobs are returned to the pending list when extending the timeout. However, it didn't use the pending list's lock to manipulate the list, which causes a race condition as the scheduler's workqueues are running. Hold the lock while manipulating the scheduler's pending list to prevent a race. Cc: stable@vger.kernel.org Fixes: 704d3d60fec4 ("drm/etnaviv: don't block scheduler when GPU is still active") Reported-by: Philipp Stanner <phasta@kernel.org> Closes: https://lore.kernel.org/dri-devel/964e59ba1539083ef29b06d3c78f5e2e9b138ab8.camel@mailbox.org/ Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250602132240.93314-1-mcanal@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-06-16drm/v3d: Avoid NULL pointer dereference in `v3d_job_update_stats()`Maíra Canal
The following kernel Oops was recently reported by Mesa CI: [ 800.139824] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000588 [ 800.148619] Mem abort info: [ 800.151402] ESR = 0x0000000096000005 [ 800.155141] EC = 0x25: DABT (current EL), IL = 32 bits [ 800.160444] SET = 0, FnV = 0 [ 800.163488] EA = 0, S1PTW = 0 [ 800.166619] FSC = 0x05: level 1 translation fault [ 800.171487] Data abort info: [ 800.174357] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 800.179832] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 800.184873] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 800.190176] user pgtable: 4k pages, 39-bit VAs, pgdp=00000001014c2000 [ 800.196607] [0000000000000588] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 [ 800.205305] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP [ 800.211564] Modules linked in: vc4 snd_soc_hdmi_codec drm_display_helper v3d cec gpu_sched drm_dma_helper drm_shmem_helper drm_kms_helper drm drm_panel_orientation_quirks snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm i2c_brcmstb snd_timer snd backlight [ 800.234448] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.25+rpt-rpi-v8 #1 Debian 1:6.12.25-1+rpt1 [ 800.244182] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT) [ 800.250005] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 800.256959] pc : v3d_job_update_stats+0x60/0x130 [v3d] [ 800.262112] lr : v3d_job_update_stats+0x48/0x130 [v3d] [ 800.267251] sp : ffffffc080003e60 [ 800.270555] x29: ffffffc080003e60 x28: ffffffd842784980 x27: 0224012000000000 [ 800.277687] x26: ffffffd84277f630 x25: ffffff81012fd800 x24: 0000000000000020 [ 800.284818] x23: ffffff8040238b08 x22: 0000000000000570 x21: 0000000000000158 [ 800.291948] x20: 0000000000000000 x19: ffffff8040238000 x18: 0000000000000000 [ 800.299078] x17: ffffffa8c1bd2000 x16: ffffffc080000000 x15: 0000000000000000 [ 800.306208] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 800.313338] x11: 0000000000000040 x10: 0000000000001a40 x9 : ffffffd83b39757c [ 800.320468] x8 : ffffffd842786420 x7 : 7fffffffffffffff x6 : 0000000000ef32b0 [ 800.327598] x5 : 00ffffffffffffff x4 : 0000000000000015 x3 : ffffffd842784980 [ 800.334728] x2 : 0000000000000004 x1 : 0000000000010002 x0 : 000000ba4c0ca382 [ 800.341859] Call trace: [ 800.344294] v3d_job_update_stats+0x60/0x130 [v3d] [ 800.349086] v3d_irq+0x124/0x2e0 [v3d] [ 800.352835] __handle_irq_event_percpu+0x58/0x218 [ 800.357539] handle_irq_event+0x54/0xb8 [ 800.361369] handle_fasteoi_irq+0xac/0x240 [ 800.365458] handle_irq_desc+0x48/0x68 [ 800.369200] generic_handle_domain_irq+0x24/0x38 [ 800.373810] gic_handle_irq+0x48/0xd8 [ 800.377464] call_on_irq_stack+0x24/0x58 [ 800.381379] do_interrupt_handler+0x88/0x98 [ 800.385554] el1_interrupt+0x34/0x68 [ 800.389123] el1h_64_irq_handler+0x18/0x28 [ 800.393211] el1h_64_irq+0x64/0x68 [ 800.396603] default_idle_call+0x3c/0x168 [ 800.400606] do_idle+0x1fc/0x230 [ 800.403827] cpu_startup_entry+0x40/0x50 [ 800.407742] rest_init+0xe4/0xf0 [ 800.410962] start_kernel+0x5e8/0x790 [ 800.414616] __primary_switched+0x80/0x90 [ 800.418622] Code: 8b170277 8b160296 11000421 b9000861 (b9401ac1) [ 800.424707] ---[ end trace 0000000000000000 ]--- [ 800.457313] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- This issue happens when the file descriptor is closed before the jobs submitted by it are completed. When the job completes, we update the global GPU stats and the per-fd GPU stats, which are exposed through fdinfo. If the file descriptor was closed, then the struct `v3d_file_priv` and its stats were already freed and we can't update the per-fd stats. Therefore, if the file descriptor was already closed, don't update the per-fd GPU stats, only update the global ones. Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Link: https://lore.kernel.org/r/20250602151451.10161-1-mcanal@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-06-13Documentation: nouveau: Update GSP message queue kernel-doc referenceBagas Sanjaya
GSP message queue docs has been moved following RPC handling split in commit 8a8b1ec5261f20 ("drm/nouveau/gsp: split rpc handling out on its own"), before GSP-RM implementation is versioned in commit c472d828348caf ("drm/nouveau/gsp: move subdev/engine impls to subdev/gsp/rm/r535/"). However, the kernel-doc reference in nouveau docs is left behind, which triggers htmldocs warnings: ERROR: Cannot find file ./drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c WARNING: No kernel-doc for file ./drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c Update the reference. Fixes: c472d828348c ("drm/nouveau/gsp: move subdev/engine impls to subdev/gsp/rm/r535/") Fixes: 8a8b1ec5261f ("drm/nouveau/gsp: split rpc handling out on its own") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20250611020805.22418-2-bagasdotme@gmail.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-06-13drm/nouveau/bl: increase buffer size to avoid truncate warningJacob Keller
The nouveau_get_backlight_name() function generates a unique name for the backlight interface, appending an id from 1 to 99 for all backlight devices after the first. GCC 15 (and likely other compilers) produce the following -Wformat-truncation warning: nouveau_backlight.c: In function ‘nouveau_backlight_init’: nouveau_backlight.c:56:69: error: ‘%d’ directive output may be truncated writing between 1 and 10 bytes into a region of size 3 [-Werror=format-truncation=] 56 | snprintf(backlight_name, BL_NAME_SIZE, "nv_backlight%d", nb); | ^~ In function ‘nouveau_get_backlight_name’, inlined from ‘nouveau_backlight_init’ at nouveau_backlight.c:351:7: nouveau_backlight.c:56:56: note: directive argument in the range [1, 2147483647] 56 | snprintf(backlight_name, BL_NAME_SIZE, "nv_backlight%d", nb); | ^~~~~~~~~~~~~~~~ nouveau_backlight.c:56:17: note: ‘snprintf’ output between 14 and 23 bytes into a destination of size 15 56 | snprintf(backlight_name, BL_NAME_SIZE, "nv_backlight%d", nb); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The warning started appearing after commit ab244be47a8f ("drm/nouveau: Fix a potential theorical leak in nouveau_get_backlight_name()") This fix for the ida usage removed the explicit value check for ids larger than 99. The compiler is unable to intuit that the ida_alloc_max() limits the returned value range between 0 and 99. Because the compiler can no longer infer that the number ranges from 0 to 99, it thinks that it could use as many as 11 digits (10 + the potential - sign for negative numbers). The warning has gone unfixed for some time, with at least one kernel test robot report. The code breaks W=1 builds, which is especially frustrating with the introduction of CONFIG_WERROR. The string is stored temporarily on the stack and then copied into the device name. Its not a big deal to use 11 more bytes of stack rounding out to an even 24 bytes. Increase BL_NAME_SIZE to 24 to avoid the truncation warning. This fixes the W=1 builds that include this driver. Compile tested only. Fixes: ab244be47a8f ("drm/nouveau: Fix a potential theorical leak in nouveau_get_backlight_name()") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312050324.0kv4PnfZ-lkp@intel.com/ Suggested-by: Timur Tabi <ttabi@nvidia.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20250610-jk-nouveua-drm-bl-snprintf-fix-v2-1-7fdd4b84b48e@intel.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-06-13drm/nouveau: fix a use-after-free in r535_gsp_rpc_push()Zhi Wang
The RPC container is released after being passed to r535_gsp_rpc_send(). When sending the initial fragment of a large RPC and passing the caller's RPC container, the container will be freed prematurely. Subsequent attempts to send remaining fragments will therefore result in a use-after-free. Allocate a temporary RPC container for holding the initial fragment of a large RPC when sending. Free the caller's container when all fragments are successfully sent. Fixes: 176fdcbddfd2 ("drm/nouveau/gsp/r535: add support for booting GSP-RM") Signed-off-by: Zhi Wang <zhiw@nvidia.com> Link: https://lore.kernel.org/r/20250527163712.3444-1-zhiw@nvidia.com [ Rebase onto Blackwell changes. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-06-13drm/nouveau/gsp: Fix potential integer overflow on integer shiftsColin Ian King
The left shift int 32 bit integer constants 1 is evaluated using 32 bit arithmetic and then assigned to a 64 bit unsigned integer. In the case where the shift is 32 or more this can lead to an overflow. Avoid this by shifting using the BIT_ULL macro instead. Fixes: 6c3ac7bcfcff ("drm/nouveau/gsp: support deeper page tables in COPY_SERVER_RESERVED_PDES") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250522131512.2768310-1-colin.i.king@gmail.com
2025-06-13drm/mgag200: Do not include <linux/export.h>Thomas Zimmermann
Fix the compile-time warning drivers/gpu/drm/mgag200/mgag200_ddc.c: warning: EXPORT_SYMBOL() is not used, but #include <linux/export.h> is present Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250612085308.203861-1-tzimmermann@suse.de
2025-06-13drm/ast: Do not include <linux/export.h>Thomas Zimmermann
Fix the compile-time warning drivers/gpu/drm/ast/ast_mode.c: warning: EXPORT_SYMBOL() is not used, but #include <linux/export.h> is present Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250612084257.200907-1-tzimmermann@suse.de
2025-06-12drm/arm/malidp: Silence informational messageAlexander Stein
When checking for unsupported expect an error is printed every time. This spams the log for platforms where this is expected, e.g. ls1028a having a Vivante (etnaviv) GPU and Mali display processor. Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://lore.kernel.org/r/20250523064042.3275926-1-alexander.stein@ew.tq-group.com
2025-06-12drm/ssd130x: fix ssd132x_clear_screen() columnsJohn Keeping
The number of columns relates to the width, not the height. Use the correct variable. Signed-off-by: John Keeping <jkeeping@inmusicbrands.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Fixes: fdd591e00a9c ("drm/ssd130x: Add support for the SSD132x OLED controller family") Link: https://lore.kernel.org/r/20250611111307.1814876-1-jkeeping@inmusicbrands.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-06-11udmabuf: use sgtable-based scatterlist wrappersMarek Szyprowski
Use common wrappers operating directly on the struct sg_table objects to fix incorrect use of scatterlists sync calls. dma_sync_sg_for_*() functions have to be called with the number of elements originally passed to dma_map_sg_*() function, not the one returned in sgtable's nents. Fixes: 1ffe09590121 ("udmabuf: fix dma-buf cpu access") CC: stable@vger.kernel.org Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250507160913.2084079-3-m.szyprowski@samsung.com
2025-06-11dma-buf: fix compare in WARN_ON_ONCEChristian König
Smatch pointed out this trivial typo: drivers/dma-buf/dma-buf.c:1123 dma_buf_map_attachment() warn: passing positive error code '16' to 'ERR_PTR' drivers/dma-buf/dma-buf.c 1113 dma_resv_assert_held(attach->dmabuf->resv); 1114 1115 if (dma_buf_pin_on_map(attach)) { 1116 ret = attach->dmabuf->ops->pin(attach); 1117 /* 1118 * Catch exporters making buffers inaccessible even when 1119 * attachments preventing that exist. 1120 */ 1121 WARN_ON_ONCE(ret == EBUSY); ^^^^^ This was probably intended to be -EBUSY? 1122 if (ret) --> 1123 return ERR_PTR(ret); ^^^ Otherwise we will eventually crash. 1124 } 1125 1126 sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); 1127 if (!sg_table) 1128 sg_table = ERR_PTR(-ENOMEM); 1129 if (IS_ERR(sg_table)) 1130 goto error_unpin; 1131 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Link: https://lore.kernel.org/r/20250605085336.62156-1-christian.koenig@amd.com
2025-06-11drm/sitronix: st7571-i2c: Select VIDEOMODE_HELPERSNathan Chancellor
This driver requires of_get_display_timing() from CONFIG_VIDEOMODE_HELPERS but does not select it. If no other driver selects it, there will be a failure from the linker if the driver is built in or modpost if it is a module. ERROR: modpost: "of_get_display_timing" [drivers/gpu/drm/sitronix/st7571-i2c.ko] undefined! Select CONFIG_VIDEOMODE_HELPERS to resolve the build failure. Fixes: 4b35f0f41ee2 ("drm/st7571-i2c: add support for Sitronix ST7571 LCD controller") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250610-drm-st7571-i2c-select-videomode-helpers-v1-1-d30b50ff6e64@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-10drm/meson: fix more rounding issues with 59.94Hz modesMartin Blumenstingl
Commit 1017560164b6 ("drm/meson: use unsigned long long / Hz for frequency types") attempts to resolve video playback using 59.94Hz. using YUV420 by changing the clock calculation to use Hz instead of kHz (thus yielding more precision). The basic calculation itself is correct, however the comparisions in meson_vclk_vic_supported_freq() and meson_vclk_setup() don't work anymore for 59.94Hz modes (using the freq * 1000 / 1001 logic). For example, drm/edid specifies a 593407kHz clock for 3840x2160@59.94Hz. With the mentioend commit we convert this to Hz. Then meson_vclk tries to find a matchig "params" entry (as the clock setup code currently only supports specific frequencies) by taking the venc_freq from the params and calculating the "alt frequency" (used for the 59.94Hz modes) from it, which is: (594000000Hz * 1000) / 1001 = 593406593Hz Similar calculation is applied to the phy_freq (TMDS clock), which is 10 times the pixel clock. Implement a new meson_vclk_freqs_are_matching_param() function whose purpose is to compare if the requested and calculated frequencies. They may not match exactly (for the reasons mentioned above). Allow the clocks to deviate slightly to make the 59.94Hz modes again. Fixes: 1017560164b6 ("drm/meson: use unsigned long long / Hz for frequency types") Reported-by: Christian Hewitt <christianshewitt@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250609202751.962208-1-martin.blumenstingl@googlemail.com
2025-06-10drm/meson: use vclk_freq instead of pixel_freq in debug printMartin Blumenstingl
meson_vclk_vic_supported_freq() has a debug print which includes the pixel freq. However, within the whole function the pixel freq is irrelevant, other than checking the end of the params array. Switch to printing the vclk_freq which is being compared / matched against the inputs to the function to avoid confusion when analyzing error reports from users. Fixes: e5fab2ec9ca4 ("drm/meson: vclk: add support for YUV420 setup") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250606221031.3419353-1-martin.blumenstingl@googlemail.com
2025-06-10drm/meson: fix debug log statement when setting the HDMI clocksMartin Blumenstingl
The "phy" and "vclk" frequency labels were swapped, making it more difficult to debug driver errors. Swap the label order to make them match with the actual frequencies printed to correct this. Fixes: e5fab2ec9ca4 ("drm/meson: vclk: add support for YUV420 setup") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250606203729.3311592-1-martin.blumenstingl@googlemail.com
2025-06-10drm/vc4: fix infinite EPROBE_DEFER loopGabriel Dalimonte
`vc4_hdmi_audio_init` calls `devm_snd_dmaengine_pcm_register` which may return EPROBE_DEFER. Calling `drm_connector_hdmi_audio_init` adds a child device. The driver model docs[1] state that adding a child device prior to returning EPROBE_DEFER may result in an infinite loop. [1] https://www.kernel.org/doc/html/v6.14/driver-api/driver-model/driver.html Fixes: 9640f1437a88 ("drm/vc4: hdmi: switch to using generic HDMI Codec infrastructure") Signed-off-by: Gabriel Dalimonte <gabriel.dalimonte@gmail.com> Link: https://lore.kernel.org/r/20250601-vc4-audio-inf-probe-v2-1-9ad43c7b6147@gmail.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-09accel/amdxdna: Fix incorrect PSP firmware sizeLizhi Hou
The incorrect PSP firmware size is used for initializing. It may cause error for newer version firmware. Fixes: 8c9ff1b181ba ("accel/amdxdna: Add a new driver for AMD AI Engine") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250604143217.1386272-1-lizhi.hou@amd.com
2025-06-08Linux 6.16-rc1v6.16-rc1Linus Torvalds
2025-06-08Merge tag 'turbostat-2025.06.08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown: - Add initial DMR support, which required smarter RAPL probe - Fix AMD MSR RAPL energy reporting - Add RAPL power limit configuration output - Minor fixes * tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: version 2025.06.08 tools/power turbostat: Add initial support for BartlettLake tools/power turbostat: Add initial support for DMR tools/power turbostat: Dump RAPL sysfs info tools/power turbostat: Avoid probing the same perf counters tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared tools/power turbostat: Clean up add perf/msr counter logic tools/power turbostat: Introduce add_msr_counter() tools/power turbostat: Remove add_msr_perf_counter_() tools/power turbostat: Remove add_cstate_perf_counter_() tools/power turbostat: Remove add_rapl_perf_counter_() tools/power turbostat: Quit early for unsupported RAPL counters tools/power turbostat: Always check rapl_joules flag tools/power turbostat: Fix AMD package-energy reporting tools/power turbostat: Fix RAPL_GFX_ALL typo tools/power turbostat: Add Android support for MSR device handling tools/power turbostat.8: pm_domain wording fix tools/power turbostat.8: fix typo: idle_pct should be pct_idle
2025-06-08Merge tag 'timers-cleanups-2025-06-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanup from Thomas Gleixner: "The delayed from_timer() API cleanup: The renaming to the timer_*() namespace was delayed due massive conflicts against Linux-next. Now that everything is upstream finish the conversion" * tag 'timers-cleanups-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: treewide, timers: Rename from_timer() to timer_container_of()
2025-06-08Merge tag 'x86-urgent-2025-06-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A small set of x86 fixes: - Cure IO bitmap inconsistencies A failed fork cleans up all resources of the newly created thread via exit_thread(). exit_thread() invokes io_bitmap_exit() which does the IO bitmap cleanups, which unfortunately assume that the cleanup is related to the current task, which is obviously bogus. Make it work correctly - A lockdep fix in the resctrl code removed the clearing of the command buffer in two places, which keeps stale error messages around. Bring them back. - Remove unused trace events" * tag 'x86-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: fs/resctrl: Restore the rdt_last_cmd_clear() calls after acquiring rdtgroup_mutex x86/iopl: Cure TIF_IO_BITMAP inconsistencies x86/fpu: Remove unused trace events