linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-11-13	drm/xe/forcewake: Improve kerneldoc	Matt Roper
	Improve the kerneldoc for forcewake a bit to give more detail about what the structures represent. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20251110232017.1475869-33-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-11-13	drm/xe/pf: Use migration-friendly GGTT auto-provisioning	Michal Wajdeczko
	Instead of trying very hard to find the largest fair GGTT size that could be allocated for VFs on the current tile, pick some smaller rounded down to power-of-two value that is more likely to be provisioned in the same manner by the other PF instance: num VFs \| GGTT space (MiB) --------+----------------- 63..57 \| 56 56..29 \| 64 28..15 \| 128 14..8 \| 256 7..4 \| 512 3..2 \| 1024 1 \| 2048 (regular PF) 1 \| 3584 (admin only PF) Note that due to FW/HW limitations we can't share all 4GiB GGTT address space with VFs, so for the larger (>7) number of the VFs the change in the outcome is happening at different points than we have in case of GuC contexts/doorbells IDs. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251112124408.8094-1-michal.wajdeczko@intel.com
2025-11-13	drm/intel/bmg: Allow device ID usage with single-argument macros	Michał Winiarski
	When INTEL_BMG_G21_IDS were added as a subplatform, token concatenation operator usage was omitted, making INTEL_BMG_IDS not usable with single-argument macros. Fix that by adding the missing operator. Fixes: 78de8f876683 ("drm/xe: Handle Wa_22010954014 and Wa_14022085890 as device workarounds") Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251112132220.516975-25-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Add wait helper for VF FLR	Michał Winiarski
	VF FLR requires additional processing done by PF driver. The processing is done after FLR is already finished from PCIe perspective. In order to avoid a scenario where migration state transitions while PF processing is still in progress, additional synchronization point is needed. Add a helper that will be used as part of VF driver struct pci_error_handlers .reset_done() callback. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-24-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Handle VRAM migration data as part of PF control	Michał Winiarski
	Connect the helpers to allow save and restore of VRAM migration data in stop_copy / resume device state. Co-developed-by: Lukasz Laguna <lukasz.laguna@intel.com> Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251112132220.516975-23-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/migrate: Add function to copy of VRAM data in chunks	Lukasz Laguna
	Introduce a new function to copy data between VRAM and sysmem objects. The existing xe_migrate_copy() is tailored for eviction and restore operations, which involves additional logic and operates on entire objects. The xe_migrate_vram_copy_chunk() allows copying chunks of data to or from a dedicated buffer object, which is essential in case of VF migration. Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251112132220.516975-22-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Add helper to retrieve VF's LMEM object	Lukasz Laguna
	Instead of accessing VF's lmem_obj directly, introduce a helper function to make the access more convenient. Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-21-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Handle MMIO migration data as part of PF control	Michał Winiarski
	Implement the helpers and use them for save and restore of MMIO migration data in stop_copy / resume device state. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-20-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Handle GGTT migration data as part of PF control	Michał Winiarski
	Connect the helpers to allow save and restore of GGTT migration data in stop_copy / resume device state. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-19-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Add helpers for VF GGTT migration data handling	Michał Winiarski
	In an upcoming change, the VF GGTT migration data will be handled as part of VF control state machine. Add the necessary helpers to allow the migration data transfer to/from the HW GGTT resource. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-18-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Handle GuC migration data as part of PF control	Michał Winiarski
	Connect the helpers to allow save and restore of GuC migration data in stop_copy / resume device state. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-17-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Switch VF migration GuC save/restore to struct migration data	Michał Winiarski
	In upcoming changes, the GuC VF migration data will be handled as part of separate SAVE/RESTORE states in VF control state machine. Now that the data is decoupled from both guc_state debugfs and PAUSE state, we can safely remove the struct xe_gt_sriov_state_snapshot and modify the GuC save/restore functions to operate on struct xe_sriov_migration_data. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-16-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Don't save GuC VF migration data on pause	Michał Winiarski
	In upcoming changes, the GuC VF migration data will be handled as part of separate SAVE/RESTORE states in VF control state machine. Remove it from PAUSE state. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-15-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Remove GuC migration data save/restore from GT debugfs	Michał Winiarski
	In upcoming changes, SR-IOV VF migration data will be extended beyond GuC data and exported to userspace using VFIO interface (with a vendor-specific variant driver) and a device-level debugfs interface. Remove the GT-level debugfs. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-14-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration	Michał Winiarski
	Contiguous PF GGTT VMAs can be scarce after creating VFs. Increase the GuC buffer cache size to 8M for PF so that we can fit GuC migration data (which currently maxes out at just over 4M) and use the cache instead of allocating fresh BOs. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-13-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe: Allow the caller to pass guc_buf_cache size	Michał Winiarski
	An upcoming change will use GuC buffer cache as a place where GuC migration data will be stored, and the memory requirement for that is larger than indirect data. Allow the caller to pass the size based on the intended usecase. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-12-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe: Add sa/guc_buf_cache sync interface	Michał Winiarski
	In upcoming changes the cached buffers are going to be used to read data produced by the GuC. Add a counterpart to flush, which synchronizes the CPU-side of suballocation with the GPU data and propagate the interface to GuC Buffer Cache. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-11-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Expose VF migration data size over debugfs	Michał Winiarski
	The size is normally used to make a decision on when to stop the device (mainly when it's in a pre_copy state). Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-10-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Add minimalistic migration descriptor	Michał Winiarski
	The descriptor reuses the KLV format used by GuC and contains metadata that can be used to quickly fail migration when source is incompatible with destination. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-9-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Add support for encap/decap of bitstream to/from packet	Michał Winiarski
	Add debugfs handlers for migration state and handle bitstream .read()/.write() to convert from bitstream to/from migration data packets. As descriptor/trailer are handled at this layer - add handling for both save and restore side. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-8-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Add helpers for migration data packet allocation / free	Michał Winiarski
	Now that it's possible to free the packets - connect the restore handling logic with the ring. The helpers will also be used in upcoming changes that will start producing migration data packets. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-7-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Add data structures and handlers for migration rings	Michał Winiarski
	Migration data is queued in a per-GT ptr_ring to decouple the worker responsible for handling the data transfer from the .read() and .write() syscalls. Add the data structures and handlers that will be used in future commits. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-6-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Add save/restore control state stubs and connect to debugfs	Michał Winiarski
	The states will be used by upcoming changes to produce (in case of save) or consume (in case of resume) the VF migration data. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-5-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Convert control state to bitmap	Michał Winiarski
	In upcoming changes, the number of states will increase as a result of introducing SAVE and RESTORE states. This means that using unsigned long as underlying storage won't work on 32-bit architectures, as we'll run out of bits. Use bitmap instead. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202510231918.XlOqymLC-lkp@intel.com/ Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-4-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe: Move migration support to device-level struct	Michał Winiarski
	Upcoming changes will allow users to control VF state and obtain its migration data with a device-level granularity (not tile/gt). Change the data structures to reflect that and move the GT-level migration init to happen after device-level init. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-3-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13	drm/xe/pf: Remove GuC version check for migration support	Michał Winiarski
	Since commit 4eb0aab6e4434 ("drm/xe/guc: Bump minimum required GuC version to v70.29.2"), the minimum GuC version required by the driver is v70.29.2, which should already include everything that we need for migration. Remove the version check. Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-2-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-12	drm/xe/guc: Eliminate RPa frequency caching	Sk Anirban
	Remove the cached pc->rpa_freq field and refactor RPA frequency handling to fetch values directly from hardware registers on each request. v2: Check graphics version instead of platform (Rodrigo) v3: Fix graphics version check (Badal) Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Suggested-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Sk Anirban <sk.anirban@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251112185153.3593145-6-sk.anirban@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-11-12	drm/xe/guc: Eliminate RPe caching for SLPC parameter handling	Sk Anirban
	RPe is runtime-determined by PCODE and caching it caused stale values, leading to incorrect GuC SLPC parameter settings. Drop the cached rpe_freq field and query fresh values from hardware on each use to ensure GuC SLPC parameters reflect current RPe. v2: Remove cached RPe frequency field (Rodrigo) v3: Remove extra variable (Vinay) Modify function name (Vinay) v4: Maintain a separate function for PVC (Rodrigo) v5: Avoid RPn update while fetching RPe frequency (Rodrigo) v6: Split platform-specific RPe comments (Vinay) Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5166 Signed-off-by: Sk Anirban <sk.anirban@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patch.msgid.link/20251112185153.3593145-5-sk.anirban@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-11-12	drm/xe/pf: Allow to lockdown the PF using custom guard	Michal Wajdeczko
	Some driver components, like eudebug or ccs-mode, can't be used when VFs are enabled. Add functions to allow those components to block the PF from enabling VFs for the requested duration. Introduce trivial counter to allow lockdown or exclusive access that can be used in the scenarios where we can't follow the strict owner semantics as required by the rw_semaphore implementation. Before enabling VFs, the PF will try to arm the "vfs_enabling" guard for the exclusive access. This will fail if there are some lockdown requests already initiated by the other components. For testing purposes, add debugfs file which will call these new functions from the file's open/close hooks. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Christoph Manszewski <christoph.manszewski@intel.com> Reviewed-by: Christoph Manszewski <christoph.manszewski@intel.com> Link: https://patch.msgid.link/20251109162451.4779-1-michal.wajdeczko@intel.com
2025-11-12	drm/xe/pcode: Rework error mapping	Lucas De Marchi
	The sparse array used for error decoding from is unnecessarily big. It should be better handled by a switch statement that will also allow us to more easily improve this code. Add a CASE_ERR() macro to keep the table compact and use it instead of the 256-entries array, which saves some space: $ bloat-o-meter xe_pcode.o.old xe_pcode.o add/remove: 0/1 grow/shrink: 2/0 up/down: 190/-4096 (-3906) Function old new delta __pcode_mailbox_rw 363 465 +102 __pcode_mailbox_rw.cold 58 146 +88 err_decode 4096 - -4096 Total: Before=7890, After=3984, chg -49.51% Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20251110-pcode-errmap-v2-1-cb18c8f54238@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-12	drm/xe: fix kernel-doc function name mismatch in xe_pm.c	Kriish Sharma
	Documentation build reported: WARNING: ./drivers/gpu/drm/xe/xe_pm.c:131 expecting prototype for xe_pm_might_block_on_suspend(). Prototype was for xe_pm_block_on_suspend() instead The kernel-doc comment for xe_pm_block_on_suspend() incorrectly used the function name xe_pm_might_block_on_suspend(). Fix the header to match the actual function prototype. No functional changes. Fixes: f73f6dd312a5 ("drm/xe/pm: Add lockdep annotation for the pm_block completion") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511061736.CiuroL7H-lkp@intel.com/ Signed-off-by: Kriish Sharma <kriish.sharma2006@gmail.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251110184206.2113830-1-kriish.sharma2006@gmail.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-11-10	drm/xe/pf: Add runtime registers for GFX ver >= 35	Piotr Piórkowski
	Add a dedicated runtime register list for GFX ver >= 35. Compared to the list for GFX >= 30, this variant drops HUC_KERNEL_LOAD_INFO, MIRROR_FUSE1 and adds SERVICE_COPY_ENABLE. v2: - drop MIRROR_FUSE1 register - update commit message Fixes: 5e0de2dfbc1b ("drm/xe/cri: Add CRI platform definition") Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251107211845.3633633-1-piotr.piorkowski@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10	drm/xe/vram: Move forcewake down to get_flat_ccs_offset()	Lucas De Marchi
	With SG_TILE_ADDR_RANGE use, the only thing requiring GT forcewake while probing for vram size is the get_flat_ccs_offset(). Move the forcewake down where it's needed. Suggested-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20251107-tile-addr-v1-2-a3014aadc2e7@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10	drm/xe: Use SG_TILE_ADDR_RANGE instead of TILE_ADDR_RANGE	Fei Yang
	The TILE_ADDR_RANGE register is not available on all platforms going forward as it was deprecated and is being replaced by equivalent registers within SoC MMIO space. While that doesn't happen, the SG_TILE_ADDR_RANGE (base 0x1083a0) is still valid for all platforms supported by xe. Use that instead. BSpec: 59353, 54991 Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251107-tile-addr-v1-1-a3014aadc2e7@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10	drm/xe: Fix MTL vm_max_level	Rodrigo Vivi
	MTL was broken after the vm_max_level movement. Get it back to a working value. [ 37.722413] xe 0000:00:02.0: [drm] Tile0: GT0: VM job timed out on non-killed execqueue [ 37.722465] WARNING: CPU: 0 PID: 12 at drivers/gpu/drm/xe/xe_guc_submit.c:1379 guc_exec_queue_timedout_job+0x2f3/0xe00 [xe] [ 37.722559] Modules linked in: xt_REDIRECT nft_compat nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr sunrpc bnep snd_ctl_led snd_soc_s\ of_sdw snd_soc_intel_hda_dsp_common snd_soc_sdw_utils snd_sof_probes snd_soc_rt712_sdca regmap_sdw_mbq snd_hda_codec_intelhdmi regmap_sdw snd_soc_dmic snd_hda_intel snd_sof_pci_intel_mtl iwlmvm snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink\ snd_sof_intel_hda snd_hda_codec_hdmi soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp binfmt_misc snd_sof mac80211 vfat snd_sof_utils fat snd_hda_ext_core snd_hda_codec snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_acpi snd_hwdep \ crc8 soundwire_bus libarc4 snd_soc_sdca snd_soc_core [ 37.722584] snd_compress ac97_bus uvcvideo snd_pcm_dmaengine iwlwifi snd_seq uvc videobuf2_vmalloc snd_seq_device videobuf2_memops videobuf2_v4l2 snd_pcm processor_thermal_device_pci videobuf2_common processor_thermal_device btusb intel_uncore_frequency processor_thermal_wt_hint intel_uncore_frequency_common platform_temp\ erature_control videodev btmtk spi_nor processor_thermal_soc_slider x86_pkg_temp_thermal btrtl snd_timer iTCO_wdt processor_thermal_rfim intel_powerclamp btbcm intel_pmc_bxt snd intel_rapl_msr processor_thermal_rapl coretemp iTCO_vendor_support mei_gsc_proxy btintel intel_rapl_common rapl intel_cstate cfg80211 bluetooth mc in\ tel_pmc_core mtd soundcore acer_wmi mei_me intel_uncore processor_thermal_wt_req i2c_i801 spi_intel_pci pmt_telemetry platform_profile mei processor_thermal_power_floor spi_intel i2c_smbus pmt_discovery igen6_edac pcspkr rfkill wmi_bmof idma64 processor_thermal_mbox intel_hid pmt_class int3403_thermal int3400_thermal joydev i\ nt340x_thermal_zone acpi_pad sparse_keymap [ 37.722611] intel_pmc_ssram_telemetry acpi_thermal_rel acer_wireless loop nfnetlink zram lz4hc_compress lz4_compress dm_crypt xe drm_ttm_helper drm_suballoc_helper gpu_sched drm_gpuvm drm_exec drm_gpusvm_helper i915 nvme i2c_algo_bit nvme_core drm_buddy ucsi_acpi ttm typec_ucsi typec nvme_keyring nvme_auth hkdf drm_displa\ y_helper hid_multitouch polyval_clmulni thunderbolt intel_vpu ghash_clmulni_intel cec vmd i2c_hid_acpi video intel_vsec i2c_hid wmi pinctrl_meteorlake serio_raw i2c_dev fuse [ 37.722638] CPU: 0 UID: 0 PID: 12 Comm: kworker/u88:0 Not tainted 6.18.0-rc2+ #37 PREEMPT(voluntary) [ 37.722641] Hardware name: Acer Swift SFG14-72/Coral_MTH, BIOS V1.01 11/06/2023 [ 37.722643] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 37.722649] RIP: 0010:guc_exec_queue_timedout_job+0x2f3/0xe00 [xe] [ 37.722722] Code: 4c 24 10 44 89 44 24 08 e8 5a 95 f1 d4 44 8b 44 24 08 8b 4c 24 10 48 c7 c7 00 b7 25 c1 48 8b 54 24 18 48 89 c6 e8 4d 59 37 d4 <0f> 0b 80 3c 24 00 0f 85 55 03 00 00 49 8b 47 58 a8 01 75 1a 49 8b [ 37.722723] RSP: 0018:ffffd468000f7d80 EFLAGS: 00010246 [ 37.722725] RAX: 0000000000000000 RBX: ffff8e3d4e215c00 RCX: 0000000000000027 [ 37.722726] RDX: ffff8e40ae61cfc8 RSI: 0000000000000001 RDI: ffff8e40ae61cfc0 [ 37.722727] RBP: 00000000fffffffb R08: 0000000000000000 R09: ffffd468000f7c20 [ 37.722727] R10: ffff8e40c09fffa8 R11: 00000000fffbffff R12: ffff8e3d44c00028 [ 37.722728] R13: ffff8e3d807d4000 R14: ffff8e3d807d4018 R15: ffff8e3d95c9d600 [ 37.722729] FS: 0000000000000000(0000) GS:ffff8e4116110000(0000) knlGS:0000000000000000 [ 37.722729] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 37.722730] CR2: 00007ff1f3e02720 CR3: 0000000113c8d005 CR4: 0000000000f70ef0 [ 37.722731] PKRU: 55555554 [ 37.722731] Call Trace: [ 37.722734] <TASK> [ 37.722735] ? __pfx_autoremove_wake_function+0x10/0x10 [ 37.722740] drm_sched_job_timedout+0x81/0x170 [gpu_sched] Fixes: 50292f9af8ec ("drm/xe: Move 'vm_max_level' flag back to platform descriptor") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251108040634.6376-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-11-10	drm/xe/vf: Enable VF resource fixup unconditionally	Michał Winiarski
	All the feature enabling code is in place - drop the debug flag requirement for VF resource fixup. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251107161000.1938186-1-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-07	drm/xe/tests: Add KUnit tests for PF fair provisioning	Michal Wajdeczko
	Add test cases to check outcome of fair GuC context or doorbells IDs allocations for regular and admin-only PF mode. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251106165932.2143-1-michal.wajdeczko@intel.com
2025-11-07	drm/xe/pf: Use migration-friendly doorbells auto-provisioning	Michal Wajdeczko
	Instead of trying very hard to find the largest fair number of GuC doorbell IDs that could be allocated for VFs on the current GT, pick some smaller rounded down to power-of-two value that is more likely to be provisioned in the same manner by the other PF instance: num VFs \| num doorbells --------+-------------- 63..32 \| 4 31..16 \| 8 15..8 \| 16 7..4 \| 32 3..2 \| 64 1 \| 128 (regular PF) 1 \| 240 (admin only PF) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251105183253.863-3-michal.wajdeczko@intel.com
2025-11-07	drm/xe/pf: Use migration-friendly context IDs auto-provisioning	Michal Wajdeczko
	Instead of trying very hard to find the largest fair number of GuC context IDs that could be allocated for VFs on the current GT, pick some smaller rounded down to power-of-two value that is more likely to be provisioned in the same manner by the other PF instance: num VFs \| num contexts --------+------------- 63..32 \| 1024 31..16 \| 2048 15..8 \| 4096 7..4 \| 8192 3..2 \| 16384 1 \| 32768 (regular PF) 1 \| 64512 (admin only PF) Add also helper function to determine if the PF is admin-only, and for now use .probe_display flag for that. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251105183253.863-2-michal.wajdeczko@intel.com
2025-11-06	drm/xe/xe3lpg: Extend Wa_15016589081 for xe3lpg	Nitin Gote
	Wa_15016589081 applies to Xe3_LPG renderCS Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Link: https://patch.msgid.link/20251106100516.318863-2-nitin.r.gote@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-11-05	drm/xe/gt_throttle: Avoid TOCTOU when monitoring reasons	Lucas De Marchi
	It's currently not possible to safely monitor if there's throttling happening and what are the reasons. The approach of reading the status and then reading the reasons is not reliable as by the time sysadmin reads the reason, the throttling could not be happening anymore. Previous tentative to fix that[1] was breaking the ABI and potentially sysadmin's scripts. This takes a different approach of adding and documenting the additional attribute. It's still valuable, though redundant, to provide the simpler 0/1 interface. In order to avoid userspace knowledge on the bitmask meaning and to be able to maintain the kernel side in sync with possible changes in future, just walk the attribute group and check what are the masks that match the value read. [1] https://lore.kernel.org/intel-xe/20241025092238.167042-1-raag.jadav@intel.com/ Cc: Raag Jadav <raag.jadav@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20251104-gt-throttle-cri-v5-1-4948b060bbfd@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-05	drm/xe: Remove never used code in xe_vm_create()	Gwan-gyeong Mun
	Clang is not happy with set but unused variable (this is visible with `make LLVM=1` build: drivers/gpu/drm/xe/xe_vm.c:1462:11: error: variable 'number_tiles' set but not used [-Werror,-Wunused-but-set-variable] The use of this variable was removed in the commit mentioned below as "Fixes:" but only its declaration and update remain. It seems like the variable is not used along with the assignment that does not have side effects as far as I can see. Remove those altogether. Fixes: cb99e12ba8cb ("drm/xe: Decouple bind queue last fence from TLB invalidations") Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patch.msgid.link/20251105011311.3177875-1-gwan-gyeong.mun@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-04	drm/xe: Remove unused GT page fault code	Matthew Brost
	With the Xe page fault layer and GuC page layer in place, this is now dead code and can be removed. ACC code is also removed, but this was dead code. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-8-matthew.brost@intel.com
2025-11-04	drm/xe: Add xe_guc_pagefault layer	Matthew Brost
	Add xe_guc_pagefault layer (producer) which parses G2H fault messages messages into struct xe_pagefault, forwards them to the page fault layer (consumer) for servicing, and provides a vfunc to acknowledge faults to the GuC upon completion. Replace the old (and incorrect) GT page fault layer with this new layer throughout the driver. As part of this change, the ACC handling code has been removed, as it is dead code that is currently unused. v2: - Include engine instance (Stuart) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-7-matthew.brost@intel.com
2025-11-04	drm/xe: Implement xe_pagefault_queue_work	Matthew Brost
	Implement a worker that services page faults, using the same implementation as in xe_gt_pagefault.c. v2: - Rebase on exhaustive eviction changes - Include engine instance in debug prints (Stuart) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-6-matthew.brost@intel.com
2025-11-04	drm/xe: Implement xe_pagefault_handler	Matthew Brost
	Enqueue (copy) the input struct xe_pagefault into a queue (i.e., into a memory buffer) and schedule a worker to service it. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-5-matthew.brost@intel.com
2025-11-04	drm/xe: Implement xe_pagefault_reset	Matthew Brost
	Squash any pending faults on the GT being reset by setting the GT field in struct xe_pagefault to NULL. v4: - Only do reset it page faults queues initialized (CI) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-4-matthew.brost@intel.com
2025-11-04	drm/xe: Implement xe_pagefault_init	Matthew Brost
	Create pagefault queues and initialize them. v2: - Fix kernel doc + add comment for number PF queue (Francois) v4: - Move init after GT init (CI, Francois) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-3-matthew.brost@intel.com
2025-11-04	drm/xe: Stub out new pagefault layer	Matthew Brost
	Stub out the new page fault layer and add kernel documentation. This is intended as a replacement for the GT page fault layer, enabling multiple producers to hook into a shared page fault consumer interface. v2: - Fix kernel doc typo (checkpatch) - Remove comment around GT (Stuart) - Add explaination around reclaim (Francois) - Add comment around u8 vs enum (Francois) - Include engine instance (Stuart) v3: - Fix XE_PAGEFAULT_TYPE_ATOMIC_ACCESS_VIOLATION kernel doc (Stuart) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-2-matthew.brost@intel.com
2025-11-04	drm/xe: Remove last fence dependency check from binds and execs	Matthew Brost
	Eliminate redundant last fence dependency checks in exec and bind jobs, as they are now equivalent to xe_exec_queue_is_idle. Simplify the code by removing this dead logic. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-7-matthew.brost@intel.com