Age | Commit message (Collapse) | Author |
|
The pKVM FF-A proxy rejects FF-A requests other than FFA_VERSION until
version negotiation is complete, which is signalled by setting the
global 'has_version_negotiated' variable.
To avoid excessive locking, this variable is checked directly from
kvm_host_ffa_handler() in response to an FF-A call, but this can race
against another CPU performing the negotiation and potentially lead to
reading a torn value (incredibly unlikely for a 'bool') or problematic
re-ordering of the accesses to 'has_version_negotiated' and
'hyp_ffa_version' whereby a stale version number could be read by
__do_ffa_mem_xfer().
Use acquire/release primitives when writing 'has_version_negotiated'
with the version lock held and when reading without the lock held.
Cc: Sebastian Ene <sebastianene@google.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Marc Zyngier <maz@kernel.org>
Fixes: c9c012625e12 ("KVM: arm64: Trap FFA_VERSION host call in pKVM")
Signed-off-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250407152755.1041-1-will@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Commit b4b05e53b550 ("drm/xe/guc_pc: Retry and wait longer for GuC PC
start"), leads to the following Smatch static checker warning:
drivers/gpu/drm/xe/xe_guc_pc.c:1073 xe_guc_pc_start()
warn: missing error code here? '_dev_err()' failed. 'ret' = '0'
Fixes: c605acb53f44 ("drm/xe/guc_pc: Retry and wait longer for GuC PC start")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/intel-xe/1454a5f1-ee18-4df1-a6b2-a4a3dddcd1cb@stanley.mountain/
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250328181752.26677-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 3f2bdccbccdcb53b0d316474eafff2e3462a51ad)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Historically, the Vertex Fetcher unit has not been an L3 client. That
meant that, when a buffer containing vertex data was written to, it was
necessary to issue a PIPE_CONTROL::VF Cache Invalidate to invalidate any
VF L2 cachelines associated with that buffer, so the new value would be
properly read from memory.
Since Tigerlake and later, VERTEX_BUFFER_STATE and 3DSTATE_INDEX_BUFFER
have included an "L3 Bypass Enable" bit which userspace drivers can set
to request that the vertex fetcher unit snoop L3. However, unlike most
true L3 clients, the "VF Cache Invalidate" bit continues to only
invalidate the VF L2 cache - and not any associated L3 lines.
To handle that, PIPE_CONTROL has a new "L3 Read Only Cache Invalidation
Bit", which according to the docs, "controls the invalidation of the
Geometry streams cached in L3 cache at the top of the pipe." In other
words, the vertex and index buffer data that gets cached in L3 when
"L3 Bypass Disable" is set.
Mesa always sets L3 Bypass Disable so that the VF unit snoops L3, and
whenever it issues a VF Cache Invalidate, it also issues a L3 Read Only
Cache Invalidate so that both L2 and L3 vertex data is invalidated.
xe is issuing VF cache invalidates too (which handles cases like CPU
writes to a buffer between GPU batches). Because userspace may enable
L3 snooping, it needs to issue an L3 Read Only Cache Invalidate as well.
Fixes significant flickering in Firefox on Meteorlake, which was writing
to vertex buffers via the CPU between batches; the missing L3 Read Only
invalidates were causing the vertex fetcher to read stale data from L3.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4460
Fixes: 6ef3bb60557d ("drm/xe: enable lite restore")
Cc: stable@vger.kernel.org # v6.13+
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250330165923.56410-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 61672806b579dd5a150a042ec9383be2bbc2ae7e)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Building the xe driver for i386 results in a link time warning:
x86_64-linux-ld: drivers/gpu/drm/xe/xe_migrate.o: in function `xe_migrate_vram':
xe_migrate.c:(.text+0x1e15): undefined reference to `__udivdi3'
Avoid this by using DIV_U64_ROUND_UP() instead of DIV_ROUND_UP(). The driver
is unlikely to be used on 32=bit hardware, so the extra cost here is not
too important.
Fixes: 9c44fd5f6e8a ("drm/xe: Add migrate layer functions for SVM support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250324210612.2927194-1-arnd@kernel.org
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit c9092257506af4985c085103714c403812a5bdcb)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Sysfs_ops needs to be defined on all directories which
can have attr files with set/get method. Add sysfs_ops
to even those directories which is currently empty but
would have attr files with set/get method in future.
Leave .default with default sysfs_ops as it will never
have setter method.
V2(Himal/Rodrigo):
- use single sysfs_ops for all dir and attr with set/get
- add default ops as ./default does not need runtime pm at all
Fixes: 3f0e14651ab0 ("drm/xe: Runtime PM wake on every sysfs call")
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250327122647.886637-1-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit 40780b9760b561e093508d07b8b9b06c94ab201e)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
The intent of the error path in xe_migrate_clear is to wait on locally
generated fence and then return. The code is waiting on m->fence which
could be the local fence but this is only stable under the job mutex
leading to a possible UAF. Fix code to wait on local fence.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20250311182915.3606291-1-matthew.brost@intel.com
(cherry picked from commit 762b7e95362170b3e13a8704f38d5e47eca4ba74)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Extend Wa_14022293748, Wa_22019794406 to Xe3_LPG
Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250325224310.1455499-1-julia.filipchuk@intel.com
(cherry picked from commit 32af900f2c6b1846fd3ede8ad36dd180d7e4ae70)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
The RCU_MODE_FIXED_SLICE_CCS_MODE setting is not getting invoked
in the gt reset path after the ccs_mode setting by the user.
Add it to engine register update list (in hw_engine_setup_default_state())
which ensures it gets set in the gt reset and engine reset paths.
v2: Add register update to engine list to ensure it gets updated
after engine reset also.
Fixes: 0d97ecce16bd ("drm/xe: Enable Fixed CCS mode setting")
Cc: stable@vger.kernel.org
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250327185604.18230-1-niranjana.vishwanathapura@intel.com
(cherry picked from commit 12468e519f98e4d93370712e3607fab61df9dae9)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
When the size of the range invalidated is larger than
rounddown_pow_of_two(ULONG_MAX),
The function macro roundup_pow_of_two(length) will hit an out-of-bounds
shift [1].
Use a full TLB invalidation for such cases.
v2:
- Use a define for the range size limit over which we use a full
TLB invalidation. (Lucas)
- Use a better calculation of the limit.
[1]:
[ 39.202421] ------------[ cut here ]------------
[ 39.202657] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
[ 39.202673] shift exponent 64 is too large for 64-bit type 'long unsigned int'
[ 39.202688] CPU: 8 UID: 0 PID: 3129 Comm: xe_exec_system_ Tainted: G U 6.14.0+ #10
[ 39.202690] Tainted: [U]=USER
[ 39.202690] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023
[ 39.202691] Call Trace:
[ 39.202692] <TASK>
[ 39.202695] dump_stack_lvl+0x6e/0xa0
[ 39.202699] ubsan_epilogue+0x5/0x30
[ 39.202701] __ubsan_handle_shift_out_of_bounds.cold+0x61/0xe6
[ 39.202705] xe_gt_tlb_invalidation_range.cold+0x1d/0x3a [xe]
[ 39.202800] ? find_held_lock+0x2b/0x80
[ 39.202803] ? mark_held_locks+0x40/0x70
[ 39.202806] xe_svm_invalidate+0x459/0x700 [xe]
[ 39.202897] drm_gpusvm_notifier_invalidate+0x4d/0x70 [drm_gpusvm]
[ 39.202900] __mmu_notifier_release+0x1f5/0x270
[ 39.202905] exit_mmap+0x40e/0x450
[ 39.202912] __mmput+0x45/0x110
[ 39.202914] exit_mm+0xc5/0x130
[ 39.202916] do_exit+0x21c/0x500
[ 39.202918] ? lockdep_hardirqs_on_prepare+0xdb/0x190
[ 39.202920] do_group_exit+0x36/0xa0
[ 39.202922] get_signal+0x8f8/0x900
[ 39.202926] arch_do_signal_or_restart+0x35/0x100
[ 39.202930] syscall_exit_to_user_mode+0x1fc/0x290
[ 39.202932] do_syscall_64+0xa1/0x180
[ 39.202934] ? do_user_addr_fault+0x59f/0x8a0
[ 39.202937] ? lock_release+0xd2/0x2a0
[ 39.202939] ? do_user_addr_fault+0x5a9/0x8a0
[ 39.202942] ? trace_hardirqs_off+0x4b/0xc0
[ 39.202944] ? clear_bhb_loop+0x25/0x80
[ 39.202946] ? clear_bhb_loop+0x25/0x80
[ 39.202947] ? clear_bhb_loop+0x25/0x80
[ 39.202950] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 39.202952] RIP: 0033:0x7fa945e543e1
[ 39.202961] Code: Unable to access opcode bytes at 0x7fa945e543b7.
[ 39.202962] RSP: 002b:00007ffca8fb4170 EFLAGS: 00000293
[ 39.202963] RAX: 000000000000003d RBX: 0000000000000000 RCX: 00007fa945e543e3
[ 39.202964] RDX: 0000000000000000 RSI: 00007ffca8fb41ac RDI: 00000000ffffffff
[ 39.202964] RBP: 00007ffca8fb4190 R08: 0000000000000000 R09: 00007fa945f600a0
[ 39.202965] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
[ 39.202966] R13: 00007fa9460dd310 R14: 00007ffca8fb41ac R15: 0000000000000000
[ 39.202970] </TASK>
[ 39.202970] ---[ end trace ]---
Fixes: 332dd0116c82 ("drm/xe: Add range based TLB invalidations")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> #v1
Link: https://lore.kernel.org/r/20250326151634.36916-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit b88f48f86500bc0b44b4f73ac66d500a40d320ad)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
If drm_gpusvm_migrate_to_devmem() succeeds, if a cpu access happens to the
range the bo may be freed before xe_bo_unlock(), causing a UAF.
Since the reference is transferred, use xe_svm_devmem_release() to
release the reference on drm_gpusvm_migrate_to_devmem() failure,
and hold a local reference to protect the UAF.
Fixes: 2f118c949160 ("drm/xe: Add SVM VRAM migration")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250326080551.40201-3-thomas.hellstrom@linux.intel.com
(cherry picked from commit c9db07cab766b665c8fa1184649cef452f448dc8)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
One additional BMG PCI ID has been added to the spec; make sure our
driver recognizes devices with this ID properly.
Bspec: 68090
Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250325224709.4073080-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit cca9734ebe55f6af11ce8d57ca1afdc4d158c808)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Try pinning into VRAM to allow P2P with RDMA NICs without ODP
support if all attachments can do P2P. If any attachment can't do
P2P just pin into GTT instead.
Acked-by: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Tested-by: Pak Nin Lui <pak.lui@amd.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Similar to xe, enable some simple management of VRAM only.
Reviewed-by: Christian König <christian.koenig@amd.com>
Co-developed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
KIQ invalidate_tlbs request has been seen to marginally exceed the
configured 100 ms timeout on systems under load.
All other KIQ requests in the driver use a 10 second timeout. Use a
similar timeout implementation on the invalidate_tlbs path.
v2: Poll once before msleep
v3: Fix return value
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Kent Russell <kent.russell@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
ASICs post GFX 9 are being flagged as SDMA per queue reset supported
in the KGD but KFD and scheduler FW currently have no support.
Limit SDMA queue reset capabilities to GFX 9.
Fixes: ceb7114c961b ("drm/amdkfd: flag per-sdma queue reset supported to user space")
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
[Why]
HP Elitebook 645 has DP0 and DP1 swapped.
[How]
Add HP Elitebook 645 to DP0/DP1 swap quirk list.
Cc: stable@vger.kernel.org
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3701
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
HP Probook 445 and 465 has DP0 and DP1 swapped.
[How]
Add HP Probook 445 and 465 to DP0/DP1 swap quirk list.
Cc: stable@vger.kernel.org
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3995
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Anson Tsao <anson.tsao@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Commit 7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml2_validate()/dml21_validate() are not protected from their
callers, causing such errors:
do_fpu invoked from kernel context![#1]:
CPU: 10 UID: 0 PID: 331 Comm: kworker/10:1H Not tainted 6.14.0-rc6+ #4
Workqueue: events_highpri dm_irq_work_func [amdgpu]
pc ffff800003191eb0 ra ffff800003191e60 tp 9000000107a94000 sp 9000000107a975b0
a0 9000000140ce4910 a1 0000000000000000 a2 9000000140ce49b0 a3 9000000140ce49a8
a4 9000000140ce49a8 a5 0000000100000000 a6 0000000000000001 a7 9000000107a97660
t0 ffff800003790000 t1 9000000140ce5000 t2 0000000000000001 t3 0000000000000000
t4 0000000000000004 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000
t8 0000000100000000 u0 ffff8000031a3b9c s9 9000000130bc0000 s0 9000000132400000
s1 9000000140ec0000 s2 9000000132400000 s3 9000000140ce0000 s4 90000000057f8b88
s5 9000000140ec0000 s6 9000000140ce4910 s7 0000000000000001 s8 9000000130d45010
ra: ffff800003191e60 dml21_map_dc_state_into_dml_display_cfg+0x40/0x1140 [amdgpu]
ERA: ffff800003191eb0 dml21_map_dc_state_into_dml_display_cfg+0x90/0x1140 [amdgpu]
CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
PRMD: 00000004 (PPLV0 +PIE -PWE)
EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT: 000f0000 [FPD] (IS= ECode=15 EsubCode=0)
PRID: 0014d010 (Loongson-64bit, Loongson-3C6000/S)
Process kworker/10:1H (pid: 331, threadinfo=000000007bf9ddb0, task=00000000cc4ab9f3)
Stack : 0000000100000000 0000043800000780 0000000100000001 0000000100000001
0000000000000000 0000078000000000 0000000000000438 0000078000000000
0000000000000438 0000078000000000 0000000000000438 0000000100000000
0000000100000000 0000000100000000 0000000100000000 0000000100000000
0000000000000001 9000000140ec0000 9000000132400000 9000000132400000
ffff800003408000 ffff800003408000 9000000132400000 9000000140ce0000
9000000140ce0000 ffff800003193850 0000000000000001 9000000140ec0000
9000000132400000 9000000140ec0860 9000000140ec0738 0000000000000001
90000001405e8000 9000000130bc0000 9000000140ec02a8 ffff8000031b5db8
0000000000000000 0000043800000780 0000000000000003 ffff8000031b79cc
...
Call Trace:
[<ffff800003191eb0>] dml21_map_dc_state_into_dml_display_cfg+0x90/0x1140 [amdgpu]
[<ffff80000319384c>] dml21_validate+0xcc/0x520 [amdgpu]
[<ffff8000031b8948>] dc_validate_global_state+0x2e8/0x460 [amdgpu]
[<ffff800002e94034>] create_validate_stream_for_sink+0x3d4/0x420 [amdgpu]
[<ffff800002e940e4>] amdgpu_dm_connector_mode_valid+0x64/0x240 [amdgpu]
[<900000000441d6b8>] drm_connector_mode_valid+0x38/0x80
[<900000000441d824>] __drm_helper_update_and_validate+0x124/0x3e0
[<900000000441ddc0>] drm_helper_probe_single_connector_modes+0x2e0/0x620
[<90000000044050dc>] drm_client_modeset_probe+0x23c/0x1780
[<9000000004420384>] __drm_fb_helper_initial_config_and_unlock+0x44/0x5a0
[<9000000004403acc>] drm_client_dev_hotplug+0xcc/0x140
[<ffff800002e9ab50>] handle_hpd_irq_helper+0x1b0/0x1e0 [amdgpu]
[<90000000038f5da0>] process_one_work+0x160/0x300
[<90000000038f6718>] worker_thread+0x318/0x440
[<9000000003901b8c>] kthread+0x12c/0x220
[<90000000038b1484>] ret_from_kernel_thread+0x8/0xa4
Unfortunately, protecting dml2_validate()/dml21_validate() out of DML2
causes "sleeping function called from invalid context", so protect them
with DC_FP_START() and DC_FP_END() inside.
Fixes: 7da55c27e767 ("drm/amd/display: Remove incorrect FP context start")
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Tested-by: Dongyan Qian <qiandongyan@loongson.cn>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Commit 7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml2_init()/dml21_init() are not protected from their callers,
causing such errors:
do_fpu invoked from kernel context![#1]:
CPU: 0 UID: 0 PID: 239 Comm: kworker/0:5 Not tainted 6.14.0-rc6+ #2
Workqueue: events work_for_cpu_fn
pc ffff80000319de80 ra ffff80000319de5c tp 900000010575c000 sp 900000010575f840
a0 0000000000000000 a1 900000012f210130 a2 900000012f000000 a3 ffff80000357e268
a4 ffff80000357e260 a5 900000012ea52cf0 a6 0000000400000004 a7 0000012c00001388
t0 00001900000015e0 t1 ffff80000379d000 t2 0000000010624dd3 t3 0000006400000014
t4 00000000000003e8 t5 0000005000000018 t6 0000000000000020 t7 0000000f00000064
t8 000000000000002f u0 5f5e9200f8901912 s9 900000012d380010 s0 900000012ea51fd8
s1 900000012f000000 s2 9000000109296000 s3 0000000000000001 s4 0000000000001fd8
s5 0000000000000001 s6 ffff800003415000 s7 900000012d390000 s8 ffff800003211f80
ra: ffff80000319de5c dml21_apply_soc_bb_overrides+0x3c/0x960 [amdgpu]
ERA: ffff80000319de80 dml21_apply_soc_bb_overrides+0x60/0x960 [amdgpu]
CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
PRMD: 00000004 (PPLV0 +PIE -PWE)
EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT: 000f0000 [FPD] (IS= ECode=15 EsubCode=0)
PRID: 0014d010 (Loongson-64bit, Loongson-3C6000/S)
Process kworker/0:5 (pid: 239, threadinfo=00000000927eadc6, task=000000008fd31682)
Stack : 00040dc000003164 0000000000000001 900000012f210130 900000012eabeeb8
900000012f000000 ffff80000319fe48 900000012f210000 900000012f210130
900000012f000000 900000012eabeeb8 0000000000000001 ffff8000031a0064
900000010575f9f0 900000012f210130 900000012eac0000 900000012ea80000
900000012f000000 ffff8000031cefc4 900000010575f9f0 ffff8000035859c0
ffff800003414000 900000010575fa78 900000012f000000 ffff8000031b4c50
0000000000000000 9000000101c9d700 9000000109c40000 5f5e9200f8901912
900000012d3c4bd0 900000012d3c5000 ffff8000034aed18 900000012d380010
900000012d3c4bd0 ffff800003414000 900000012d380000 ffff800002ea49dc
0000000000000001 900000012d3c6000 00000000ffffe423 0000000000010000
...
Call Trace:
[<ffff80000319de80>] dml21_apply_soc_bb_overrides+0x60/0x960 [amdgpu]
[<ffff80000319fe44>] dml21_init+0xa4/0x280 [amdgpu]
[<ffff8000031a0060>] dml21_create+0x40/0x80 [amdgpu]
[<ffff8000031cefc0>] dc_state_create+0x100/0x160 [amdgpu]
[<ffff8000031b4c4c>] dc_create+0x44c/0x640 [amdgpu]
[<ffff800002ea49d8>] amdgpu_dm_init+0x3f8/0x2060 [amdgpu]
[<ffff800002ea6658>] dm_hw_init+0x18/0x60 [amdgpu]
[<ffff800002b16738>] amdgpu_device_init+0x1938/0x27e0 [amdgpu]
[<ffff800002b18e80>] amdgpu_driver_load_kms+0x20/0xa0 [amdgpu]
[<ffff800002b0c8f0>] amdgpu_pci_probe+0x1b0/0x580 [amdgpu]
[<900000000448eae4>] local_pci_probe+0x44/0xc0
[<9000000003b02b18>] work_for_cpu_fn+0x18/0x40
[<9000000003b05da0>] process_one_work+0x160/0x300
[<9000000003b06718>] worker_thread+0x318/0x440
[<9000000003b11b8c>] kthread+0x12c/0x220
[<9000000003ac1484>] ret_from_kernel_thread+0x8/0xa4
Unfortunately, protecting dml2_init()/dml21_init() out of DML2 causes
"sleeping function called from invalid context", so protect them with
DC_FP_START() and DC_FP_END() inside.
Fixes: 7da55c27e767 ("drm/amd/display: Remove incorrect FP context start")
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Commit 7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml21_copy() are not protected from their callers, causing such
errors:
do_fpu invoked from kernel context![#1]:
CPU: 0 UID: 0 PID: 240 Comm: kworker/0:5 Not tainted 6.14.0-rc6+ #1
Workqueue: events work_for_cpu_fn
pc ffff80000318bd2c ra ffff80000315750c tp 9000000105910000 sp 9000000105913810
a0 0000000000000000 a1 0000000000000002 a2 900000013140d728 a3 900000013140d720
a4 0000000000000000 a5 9000000131592d98 a6 0000000000017ae8 a7 00000000001312d0
t0 9000000130751ff0 t1 ffff800003790000 t2 ffff800003790000 t3 9000000131592e28
t4 000000000004c6a8 t5 00000000001b7740 t6 0000000000023e38 t7 0000000000249f00
t8 0000000000000002 u0 0000000000000000 s9 900000012b010000 s0 9000000131400000
s1 9000000130751fd8 s2 ffff800003408000 s3 9000000130752c78 s4 9000000131592da8
s5 9000000131592120 s6 9000000130751ff0 s7 9000000131592e28 s8 9000000131400008
ra: ffff80000315750c dml2_top_soc15_initialize_instance+0x20c/0x300 [amdgpu]
ERA: ffff80000318bd2c mcg_dcn4_build_min_clock_table+0x14c/0x600 [amdgpu]
CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
PRMD: 00000004 (PPLV0 +PIE -PWE)
EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT: 000f0000 [FPD] (IS= ECode=15 EsubCode=0)
PRID: 0014d010 (Loongson-64bit, Loongson-3C6000/S)
Process kworker/0:5 (pid: 240, threadinfo=00000000f1700428, task=0000000020d2e962)
Stack : 0000000000000000 0000000000000000 0000000000000000 9000000130751fd8
9000000131400000 ffff8000031574e0 9000000130751ff0 0000000000000000
9000000131592e28 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 f9175936df5d7fd2
900000012b00ff08 900000012b000000 ffff800003409000 ffff8000034a1780
90000001019634c0 900000012b000010 90000001307beeb8 90000001306b0000
0000000000000001 ffff8000031942b4 9000000130780000 90000001306c0000
9000000130780000 ffff8000031c276c 900000012b044bd0 ffff800003408000
...
Call Trace:
[<ffff80000318bd2c>] mcg_dcn4_build_min_clock_table+0x14c/0x600 [amdgpu]
[<ffff800003157508>] dml2_top_soc15_initialize_instance+0x208/0x300 [amdgpu]
[<ffff8000031942b0>] dml21_create_copy+0x30/0x60 [amdgpu]
[<ffff8000031c2768>] dc_state_create_copy+0x68/0xe0 [amdgpu]
[<ffff800002e98ea0>] amdgpu_dm_init+0x8c0/0x2060 [amdgpu]
[<ffff800002e9a658>] dm_hw_init+0x18/0x60 [amdgpu]
[<ffff800002b0a738>] amdgpu_device_init+0x1938/0x27e0 [amdgpu]
[<ffff800002b0ce80>] amdgpu_driver_load_kms+0x20/0xa0 [amdgpu]
[<ffff800002b008f0>] amdgpu_pci_probe+0x1b0/0x580 [amdgpu]
[<9000000003c7eae4>] local_pci_probe+0x44/0xc0
[<90000000032f2b18>] work_for_cpu_fn+0x18/0x40
[<90000000032f5da0>] process_one_work+0x160/0x300
[<90000000032f6718>] worker_thread+0x318/0x440
[<9000000003301b8c>] kthread+0x12c/0x220
[<90000000032b1484>] ret_from_kernel_thread+0x8/0xa4
Unfortunately, protecting dml21_copy() out of DML2 causes "sleeping
function called from invalid context", so protect them with DC_FP_START()
and DC_FP_END() inside.
Fixes: 7da55c27e767 ("drm/amd/display: Remove incorrect FP context start")
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
amdgpu_dm_commit_planes()
[Why]
Replay and PSR will cause some video corruption while VRR is enabled.
[How]
Do not enable the Replay and PSR while VRR is active in
amdgpu_dm_enable_self_refresh().
Fixes: 67edb81d6e9a ("drm/amd/display: Disable replay and psr while VRR is enabled")
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Disable per queue reset for sriov.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The page_link lower bits of the first sg could contain something like
SG_END, if we are mapping a single VRAM page or contiguous blob which
fits into one sg entry. Rather pull out the struct page, and use that in
our check to know if we mapped struct pages vs VRAM.
Fixes: f44ffd677fb3 ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Only use GTT as a fallback if we already have a backing store. This
prevents evictions when an application constantly allocates and frees new
memory.
Partially fixes
https://gitlab.freedesktop.org/drm/amd/-/issues/3844#note_2833985.
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 216c1282dde3 ("drm/amdgpu: use GTT only as fallback for VRAM|GTT")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Don't fetch it again if we already have it. It seems the
registers don't reliably have the value at resume in some
cases.
Fixes: 028c3fb37e70 ("drm/amdgpu/mes11: initiate mes v11 support")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4083
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
The end of table checks should be done with the structure size,
but 2 of the 3 similar calls use the pointer size.
Signed-off-by: Jean-Marc Eurin <jmeurin@google.com>
Link: https://patch.msgid.link/20250402001542.2600671-1-jmeurin@google.com
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
ri_timer() uprobe timer callback, use raw_write_seqcount_*()
Avoid a false-positive lockdep warning in the CONFIG_PREEMPT_RT=y
configuration when using write_seqcount_begin() in the uprobe timer
callback by using raw_write_* APIs.
Uprobe's use of timer callback is guaranteed to not race with itself
for a given uprobe_task, and as such seqcount's insistence on having
preemption disabled on the writer side is irrelevant. So switch to
raw_ variants of seqcount API instead of disabling preemption unnecessarily.
Also, point out in the comments more explicitly why we use seqcount
despite our reader side being rather simple and never retrying. We favor
well-maintained kernel primitive in favor of open-coding our own memory
barriers.
Fixes: 8622e45b5da1 ("uprobes: Reuse return_instances between multiple uretprobes within task")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Suggested-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20250404194848.2109539-1-andrii@kernel.org
|
|
Cosmin reports an issue with ipv6_add_dev being called from
NETDEV_CHANGE notifier:
[ 3455.008776] ? ipv6_add_dev+0x370/0x620
[ 3455.010097] ipv6_find_idev+0x96/0xe0
[ 3455.010725] addrconf_add_dev+0x1e/0xa0
[ 3455.011382] addrconf_init_auto_addrs+0xb0/0x720
[ 3455.013537] addrconf_notify+0x35f/0x8d0
[ 3455.014214] notifier_call_chain+0x38/0xf0
[ 3455.014903] netdev_state_change+0x65/0x90
[ 3455.015586] linkwatch_do_dev+0x5a/0x70
[ 3455.016238] rtnl_getlink+0x241/0x3e0
[ 3455.019046] rtnetlink_rcv_msg+0x177/0x5e0
Similarly, linkwatch might get to ipv6_add_dev without ops lock:
[ 3456.656261] ? ipv6_add_dev+0x370/0x620
[ 3456.660039] ipv6_find_idev+0x96/0xe0
[ 3456.660445] addrconf_add_dev+0x1e/0xa0
[ 3456.660861] addrconf_init_auto_addrs+0xb0/0x720
[ 3456.661803] addrconf_notify+0x35f/0x8d0
[ 3456.662236] notifier_call_chain+0x38/0xf0
[ 3456.662676] netdev_state_change+0x65/0x90
[ 3456.663112] linkwatch_do_dev+0x5a/0x70
[ 3456.663529] __linkwatch_run_queue+0xeb/0x200
[ 3456.663990] linkwatch_event+0x21/0x30
[ 3456.664399] process_one_work+0x211/0x610
[ 3456.664828] worker_thread+0x1cc/0x380
[ 3456.665691] kthread+0xf4/0x210
Reclassify NETDEV_CHANGE as a notifier that consistently runs under the
instance lock.
Link: https://lore.kernel.org/netdev/aac073de8beec3e531c86c101b274d434741c28e.camel@nvidia.com/
Reported-by: Cosmin Ratiu <cratiu@nvidia.com>
Tested-by: Cosmin Ratiu <cratiu@nvidia.com>
Fixes: ad7c7b2172c3 ("net: hold netdev instance lock during sysfs operations")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250404161122.3907628-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cited commit placed netdev_lock_ops() just after __dev_get_by_index()
in addrconf_add_ifaddr(), where dev could be NULL as reported. [0]
Let's call netdev_lock_ops() only when dev is not NULL.
[0]:
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000198: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000cc0-0x0000000000000cc7]
CPU: 3 UID: 0 PID: 12032 Comm: syz.0.15 Not tainted 6.14.0-13408-g9f867ba24d36 #1 PREEMPT(full)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:addrconf_add_ifaddr (./include/net/netdev_lock.h:30 ./include/net/netdev_lock.h:41 net/ipv6/addrconf.c:3157)
Code: 8b b4 24 94 00 00 00 4c 89 ef e8 7e 4c 2f ff 4c 8d b0 c5 0c 00 00 48 89 c3 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 ea 03 <0f> b6 04 02 4c 89 f2 83 e2 07 38 d0 7f 08 80
RSP: 0018:ffffc90015b0faa0 EFLAGS: 00010213
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000198 RSI: ffffffff893162f2 RDI: ffff888078cb0338
RBP: ffffc90015b0fbb0 R08: 0000000000000000 R09: fffffbfff20cbbe2
R10: ffffc90015b0faa0 R11: 0000000000000000 R12: 1ffff92002b61f54
R13: ffff888078cb0000 R14: 0000000000000cc5 R15: ffff888078cb0000
FS: 00007f92559ed640(0000) GS:ffff8882a8659000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f92559ecfc8 CR3: 000000001c39e000 CR4: 00000000000006f0
Call Trace:
<TASK>
inet6_ioctl (net/ipv6/af_inet6.c:580)
sock_do_ioctl (net/socket.c:1196)
sock_ioctl (net/socket.c:1314)
__x64_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:906 fs/ioctl.c:892 fs/ioctl.c:892)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130
RIP: 0033:0x7f9254b9c62d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff f8
RSP: 002b:00007f92559ecf98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f9254d65f80 RCX: 00007f9254b9c62d
RDX: 0000000020000040 RSI: 0000000000008916 RDI: 0000000000000003
RBP: 00007f9254c264d3 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f9254d65f80 R15: 00007f92559cd000
</TASK>
Modules linked in:
Fixes: 8965c160b8f7 ("net: use netif_disable_lro in ipv6_add_dev")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Reported-by: Hui Guo <guohui.study@gmail.com>
Closes: https://lore.kernel.org/netdev/CAHOo4gK+tdU1B14Kh6tg-tNPqnQ1qGLfinONFVC43vmgEPnXXw@mail.gmail.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250406035755.69238-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Taehee Yoo says:
====================
fix wrong hds-thresh value setting
A hds-thresh value is not set correctly if input value is 0.
The cause is that ethtool_ringparam_get_cfg(), which is a internal
function that returns ringparameters from both ->get_ringparam() and
dev->cfg can't return a correct hds-thresh value.
The first patch fixes ethtool_ringparam_get_cfg() to set hds-thresh
value correcltly.
The second patch adds random test for hds-thresh value.
So that we can test 0 value for a hds-thresh properly.
====================
Link: https://patch.msgid.link/20250404122126.1555648-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
hds.py has been testing 0(set_hds_thresh_zero()),
MAX(set_hds_thresh_max()), GT(set_hds_thresh_gt()) values for hds-thresh.
However if a hds-thresh value was already 0, set_hds_thresh_zero()
can't test properly.
So, it tests random value first and then tests 0, MAX, GT values.
Testing bnxt:
TAP version 13
1..13
ok 1 hds.get_hds
ok 2 hds.get_hds_thresh
ok 3 hds.set_hds_disable # SKIP disabling of HDS not supported by
the device
ok 4 hds.set_hds_enable
ok 5 hds.set_hds_thresh_random
ok 6 hds.set_hds_thresh_zero
ok 7 hds.set_hds_thresh_max
ok 8 hds.set_hds_thresh_gt
ok 9 hds.set_xdp
ok 10 hds.enabled_set_xdp
ok 11 hds.ioctl
ok 12 hds.ioctl_set_xdp
ok 13 hds.ioctl_enabled_set_xdp
# Totals: pass:12 fail:0 xfail:0 xpass:0 skip:1 error:0
Testing lo:
TAP version 13
1..13
ok 1 hds.get_hds # SKIP tcp-data-split not supported by device
ok 2 hds.get_hds_thresh # SKIP hds-thresh not supported by device
ok 3 hds.set_hds_disable # SKIP ring-set not supported by the device
ok 4 hds.set_hds_enable # SKIP ring-set not supported by the device
ok 5 hds.set_hds_thresh_random # SKIP hds-thresh not supported by
device
ok 6 hds.set_hds_thresh_zero # SKIP ring-set not supported by the
device
ok 7 hds.set_hds_thresh_max # SKIP hds-thresh not supported by
device
ok 8 hds.set_hds_thresh_gt # SKIP hds-thresh not supported by device
ok 9 hds.set_xdp # SKIP tcp-data-split not supported by device
ok 10 hds.enabled_set_xdp # SKIP tcp-data-split not supported by
device
ok 11 hds.ioctl # SKIP tcp-data-split not supported by device
ok 12 hds.ioctl_set_xdp # SKIP tcp-data-split not supported by
device
ok 13 hds.ioctl_enabled_set_xdp # SKIP tcp-data-split not supported
by device
# Totals: pass:0 fail:0 xfail:0 xpass:0 skip:13 error:0
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250404122126.1555648-3-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
always as 0.
When hds-thresh is configured, ethnl_set_rings() is called, and it calls
ethtool_ringparam_get_cfg() to get ringparameters from .get_ringparam()
callback and dev->cfg.
Both hds_config and hds_thresh values should be set from dev->cfg, not
from .get_ringparam().
But ethtool_ringparam_get_cfg() sets only hds_config from dev->cfg.
So, ethtool_ringparam_get_cfg() returns always a hds_thresh as 0.
If an input value of hds-thresh is 0, a hds_thresh value from
ethtool_ringparam_get_cfg() are same. So ethnl_set_rings() does
nothing and returns immediately.
It causes a bug that setting a hds-thresh value to 0 is not working.
Reproducer:
modprobe netdevsim
echo 1 > /sys/bus/netdevsim/new_device
ethtool -G eth0 hds-thresh 100
ethtool -G eth0 hds-thresh 0
ethtool -g eth0
#hds-thresh value should be 0, but it shows 100.
The tools/testing/selftests/drivers/net/hds.py can test it too with
applying a following patch for hds.py.
Fixes: 928459bbda19 ("net: ethtool: populate the default HDS params in the core")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250404122126.1555648-2-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When AC adapter is unplugged or plugged in EC wakes from HW sleep but
APU doesn't enter back into HW sleep.
The reason this happens is that, when the APU exits HW sleep, the power
rails controlled by the EC will power up the TCON. The TCON has a GPIO
that will be toggled at this time. The GPIO is not marked as a wakeup
source, but the GPIO controller still has an unserviced interrupt.
Unserviced interrupts will block entering HW sleep again. Clearing the
GPIO doesn't help as the TCON continues to assert it until it's been
initialized by i2c-hid.
Fixing this would require TCON F/W changes and it's already broken in
the wild on production hardware.
To avoid triggering this issue add a quirk to avoid letting EC wake
up system at all. The power button still works properly on this system.
Reported-by: Antheas Kapenekakis <lkml@antheas.dev>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3929
Link: https://github.com/bazzite-org/patchwork/commit/95b93b2852718ee1e808c72e6b1836da4a95fc63
Co-developed-by: Antheas Kapenekakis <lkml@antheas.dev>
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250401133858.1892077-1-superm1@kernel.org
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Commit a7e23ec17feec ("ACPI: button: Install notifier for system
events as well") modified the ACPI button behavior to send
`ACPI_BUTTON_NOTIFY_WAKE` events.
This caused a regression on Dell Optiplex 3040 sending `KEY_POWER`
randomly at runtime.
Adjust logic so that the `ACPI_BUTTON_NOTIFY_WAKE` event will never
send `KEY_POWER`.
Fixes: a7e23ec17feec ("ACPI: button: Install notifier for system events as well")
Reported-by: Ian Laurie <nixuser@mail.com>
Closes: https://lore.kernel.org/linux-acpi/CAJZ5v0hbA6bqxHupTh4NZR-GVSb9M5RL7JSb2yQgvYYJg+z2aQ@mail.gmail.com/T/#md8071e480212201f23e4929607386750d3b6bc13
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2357044
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Ian Laurie <nixuser@mail.com>
Link: https://patch.msgid.link/20250404145034.2608574-1-superm1@kernel.org
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
e820__register_nosave_regions()
While debugging kexec/hibernation hangs and crashes, it turned out that
the current implementation of e820__register_nosave_regions() suffers from
multiple serious issues:
- The end of last region is tracked by PFN, causing it to find holes
that aren't there if two consecutive subpage regions are present
- The nosave PFN ranges derived from holes are rounded out (instead of
rounded in) which makes it inconsistent with how explicitly reserved
regions are handled
Fix this by:
- Treating reserved regions as if they were holes, to ensure consistent
handling (rounding out nosave PFN ranges is more correct as the
kernel does not use partial pages)
- Tracking the end of the last RAM region by address instead of pages
to detect holes more precisely
These bugs appear to have been introduced about ~18 years ago with the very
first version of e820_mark_nosave_regions(), and its flawed assumptions were
carried forward uninterrupted through various waves of rewrites and renames.
[ mingo: Added Git archeology details, for kicks and giggles. ]
Fixes: e8eff5ac294e ("[PATCH] Make swsusp avoid memory holes and reserved memory regions on x86_64")
Reported-by: Roberto Ricci <io@r-ricci.it>
Tested-by: Roberto Ricci <io@r-ricci.it>
Signed-off-by: Myrrh Periwinkle <myrrhperiwinkle@qtmlabs.xyz>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Len Brown <len.brown@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250406-fix-e820-nosave-v3-1-f3787bc1ee1d@qtmlabs.xyz
Closes: https://lore.kernel.org/all/Z4WFjBVHpndct7br@desktop0a/
|
|
The function get_vm_area() is not defined for non-MMU builds and causes a
build error if it is used. Hide the map_pages() function around a:
#ifdef CONFIG_MMU
to keep it from being compiled when CONFIG_MMU is not set.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20250407120111.2ccc9319@gandalf.local.home
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/all/4f8ece8b-8862-4f7c-8ede-febd28f8a9fe@roeck-us.net/
Fixes: 394f3f02de531 ("tracing: Use vmap_page_range() to map memmap ring buffer")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Xen disables ACPI for PV guests in DomU, which causes acpi_mps_check() to
return 1 when CONFIG_X86_MPPARSE is not set. As a result, the local APIC is
disabled and the guest is later limited to a single vCPU, despite being
configured with more.
This regression was introduced in version 6.9 in commit 7c0edad3643f
("x86/cpu/topology: Rework possible CPU management"), which added an
early check that limits CPUs to 1 if apic_is_disabled.
Update the acpi_mps_check() logic to return 0 early when running as a Xen
PV guest in DomU, preventing APIC from being disabled in this specific case
and restoring correct multi-vCPU behaviour.
Fixes: 7c0edad3643f ("x86/cpu/topology: Rework possible CPU management")
Signed-off-by: Petr Vaněk <arkamar@atlas.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250407132445.6732-2-arkamar@atlas.cz
|
|
The recently introduced msi_parent_ops::chip_flags sets irq_eoi()/irq_ack()
conditionally, but MIP driver has not been updated. Populate chip_flags
with EOI | ACK flags.
Fixes: 32c6c054661a ("irqchip: Add Broadcom BCM2712 MSI-X interrupt controller")
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250407125918.3021454-1-svarbanov@suse.de
|
|
Scanning for namespaces can take some time, so if the target is
reconfigured while the scan is running we may miss a Attached Namespace
Attribute Changed AEN.
Check if the NVME_AER_NOTICE_NS_CHANGED bit is set once the scan has
finished, and requeue scanning to pick up any missed change.
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
This isn't fixing a real issue, but there's also zero point in going
through group and buffer setup, when the buffers are going to be
rejected once attempted to get used.
Cc: stable@vger.kernel.org
Reported-by: syzbot+58928048fd1416f1457c@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
A preparation patch that separates the number of pages / folios from
the number of niovs. They will not match in the future to support huge
pages, improved dma mapping and/or larger chunk sizes.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0780ac966ee84200385737f45bb0f2ada052392b.1743848231.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Refill queue lock and other bits are only used from the allocation path
on the rx softirq side, but it shares the cache line with other fields
like ctx that are used also in the "syscall" path, which causes cache
bouncing when softirq runs on a different CPU.
Separate them into different cache lines. The first one now contains
constant fields used by both contextx, followed by a line responsible
for refill queue data.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/6d1f598e27d623c07fc49d6baee13089a9b1216c.1743848241.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The .round_waveform_tohw() is supposed to return 1 if the requested
waveform cannot be implemented by rounding down all parameters. Also
adapt the corresponding comment to better describe why the implemented
procedure is right.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Tested-by: Trevor Gamblin <tgamblin@baylibre.com>
Link: https://lore.kernel.org/r/ba451573f0218d76645f068cec78bd97802cf010.1743844730.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
|
|
If another channel is already enabled period must not be modified. If
the requested period is smaller than this unchangable period the driver
is still supposed to search a duty_cycle according to the usual rounding
rules.
So don't set the duty_cycle to 0 but continue to determine an
appropriate value for ccr.
Fixes: deaba9cff809 ("pwm: stm32: Implementation of the waveform callbacks")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/f0c50df31daa3d6069bfa8d7fb3e71fae241b026.1743844730.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
|
|
Waveform parameters are supposed to be rounded down to the next value
possible for the hardware. However when a requested value is too small,
.round_waveform_tohw() is supposed to pick the next bigger value and
return 1. Let pwm_set_waveform() behave in the same way.
This creates consistency between pwm_set_waveform_might_sleep() with
exact=false and pwm_round_waveform_might_sleep() +
pwm_set_waveform_might_sleep() with exact=true.
The PWM_DEBUG rounding check has to be adapted to only trigger if no
uprounding happend.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Tested-by: Trevor Gamblin <tgamblin@baylibre.com>
Link: https://lore.kernel.org/r/353dc6ae31be815e41fd3df89c257127ca0d1a09.1743844730.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
|
|
PVH dom0 re-uses logic from PV dom0, in which RAM ranges not assigned to
dom0 are re-used as scratch memory to map foreign and grant pages. Such
logic relies on reporting those unpopulated ranges as RAM to Linux, and
mark them as reserved. This way Linux creates the underlying page
structures required for metadata management.
Such approach works fine on PV because the initial balloon target is
calculated using specific Xen data, that doesn't take into account the
memory type changes described above. However on HVM and PVH the initial
balloon target is calculated using get_num_physpages(), and that function
does take into account the unpopulated RAM regions used as scratch space
for remote domain mappings.
This leads to PVH dom0 having an incorrect initial balloon target, which
causes malfunction (excessive memory freeing) of the balloon driver if the
dom0 memory target is later adjusted from the toolstack.
Fix this by using xen_released_pages to account for any pages that are part
of the memory map, but are already unpopulated when the balloon driver is
initialized. This accounts for any regions used for scratch remote
mappings. Note on x86 xen_released_pages definition is moved to
enlighten.c so it's uniformly available for all Xen-enabled builds.
Take the opportunity to unify PV with PVH/HVM guests regarding the usage of
get_num_physpages(), as that avoids having to add different logic for PV vs
PVH in both balloon_add_regions() and arch_xen_unpopulated_init().
Much like a6aa4eb994ee, the code in this changeset should have been part of
38620fc4e893.
Fixes: a6aa4eb994ee ('xen/x86: add extra pages to unpopulated-alloc if available')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250407082838.65495-1-roger.pau@citrix.com>
|
|
xen-acpi-processor functions under a PVH dom0 with only a
xen_initial_domain() runtime check. Change the Kconfig dependency from
PV dom0 to generic dom0 to reflect that.
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Tested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250331172913.51240-1-jason.andryuk@amd.com>
|
|
Some SKUs of Xe2_HPD platforms (such as BMG) have GDDR memory type
with ECC enabled. We need to identify this scenario and add a new
case in xelpdp_get_dram_info() to handle it. In addition, the
derating value needs to be adjusted accordingly to compensate for
the limited bandwidth.
Bspec: 64602
Cc: Matt Roper <matthew.d.roper@intel.com>
Fixes: 3adcf970dc7e ("drm/xe/bmg: Drop force_probe requirement")
Cc: stable@vger.kernel.org
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250324-tip-v2-1-38397de319f8@intel.com
(cherry picked from commit 327e30123cafcb45c0fc5843da0367b90332999d)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
According to the DP spec TPS4 is mandatory for HBR3. We have
however seen some broken eDP sinks that violate this and
declare support for HBR3 without TPS4 support.
At least in the case of the icl Dell XPS 13 7390 this results
in an unstable output.
Reject HBR3 when TPS4 supports is unavailable on the sink.
v2: Leave breadcrumbs in dmesg to avoid head scratching (Jani)
Cc: stable@vger.kernel.org
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/5969
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306210740.11886-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 38188a7f575dacba1120a59fd5d62c7f3313c0fa)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|