Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
i2c-host-fixes for v6.15-rc6
- omap: use correct function to read from device tree
- MAINTAINERS: remove Seth from ISMT maintainership
|
|
Pull smb client fixes from Steve French:
- Fix dentry leak which can cause umount crash
- Add warning for parse contexts error on compounded operation
* tag '6.15-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: Avoid race in open_cached_dir with lease breaks
smb3 client: warn when parse contexts returns error on compounded operation
|
|
propagate_mnt() does not attach anything to mounts created during
propagate_mnt() itself. What's more, anything on ->mnt_slave_list
of such new mount must also be new, so we don't need to even look
there.
When move_mount() had been introduced, we've got an additional
class of mounts to skip - if we are moving from anon namespace,
we do not want to propagate to mounts we are moving (i.e. all
mounts in that anon namespace).
Unfortunately, the part about "everything on their ->mnt_slave_list
will also be ignorable" is not true - if we have propagation graph
A -> B -> C
and do OPEN_TREE_CLONE open_tree() of B, we get
A -> [B <-> B'] -> C
as propagation graph, where B' is a clone of B in our detached tree.
Making B private will result in
A -> B' -> C
C still gets propagation from A, as it would after making B private
if we hadn't done that open_tree(), but now the propagation goes
through B'. Trying to move_mount() our detached tree on subdirectory
in A should have
* moved B' on that subdirectory in A
* skipped the corresponding subdirectory in B' itself
* copied B' on the corresponding subdirectory in C.
As it is, the logics in propagation_next() and friends ends up
skipping propagation into C, since it doesn't consider anything
downstream of B'.
IOW, walking the propagation graph should only skip the ->mnt_slave_list
of new mounts; the only places where the check for "in that one
anon namespace" are applicable are propagate_one() (where we should
treat that as the same kind of thing as "mountpoint we are looking
at is not visible in the mount we are looking at") and
propagation_would_overmount(). The latter is better dealt with
in the caller (can_move_mount_beneath()); on the first call of
propagation_would_overmount() the test is always false, on the
second it is always true in "move from anon namespace" case and
always false in "move within our namespace" one, so it's easier
to just use check_mnt() before bothering with the second call and
be done with that.
Fixes: 064fe6e233e8 ("mount: handle mount propagation for detached mount trees")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
as it is, a failed move_mount(2) from anon namespace breaks
all further propagation into that namespace, including normal
mounts in non-anon namespaces that would otherwise propagate
there.
Fixes: 064fe6e233e8 ("mount: handle mount propagation for detached mount trees")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
do_umount() analogue of the race fixed in 119e1ef80ecf "fix
__legitimize_mnt()/mntput() race". Here we want to make sure that
if __legitimize_mnt() doesn't notice our lock_mount_hash(), we will
notice their refcount increment. Harder to hit than mntput_no_expire()
one, fortunately, and consequences are milder (sync umount acting
like umount -l on a rare race with RCU pathwalk hitting at just the
wrong time instead of use-after-free galore mntput_no_expire()
counterpart used to be hit). Still a bug...
Fixes: 48a066e72d97 ("RCU'd vfsmounts")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... or we risk stealing final mntput from sync umount - raising mnt_count
after umount(2) has verified that victim is not busy, but before it
has set MNT_SYNC_UMOUNT; in that case __legitimize_mnt() doesn't see
that it's safe to quietly undo mnt_count increment and leaves dropping
the reference to caller, where it'll be a full-blown mntput().
Check under mount_lock is needed; leaving the current one done before
taking that makes no sense - it's nowhere near common enough to bother
with.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull rust fixes from Miguel Ojeda:
- Make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88.0
- Clean Rust (and Clippy) lints for the upcoming Rust 1.87.0 and 1.88.0
releases
- Clean objtool warning for the upcoming Rust 1.87.0 release by adding
one more noreturn function
* tag 'rust-fixes-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
x86/Kconfig: make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88
rust: clean Rust 1.88.0's `clippy::uninlined_format_args` lint
rust: clean Rust 1.88.0's warning about `clippy::disallowed_macros` configuration
rust: clean Rust 1.88.0's `unnecessary_transmutes` lint
rust: allow Rust 1.87.0's `clippy::ptr_eq` lint
objtool/rust: add one more `noreturn` Rust function for Rust 1.87.0
|
|
Pull drm fixes from Dave Airlie:
"Weekly drm fixes, bit bigger than last week, but overall amdgpu/xe
with some ivpu bits and a random few fixes, and dropping the
ttm_backup struct which wrapped struct file and was recently
frowned at.
drm:
- Fix overflow when generating wedged event
ttm:
- Fix documentation
- Remove struct ttm_backup
panel:
- simple: Fix timings for AUO G101EVN010
amdgpu:
- DC FP fixes
- Freesync fix
- DMUB AUX fixes
- VCN fix
- Hibernation fixes
- HDP fixes
xe:
- Prevent PF queue overflow
- Hold all forcewake during mocs test
- Remove GSC flush on reset path
- Fix forcewake put on error path
- Fix runtime warning when building without svm
i915:
- Fix oops on resume after disconnecting DP MST sinks during suspend
- Fix SPLC num_waiters refcounting
ivpu:
- Increase timeouts
- Fix deadlock in cmdq ioctl
- Unlock mutices in correct order
v3d:
- Avoid memory leak in job handling"
* tag 'drm-fixes-2025-05-10' of https://gitlab.freedesktop.org/drm/kernel: (32 commits)
drm/i915/dp: Fix determining SST/MST mode during MTP TU state computation
drm/xe: Add config control for svm flush work
drm/xe: Release force wake first then runtime power
drm/xe/gsc: do not flush the GSC worker from the reset path
drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regs
drm/xe: Add page queue multiplier
drm/amdgpu/hdp7: use memcfg register to post the write for HDP flush
drm/amdgpu/hdp6: use memcfg register to post the write for HDP flush
drm/amdgpu/hdp5.2: use memcfg register to post the write for HDP flush
drm/amdgpu/hdp5: use memcfg register to post the write for HDP flush
drm/amdgpu/hdp4: use memcfg register to post the write for HDP flush
drm/amdgpu: fix pm notifier handling
Revert "drm/amd: Stop evicting resources on APUs in suspend"
drm/amdgpu/vcn: using separate VCN1_AON_SOC offset
drm/amd/display: Fix wrong handling for AUX_DEFER case
drm/amd/display: Copy AUX read reply data whenever length > 0
drm/amd/display: Remove incorrect checking in dmub aux handler
drm/amd/display: Fix the checking condition in dmub aux handling
drm/amd/display: Shift DMUB AUX reply command if necessary
drm/amd/display: Call FP Protect Before Mode Programming/Mode Support
...
|
|
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
drm/i915 fixes for v6.15-rc6:
- Fix oops on resume after disconnecting DP MST sinks during suspend
- Fix SPLC num_waiters refcounting
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/87tt5umeaw.fsf@intel.com
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Prevent PF queue overflow
- Hold all forcewake during mocs test
- Remove GSC flush on reset path
- Fix forcewake put on error path
- Fix runtime warning when building without svm
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/jffqa56f2zp4i5ztz677cdspgxhnw7qfop3dd3l2epykfpfvza@q2nw6wapsphz
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Move the arm64_use_ng_mappings variable from the .bss to the .data
section as it is accessed very early during boot with the MMU off and
before the .bss has been initialised.
This could lead to incorrect idmap page table"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: cpufeature: Move arm64_use_ng_mappings to the .data section to prevent wrong idmap generation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- The compressed half-word misaligned access instructions (c.lhu, c.lh,
and c.sh) from the Zcb extension are now properly emulated
- A series of fixes to properly emulate permissions while handling
userspace misaligned accesses
- A pair of fixes for PR_GET_TAGGED_ADDR_CTRL to avoid accessing the
envcfg CSR on systems that don't support that CSR, and to report
those failures up to userspace
- The .rela.dyn section is no longer stripped from vmlinux, as it is
necessary to relocate the kernel under some conditions (including
kexec)
* tag 'riscv-for-linus-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Disallow PR_GET_TAGGED_ADDR_CTRL without Supm
scripts: Do not strip .rela.dyn section
riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
riscv: misaligned: use get_user() instead of __get_user()
riscv: misaligned: enable IRQs while handling misaligned accesses
riscv: misaligned: factorize trap handling
riscv: misaligned: Add handling for ZCB instructions
|
|
Pull block fixes from Jens Axboe:
- Fix for a regression in this series for loop and read/write iterator
handling
- zone append block update tweak
- remove a broken IO priority test
- NVMe pull request via Christoph:
- unblock ctrl state transition for firmware update (Daniel
Wagner)
* tag 'block-6.15-20250509' of git://git.kernel.dk/linux:
block: remove test of incorrect io priority level
nvme: unblock ctrl state transition for firmware update
block: only update request sector if needed
loop: Add sanity check for read/write_iter
|
|
Pull io_uring fixes from Jens Axboe:
- Fix for linked timeouts arming and firing wrt prep and issue of the
request being managed by the linked timeout
- Fix for a CQE ordering issue between requests with multishot and
using the same buffer group. This is a dumbed down version for this
release and for stable, it'll get improved for v6.16
- Tweak the SQPOLL submit batch size. A previous commit made SQPOLL
manage its own task_work and chose a tiny batch size, bump it from 8
to 32 to fix a performance regression due to that
* tag 'io_uring-6.15-20250509' of git://git.kernel.dk/linux:
io_uring/sqpoll: Increase task_work submission batch size
io_uring: ensure deferred completions are flushed for multishot
io_uring: always arm linked timeouts prior to issue
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull modules fix from Petr Pavlu:
"A single fix to prevent use of an uninitialized completion pointer
when releasing a module_kobject in specific situations.
This addresses a latent bug exposed by commit f95bbfe18512 ("drivers:
base: handle module_kobject creation")"
* tag 'modules-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux:
module: ensure that kobject_put() is safe for module type kobjects
|
|
tl;dr: There is a window in the mm switching code where the new CR3 is
set and the CPU should be getting TLB flushes for the new mm. But
should_flush_tlb() has a bug and suppresses the flush. Fix it by
widening the window where should_flush_tlb() sends an IPI.
Long Version:
=== History ===
There were a few things leading up to this.
First, updating mm_cpumask() was observed to be too expensive, so it was
made lazier. But being lazy caused too many unnecessary IPIs to CPUs
due to the now-lazy mm_cpumask(). So code was added to cull
mm_cpumask() periodically[2]. But that culling was a bit too aggressive
and skipped sending TLB flushes to CPUs that need them. So here we are
again.
=== Problem ===
The too-aggressive code in should_flush_tlb() strikes in this window:
// Turn on IPIs for this CPU/mm combination, but only
// if should_flush_tlb() agrees:
cpumask_set_cpu(cpu, mm_cpumask(next));
next_tlb_gen = atomic64_read(&next->context.tlb_gen);
choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush);
load_new_mm_cr3(need_flush);
// ^ After 'need_flush' is set to false, IPIs *MUST*
// be sent to this CPU and not be ignored.
this_cpu_write(cpu_tlbstate.loaded_mm, next);
// ^ Not until this point does should_flush_tlb()
// become true!
should_flush_tlb() will suppress TLB flushes between load_new_mm_cr3()
and writing to 'loaded_mm', which is a window where they should not be
suppressed. Whoops.
=== Solution ===
Thankfully, the fuzzy "just about to write CR3" window is already marked
with loaded_mm==LOADED_MM_SWITCHING. Simply checking for that state in
should_flush_tlb() is sufficient to ensure that the CPU is targeted with
an IPI.
This will cause more TLB flush IPIs. But the window is relatively small
and I do not expect this to cause any kind of measurable performance
impact.
Update the comment where LOADED_MM_SWITCHING is written since it grew
yet another user.
Peter Z also raised a concern that should_flush_tlb() might not observe
'loaded_mm' and 'is_lazy' in the same order that switch_mm_irqs_off()
writes them. Add a barrier to ensure that they are observed in the
order they are written.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/oe-lkp/202411282207.6bd28eae-lkp@intel.com/ [1]
Fixes: 6db2526c1d69 ("x86/mm/tlb: Only trim the mm_cpumask once a second") [2]
Reported-by: Stephen Dolan <sdolan@janestreet.com>
Cc: stable@vger.kernel.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Our QA team reported a 10%-23%, throughput reduction on an io_uring
sqpoll testcase doing IO to a null_blk, that I traced back to a
reduction of the device submission queue depth utilization. It turns out
that, after commit af5d68f8892f ("io_uring/sqpoll: manage task_work
privately"), we capped the number of task_work entries that can be
completed from a single spin of sqpoll to only 8 entries, before the
sqpoll goes around to (potentially) sleep. While this cap doesn't drive
the submission side directly, it impacts the completion behavior, which
affects the number of IO queued by fio per sqpoll cycle on the
submission side, and io_uring ends up seeing less ios per sqpoll cycle.
As a result, block layer plugging is less effective, and we see more
time spent inside the block layer in profilings charts, and increased
submission latency measured by fio.
There are other places that have increased overhead once sqpoll sleeps
more often, such as the sqpoll utilization calculation. But, in this
microbenchmark, those were not representative enough in perf charts, and
their removal didn't yield measurable changes in throughput. The major
overhead comes from the fact we plug less, and less often, when submitting
to the block layer.
My benchmark is:
fio --ioengine=io_uring --direct=1 --iodepth=128 --runtime=300 --bs=4k \
--invalidate=1 --time_based --ramp_time=10 --group_reporting=1 \
--filename=/dev/nullb0 --name=RandomReads-direct-nullb-sqpoll-4k-1 \
--rw=randread --numjobs=1 --sqthread_poll
In one machine, tested on top of Linux 6.15-rc1, we have the following
baseline:
READ: bw=4994MiB/s (5236MB/s), 4994MiB/s-4994MiB/s (5236MB/s-5236MB/s), io=439GiB (471GB), run=90001-90001msec
With this patch:
READ: bw=5762MiB/s (6042MB/s), 5762MiB/s-5762MiB/s (6042MB/s-6042MB/s), io=506GiB (544GB), run=90001-90001msec
which is a 15% improvement in measured bandwidth. The average
submission latency is noticeably lowered too. As measured by
fio:
Baseline:
lat (usec): min=20, max=241, avg=99.81, stdev=3.38
Patched:
lat (usec): min=26, max=226, avg=86.48, stdev=4.82
If we look at blktrace, we can also see the plugging behavior is
improved. In the baseline, we end up limited to plugging 8 requests in
the block layer regardless of the device queue depth size, while after
patching we can drive more io, and we manage to utilize the full device
queue.
In the baseline, after a stabilization phase, an ordinary submission
looks like:
254,0 1 49942 0.016028795 5977 U N [iou-sqp-5976] 7
After patching, I see consistently more requests per unplug.
254,0 1 4996 0.001432872 3145 U N [iou-sqp-3144] 32
Ideally, the cap size would at least be the deep enough to fill the
device queue, but we can't predict that behavior, or assume all IO goes
to a single device, and thus can't guess the ideal batch size. We also
don't want to let the tw run unbounded, though I'm not sure it would
really be a problem. Instead, let's just give it a more sensible value
that will allow for more efficient batching. I've tested with different
cap values, and initially proposed to increase the cap to 1024. Jens
argued it is too big of a bump and I observed that, with 32, I'm no
longer able to observe this bottleneck in any of my machines.
Fixes: af5d68f8892f ("io_uring/sqpoll: manage task_work privately")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20250508181203.3785544-1-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Determining the SST/MST mode during state computation must be done based
on the output type stored in the CRTC state, which in turn is set once
based on the modeset connector's SST vs. MST type and will not change as
long as the connector is using the CRTC. OTOH the MST mode indicated by
the given connector's intel_dp::is_mst flag can change independently of
the above output type, based on what sink is at any moment plugged to
the connector.
Fix the state computation accordingly.
Cc: Jani Nikula <jani.nikula@intel.com>
Fixes: f6971d7427c2 ("drm/i915/mst: adapt intel_dp_mtp_tu_compute_config() for 128b/132b SST")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4607
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://lore.kernel.org/r/20250507151953.251846-1-imre.deak@intel.com
(cherry picked from commit 0f45696ddb2b901fbf15cb8d2e89767be481d59f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
When increasing the array size in memblock_double_array() and the slab
is not yet available, a call to memblock_find_in_range() is used to
reserve/allocate memory. However, the range returned may not have been
accepted, which can result in a crash when booting an SNP guest:
RIP: 0010:memcpy_orig+0x68/0x130
Code: ...
RSP: 0000:ffffffff9cc03ce8 EFLAGS: 00010006
RAX: ff11001ff83e5000 RBX: 0000000000000000 RCX: fffffffffffff000
RDX: 0000000000000bc0 RSI: ffffffff9dba8860 RDI: ff11001ff83e5c00
RBP: 0000000000002000 R08: 0000000000000000 R09: 0000000000002000
R10: 000000207fffe000 R11: 0000040000000000 R12: ffffffff9d06ef78
R13: ff11001ff83e5000 R14: ffffffff9dba7c60 R15: 0000000000000c00
memblock_double_array+0xff/0x310
memblock_add_range+0x1fb/0x2f0
memblock_reserve+0x4f/0xa0
memblock_alloc_range_nid+0xac/0x130
memblock_alloc_internal+0x53/0xc0
memblock_alloc_try_nid+0x3d/0xa0
swiotlb_init_remap+0x149/0x2f0
mem_init+0xb/0xb0
mm_core_init+0x8f/0x350
start_kernel+0x17e/0x5d0
x86_64_start_reservations+0x14/0x30
x86_64_start_kernel+0x92/0xa0
secondary_startup_64_no_verify+0x194/0x19b
Mitigate this by calling accept_memory() on the memory range returned
before the slab is available.
Prior to v6.12, the accept_memory() interface used a 'start' and 'end'
parameter instead of 'start' and 'size', therefore the accept_memory()
call must be adjusted to specify 'start + size' for 'end' when applying
to kernels prior to v6.12.
Cc: stable@vger.kernel.org # see patch description, needs adjustments for <= 6.11
Fixes: dcdfdd40fa82 ("mm: Add support for unaccepted memory")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/da1ac73bf4ded761e21b4e4bb5178382a580cd73.1746725050.git.thomas.lendacky@amd.com
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.15-2025-05-08:
amdgpu:
- DC FP fixes
- Freesync fix
- DMUB AUX fixes
- VCN fix
- Hibernation fixes
- HDP fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250508194102.3242372-1-alexander.deucher@amd.com
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
drm:
- Fix overflow when generating wedged event
ivpu:
- Increate timeouts
- Fix deadlock in cmdq ioctl
- Unlock mutices in correct order
panel:
- simple: Fix timings for AUO G101EVN010
ttm:
- Fix documentation
- Remove struct ttm_backup
v3d:
- Avoid memory leak in job handling
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250508104939.GA76697@2a02-2454-fd5e-fd00-c110-cbf2-6528-c5be.dyn6.pyur.net
|
|
Pull bcachefs fixes from Kent Overstreet:
- Some fixes to help with filesystem analysis: ensure superblock
error count gets written if we go ERO, don't discard the journal
aggressively (so it's available for list_journal -a)
- Fix lost wakeup on arm causing us to get stuck when reading btree
nodes
- Fix fsck failing to exit on ctrl-c
- An additional fix for filesystems with misaligned bucket sizes: we
now ensure that allocations are properly aligned
- Setting background target but not promote target will now leave that
data cached on the foreground target, as it used to
- Revert a change to when we allocate the VFS superblock, this was done
for implementing blk_holder_ops but ended up not being needed, and
allocating a superblock and not setting SB_BORN while we do recovery
caused sync() calls and other things to hang
- Assorted fixes for harmless error messages that caused concern to
users
* tag 'bcachefs-2025-05-08' of git://evilpiepirate.org/bcachefs:
bcachefs: Don't aggressively discard the journal
bcachefs: Ensure superblock gets written when we go ERO
bcachefs: Filter out harmless EROFS error messages
bcachefs: journal_shutdown is EROFS, not EIO
bcachefs: Call bch2_fs_start before getting vfs superblock
bcachefs: fix hung task timeout in journal read
bcachefs: Add missing barriers before wake_up_bit()
bcachefs: Ensure proper write alignment
bcachefs: Improve want_cached_ptr()
bcachefs: thread_with_stdio: fix spinning instead of exiting
|
|
Without CONFIG_DRM_XE_GPUSVM set, GPU SVM is not initialized thus below
warning pops. Refine the flush work code to be controlled by the config
to avoid below warning:
"
[ 453.132028] ------------[ cut here ]------------
[ 453.132527] WARNING: CPU: 9 PID: 4491 at kernel/workqueue.c:4205 __flush_work+0x379/0x3a0
[ 453.133355] Modules linked in: xe drm_ttm_helper ttm gpu_sched drm_buddy drm_suballoc_helper drm_gpuvm drm_exec
[ 453.134352] CPU: 9 UID: 0 PID: 4491 Comm: xe_exec_mix_mod Tainted: G U W 6.15.0-rc3+ #7 PREEMPT(full)
[ 453.135405] Tainted: [U]=USER, [W]=WARN
...
[ 453.136921] RIP: 0010:__flush_work+0x379/0x3a0
[ 453.137417] Code: 8b 45 00 48 8b 55 08 89 c7 48 c1 e8 04 83 e7 08 83 e0 0f 83 cf 02 89 c6 48 0f ba 6d 00 03 e9 d5 fe ff ff 0f 0b e9 db fd ff ff <0f> 0b 45 31 e4 e9 d1 fd ff ff 0f 0b e9 03 ff ff ff 0f 0b e9 d6 fe
[ 453.139250] RSP: 0018:ffffc90000c67b18 EFLAGS: 00010246
[ 453.139782] RAX: 0000000000000000 RBX: ffff888108a24000 RCX: 0000000000002000
[ 453.140521] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881016d61c8
[ 453.141253] RBP: ffff8881016d61c8 R08: 0000000000000000 R09: 0000000000000000
[ 453.141985] R10: 0000000000000000 R11: 0000000008a24000 R12: 0000000000000001
[ 453.142709] R13: 0000000000000002 R14: 0000000000000000 R15: ffff888107db8c00
[ 453.143450] FS: 00007f44853d4c80(0000) GS:ffff8882f469b000(0000) knlGS:0000000000000000
[ 453.144276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 453.144853] CR2: 00007f4487629228 CR3: 00000001016aa000 CR4: 00000000000406f0
[ 453.145594] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 453.146320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 453.147061] Call Trace:
[ 453.147336] <TASK>
[ 453.147579] ? tick_nohz_tick_stopped+0xd/0x30
[ 453.148067] ? xas_load+0x9/0xb0
[ 453.148435] ? xa_load+0x6f/0xb0
[ 453.148781] __xe_vm_bind_ioctl+0xbd5/0x1500 [xe]
[ 453.149338] ? dev_printk_emit+0x48/0x70
[ 453.149762] ? _dev_printk+0x57/0x80
[ 453.150148] ? drm_ioctl+0x17c/0x440
[ 453.150544] ? __drm_dev_vprintk+0x36/0x90
[ 453.150983] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[ 453.151575] ? drm_ioctl_kernel+0x9f/0xf0
[ 453.151998] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[ 453.152560] drm_ioctl_kernel+0x9f/0xf0
[ 453.152968] drm_ioctl+0x20f/0x440
[ 453.153332] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[ 453.153893] ? ioctl_has_perm.constprop.0.isra.0+0xae/0x100
[ 453.154489] ? memory_bm_test_bit+0x5/0x60
[ 453.154935] xe_drm_ioctl+0x47/0x70 [xe]
[ 453.155419] __x64_sys_ioctl+0x8d/0xc0
[ 453.155824] do_syscall_64+0x47/0x110
[ 453.156228] entry_SYSCALL_64_after_hwframe+0x76/0x7e
"
v2 (Matt):
refine commit message to have more details
add Fixes tag
move the code to xe_svm.h which already have the config
remove a blank line per codestyle suggestion
Fixes: 63f6e480d115 ("drm/xe: Add SVM garbage collector")
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250502170052.1787973-1-shuicheng.lin@intel.com
(cherry picked from commit 9d80698bcd97a5ad1088bcbb055e73fd068895e2)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
xe_force_wake_get() is dependent on xe_pm_runtime_get(), so for
the release path, xe_force_wake_put() should be called first then
xe_pm_runtime_put().
Combine the error path and normal path together with goto.
Fixes: 85d547608ef5 ("drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return")
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250507022302.2187527-1-shuicheng.lin@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 432cd94efdca06296cc5e76d673546f58aa90ee1)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM,
while the GSC one isn't (and can't be as we need to do memory
allocations in the gsc worker). Therefore, we can't flush the latter
from the former.
The reason why we had such a flush was to avoid interrupting either
the GSC FW load or in progress GSC proxy operations. GSC proxy
operations fall into 2 categories:
1) GSC proxy init: this only happens once immediately after GSC FW load
and does not support being interrupted. The only way to recover from
an interruption of the proxy init is to do an FLR and re-load the GSC.
2) GSC proxy request: this can happen in response to a request that
the driver sends to the GSC. If this is interrupted, the GSC FW will
timeout and the driver request will be failed, but overall the GSC
will keep working fine.
Flushing the work allowed us to avoid interruption in both cases (unless
the hang came from the GSC engine itself, in which case we're toast
anyway). However, a failure on a proxy request is tolerable if we're in
a scenario where we're triggering a GT reset (i.e., something is already
gone pretty wrong), so what we really need to avoid is interrupting
the init flow, which we can do by polling on the register that reports
when the proxy init is complete (as that ensure us that all the load and
init operations have been completed).
Note that during suspend we still want to do a flush of the worker to
make sure it completes any operations involving the HW before the power
is cut.
v2: fix spelling in commit msg, rename waiter function (Julia)
Fixes: dd0e89e5edc2 ("drm/xe/gsc: GSC FW load")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com
(cherry picked from commit 12370bfcc4f0bdf70279ec5b570eb298963422b5)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
LNCF registers report wrong values when XE_FORCEWAKE_GT
only is held. Holding XE_FORCEWAKE_ALL ensures correct
operations on LNCF regs.
V2(Himal):
- Use xe_force_wake_ref_has_domain
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1999
Fixes: a6a4ea6d7d37 ("drm/xe: Add mocs kunit")
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250428082357.1730068-1-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit 70a2585e582058e94fe4381a337be42dec800337)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
For an unknown reason the math to determine the PF queue size does is
not correct - compute UMD applications are overflowing the PF queue
which is fatal. A multippier of 8 fixes the problem.
Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com>
Link: https://lore.kernel.org/r/20250408155915.78770-1-matthew.brost@intel.com
(cherry picked from commit 29582e0ea75c95668d168b12406e3c56cf5a73c4)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Pull vfio fix from Alex Williamson:
- Fix an issue in vfio-pci huge_fault handling by aligning faults to
the order, resulting in deterministic use of huge pages. This
avoids a race where simultaneous aligned and unaligned faults to
the same PMD can result in a VM_FAULT_OOM and subsequent VM crash.
(Alex Williamson)
* tag 'vfio-v6.15-rc6' of https://github.com/awilliam/linux-vfio:
vfio/pci: Align huge faults to order
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into fixes
riscv fixes for 6.15-rc6
- A fix to handle compressed halfword load/store instructions misaligned accesses
- A fix to allow user memory access while handling a misaligned access
- 2 fixes to return an error if the pointer masking extension is not implemented on the platform but userspace still tries to access it, which caused oops on some early platforms
- A fix to prevent the stripping of .rela.dyn so that a vmlinux loaded by kexec can successfully boot
* tag 'riscv-fixes-6.15-rc6' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux:
riscv: Disallow PR_GET_TAGGED_ADDR_CTRL without Supm
scripts: Do not strip .rela.dyn section
riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
riscv: misaligned: use get_user() instead of __get_user()
riscv: misaligned: enable IRQs while handling misaligned accesses
riscv: misaligned: factorize trap handling
riscv: misaligned: Add handling for ZCB instructions
|
|
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes: 689275140cb8 ("drm/amdgpu/hdp7.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit dbc064adfcf9095e7d895bea87b2f75c1ab23236)
Cc: stable@vger.kernel.org
|
|
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes: abe1cbaec6cf ("drm/amdgpu/hdp6.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 84141ff615951359c9a99696fd79a36c465ed847)
Cc: stable@vger.kernel.org
|
|
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes: f756dbac1ce1 ("drm/amdgpu/hdp5.2: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4a89b7698e771914b4d5b571600c76e2fdcbe2a9)
Cc: stable@vger.kernel.org
|
|
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes: cf424020e040 ("drm/amdgpu/hdp5.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a5cb344033c7598762e89255e8ff52827abb57a4)
Cc: stable@vger.kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from CAN, WiFi and netfilter.
We have still a comple of regressions open due to the recent
drivers locking refactor. The patches are in-flight, but not
ready yet.
Current release - regressions:
- core: lock netdevices during dev_shutdown
- sch_htb: make htb_deactivate() idempotent
- eth: virtio-net: don't re-enable refill work too early
Current release - new code bugs:
- eth: icssg-prueth: fix kernel panic during concurrent Tx queue
access
Previous releases - regressions:
- gre: fix again IPv6 link-local address generation.
- eth: b53: fix learning on VLAN unaware bridges
Previous releases - always broken:
- wifi: fix out-of-bounds access during multi-link element
defragmentation
- can:
- initialize spin lock on device probe
- fix order of unregistration calls
- openvswitch: fix unsafe attribute parsing in output_userspace()
- eth:
- virtio-net: fix total qstat values
- mtk_eth_soc: reset all TX queues on DMA free
- fbnic: firmware IPC mailbox fixes"
* tag 'net-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits)
virtio-net: fix total qstat values
net: export a helper for adding up queue stats
fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready
fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready
fbnic: Cleanup handling of completions
fbnic: Actually flush_tx instead of stalling out
fbnic: Add additional handling of IRQs
fbnic: Gate AXI read/write enabling on FW mailbox
fbnic: Fix initialization of mailbox descriptor rings
net: dsa: b53: do not set learning and unicast/multicast on up
net: dsa: b53: fix learning on VLAN unaware bridges
net: dsa: b53: fix toggling vlan_filtering
net: dsa: b53: do not program vlans when vlan filtering is off
net: dsa: b53: do not allow to configure VLAN 0
net: dsa: b53: always rejoin default untagged VLAN on bridge leave
net: dsa: b53: fix VLAN ID for untagged vlan on bridge leave
net: dsa: b53: fix flushing old pvid VLAN on pvid change
net: dsa: b53: fix clearing PVID of a port
net: dsa: b53: keep CPU port always tagged again
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- Fix potential use-after-free bug and missing error handling in PCI
code
- Fix dcssblk build error
- Fix last breaking event handling in case of stack corruption to allow
for better error reporting
- Update defconfigs
* tag 's390-6.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pci: Fix duplicate pci_dev_put() in disable_slot() when PF has child VFs
s390/pci: Fix missing check for zpci_create_device() error return
s390: Update defconfigs
s390/dcssblk: Fix build error with CONFIG_DAX=m and CONFIG_DCSSBLK=y
s390/entry: Fix last breaking event handling in case of stack corruption
s390/configs: Enable options required for TC flow offload
s390/configs: Enable VDPA on Nvidia ConnectX-6 network card
|
|
Pull smb server fixes from Steve French:
- Fix UAF closing file table (e.g. in tree disconnect)
- Fix potential out of bounds write
- Fix potential memory leak parsing lease state in open
- Fix oops in rename with empty target
* tag 'v6.15-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: Fix UAF in __close_file_table_ids
ksmbd: prevent out-of-bounds stream writes by validating *pos
ksmbd: fix memory leak in parse_lease_state()
ksmbd: prevent rename with empty string
|
|
Pull NVMe fix from Christoph:
"nvme fixes for linux 6.15
- unblock ctrl state transition for firmware update (Daniel Wagner)"
* tag 'nvme-6.15-2025-05-08' of git://git.infradead.org/nvme:
nvme: unblock ctrl state transition for firmware update
|
|
Ever since commit eca2040972b4("scsi: block: ioprio: Clean up interface
definition"), the macro IOPRIO_PRIO_LEVEL() will mask the level value to
something between 0 and 7 so necessarily, level will always be lower than
IOPRIO_NR_LEVELS(8).
Remove this obsolete check.
Reported-by: Kexin Wei <ys.weikexin@h3c.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Aaron Lu <ziqianlu@bytedance.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250508083018.GA769554@bytedance
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Set the magic BP_SPEC_REDUCE bit to mitigate SRSO when running VMs if and
only if KVM has at least one active VM. Leaving the bit set at all times
unfortunately degrades performance by a wee bit more than expected.
Use a dedicated spinlock and counter instead of hooking virtualization
enablement, as changing the behavior of kvm.enable_virt_at_load based on
SRSO_BP_SPEC_REDUCE is painful, and has its own drawbacks, e.g. could
result in performance issues for flows that are sensitive to VM creation
latency.
Defer setting BP_SPEC_REDUCE until VMRUN is imminent to avoid impacting
performance on CPUs that aren't running VMs, e.g. if a setup is using
housekeeping CPUs. Setting BP_SPEC_REDUCE in task context, i.e. without
blasting IPIs to all CPUs, also helps avoid serializing 1<=>N transitions
without incurring a gross amount of complexity (see the Link for details
on how ugly coordinating via IPIs gets).
Link: https://lore.kernel.org/all/aBOnzNCngyS_pQIW@google.com
Fixes: 8442df2b49ed ("x86/bugs: KVM: Add support for SRSO_MSR_FIX")
Reported-by: Michael Larabel <Michael@michaellarabel.com>
Closes: https://www.phoronix.com/review/linux-615-amd-regression
Cc: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20250505180300.973137-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
When the prctl() interface for pointer masking was added, it did not
check that the pointer masking ISA extension was supported, only the
individual submodes. Userspace could still attempt to disable pointer
masking and query the pointer masking state. commit 81de1afb2dd1
("riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL") disallowed
the former, as the senvcfg write could crash on older systems.
PR_GET_TAGGED_ADDR_CTRL state does not crash, because it reads only
kernel-internal state and not senvcfg, but it should still be disallowed
for consistency.
Fixes: 09d6775f503b ("riscv: Add support for userspace pointer masking")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20250507145230.2272871-1-samuel.holland@sifive.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
|
|
The .rela.dyn section contains runtime relocations and is only emitted
for a relocatable kernel.
riscv uses this section to relocate the kernel at runtime but that section
is stripped from vmlinux. That prevents kexec to successfully load vmlinux
since it does not contain the relocations info needed.
Fixes: 559d1e45a16d ("riscv: Use --emit-relocs in order to move .rela.dyn in init")
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250408072851.90275-1-alexghiti@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
|
|
When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not
available, the kernel crashes:
Oops - illegal instruction [#1]
[snip]
epc : set_tagged_addr_ctrl+0x112/0x15a
ra : set_tagged_addr_ctrl+0x74/0x15a
epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10
[snip]
status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002
set_tagged_addr_ctrl+0x112/0x15a
__riscv_sys_prctl+0x352/0x73c
do_trap_ecall_u+0x17c/0x20c
andle_exception+0x150/0x15c
Fix it by checking if Supm is available.
Fixes: 09d6775f503b ("riscv: Add support for userspace pointer masking")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20250504101920.3393053-1-namcao@linutronix.de
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
|
|
Now that we can safely handle user memory accesses while in the
misaligned access handlers, use get_user() instead of __get_user() to
have user memory access checks.
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250422162324.956065-4-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
|
|
We can safely reenable IRQs if coming from userspace. This allows to
access user memory that could potentially trigger a page fault.
Fixes: b686ecdeacf6 ("riscv: misaligned: Restrict user access to kernel memory")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250422162324.956065-3-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
|
|
Since both load/store and user/kernel should use almost the same path and
that we are going to add some code around that, factorize it.
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250422162324.956065-2-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
|
|
Seth's mails bounce back, remove his maintainership.
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20250505231511.3175151-1-andi.shyti@kernel.org
|
|
Jakub Kicinski says:
====================
virtio-net: fix total qstat values
Another small fix discovered after we enabled virtio multi-queue
in netdev CI. The queue stat test fails:
# Exception| Exception: Qstats are lower, fetched later
not ok 3 stats.pkt_byte_sum
The queue stats from disabled queues are supposed to be reported
in the "base" stats.
====================
Link: https://patch.msgid.link/20250507003221.823267-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
NIPA tests report that the interface statistics reported
via qstat are lower than those reported via ip link.
Looks like this is because some tests flip the queue
count up and down, and we end up with some of the traffic
accounted on disabled queues.
Add up counters from disabled queues.
Fixes: d888f04c09bb ("virtio-net: support queue stat")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250507003221.823267-3-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Older drivers and drivers with lower queue counts often have a static
array of queues, rather than allocating structs for each queue on demand.
Add a helper for adding up qstats from a queue range. Expectation is
that driver will pass a queue range [netdev->real_num_*x_queues, MAX).
It was tempting to always use num_*x_queues as the end, but virtio
seems to clamp its queue count after allocating the netdev. And this
way we can trivaly reuse the helper for [0, real_..).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250507003221.823267-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Alexander Duyck says:
====================
fbnic: FW IPC Mailbox fixes
This series is meant to address a number of issues that have been found in
the FW IPC mailbox over the past several months.
The main issues addressed are:
1. Resolve a potential race between host and FW during initialization that
can cause the FW to only have the lower 32b of an address.
2. Block the FW from issuing DMA requests after we have closed the mailbox
and before we have started issuing requests on it.
3. Fix races in the IRQ handlers that can cause the IRQ to unmask itself if
it is being processed while we are trying to disable it.
4. Cleanup the Tx flush logic so that we actually lock down the Tx path
before we start flushing it instead of letting it free run while we are
shutting it down.
5. Fix several memory leaks that could occur if we failed initialization.
6. Cleanup the mailbox completion if we are flushing Tx since we are no
longer processing Rx.
7. Move several allocations out of a potential IRQ/atomic context.
There have been a few optimizations we also picked up since then. Rather
than split them out I just folded them into these diffs. They mostly
address minor issues such as how long it takes to initialize and/or fail so
I thought they could probably go in with the rest of the patches. They
consist of:
1. Do not sleep more than 20ms waiting on FW to respond as the 200ms value
likely originated from simulation/emulation testing.
2. Use jiffies to determine timeout instead of sleep * attempts for better
accuracy.
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
====================
Link: https://patch.msgid.link/174654659243.499179.11194817277075480209.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|