Age | Commit message (Collapse) | Author |
|
There is a spelling mistake in a literal string and in the function
test_get_inital_dirty. Fix them.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Message-ID: <20250204105647.367743-1-colin.i.king@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.14, take #1
- Correctly clean the BSS to the PoC before allowing EL2 to access it
on nVHE/hVHE/protected configurations
- Propagate ownership of debug registers in protected mode after
the rework that landed in 6.14-rc1
- Stop pretending that we can run the protected mode without a GICv3
being present on the host
- Fix a use-after-free situation that can occur if a vcpu fails to
initialise the NV shadow S2 MMU contexts
- Always evaluate the need to arm a background timer for fully emulated
guest timers
- Fix the emulation of EL1 timers in the absence of FEAT_ECV
- Correctly handle the EL2 virtual timer, specially when HCR_EL2.E2H==0
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
- some selftest fixes
- move some kvm-related functions from mm into kvm
- remove all usage of page->index and page->lru from kvm
- fixes and cleanups for vsie
|
|
SYNTHESIZED_F() generally is used together with setup_force_cpu_cap(),
i.e. when it makes sense to present the feature even if cpuid does not
have it *and* the VM is not able to see the difference. For example,
it can be used when mitigations on the host automatically protect
the guest as well.
The "SYNTHESIZED_F(SRSO_USER_KERNEL_NO)" line came in as a conflict
resolution between the CPUID overhaul from the KVM tree and support
for the feature in the x86 tree. Using it right now does not hurt,
or make a difference for that matter, because there is no
setup_force_cpu_cap(X86_FEATURE_SRSO_USER_KERNEL_NO). However, it
is a little less future proof in case such a setup_force_cpu_cap()
appears later, for a case where the kernel somehow is not vulnerable
but the guest would have to apply the mitigation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The way we deal with the EL2 virtual timer is a bit odd.
We try to cope with E2H being flipped, and adjust which offset
applies to that timer depending on the current E2H value. But that's
a complexity we shouldn't have to worry about.
What we have to deal with is either E2H being RES1, in which case
there is no offset, or E2H being RES0, and the virtual timer simply
does not exist.
Drop the adjusting of the timer offset, which makes things a bit
simpler. At the same time, make sure that accessing the HV timer
when E2H is RES0 results in an UNDEF in the guest.
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250204110050.150560-4-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Both Wei-Lin Chang and Volodymyr Babchuk report that the way we
handle the emulation of EL1 timers with NV is completely wrong,
specially in the case of HCR_EL2.E2H==0.
There are three problems in about as many lines of code:
- With E2H==0, the EL1 timers are overwritten with the EL1 state,
while they should actually contain the EL2 state (as per the timer
map)
- With E2H==1, we run the full EL1 timer emulation even when ECV
is present, hiding a bug in timer_emulate() (see previous patch)
- The comments are actively misleading, and say all the wrong things.
This is only attributable to the code having been initially written
for FEAT_NV, hacked up to handle FEAT_NV2 *in parallel*, and vaguely
hacked again to be FEAT_NV2 only. Oh, and yours truly being a gold
plated idiot.
The fix is obvious: just delete most of the E2H==0 code, have a unified
handling of the timers (because they really are E2H agnostic), and
make sure we don't execute any of that when FEAT_ECV is present.
Fixes: 4bad3068cfa9f ("KVM: arm64: nv: Sync nested timer state with FEAT_NV2")
Reported-by: Wei-Lin Chang <r09922117@csie.ntu.edu.tw>
Reported-by: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Link: https://lore.kernel.org/r/fqiqfjzwpgbzdtouu2pwqlu7llhnf5lmy4hzv5vo6ph4v3vyls@jdcfy3fjjc5k
Link: https://lore.kernel.org/r/87frl51tse.fsf@epam.com
Tested-by: Dmytro Terletskyi <dmytro_terletskyi@epam.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250204110050.150560-3-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When updating the interrupt state for an emulated timer, we return
early and skip the setup of a soft timer that runs in parallel
with the guest.
While this is OK if we have set the interrupt pending, it is pretty
wrong if the guest moved CVAL into the future. In that case,
no timer is armed and the guest can wait for a very long time
(it will take a full put/load cycle for the situation to resolve).
This is specially visible with EDK2 running at EL2, but still
using the EL1 virtual timer, which in that case is fully emulated.
Any key-press takes ages to be captured, as there is no UART
interrupt and EDK2 relies on polling from a timer...
The fix is simply to drop the early return. If the timer interrupt
is pending, we will still return early, and otherwise arm the soft
timer.
Fixes: 4d74ecfa6458b ("KVM: arm64: Don't arm a hrtimer for an already pending timer")
Cc: stable@vger.kernel.org
Tested-by: Dmytro Terletskyi <dmytro_terletskyi@epam.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250204110050.150560-2-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
For each vcpu that userspace creates, we allocate a number of
s2_mmu structures that will eventually contain our shadow S2
page tables.
Since this is a dynamically allocated array, we reallocate
the array and initialise the newly allocated elements. Once
everything is correctly initialised, we adjust pointer and size
in the kvm structure, and move on.
But should that initialisation fail *and* the reallocation triggered
a copy to another location, we end-up returning early, with the
kvm structure still containing the (now stale) old pointer. Weeee!
Cure it by assigning the pointer early, and use this to perform
the initialisation. If everything succeeds, we adjust the size.
Otherwise, we just leave the size as it was, no harm done, and the
new memory is as good as the ol' one (we hope...).
Fixes: 4f128f8e1aaac ("KVM: arm64: nv: Support multiple nested Stage-2 mmu structures")
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Link: https://lore.kernel.org/r/20250204145554.774427-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The rxrpc_connection attend queue is never used because conn::attend_link
is never initialised and so is always NULL'd out and thus always appears to
be busy. This requires the following fix:
(1) Fix this the attend queue problem by initialising conn::attend_link.
And, consequently, two further fixes for things masked by the above bug:
(2) Fix rxrpc_input_conn_event() to handle being invoked with a NULL
sk_buff pointer - something that can now happen with the above change.
(3) Fix the RXRPC_SKB_MARK_SERVICE_CONN_SECURED message to carry a pointer
to the connection and a ref on it.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
Fixes: f2cce89a074e ("rxrpc: Implement a mechanism to send an event notification to a connection")
Link: https://patch.msgid.link/20250203110307.7265-3-dhowells@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Protected mode assumes that at minimum vgic-v3 is present, however KVM
fails to actually enforce this at the time of initialization. As such,
when running protected mode in a half-baked state on GICv2 hardware we
see the hyp go belly up at vcpu_load() when it tries to restore the
vgic-v3 cpuif:
$ ./arch_timer_edge_cases
[ 130.599140] kvm [4518]: nVHE hyp panic at: [<ffff800081102b58>] __kvm_nvhe___vgic_v3_restore_vmcr_aprs+0x8/0x84!
[ 130.603685] kvm [4518]: Cannot dump pKVM nVHE stacktrace: !CONFIG_PROTECTED_NVHE_STACKTRACE
[ 130.611962] kvm [4518]: Hyp Offset: 0xfffeca95ed000000
[ 130.617053] Kernel panic - not syncing: HYP panic:
[ 130.617053] PS:800003c9 PC:0000b56a94102b58 ESR:0000000002000000
[ 130.617053] FAR:ffff00007b98d4d0 HPFAR:00000000007b98d0 PAR:0000000000000000
[ 130.617053] VCPU:0000000000000000
[ 130.638013] CPU: 0 UID: 0 PID: 4518 Comm: arch_timer_edge Tainted: G C 6.13.0-rc3-00009-gf7d03fcbf1f4 #1
[ 130.648790] Tainted: [C]=CRAP
[ 130.651721] Hardware name: Libre Computer AML-S905X-CC (DT)
[ 130.657242] Call trace:
[ 130.659656] show_stack+0x18/0x24 (C)
[ 130.663279] dump_stack_lvl+0x38/0x90
[ 130.666900] dump_stack+0x18/0x24
[ 130.670178] panic+0x388/0x3e8
[ 130.673196] nvhe_hyp_panic_handler+0x104/0x208
[ 130.677681] kvm_arch_vcpu_load+0x290/0x548
[ 130.681821] vcpu_load+0x50/0x80
[ 130.685013] kvm_arch_vcpu_ioctl_run+0x30/0x868
[ 130.689498] kvm_vcpu_ioctl+0x2e0/0x974
[ 130.693293] __arm64_sys_ioctl+0xb4/0xec
[ 130.697174] invoke_syscall+0x48/0x110
[ 130.700883] el0_svc_common.constprop.0+0x40/0xe0
[ 130.705540] do_el0_svc+0x1c/0x28
[ 130.708818] el0_svc+0x30/0xd0
[ 130.711837] el0t_64_sync_handler+0x10c/0x138
[ 130.716149] el0t_64_sync+0x198/0x19c
[ 130.719774] SMP: stopping secondary CPUs
[ 130.723660] Kernel Offset: disabled
[ 130.727103] CPU features: 0x000,00000800,02800000,0200421b
[ 130.732537] Memory Limit: none
[ 130.735561] ---[ end Kernel panic - not syncing: HYP panic:
[ 130.735561] PS:800003c9 PC:0000b56a94102b58 ESR:0000000002000000
[ 130.735561] FAR:ffff00007b98d4d0 HPFAR:00000000007b98d0 PAR:0000000000000000
[ 130.735561] VCPU:0000000000000000 ]---
Fix it by failing KVM initialization if the system doesn't implement
vgic-v3, as protected mode will never do anything useful on such
hardware.
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/kvmarm/5ca7588c-7bf2-4352-8661-e4a56a9cd9aa@sirena.org.uk/
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250203231543.233511-1-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The documentation previously listed the path to download In Field Scan
(IFS) test images as "TBD".
Update the documentation to include the correct image download
location. Also move the download link to the appropriate section within
the documentation.
Reported-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Link: https://lore.kernel.org/r/20250131205315.1585663-1-jithu.joseph@intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
After the blamed commits below, some UDP tunnel use dstats for
accounting. On the xmit path, all the UDP-base tunnels ends up
using iptunnel_xmit_stats() for stats accounting, and the latter
assumes the relevant (tunnel) network device uses tstats.
The end result is some 'funny' stat report for the mentioned UDP
tunnel, e.g. when no packet is actually dropped and a bunch of
packets are transmitted:
gnv2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue \
state UNKNOWN mode DEFAULT group default qlen 1000
link/ether ee:7d:09:87:90:ea brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped missed mcast
14916 23 0 15 0 0
TX: bytes packets errors dropped carrier collsns
0 1566 0 0 0 0
Address the issue ensuring the same binary layout for the overlapping
fields of dstats and tstats. While this solution is a bit hackish, is
smaller and with no performance pitfall compared to other alternatives
i.e. supporting both dstat and tstat in iptunnel_xmit_stats() or
reverting the blamed commit.
With time we should possibly move all the IP-based tunnel (and virtual
devices) to dstats.
Fixes: c77200c07491 ("bareudp: Handle stats using NETDEV_PCPU_STAT_DSTATS.")
Fixes: 6fa6de302246 ("geneve: Handle stats using NETDEV_PCPU_STAT_DSTATS.")
Fixes: be226352e8dc ("vxlan: Handle stats using NETDEV_PCPU_STAT_DSTATS.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/2e1c444cf0f63ae472baff29862c4c869be17031.1738432804.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jakub Kicinski says:
====================
ethtool: rss: minor fixes for recent RSS changes
Make sure RSS_GET messages are consistent in do and dump.
Fix up a recently added safety check for RSS + queue offset.
Adjust related tests so that they pass on devices which
don't support RSS + queue offset.
====================
Link: https://patch.msgid.link/20250201013040.725123-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
supported
Vast majority of drivers does not support queue offset.
Simply return if the rss context + queue ntuple fails.
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit under Fixes adds ntuple rules but never deletes them.
Fixes: 29a4bc1fe961 ("selftest: extend test_rss_context_queue_reconfigure for action addition")
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The info.flow_type is for RXFH commands, ntuple flow_type is inside
the flow spec. The check currently does nothing, as info.flow_type
is 0 (or even uninitialized by user space) for ETHTOOL_SRXCLSRLINS.
Fixes: 9e43ad7a1ede ("net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in")
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit ec6e57beaf8b ("ethtool: rss: don't report key if device
doesn't support it") intended to stop reporting key fields for
additional rss contexts if device has a global hashing key.
Later we added dump support and the filtering wasn't properly
added there. So we end up reporting the key fields in dumps
but not in dos:
# ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --do rss-get \
--json '{"header": {"dev-index":2}, "context": 1 }'
{
"header": { ... },
"context": 1,
"indir": [0, 1, 2, 3, ...]]
}
# ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --dump rss-get
[
... snip context 0 ...
{ "header": { ... },
"context": 1,
"indir": [0, 1, 2, 3, ...],
-> "input_xfrm": 255,
-> "hfunc": 1,
-> "hkey": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
}
]
Hide these fields correctly.
The drivers/net/hw/rss_ctx.py selftest catches this when run on
a device with single key, already:
# Check| At /root/./ksft-net-drv/drivers/net/hw/rss_ctx.py, line 381, in test_rss_context_dump:
# Check| ksft_ne(set(data.get('hkey', [1])), {0}, "key is all zero")
# Check failed {0} == {0} key is all zero
not ok 8 rss_ctx.test_rss_context_dump
Fixes: f6122900f4e2 ("ethtool: rss: support dumping RSS contexts")
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
kthread_affine_preferred() incorrectly returns 0 instead of -ENOMEM
when kzalloc() fails. Return 'ret' to ensure the correct error code is
propagated.
Fixes: 4d13f4304fa4 ("kthread: Implement preferred affinity")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501301528.t0cZVbnq-lkp@intel.com/
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|
|
A null dereference or oops exception will eventually occur when qla1280.c
driver is compiled with DEBUG_QLA1280 enabled and ql_debug_level > 2. I
think its clear from the code that the intention here is sg_dma_len(s) not
length of sg_next(s) when printing the debug info.
Signed-off-by: Magnus Lindholm <linmag7@gmail.com>
Link: https://lore.kernel.org/r/20250125095033.26188-1-linmag7@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
There is currently no mechanism to return error from query responses.
Return the error and print the corresponding error message with it.
Signed-off-by: Seunghui Lee <sh043.lee@samsung.com>
Link: https://lore.kernel.org/r/20250118023808.24726-1-sh043.lee@samsung.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In StorVSC, payload->range.len is used to indicate if this SCSI command
carries payload. This data is allocated as part of the private driver data
by the upper layer and may get passed to lower driver uninitialized.
For example, the SCSI error handling mid layer may send TEST_UNIT_READY or
REQUEST_SENSE while reusing the buffer from a failed command. The private
data section may have stale data from the previous command.
If the SCSI command doesn't carry payload, the driver may use this value as
is for communicating with host, resulting in possible corruption.
Fix this by always initializing this value.
Fixes: be0cf6ca301c ("scsi: storvsc: Set the tablesize based on the information given by the host")
Cc: stable@kernel.org
Tested-by: Roman Kisel <romank@linux.microsoft.com>
Reviewed-by: Roman Kisel <romank@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1737601642-7759-1-git-send-email-longli@linuxonhyperv.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
devm_blk_crypto_profile_init() registers a cleanup handler to run when
the associated (platform-) device is being released. For UFS, the
crypto private data and pointers are stored as part of the ufs_hba's
data structure 'struct ufs_hba::crypto_profile'. This structure is
allocated as part of the underlying ufshcd and therefore Scsi_host
allocation.
During driver release or during error handling in ufshcd_pltfrm_init(),
this structure is released as part of ufshcd_dealloc_host() before the
(platform-) device associated with the crypto call above is released.
Once this device is released, the crypto cleanup code will run, using
the just-released 'struct ufs_hba::crypto_profile'. This causes a
use-after-free situation:
Call trace:
kfree+0x60/0x2d8 (P)
kvfree+0x44/0x60
blk_crypto_profile_destroy_callback+0x28/0x70
devm_action_release+0x1c/0x30
release_nodes+0x6c/0x108
devres_release_all+0x98/0x100
device_unbind_cleanup+0x20/0x70
really_probe+0x218/0x2d0
In other words, the initialisation code flow is:
platform-device probe
ufshcd_pltfrm_init()
ufshcd_alloc_host()
scsi_host_alloc()
allocation of struct ufs_hba
creation of scsi-host devices
devm_blk_crypto_profile_init()
devm registration of cleanup handler using platform-device
and during error handling of ufshcd_pltfrm_init() or during driver
removal:
ufshcd_dealloc_host()
scsi_host_put()
put_device(scsi-host)
release of struct ufs_hba
put_device(platform-device)
crypto cleanup handler
To fix this use-after free, change ufshcd_alloc_host() to register a
devres action to automatically cleanup the underlying SCSI device on
ufshcd destruction, without requiring explicit calls to
ufshcd_dealloc_host(). This way:
* the crypto profile and all other ufs_hba-owned resources are
destroyed before SCSI (as they've been registered after)
* a memleak is plugged in tc-dwc-g210-pci.c remove() as a
side-effect
* EXPORT_SYMBOL_GPL(ufshcd_dealloc_host) can be removed fully as
it's not needed anymore
* no future drivers using ufshcd_alloc_host() could ever forget
adding the cleanup
Fixes: cb77cb5abe1f ("blk-crypto: rename blk_keyslot_manager to blk_crypto_profile")
Fixes: d76d9d7d1009 ("scsi: ufs: use devm_blk_ksm_init()")
Cc: stable@vger.kernel.org
Signed-off-by: André Draszik <andre.draszik@linaro.org>
Link: https://lore.kernel.org/r/20250124-ufshcd-fix-v4-1-c5d0144aae59@linaro.org
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Fail I/Os instead of retry to prevent user space processes from being
blocked on the I/O completion for several minutes.
Retrying I/Os during "depopulation in progress" or "depopulation restore in
progress" results in a continuous retry loop until the depopulation
completes or until the I/O retry loop is aborted due to a timeout by the
scsi_cmd_runtime_exceeced().
Depopulation is slow and can take 24+ hours to complete on 20+ TB HDDs.
Most I/Os in the depopulation retry loop end up taking several minutes
before returning the failure to user space.
Cc: stable@vger.kernel.org # 4.18.x: 2bbeb8d scsi: core: Handle depopulation and restoration in progress
Cc: stable@vger.kernel.org # 4.18.x
Fixes: e37c7d9a0341 ("scsi: core: sanitize++ in progress")
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Link: https://lore.kernel.org/r/20250131184408.859579-1-ipylypiv@google.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Filesystems can write to disk from page reclaim with __GFP_FS
set. Marc found a case where scsi_realloc_sdev_budget_map() ends up in
page reclaim with GFP_KERNEL, where it could try to take filesystem
locks again, leading to a deadlock.
WARNING: possible circular locking dependency detected
6.13.0 #1 Not tainted
------------------------------------------------------
kswapd0/70 is trying to acquire lock:
ffff8881025d5d78 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_submit_bio+0x461/0x6e0
but task is already holding lock:
ffffffff81ef5f40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x9f/0x760
The full lockdep splat can be found in Marc's report:
https://lkml.org/lkml/2025/1/24/1101
Avoid the potential deadlock by doing the allocation with GFP_NOIO, which
prevents both filesystem and block layer recursion.
Reported-by: Marc Aurèle La France <tsi@tuyoix.net>
Signed-off-by: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/r/20250129104525.0ae8421e@fangorn
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This commit addresses an issue where clk_gating.state is being toggled in
ufshcd_setup_clocks() even if clock gating is not allowed.
The fix is to add a check for hba->clk_gating.is_initialized before toggling
clk_gating.state in ufshcd_setup_clocks().
Since clk_gating.lock is now initialized unconditionally, it can no longer
lead to the spinlock being used before it is properly initialized, but
instead it is mostly for documentation purposes.
Fixes: 1ab27c9cf8b6 ("ufs: Add support for clock gating")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Link: https://lore.kernel.org/r/20250128071207.75494-3-avri.altman@wdc.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Address a lockdep warning triggered by the use of the clk_gating.lock before
it is properly initialized. The warning is as follows:
[ 4.388838] INFO: trying to register non-static key.
[ 4.395673] The code is fine but needs lockdep annotation, or maybe
[ 4.402118] you didn't initialize this object before use?
[ 4.407673] turning off the locking correctness validator.
[ 4.413334] CPU: 5 UID: 0 PID: 58 Comm: kworker/u32:1 Not tainted 6.12-rc1 #185
[ 4.413343] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
[ 4.413362] Call trace:
[ 4.413364] show_stack+0x18/0x24 (C)
[ 4.413374] dump_stack_lvl+0x90/0xd0
[ 4.413384] dump_stack+0x18/0x24
[ 4.413392] register_lock_class+0x498/0x4a8
[ 4.413400] __lock_acquire+0xb4/0x1b90
[ 4.413406] lock_acquire+0x114/0x310
[ 4.413413] _raw_spin_lock_irqsave+0x60/0x88
[ 4.413423] ufshcd_setup_clocks+0x2c0/0x490
[ 4.413433] ufshcd_init+0x198/0x10ec
[ 4.413437] ufshcd_pltfrm_init+0x600/0x7c0
[ 4.413444] ufs_qcom_probe+0x20/0x58
[ 4.413449] platform_probe+0x68/0xd8
[ 4.413459] really_probe+0xbc/0x268
[ 4.413466] __driver_probe_device+0x78/0x12c
[ 4.413473] driver_probe_device+0x40/0x11c
[ 4.413481] __device_attach_driver+0xb8/0xf8
[ 4.413489] bus_for_each_drv+0x84/0xe4
[ 4.413495] __device_attach+0xfc/0x18c
[ 4.413502] device_initial_probe+0x14/0x20
[ 4.413510] bus_probe_device+0xb0/0xb4
[ 4.413517] deferred_probe_work_func+0x8c/0xc8
[ 4.413524] process_scheduled_works+0x250/0x658
[ 4.413534] worker_thread+0x15c/0x2c8
[ 4.413542] kthread+0x134/0x200
[ 4.413550] ret_from_fork+0x10/0x20
To fix this issue, ensure that the spinlock is only used after it has been
properly initialized before using it in ufshcd_setup_clocks(). Do that
unconditionally as initializing a spinlock is a fast operation.
Fixes: 209f4e43b806 ("scsi: ufs: core: Introduce a new clock_gating lock")
Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Link: https://lore.kernel.org/r/20250128071207.75494-2-avri.altman@wdc.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull d_revalidate fix from Al Viro:
"Fix a braino in d_revalidate series: check ->d_op for NULL"
* tag 'pull-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix braino in "9p: fix ->rename_sem exclusion"
|
|
Pull outstanding fixes bound for this release into 6.14/scsi-fixes.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Jakub Kicinski says:
====================
MAINTAINERS: recognize Kuniyuki Iwashima as a maintainer
Kuniyuki Iwashima has been a prolific contributor and trusted reviewer
for some core portions of the networking stack for a couple of years now.
Formalize some obvious areas of his expertise and list him as a maintainer.
====================
Link: https://patch.msgid.link/20250202014728.1005003-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a MAINTAINERS entry for UNIX socket, Kuniyuki has been
the de-facto maintainer of this code for a while.
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250202014728.1005003-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Create a MAINTAINERS entry for BSD sockets. List the top 3
reviewers as maintainers. The entry is meant to cover core
socket code (of which there isn't much) but also reviews
of any new socket families.
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250202014728.1005003-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
List Kuniyuki as an official TCP reviewer.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250202014728.1005003-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Submissions to the docs seem to not get properly CCed.
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://patch.msgid.link/20250202005024.964262-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
ice: fix Rx data path for heavy 9k MTU traffic
Maciej Fijalkowski says:
This patchset fixes a pretty nasty issue that was reported by RedHat
folks which occurred after ~30 minutes (this value varied, just trying
here to state that it was not observed immediately but rather after a
considerable longer amount of time) when ice driver was tortured with
jumbo frames via mix of iperf traffic executed simultaneously with
wrk/nginx on client/server sides (HTTP and TCP workloads basically).
The reported splats were spanning across all the bad things that can
happen to the state of page - refcount underflow, use-after-free, etc.
One of these looked as follows:
[ 2084.019891] BUG: Bad page state in process swapper/34 pfn:97fcd0
[ 2084.025990] page:00000000a60ee772 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x97fcd0
[ 2084.035462] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[ 2084.041990] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000
[ 2084.049730] raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
[ 2084.057468] page dumped because: nonzero _refcount
[ 2084.062260] Modules linked in: bonding tls sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mgag200 irqd
[ 2084.137829] CPU: 34 PID: 0 Comm: swapper/34 Kdump: loaded Not tainted 5.14.0-427.37.1.el9_4.x86_64 #1
[ 2084.147039] Hardware name: Dell Inc. PowerEdge R750/0216NK, BIOS 1.13.2 12/19/2023
[ 2084.154604] Call Trace:
[ 2084.157058] <IRQ>
[ 2084.159080] dump_stack_lvl+0x34/0x48
[ 2084.162752] bad_page.cold+0x63/0x94
[ 2084.166333] check_new_pages+0xb3/0xe0
[ 2084.170083] rmqueue_bulk+0x2d2/0x9e0
[ 2084.173749] ? ktime_get+0x35/0xa0
[ 2084.177159] rmqueue_pcplist+0x13b/0x210
[ 2084.181081] rmqueue+0x7d3/0xd40
[ 2084.184316] ? xas_load+0x9/0xa0
[ 2084.187547] ? xas_find+0x183/0x1d0
[ 2084.191041] ? xa_find_after+0xd0/0x130
[ 2084.194879] ? intel_iommu_iotlb_sync_map+0x89/0xe0
[ 2084.199759] get_page_from_freelist+0x11f/0x530
[ 2084.204291] __alloc_pages+0xf2/0x250
[ 2084.207958] ice_alloc_rx_bufs+0xcc/0x1c0 [ice]
[ 2084.212543] ice_clean_rx_irq+0x631/0xa20 [ice]
[ 2084.217111] ice_napi_poll+0xdf/0x2a0 [ice]
[ 2084.221330] __napi_poll+0x27/0x170
[ 2084.224824] net_rx_action+0x233/0x2f0
[ 2084.228575] __do_softirq+0xc7/0x2ac
[ 2084.232155] __irq_exit_rcu+0xa1/0xc0
[ 2084.235821] common_interrupt+0x80/0xa0
[ 2084.239662] </IRQ>
[ 2084.241768] <TASK>
The fix is mostly about reverting what was done in commit 1dc1a7e7f410
("ice: Centrallize Rx buffer recycling") followed by proper timing on
page_count() storage and then removing the ice_rx_buf::act related logic
(which was mostly introduced for purposes from cited commit).
Special thanks to Xu Du for providing reproducer and Jacob Keller for
initial extensive analysis.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: stop storing XDP verdict within ice_rx_buf
ice: gather page_count()'s of each frag right before XDP prog call
ice: put Rx buffers after being done with current frame
====================
Link: https://patch.msgid.link/20250131185415.3741532-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
->d_op can bloody well be NULL
Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: 30d61efe118c "9p: fix ->rename_sem exclusion"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Commit 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa
debug tool") partially reverted some changes to workaround breakage
caused to mesa tools. However, in doing so it also broke fetching the
GuC log via debugfs since xe_print_blob_ascii85() simply bails out.
The fix is to avoid the extra newlines: the devcoredump interface is
line-oriented and adding random newlines in the middle breaks it. If a
tool is able to parse it by looking at the data and checking for chars
that are out of the ascii85 space, it can still do so. A format change
that breaks the line-oriented output on devcoredump however needs better
coordination with existing tools.
v2: Add suffix description comment
v3: Reword explanation of xe_print_blob_ascii85() calling drm_puts()
in a loop
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Julia Filipchuk <julia.filipchuk@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: stable@vger.kernel.org
Fixes: 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool")
Fixes: ec1455ce7e35 ("drm/xe/devcoredump: Add ASCII85 dump helper function")
Link: https://patchwork.freedesktop.org/patch/msgid/20250123202307.95103-2-jose.souza@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 2c95bbf5002776117a69caed3b31c10bf7341bec)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Having the exec queue snapshot inside a "GuC CT" section was always
wrong. Commit c28fd6c358db ("drm/xe/devcoredump: Improve section
headings and add tile info") tried to fix that bug, but with that also
broke the mesa tool that parses the devcoredump, hence it was reverted
in commit a53da2fb25a3 ("drm/xe: Revert some changes that break a mesa
debug tool").
With the mesa tool also fixed, this can propagate as a fix on both
kernel and userspace side to avoid unnecessary headache for a debug
feature.
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Julia Filipchuk <julia.filipchuk@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: stable@vger.kernel.org
Fixes: a53da2fb25a3 ("drm/xe: Revert some changes that break a mesa debug tool")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250123051112.1938193-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit a37934ea75d331fafa7fe80b6180642ba5193422)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
We rely on stream->pollin to decide whether or not to block during
poll/read calls. However, currently there are blocking read code paths
which don't even set stream->pollin. The best place to consistently set
stream->pollin for all code paths is therefore to set it in
xe_oa_buffer_check_unlocked.
Fixes: e936f885f1e9 ("drm/xe/oa/uapi: Expose OA stream fd")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250115222029.3002103-1-ashutosh.dixit@intel.com
(cherry picked from commit d3fedff828bb7e4a422c42caeafd5d974e24ee43)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
The migration support only needs to be initialized once, but it
was incorrectly called from the xe_gt_sriov_pf_init_hw(), which
is part of the reset flow and may be called multiple times.
Fixes: d86e3737c7ab ("drm/xe/pf: Add functions to save and restore VF GuC state")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250120232443.544-1-michal.wajdeczko@intel.com
(cherry picked from commit 9ebb5846e1a3b1705f8a7cbc528888a1aa0b163e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
UMD's have interest in setting unused bits of the oa_ctrl register "out of
band" for certain experiments. To facilitate this, don't clobber previous
oa_ctrl unused bits, i.e. rmw the values rather than simply write them.
Fixes: e936f885f1e9 ("drm/xe/oa/uapi: Expose OA stream fd")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250117032155.3048063-1-ashutosh.dixit@intel.com
(cherry picked from commit cfa9d40db8c30d894171010fe765d96e9bc6a47e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
[WHY]
When the system powers up eDP with external monitors in seamless boot
sequence, stutter get enabled before TTU and HUBP registers being
programmed, which resulting in underflow.
[HOW]
Enable TTU in hubp_init.
Change the sequence that do not perpare_bandwidth and optimize_bandwidth
while having seamless boot streams.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Lo-an Chen <lo-an.chen@amd.com>
Signed-off-by: Paul Hsieh <paul.hsieh@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHAT & HOW]
hpo_stream_to_link_encoder_mapping has size MAX_HPO_DP2_ENCODERS(=4),
but location can have size up to 6. As a result, it is necessary to
check location against MAX_HPO_DP2_ENCODERS.
Similiarly, disp_cfg_stream_location can be used as an array index which
should be 0..5, so the ASSERT's conditions should be less without equal.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3904
Reviewed-by: Austin Zheng <Austin.Zheng@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Vulkan can't support DCC and Z/S compression on GFX12 without
WRITE_COMPRESS_DISABLE in this commit or a completely different DCC
interface.
AMDGPU_TILING_GFX12_SCANOUT is added because it's already used by userspace.
Cc: stable@vger.kernel.org # 6.12.x
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
- Properly cast the input to secs_to_jiffies() to unsigned long as
otherwise the result uses the data type of the input variable, which
causes result range checks to fail if the input data type is signed
and smaller than unsigned long.
- Handle late armed hrtimers gracefully on CPU hotplug
There are legitimate cases where a hrtimer is (re)armed on an
outgoing CPU after the timers have been migrated away. This triggers
warnings and caused people to implement horrible workarounds in RCU.
But those workarounds are incomplete and do not cover e.g. the
scheduler hrtimers.
Stop this by force moving timer which are enqueued on the current CPU
after timer migration to be queued on a remote online CPU.
This allows to undo the workarounds in a seperate step.
- Demote a warning level printk() to info level in the clocksource
watchdog code as there is no point to emit a warning level message
for a purely informational message.
- Mark a helper function __always_inline and move it into the existing
#ifdef block to avoid 'unused function' warnings from CLANG
* tag 'timers-urgent-2025-02-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
jiffies: Cast to unsigned long in secs_to_jiffies() conversion
clocksource: Use pr_info() for "Checking clocksource synchronization" message
hrtimers: Force migrate away hrtimers queued after CPUHP_AP_HRTIMERS_DYING
hrtimers: Mark is_migration_base() with __always_inline
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
- Ensure ordering of memory and device I/O for IPIs on RISCV
The RISCV interrupt controllers use writel_relaxed() for generating
an IPI. That's a device I/O write which is not guaranteed to be
ordered against preceding memory writes. As a consequence a IPI
receiving CPU might not be able to observe the actual IPI data which
is required to handle it. Switch to writel() which contains the
necessary memory barriers to enforce ordering.
- Fix up the fallout of the MSI conversion in the MVEVBU ICU driver.
The conversion failed to handle the change of the data storage and
kept the original code which uses the domain::host_data pointer
unchanged. After the conversion domain::host_data points to the new
msi_domain_info structure and not longer to the MVEBU specific MSI
data, which is now stored in a member of msi_domain_info. This leads
to malfunction of the transalate() callback.
- Only handle the PMC in FIQ mode when it is configured that way.
The original check was incorrect as it did not explicitely check for
the proper conditions, which led to malfunctions of the PMU
interrupt.
- Improve Kconfig dependencies for the LAN966x Outband Interrupt
controller to avoid pointless pronmpts.
* tag 'irq-urgent-2025-02-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/apple-aic: Only handle PMC interrupt as FIQ when configured so
irqchip/irq-mvebu-icu: Fix access to msi_data from irq_domain::host_data
irqchip/riscv: Ensure ordering of memory writes and IPI writes
irqchip/lan966x-oic: Make CONFIG_LAN966X_OIC depend on CONFIG_MCHP_LAN966X_PCI
dt-bindings: interrupt-controller: microchip,lan966x-oic: Clarify endpoint use
|
|
Pull xfs bug fixes from Carlos Maiolino:
"A few fixes for XFS, but the most notable one is:
- xfs: remove xfs_buf_cache.bc_lock
which has been hit by different persons including syzbot"
* tag 'xfs-fixes-6.14-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: remove xfs_buf_cache.bc_lock
xfs: Add error handling for xfs_reflink_cancel_cow_range
xfs: Propagate errors from xfs_reflink_cancel_cow_range in xfs_dax_write_iomap_end
xfs: don't call remap_verify_area with sb write protection held
xfs: remove an out of data comment in _xfs_buf_alloc
xfs: fix the entry condition of exact EOF block allocation optimization
|
|
Pull NVMe fixes from Keith:
"nvme fixes for Linux 6.14
- Connection fixes for fibre channel transport (Daniel)
- Endian fixes (Keith, Christoph)
- Cleanup fix for host memory buffer (Francis)
- Platform specific power quirks (Georg)
- Target memory leak (Sagi)
- Use appropriate controller state accessor (Daniel)"
* tag 'nvme-6.14-2025-01-31' of git://git.infradead.org/nvme:
nvme-fc: use ctrl state getter
nvme: make nvme_tls_attrs_group static
nvmet: add a missing endianess conversion in nvmet_execute_admin_connect
nvmet: the result field in nvmet_alloc_ctrl_args is little endian
nvmet: fix a memory leak in controller identify
nvme-fc: do not ignore connectivity loss during connecting
nvme: handle connectivity loss in nvme_set_queue_count
nvme-fc: go straight to connecting state when initializing
nvme-pci: Add TUXEDO IBP Gen9 to Samsung sleep quirk
nvme-pci: Add TUXEDO InfinityFlex to Samsung sleep quirk
nvme-pci: remove redundant dma frees in hmb
nvmet: fix rw control endian access
|
|
atomic context
The following bug report happened with a PREEMPT_RT kernel:
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
get_random_u32+0x4f/0x110
clocksource_verify_choose_cpus+0xab/0x1a0
clocksource_verify_percpu.part.0+0x6b/0x330
clocksource_watchdog_kthread+0x193/0x1a0
It is due to the fact that clocksource_verify_choose_cpus() is invoked with
preemption disabled. This function invokes get_random_u32() to obtain
random numbers for choosing CPUs. The batched_entropy_32 local lock and/or
the base_crng.lock spinlock in driver/char/random.c will be acquired during
the call. In PREEMPT_RT kernel, they are both sleeping locks and so cannot
be acquired in atomic context.
Fix this problem by using migrate_disable() to allow smp_processor_id() to
be reliably used without introducing atomic context. preempt_disable() is
then called after clocksource_verify_choose_cpus() but before the
clocksource measurement is being run to avoid introducing unexpected
latency.
Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/all/20250131173323.891943-2-longman@redhat.com
|
|
The scale() functions detects invalid parameters, but continues
its calculations anyway. This causes bad results if negative values
are used for unsigned operations. Worst case, a division by 0 error
will be seen if source_min == source_max.
On top of that, after v6.13, the sequence of WARN_ON() followed by clamp()
may result in a build error with gcc 13.x.
drivers/gpu/drm/i915/display/intel_backlight.c: In function 'scale':
include/linux/compiler_types.h:542:45: error:
call to '__compiletime_assert_415' declared with attribute error:
clamp() low limit source_min greater than high limit source_max
This happens if the compiler decides to rearrange the code as follows.
if (source_min > source_max) {
WARN(..);
/* Do the clamp() knowing that source_min > source_max */
source_val = clamp(source_val, source_min, source_max);
} else {
/* Do the clamp knowing that source_min <= source_max */
source_val = clamp(source_val, source_min, source_max);
}
Fix the problem by evaluating the return values from WARN_ON and returning
immediately after a warning. While at it, fix divide by zero error seen
if source_min == source_max.
Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: David Laight <david.laight.linux@gmail.com>
Cc: David Laight <david.laight.linux@gmail.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250121145203.2851237-1-linux@roeck-us.net
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 6f71507415841d1a6d38118e5fa0eaf0caab9c17)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Currently, intel_dp_dsc_max_src_input_bpc can return 0 for platforms not
supporting DSC, which could theoretically cause issues in clamp()
due to a low limit being greater than the high limit.
Instead, return the minimum bpc supported by the source to prevent
such issues.
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/all/CA+G9fYtNfM399_=_ff81zeRJv=0+z7oFJfPGmJgTp6yrJmU+1w@mail.gmail.com/
Fixes: 160672b86b0d ("drm/i915/dp: Use clamp for pipe_bpp limits with DSC")
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Tested-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250131041342.3086716-1-ankit.k.nautiyal@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit a67221b5eb8d59fb7e1f0df3ef9945b6a0f32cca)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|