summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-02-04KVM: selftests: Fix spelling mistake "initally" -> "initially"Colin Ian King
There is a spelling mistake in a literal string and in the function test_get_inital_dirty. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Message-ID: <20250204105647.367743-1-colin.i.king@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-02-04Merge tag 'kvmarm-fixes-6.14-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.14, take #1 - Correctly clean the BSS to the PoC before allowing EL2 to access it on nVHE/hVHE/protected configurations - Propagate ownership of debug registers in protected mode after the rework that landed in 6.14-rc1 - Stop pretending that we can run the protected mode without a GICv3 being present on the host - Fix a use-after-free situation that can occur if a vcpu fails to initialise the NV shadow S2 MMU contexts - Always evaluate the need to arm a background timer for fully emulated guest timers - Fix the emulation of EL1 timers in the absence of FEAT_ECV - Correctly handle the EL2 virtual timer, specially when HCR_EL2.E2H==0
2025-02-04Merge tag 'kvm-s390-next-6.14-2' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - some selftest fixes - move some kvm-related functions from mm into kvm - remove all usage of page->index and page->lru from kvm - fixes and cleanups for vsie
2025-02-04kvm: x86: SRSO_USER_KERNEL_NO is not synthesizedPaolo Bonzini
SYNTHESIZED_F() generally is used together with setup_force_cpu_cap(), i.e. when it makes sense to present the feature even if cpuid does not have it *and* the VM is not able to see the difference. For example, it can be used when mitigations on the host automatically protect the guest as well. The "SYNTHESIZED_F(SRSO_USER_KERNEL_NO)" line came in as a conflict resolution between the CPUID overhaul from the KVM tree and support for the feature in the x86 tree. Using it right now does not hurt, or make a difference for that matter, because there is no setup_force_cpu_cap(X86_FEATURE_SRSO_USER_KERNEL_NO). However, it is a little less future proof in case such a setup_force_cpu_cap() appears later, for a case where the kernel somehow is not vulnerable but the guest would have to apply the mitigation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-02-04KVM: arm64: timer: Don't adjust the EL2 virtual timer offsetMarc Zyngier
The way we deal with the EL2 virtual timer is a bit odd. We try to cope with E2H being flipped, and adjust which offset applies to that timer depending on the current E2H value. But that's a complexity we shouldn't have to worry about. What we have to deal with is either E2H being RES1, in which case there is no offset, or E2H being RES0, and the virtual timer simply does not exist. Drop the adjusting of the timer offset, which makes things a bit simpler. At the same time, make sure that accessing the HV timer when E2H is RES0 results in an UNDEF in the guest. Suggested-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250204110050.150560-4-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-02-04KVM: arm64: timer: Correctly handle EL1 timer emulation when !FEAT_ECVMarc Zyngier
Both Wei-Lin Chang and Volodymyr Babchuk report that the way we handle the emulation of EL1 timers with NV is completely wrong, specially in the case of HCR_EL2.E2H==0. There are three problems in about as many lines of code: - With E2H==0, the EL1 timers are overwritten with the EL1 state, while they should actually contain the EL2 state (as per the timer map) - With E2H==1, we run the full EL1 timer emulation even when ECV is present, hiding a bug in timer_emulate() (see previous patch) - The comments are actively misleading, and say all the wrong things. This is only attributable to the code having been initially written for FEAT_NV, hacked up to handle FEAT_NV2 *in parallel*, and vaguely hacked again to be FEAT_NV2 only. Oh, and yours truly being a gold plated idiot. The fix is obvious: just delete most of the E2H==0 code, have a unified handling of the timers (because they really are E2H agnostic), and make sure we don't execute any of that when FEAT_ECV is present. Fixes: 4bad3068cfa9f ("KVM: arm64: nv: Sync nested timer state with FEAT_NV2") Reported-by: Wei-Lin Chang <r09922117@csie.ntu.edu.tw> Reported-by: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> Link: https://lore.kernel.org/r/fqiqfjzwpgbzdtouu2pwqlu7llhnf5lmy4hzv5vo6ph4v3vyls@jdcfy3fjjc5k Link: https://lore.kernel.org/r/87frl51tse.fsf@epam.com Tested-by: Dmytro Terletskyi <dmytro_terletskyi@epam.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250204110050.150560-3-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-02-04KVM: arm64: timer: Always evaluate the need for a soft timerMarc Zyngier
When updating the interrupt state for an emulated timer, we return early and skip the setup of a soft timer that runs in parallel with the guest. While this is OK if we have set the interrupt pending, it is pretty wrong if the guest moved CVAL into the future. In that case, no timer is armed and the guest can wait for a very long time (it will take a full put/load cycle for the situation to resolve). This is specially visible with EDK2 running at EL2, but still using the EL1 virtual timer, which in that case is fully emulated. Any key-press takes ages to be captured, as there is no UART interrupt and EDK2 relies on polling from a timer... The fix is simply to drop the early return. If the timer interrupt is pending, we will still return early, and otherwise arm the soft timer. Fixes: 4d74ecfa6458b ("KVM: arm64: Don't arm a hrtimer for an already pending timer") Cc: stable@vger.kernel.org Tested-by: Dmytro Terletskyi <dmytro_terletskyi@epam.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250204110050.150560-2-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-02-04KVM: arm64: Fix nested S2 MMU structures reallocationMarc Zyngier
For each vcpu that userspace creates, we allocate a number of s2_mmu structures that will eventually contain our shadow S2 page tables. Since this is a dynamically allocated array, we reallocate the array and initialise the newly allocated elements. Once everything is correctly initialised, we adjust pointer and size in the kvm structure, and move on. But should that initialisation fail *and* the reallocation triggered a copy to another location, we end-up returning early, with the kvm structure still containing the (now stale) old pointer. Weeee! Cure it by assigning the pointer early, and use this to perform the initialisation. If everything succeeds, we adjust the size. Otherwise, we just leave the size as it was, no harm done, and the new memory is as good as the ol' one (we hope...). Fixes: 4f128f8e1aaac ("KVM: arm64: nv: Support multiple nested Stage-2 mmu structures") Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Link: https://lore.kernel.org/r/20250204145554.774427-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-02-04rxrpc: Fix the rxrpc_connection attend queue handlingDavid Howells
The rxrpc_connection attend queue is never used because conn::attend_link is never initialised and so is always NULL'd out and thus always appears to be busy. This requires the following fix: (1) Fix this the attend queue problem by initialising conn::attend_link. And, consequently, two further fixes for things masked by the above bug: (2) Fix rxrpc_input_conn_event() to handle being invoked with a NULL sk_buff pointer - something that can now happen with the above change. (3) Fix the RXRPC_SKB_MARK_SERVICE_CONN_SECURED message to carry a pointer to the connection and a ref on it. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jakub Kicinski <kuba@kernel.org> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Paolo Abeni <pabeni@redhat.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org Fixes: f2cce89a074e ("rxrpc: Implement a mechanism to send an event notification to a connection") Link: https://patch.msgid.link/20250203110307.7265-3-dhowells@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-04KVM: arm64: Fail protected mode init if no vgic hardware is presentOliver Upton
Protected mode assumes that at minimum vgic-v3 is present, however KVM fails to actually enforce this at the time of initialization. As such, when running protected mode in a half-baked state on GICv2 hardware we see the hyp go belly up at vcpu_load() when it tries to restore the vgic-v3 cpuif: $ ./arch_timer_edge_cases [ 130.599140] kvm [4518]: nVHE hyp panic at: [<ffff800081102b58>] __kvm_nvhe___vgic_v3_restore_vmcr_aprs+0x8/0x84! [ 130.603685] kvm [4518]: Cannot dump pKVM nVHE stacktrace: !CONFIG_PROTECTED_NVHE_STACKTRACE [ 130.611962] kvm [4518]: Hyp Offset: 0xfffeca95ed000000 [ 130.617053] Kernel panic - not syncing: HYP panic: [ 130.617053] PS:800003c9 PC:0000b56a94102b58 ESR:0000000002000000 [ 130.617053] FAR:ffff00007b98d4d0 HPFAR:00000000007b98d0 PAR:0000000000000000 [ 130.617053] VCPU:0000000000000000 [ 130.638013] CPU: 0 UID: 0 PID: 4518 Comm: arch_timer_edge Tainted: G C 6.13.0-rc3-00009-gf7d03fcbf1f4 #1 [ 130.648790] Tainted: [C]=CRAP [ 130.651721] Hardware name: Libre Computer AML-S905X-CC (DT) [ 130.657242] Call trace: [ 130.659656] show_stack+0x18/0x24 (C) [ 130.663279] dump_stack_lvl+0x38/0x90 [ 130.666900] dump_stack+0x18/0x24 [ 130.670178] panic+0x388/0x3e8 [ 130.673196] nvhe_hyp_panic_handler+0x104/0x208 [ 130.677681] kvm_arch_vcpu_load+0x290/0x548 [ 130.681821] vcpu_load+0x50/0x80 [ 130.685013] kvm_arch_vcpu_ioctl_run+0x30/0x868 [ 130.689498] kvm_vcpu_ioctl+0x2e0/0x974 [ 130.693293] __arm64_sys_ioctl+0xb4/0xec [ 130.697174] invoke_syscall+0x48/0x110 [ 130.700883] el0_svc_common.constprop.0+0x40/0xe0 [ 130.705540] do_el0_svc+0x1c/0x28 [ 130.708818] el0_svc+0x30/0xd0 [ 130.711837] el0t_64_sync_handler+0x10c/0x138 [ 130.716149] el0t_64_sync+0x198/0x19c [ 130.719774] SMP: stopping secondary CPUs [ 130.723660] Kernel Offset: disabled [ 130.727103] CPU features: 0x000,00000800,02800000,0200421b [ 130.732537] Memory Limit: none [ 130.735561] ---[ end Kernel panic - not syncing: HYP panic: [ 130.735561] PS:800003c9 PC:0000b56a94102b58 ESR:0000000002000000 [ 130.735561] FAR:ffff00007b98d4d0 HPFAR:00000000007b98d0 PAR:0000000000000000 [ 130.735561] VCPU:0000000000000000 ]--- Fix it by failing KVM initialization if the system doesn't implement vgic-v3, as protected mode will never do anything useful on such hardware. Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/kvmarm/5ca7588c-7bf2-4352-8661-e4a56a9cd9aa@sirena.org.uk/ Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250203231543.233511-1-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-02-04platform/x86/intel/ifs: Update documentation with image download pathJithu Joseph
The documentation previously listed the path to download In Field Scan (IFS) test images as "TBD". Update the documentation to include the correct image download location. Also move the download link to the appropriate section within the documentation. Reported-by: Anisse Astier <anisse@astier.eu> Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Link: https://lore.kernel.org/r/20250131205315.1585663-1-jithu.joseph@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-02-03net: harmonize tstats and dstatsPaolo Abeni
After the blamed commits below, some UDP tunnel use dstats for accounting. On the xmit path, all the UDP-base tunnels ends up using iptunnel_xmit_stats() for stats accounting, and the latter assumes the relevant (tunnel) network device uses tstats. The end result is some 'funny' stat report for the mentioned UDP tunnel, e.g. when no packet is actually dropped and a bunch of packets are transmitted: gnv2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue \ state UNKNOWN mode DEFAULT group default qlen 1000 link/ether ee:7d:09:87:90:ea brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 14916 23 0 15 0 0 TX: bytes packets errors dropped carrier collsns 0 1566 0 0 0 0 Address the issue ensuring the same binary layout for the overlapping fields of dstats and tstats. While this solution is a bit hackish, is smaller and with no performance pitfall compared to other alternatives i.e. supporting both dstat and tstat in iptunnel_xmit_stats() or reverting the blamed commit. With time we should possibly move all the IP-based tunnel (and virtual devices) to dstats. Fixes: c77200c07491 ("bareudp: Handle stats using NETDEV_PCPU_STAT_DSTATS.") Fixes: 6fa6de302246 ("geneve: Handle stats using NETDEV_PCPU_STAT_DSTATS.") Fixes: be226352e8dc ("vxlan: Handle stats using NETDEV_PCPU_STAT_DSTATS.") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/2e1c444cf0f63ae472baff29862c4c869be17031.1738432804.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03Merge branch 'ethtool-rss-minor-fixes-for-recent-rss-changes'Jakub Kicinski
Jakub Kicinski says: ==================== ethtool: rss: minor fixes for recent RSS changes Make sure RSS_GET messages are consistent in do and dump. Fix up a recently added safety check for RSS + queue offset. Adjust related tests so that they pass on devices which don't support RSS + queue offset. ==================== Link: https://patch.msgid.link/20250201013040.725123-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03selftests: drv-net: rss_ctx: don't fail reconfigure test if queue offset not ↵Jakub Kicinski
supported Vast majority of drivers does not support queue offset. Simply return if the rss context + queue ntuple fails. Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250201013040.725123-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03selftests: drv-net: rss_ctx: add missing cleanup in queue reconfigureJakub Kicinski
Commit under Fixes adds ntuple rules but never deletes them. Fixes: 29a4bc1fe961 ("selftest: extend test_rss_context_queue_reconfigure for action addition") Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250201013040.725123-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03ethtool: ntuple: fix rss + ring_cookie checkJakub Kicinski
The info.flow_type is for RXFH commands, ntuple flow_type is inside the flow spec. The check currently does nothing, as info.flow_type is 0 (or even uninitialized by user space) for ETHTOOL_SRXCLSRLINS. Fixes: 9e43ad7a1ede ("net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in") Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250201013040.725123-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03ethtool: rss: fix hiding unsupported fields in dumpsJakub Kicinski
Commit ec6e57beaf8b ("ethtool: rss: don't report key if device doesn't support it") intended to stop reporting key fields for additional rss contexts if device has a global hashing key. Later we added dump support and the filtering wasn't properly added there. So we end up reporting the key fields in dumps but not in dos: # ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --do rss-get \ --json '{"header": {"dev-index":2}, "context": 1 }' { "header": { ... }, "context": 1, "indir": [0, 1, 2, 3, ...]] } # ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --dump rss-get [ ... snip context 0 ... { "header": { ... }, "context": 1, "indir": [0, 1, 2, 3, ...], -> "input_xfrm": 255, -> "hfunc": 1, -> "hkey": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" } ] Hide these fields correctly. The drivers/net/hw/rss_ctx.py selftest catches this when run on a device with single key, already: # Check| At /root/./ksft-net-drv/drivers/net/hw/rss_ctx.py, line 381, in test_rss_context_dump: # Check| ksft_ne(set(data.get('hkey', [1])), {0}, "key is all zero") # Check failed {0} == {0} key is all zero not ok 8 rss_ctx.test_rss_context_dump Fixes: f6122900f4e2 ("ethtool: rss: support dumping RSS contexts") Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250201013040.725123-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-04kthread: Fix return value on kzalloc() failure in kthread_affine_preferred()Yu-Chun Lin
kthread_affine_preferred() incorrectly returns 0 instead of -ENOMEM when kzalloc() fails. Return 'ret' to ensure the correct error code is propagated. Fixes: 4d13f4304fa4 ("kthread: Implement preferred affinity") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501301528.t0cZVbnq-lkp@intel.com/ Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2025-02-03scsi: qla1280: Fix kernel oops when debug level > 2Magnus Lindholm
A null dereference or oops exception will eventually occur when qla1280.c driver is compiled with DEBUG_QLA1280 enabled and ql_debug_level > 2. I think its clear from the code that the intention here is sg_dma_len(s) not length of sg_next(s) when printing the debug info. Signed-off-by: Magnus Lindholm <linmag7@gmail.com> Link: https://lore.kernel.org/r/20250125095033.26188-1-linmag7@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: ufs: core: Fix error return with query responseSeunghui Lee
There is currently no mechanism to return error from query responses. Return the error and print the corresponding error message with it. Signed-off-by: Seunghui Lee <sh043.lee@samsung.com> Link: https://lore.kernel.org/r/20250118023808.24726-1-sh043.lee@samsung.com Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: storvsc: Set correct data length for sending SCSI command without payloadLong Li
In StorVSC, payload->range.len is used to indicate if this SCSI command carries payload. This data is allocated as part of the private driver data by the upper layer and may get passed to lower driver uninitialized. For example, the SCSI error handling mid layer may send TEST_UNIT_READY or REQUEST_SENSE while reusing the buffer from a failed command. The private data section may have stale data from the previous command. If the SCSI command doesn't carry payload, the driver may use this value as is for communicating with host, resulting in possible corruption. Fix this by always initializing this value. Fixes: be0cf6ca301c ("scsi: storvsc: Set the tablesize based on the information given by the host") Cc: stable@kernel.org Tested-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Long Li <longli@microsoft.com> Link: https://lore.kernel.org/r/1737601642-7759-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: ufs: core: Fix use-after free in init error and remove pathsAndré Draszik
devm_blk_crypto_profile_init() registers a cleanup handler to run when the associated (platform-) device is being released. For UFS, the crypto private data and pointers are stored as part of the ufs_hba's data structure 'struct ufs_hba::crypto_profile'. This structure is allocated as part of the underlying ufshcd and therefore Scsi_host allocation. During driver release or during error handling in ufshcd_pltfrm_init(), this structure is released as part of ufshcd_dealloc_host() before the (platform-) device associated with the crypto call above is released. Once this device is released, the crypto cleanup code will run, using the just-released 'struct ufs_hba::crypto_profile'. This causes a use-after-free situation: Call trace: kfree+0x60/0x2d8 (P) kvfree+0x44/0x60 blk_crypto_profile_destroy_callback+0x28/0x70 devm_action_release+0x1c/0x30 release_nodes+0x6c/0x108 devres_release_all+0x98/0x100 device_unbind_cleanup+0x20/0x70 really_probe+0x218/0x2d0 In other words, the initialisation code flow is: platform-device probe ufshcd_pltfrm_init() ufshcd_alloc_host() scsi_host_alloc() allocation of struct ufs_hba creation of scsi-host devices devm_blk_crypto_profile_init() devm registration of cleanup handler using platform-device and during error handling of ufshcd_pltfrm_init() or during driver removal: ufshcd_dealloc_host() scsi_host_put() put_device(scsi-host) release of struct ufs_hba put_device(platform-device) crypto cleanup handler To fix this use-after free, change ufshcd_alloc_host() to register a devres action to automatically cleanup the underlying SCSI device on ufshcd destruction, without requiring explicit calls to ufshcd_dealloc_host(). This way: * the crypto profile and all other ufs_hba-owned resources are destroyed before SCSI (as they've been registered after) * a memleak is plugged in tc-dwc-g210-pci.c remove() as a side-effect * EXPORT_SYMBOL_GPL(ufshcd_dealloc_host) can be removed fully as it's not needed anymore * no future drivers using ufshcd_alloc_host() could ever forget adding the cleanup Fixes: cb77cb5abe1f ("blk-crypto: rename blk_keyslot_manager to blk_crypto_profile") Fixes: d76d9d7d1009 ("scsi: ufs: use devm_blk_ksm_init()") Cc: stable@vger.kernel.org Signed-off-by: André Draszik <andre.draszik@linaro.org> Link: https://lore.kernel.org/r/20250124-ufshcd-fix-v4-1-c5d0144aae59@linaro.org Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Acked-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: core: Do not retry I/Os during depopulationIgor Pylypiv
Fail I/Os instead of retry to prevent user space processes from being blocked on the I/O completion for several minutes. Retrying I/Os during "depopulation in progress" or "depopulation restore in progress" results in a continuous retry loop until the depopulation completes or until the I/O retry loop is aborted due to a timeout by the scsi_cmd_runtime_exceeced(). Depopulation is slow and can take 24+ hours to complete on 20+ TB HDDs. Most I/Os in the depopulation retry loop end up taking several minutes before returning the failure to user space. Cc: stable@vger.kernel.org # 4.18.x: 2bbeb8d scsi: core: Handle depopulation and restoration in progress Cc: stable@vger.kernel.org # 4.18.x Fixes: e37c7d9a0341 ("scsi: core: sanitize++ in progress") Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20250131184408.859579-1-ipylypiv@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: core: Use GFP_NOIO to avoid circular locking dependencyRik van Riel
Filesystems can write to disk from page reclaim with __GFP_FS set. Marc found a case where scsi_realloc_sdev_budget_map() ends up in page reclaim with GFP_KERNEL, where it could try to take filesystem locks again, leading to a deadlock. WARNING: possible circular locking dependency detected 6.13.0 #1 Not tainted ------------------------------------------------------ kswapd0/70 is trying to acquire lock: ffff8881025d5d78 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_submit_bio+0x461/0x6e0 but task is already holding lock: ffffffff81ef5f40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x9f/0x760 The full lockdep splat can be found in Marc's report: https://lkml.org/lkml/2025/1/24/1101 Avoid the potential deadlock by doing the allocation with GFP_NOIO, which prevents both filesystem and block layer recursion. Reported-by: Marc Aurèle La France <tsi@tuyoix.net> Signed-off-by: Rik van Riel <riel@surriel.com> Link: https://lore.kernel.org/r/20250129104525.0ae8421e@fangorn Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: ufs: Fix toggling of clk_gating.state when clock gating is not allowedAvri Altman
This commit addresses an issue where clk_gating.state is being toggled in ufshcd_setup_clocks() even if clock gating is not allowed. The fix is to add a check for hba->clk_gating.is_initialized before toggling clk_gating.state in ufshcd_setup_clocks(). Since clk_gating.lock is now initialized unconditionally, it can no longer lead to the spinlock being used before it is properly initialized, but instead it is mostly for documentation purposes. Fixes: 1ab27c9cf8b6 ("ufs: Add support for clock gating") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20250128071207.75494-3-avri.altman@wdc.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: ufs: core: Ensure clk_gating.lock is used only after initializationAvri Altman
Address a lockdep warning triggered by the use of the clk_gating.lock before it is properly initialized. The warning is as follows: [ 4.388838] INFO: trying to register non-static key. [ 4.395673] The code is fine but needs lockdep annotation, or maybe [ 4.402118] you didn't initialize this object before use? [ 4.407673] turning off the locking correctness validator. [ 4.413334] CPU: 5 UID: 0 PID: 58 Comm: kworker/u32:1 Not tainted 6.12-rc1 #185 [ 4.413343] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) [ 4.413362] Call trace: [ 4.413364] show_stack+0x18/0x24 (C) [ 4.413374] dump_stack_lvl+0x90/0xd0 [ 4.413384] dump_stack+0x18/0x24 [ 4.413392] register_lock_class+0x498/0x4a8 [ 4.413400] __lock_acquire+0xb4/0x1b90 [ 4.413406] lock_acquire+0x114/0x310 [ 4.413413] _raw_spin_lock_irqsave+0x60/0x88 [ 4.413423] ufshcd_setup_clocks+0x2c0/0x490 [ 4.413433] ufshcd_init+0x198/0x10ec [ 4.413437] ufshcd_pltfrm_init+0x600/0x7c0 [ 4.413444] ufs_qcom_probe+0x20/0x58 [ 4.413449] platform_probe+0x68/0xd8 [ 4.413459] really_probe+0xbc/0x268 [ 4.413466] __driver_probe_device+0x78/0x12c [ 4.413473] driver_probe_device+0x40/0x11c [ 4.413481] __device_attach_driver+0xb8/0xf8 [ 4.413489] bus_for_each_drv+0x84/0xe4 [ 4.413495] __device_attach+0xfc/0x18c [ 4.413502] device_initial_probe+0x14/0x20 [ 4.413510] bus_probe_device+0xb0/0xb4 [ 4.413517] deferred_probe_work_func+0x8c/0xc8 [ 4.413524] process_scheduled_works+0x250/0x658 [ 4.413534] worker_thread+0x15c/0x2c8 [ 4.413542] kthread+0x134/0x200 [ 4.413550] ret_from_fork+0x10/0x20 To fix this issue, ensure that the spinlock is only used after it has been properly initialized before using it in ufshcd_setup_clocks(). Do that unconditionally as initializing a spinlock is a fast operation. Fixes: 209f4e43b806 ("scsi: ufs: core: Introduce a new clock_gating lock") Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20250128071207.75494-2-avri.altman@wdc.com Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03Merge tag 'pull-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull d_revalidate fix from Al Viro: "Fix a braino in d_revalidate series: check ->d_op for NULL" * tag 'pull-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix braino in "9p: fix ->rename_sem exclusion"
2025-02-03Merge branch '6.14/scsi-queue' into 6.14/scsi-fixesMartin K. Petersen
Pull outstanding fixes bound for this release into 6.14/scsi-fixes. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03Merge branch 'maintainers-recognize-kuniyuki-iwashima-as-a-maintainer'Jakub Kicinski
Jakub Kicinski says: ==================== MAINTAINERS: recognize Kuniyuki Iwashima as a maintainer Kuniyuki Iwashima has been a prolific contributor and trusted reviewer for some core portions of the networking stack for a couple of years now. Formalize some obvious areas of his expertise and list him as a maintainer. ==================== Link: https://patch.msgid.link/20250202014728.1005003-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03MAINTAINERS: add entry for UNIX socketsJakub Kicinski
Add a MAINTAINERS entry for UNIX socket, Kuniyuki has been the de-facto maintainer of this code for a while. Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250202014728.1005003-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03MAINTAINERS: add a general entry for BSD socketsJakub Kicinski
Create a MAINTAINERS entry for BSD sockets. List the top 3 reviewers as maintainers. The entry is meant to cover core socket code (of which there isn't much) but also reviews of any new socket families. Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250202014728.1005003-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03MAINTAINERS: add Kuniyuki Iwashima to TCP reviewersJakub Kicinski
List Kuniyuki as an official TCP reviewer. Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250202014728.1005003-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03MAINTAINERS: list openvswitch docs under its entryJakub Kicinski
Submissions to the docs seem to not get properly CCed. Acked-by: Ilya Maximets <i.maximets@ovn.org> Link: https://patch.msgid.link/20250202005024.964262-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== ice: fix Rx data path for heavy 9k MTU traffic Maciej Fijalkowski says: This patchset fixes a pretty nasty issue that was reported by RedHat folks which occurred after ~30 minutes (this value varied, just trying here to state that it was not observed immediately but rather after a considerable longer amount of time) when ice driver was tortured with jumbo frames via mix of iperf traffic executed simultaneously with wrk/nginx on client/server sides (HTTP and TCP workloads basically). The reported splats were spanning across all the bad things that can happen to the state of page - refcount underflow, use-after-free, etc. One of these looked as follows: [ 2084.019891] BUG: Bad page state in process swapper/34 pfn:97fcd0 [ 2084.025990] page:00000000a60ee772 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x97fcd0 [ 2084.035462] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) [ 2084.041990] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000 [ 2084.049730] raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000 [ 2084.057468] page dumped because: nonzero _refcount [ 2084.062260] Modules linked in: bonding tls sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mgag200 irqd [ 2084.137829] CPU: 34 PID: 0 Comm: swapper/34 Kdump: loaded Not tainted 5.14.0-427.37.1.el9_4.x86_64 #1 [ 2084.147039] Hardware name: Dell Inc. PowerEdge R750/0216NK, BIOS 1.13.2 12/19/2023 [ 2084.154604] Call Trace: [ 2084.157058] <IRQ> [ 2084.159080] dump_stack_lvl+0x34/0x48 [ 2084.162752] bad_page.cold+0x63/0x94 [ 2084.166333] check_new_pages+0xb3/0xe0 [ 2084.170083] rmqueue_bulk+0x2d2/0x9e0 [ 2084.173749] ? ktime_get+0x35/0xa0 [ 2084.177159] rmqueue_pcplist+0x13b/0x210 [ 2084.181081] rmqueue+0x7d3/0xd40 [ 2084.184316] ? xas_load+0x9/0xa0 [ 2084.187547] ? xas_find+0x183/0x1d0 [ 2084.191041] ? xa_find_after+0xd0/0x130 [ 2084.194879] ? intel_iommu_iotlb_sync_map+0x89/0xe0 [ 2084.199759] get_page_from_freelist+0x11f/0x530 [ 2084.204291] __alloc_pages+0xf2/0x250 [ 2084.207958] ice_alloc_rx_bufs+0xcc/0x1c0 [ice] [ 2084.212543] ice_clean_rx_irq+0x631/0xa20 [ice] [ 2084.217111] ice_napi_poll+0xdf/0x2a0 [ice] [ 2084.221330] __napi_poll+0x27/0x170 [ 2084.224824] net_rx_action+0x233/0x2f0 [ 2084.228575] __do_softirq+0xc7/0x2ac [ 2084.232155] __irq_exit_rcu+0xa1/0xc0 [ 2084.235821] common_interrupt+0x80/0xa0 [ 2084.239662] </IRQ> [ 2084.241768] <TASK> The fix is mostly about reverting what was done in commit 1dc1a7e7f410 ("ice: Centrallize Rx buffer recycling") followed by proper timing on page_count() storage and then removing the ice_rx_buf::act related logic (which was mostly introduced for purposes from cited commit). Special thanks to Xu Du for providing reproducer and Jacob Keller for initial extensive analysis. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: stop storing XDP verdict within ice_rx_buf ice: gather page_count()'s of each frag right before XDP prog call ice: put Rx buffers after being done with current frame ==================== Link: https://patch.msgid.link/20250131185415.3741532-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03fix braino in "9p: fix ->rename_sem exclusion"Al Viro
->d_op can bloody well be NULL Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk> Fixes: 30d61efe118c "9p: fix ->rename_sem exclusion" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-02-03drm/xe: Fix and re-enable xe_print_blob_ascii85()Lucas De Marchi
Commit 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool") partially reverted some changes to workaround breakage caused to mesa tools. However, in doing so it also broke fetching the GuC log via debugfs since xe_print_blob_ascii85() simply bails out. The fix is to avoid the extra newlines: the devcoredump interface is line-oriented and adding random newlines in the middle breaks it. If a tool is able to parse it by looking at the data and checking for chars that are out of the ascii85 space, it can still do so. A format change that breaks the line-oriented output on devcoredump however needs better coordination with existing tools. v2: Add suffix description comment v3: Reword explanation of xe_print_blob_ascii85() calling drm_puts() in a loop Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: stable@vger.kernel.org Fixes: 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool") Fixes: ec1455ce7e35 ("drm/xe/devcoredump: Add ASCII85 dump helper function") Link: https://patchwork.freedesktop.org/patch/msgid/20250123202307.95103-2-jose.souza@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 2c95bbf5002776117a69caed3b31c10bf7341bec) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-03drm/xe/devcoredump: Move exec queue snapshot to Contexts sectionLucas De Marchi
Having the exec queue snapshot inside a "GuC CT" section was always wrong. Commit c28fd6c358db ("drm/xe/devcoredump: Improve section headings and add tile info") tried to fix that bug, but with that also broke the mesa tool that parses the devcoredump, hence it was reverted in commit a53da2fb25a3 ("drm/xe: Revert some changes that break a mesa debug tool"). With the mesa tool also fixed, this can propagate as a fix on both kernel and userspace side to avoid unnecessary headache for a debug feature. Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: stable@vger.kernel.org Fixes: a53da2fb25a3 ("drm/xe: Revert some changes that break a mesa debug tool") Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250123051112.1938193-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit a37934ea75d331fafa7fe80b6180642ba5193422) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-03drm/xe/oa: Set stream->pollin in xe_oa_buffer_check_unlockedAshutosh Dixit
We rely on stream->pollin to decide whether or not to block during poll/read calls. However, currently there are blocking read code paths which don't even set stream->pollin. The best place to consistently set stream->pollin for all code paths is therefore to set it in xe_oa_buffer_check_unlocked. Fixes: e936f885f1e9 ("drm/xe/oa/uapi: Expose OA stream fd") Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250115222029.3002103-1-ashutosh.dixit@intel.com (cherry picked from commit d3fedff828bb7e4a422c42caeafd5d974e24ee43) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-03drm/xe/pf: Fix migration initializationMichal Wajdeczko
The migration support only needs to be initialized once, but it was incorrectly called from the xe_gt_sriov_pf_init_hw(), which is part of the reset flow and may be called multiple times. Fixes: d86e3737c7ab ("drm/xe/pf: Add functions to save and restore VF GuC state") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250120232443.544-1-michal.wajdeczko@intel.com (cherry picked from commit 9ebb5846e1a3b1705f8a7cbc528888a1aa0b163e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-03drm/xe/oa: Preserve oa_ctrl unused bitsAshutosh Dixit
UMD's have interest in setting unused bits of the oa_ctrl register "out of band" for certain experiments. To facilitate this, don't clobber previous oa_ctrl unused bits, i.e. rmw the values rather than simply write them. Fixes: e936f885f1e9 ("drm/xe/oa/uapi: Expose OA stream fd") Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250117032155.3048063-1-ashutosh.dixit@intel.com (cherry picked from commit cfa9d40db8c30d894171010fe765d96e9bc6a47e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-03drm/amd/display: Fix seamless boot sequenceLo-an Chen
[WHY] When the system powers up eDP with external monitors in seamless boot sequence, stutter get enabled before TTU and HUBP registers being programmed, which resulting in underflow. [HOW] Enable TTU in hubp_init. Change the sequence that do not perpare_bandwidth and optimize_bandwidth while having seamless boot streams. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Lo-an Chen <lo-an.chen@amd.com> Signed-off-by: Paul Hsieh <paul.hsieh@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-03drm/amd/display: Fix out-of-bound accessesAlex Hung
[WHAT & HOW] hpo_stream_to_link_encoder_mapping has size MAX_HPO_DP2_ENCODERS(=4), but location can have size up to 6. As a result, it is necessary to check location against MAX_HPO_DP2_ENCODERS. Similiarly, disp_cfg_stream_location can be used as an array index which should be 0..5, so the ASSERT's conditions should be less without equal. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3904 Reviewed-by: Austin Zheng <Austin.Zheng@amd.com> Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-03drm/amdgpu: add a BO metadata flag to disable write compression for VulkanMarek Olšák
Vulkan can't support DCC and Z/S compression on GFX12 without WRITE_COMPRESS_DISABLE in this commit or a completely different DCC interface. AMDGPU_TILING_GFX12_SCANOUT is added because it's already used by userspace. Cc: stable@vger.kernel.org # 6.12.x Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-03Merge tag 'timers-urgent-2025-02-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: - Properly cast the input to secs_to_jiffies() to unsigned long as otherwise the result uses the data type of the input variable, which causes result range checks to fail if the input data type is signed and smaller than unsigned long. - Handle late armed hrtimers gracefully on CPU hotplug There are legitimate cases where a hrtimer is (re)armed on an outgoing CPU after the timers have been migrated away. This triggers warnings and caused people to implement horrible workarounds in RCU. But those workarounds are incomplete and do not cover e.g. the scheduler hrtimers. Stop this by force moving timer which are enqueued on the current CPU after timer migration to be queued on a remote online CPU. This allows to undo the workarounds in a seperate step. - Demote a warning level printk() to info level in the clocksource watchdog code as there is no point to emit a warning level message for a purely informational message. - Mark a helper function __always_inline and move it into the existing #ifdef block to avoid 'unused function' warnings from CLANG * tag 'timers-urgent-2025-02-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: jiffies: Cast to unsigned long in secs_to_jiffies() conversion clocksource: Use pr_info() for "Checking clocksource synchronization" message hrtimers: Force migrate away hrtimers queued after CPUHP_AP_HRTIMERS_DYING hrtimers: Mark is_migration_base() with __always_inline
2025-02-03Merge tag 'irq-urgent-2025-02-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: - Ensure ordering of memory and device I/O for IPIs on RISCV The RISCV interrupt controllers use writel_relaxed() for generating an IPI. That's a device I/O write which is not guaranteed to be ordered against preceding memory writes. As a consequence a IPI receiving CPU might not be able to observe the actual IPI data which is required to handle it. Switch to writel() which contains the necessary memory barriers to enforce ordering. - Fix up the fallout of the MSI conversion in the MVEVBU ICU driver. The conversion failed to handle the change of the data storage and kept the original code which uses the domain::host_data pointer unchanged. After the conversion domain::host_data points to the new msi_domain_info structure and not longer to the MVEBU specific MSI data, which is now stored in a member of msi_domain_info. This leads to malfunction of the transalate() callback. - Only handle the PMC in FIQ mode when it is configured that way. The original check was incorrect as it did not explicitely check for the proper conditions, which led to malfunctions of the PMU interrupt. - Improve Kconfig dependencies for the LAN966x Outband Interrupt controller to avoid pointless pronmpts. * tag 'irq-urgent-2025-02-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/apple-aic: Only handle PMC interrupt as FIQ when configured so irqchip/irq-mvebu-icu: Fix access to msi_data from irq_domain::host_data irqchip/riscv: Ensure ordering of memory writes and IPI writes irqchip/lan966x-oic: Make CONFIG_LAN966X_OIC depend on CONFIG_MCHP_LAN966X_PCI dt-bindings: interrupt-controller: microchip,lan966x-oic: Clarify endpoint use
2025-02-03Merge tag 'xfs-fixes-6.14-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs bug fixes from Carlos Maiolino: "A few fixes for XFS, but the most notable one is: - xfs: remove xfs_buf_cache.bc_lock which has been hit by different persons including syzbot" * tag 'xfs-fixes-6.14-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: remove xfs_buf_cache.bc_lock xfs: Add error handling for xfs_reflink_cancel_cow_range xfs: Propagate errors from xfs_reflink_cancel_cow_range in xfs_dax_write_iomap_end xfs: don't call remap_verify_area with sb write protection held xfs: remove an out of data comment in _xfs_buf_alloc xfs: fix the entry condition of exact EOF block allocation optimization
2025-02-03Merge tag 'nvme-6.14-2025-01-31' of git://git.infradead.org/nvme into block-6.14Jens Axboe
Pull NVMe fixes from Keith: "nvme fixes for Linux 6.14 - Connection fixes for fibre channel transport (Daniel) - Endian fixes (Keith, Christoph) - Cleanup fix for host memory buffer (Francis) - Platform specific power quirks (Georg) - Target memory leak (Sagi) - Use appropriate controller state accessor (Daniel)" * tag 'nvme-6.14-2025-01-31' of git://git.infradead.org/nvme: nvme-fc: use ctrl state getter nvme: make nvme_tls_attrs_group static nvmet: add a missing endianess conversion in nvmet_execute_admin_connect nvmet: the result field in nvmet_alloc_ctrl_args is little endian nvmet: fix a memory leak in controller identify nvme-fc: do not ignore connectivity loss during connecting nvme: handle connectivity loss in nvme_set_queue_count nvme-fc: go straight to connecting state when initializing nvme-pci: Add TUXEDO IBP Gen9 to Samsung sleep quirk nvme-pci: Add TUXEDO InfinityFlex to Samsung sleep quirk nvme-pci: remove redundant dma frees in hmb nvmet: fix rw control endian access
2025-02-03clocksource: Use migrate_disable() to avoid calling get_random_u32() in ↵Waiman Long
atomic context The following bug report happened with a PREEMPT_RT kernel: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 get_random_u32+0x4f/0x110 clocksource_verify_choose_cpus+0xab/0x1a0 clocksource_verify_percpu.part.0+0x6b/0x330 clocksource_watchdog_kthread+0x193/0x1a0 It is due to the fact that clocksource_verify_choose_cpus() is invoked with preemption disabled. This function invokes get_random_u32() to obtain random numbers for choosing CPUs. The batched_entropy_32 local lock and/or the base_crng.lock spinlock in driver/char/random.c will be acquired during the call. In PREEMPT_RT kernel, they are both sleeping locks and so cannot be acquired in atomic context. Fix this problem by using migrate_disable() to allow smp_processor_id() to be reliably used without introducing atomic context. preempt_disable() is then called after clocksource_verify_choose_cpus() but before the clocksource measurement is being run to avoid introducing unexpected latency. Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable") Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/all/20250131173323.891943-2-longman@redhat.com
2025-02-03drm/i915/backlight: Return immediately when scale() finds invalid parametersGuenter Roeck
The scale() functions detects invalid parameters, but continues its calculations anyway. This causes bad results if negative values are used for unsigned operations. Worst case, a division by 0 error will be seen if source_min == source_max. On top of that, after v6.13, the sequence of WARN_ON() followed by clamp() may result in a build error with gcc 13.x. drivers/gpu/drm/i915/display/intel_backlight.c: In function 'scale': include/linux/compiler_types.h:542:45: error: call to '__compiletime_assert_415' declared with attribute error: clamp() low limit source_min greater than high limit source_max This happens if the compiler decides to rearrange the code as follows. if (source_min > source_max) { WARN(..); /* Do the clamp() knowing that source_min > source_max */ source_val = clamp(source_val, source_min, source_max); } else { /* Do the clamp knowing that source_min <= source_max */ source_val = clamp(source_val, source_min, source_max); } Fix the problem by evaluating the return values from WARN_ON and returning immediately after a warning. While at it, fix divide by zero error seen if source_min == source_max. Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: David Laight <david.laight.linux@gmail.com> Cc: David Laight <david.laight.linux@gmail.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250121145203.2851237-1-linux@roeck-us.net Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 6f71507415841d1a6d38118e5fa0eaf0caab9c17) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-03drm/i915/dp: Return min bpc supported by source instead of 0Ankit Nautiyal
Currently, intel_dp_dsc_max_src_input_bpc can return 0 for platforms not supporting DSC, which could theoretically cause issues in clamp() due to a low limit being greater than the high limit. Instead, return the minimum bpc supported by the source to prevent such issues. Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Closes: https://lore.kernel.org/all/CA+G9fYtNfM399_=_ff81zeRJv=0+z7oFJfPGmJgTp6yrJmU+1w@mail.gmail.com/ Fixes: 160672b86b0d ("drm/i915/dp: Use clamp for pipe_bpp limits with DSC") Cc: Suraj Kandpal <suraj.kandpal@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Tested-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250131041342.3086716-1-ankit.k.nautiyal@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit a67221b5eb8d59fb7e1f0df3ef9945b6a0f32cca) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>