summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2025-10-17HID: intel-ish-hid: Add ishtp_get_connection_state() interfaceZhang Lixu
Add the ishtp_get_connection_state() function for struct ishtp_cl, allowing ishtp client drivers to retrieve the current connection state. Signed-off-by: Zhang Lixu <lixu.zhang@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-10-17HID: intel-ish-hid: Use dedicated unbound workqueues to prevent resume blockingZhang Lixu
During suspend/resume tests with S2IDLE, some ISH functional failures were observed because of delay in executing ISH resume handler. Here schedule_work() is used from resume handler to do actual work. schedule_work() uses system_wq, which is a per CPU work queue. Although the queuing is not bound to a CPU, but it prefers local CPU of the caller, unless prohibited. Users of this work queue are not supposed to queue long running work. But in practice, there are scenarios where long running work items are queued on other unbound workqueues, occupying the CPU. As a result, the ISH resume handler may not get a chance to execute in a timely manner. In one scenario, one of the ish_resume_handler() executions was delayed nearly 1 second because another work item on an unbound workqueue occupied the same CPU. This delay causes ISH functionality failures. A similar issue was previously observed where the ISH HID driver timed out while getting the HID descriptor during S4 resume in the recovery kernel, likely caused by the same workqueue contention problem. Create dedicated unbound workqueues for all ISH operations to allow work items to execute on any available CPU, eliminating CPU-specific bottlenecks and improving resume reliability under varying system loads. Also ISH has three different components, a bus driver which implements ISH protocols, a PCI interface layer and HID interface. Use one dedicated work queue for all of them. Signed-off-by: Zhang Lixu <lixu.zhang@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-10-17cgroup/misc: fix misc_res_type kernel-doc warningRandy Dunlap
Format the kernel-doc for SCALE_HW_CALIB_INVALID correctly to avoid a kernel-doc warning: Warning: include/linux/misc_cgroup.h:26 Enum value 'MISC_CG_RES_TDX' not described in enum 'misc_res_type' Fixes: 7c035bea9407 ("KVM: TDX: Register TDX host key IDs to cgroup misc controller") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-10-17Merge tag 'mmc-v6.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull mmc cleanup from Ulf Hansson: "Move rpmb_frame struct and constants to rpmb common header This helps us to avoid sharing an immutable branch between our git trees. I was planning to send it before rc1, but I didn't make it" * tag 'mmc-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: rpmb: move rpmb_frame struct and constants to common header
2025-10-17Merge tag 'sound-6.18-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes. All changes are rather boring device-specific fixes and quirks: - A few fixes for missing NULL checks - ASoC NAU8821 fixes for jack and irq handling - Various fixes for ASoC TAS2781, IDT821034, sc8280xp, max9809x, wcd938x, and SoundWire - Usual HD-audio and USB-audio quirks" * tag 'sound-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (27 commits) ALSA: hda/realtek: Fix mute led for HP Omen 17-cb0xxx ALSA: usb-audio: fix vendor quirk for Logitech H390 ALSA: usb-audio: add volume quirks for MS LifeChat LX-3000 ASoC: amd/sdw_utils: avoid NULL deref when devm_kasprintf() fails ASoC: max98090/91: fixed max98091 ALSA widget powering up/down ASoC: dt-bindings: Add compatible string fsl,imx-audio-tlv320 ASoC: codecs: wcd938x-sdw: remove redundant runtime pm calls ASoC: sdw_utils: add rt1321 part id to codec_info_list ALSA: usb-audio: Fix NULL pointer deference in try_to_register_card ALSA: firewire: amdtp-stream: fix enum kernel-doc warnings ALSA: usb-audio: add mixer_playback_min_mute quirk for Logitech H390 ASoC: nau8821: Avoid unnecessary blocking in IRQ handler ASoC: nau8821: Add DMI quirk to bypass jack debounce circuit ASoC: nau8821: Consistently clear interrupts before unmasking ASoC: nau8821: Generalize helper to clear IRQ status ASoC: nau8821: Cancel jdet_work before handling jack ejection ASoC: codecs: Fix gain setting ranges for Renesas IDT821034 codec ASoC: tas2781: Update ti,tas2781.yaml for adding tas58xx ASoC: tas2781: Support more newly-released amplifiers tas58xx in the driver ASoC: qcom: sc8280xp: Add support for QCS615 ...
2025-10-17Merge tag 'drm-fixes-2025-10-17' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "As per usual xe/amdgpu are the leaders, with some i915 and then a bunch of scattered fixes. There are a bunch of stability fixes for some older amdgpu cards. draw: - Avoid color truncation gpuvm: - Avoid kernel-doc warning sched: - Avoid double free i915: - Skip GuC communication warning if reset is in progress - Couple frontbuffer related fixes - Deactivate PSR only on LNL and when selective fetch enabled xe: - Increase global invalidation timeout to handle some workloads - Fix NPD while evicting BOs in an array of VM binds - Fix resizable BAR to account for possibly needing to move BARs other than the LMEMBAR - Fix error handling in xe_migrate_init() - Fix atomic fault handling with mixed mappings or if the page is already in VRAM - Enable media samplers power gating for platforms before Xe2 - Fix de-registering exec queue from GuC when unbinding - Ensure data migration to system if indicated by madvise with SVM - Fix kerneldoc for kunit change - Always account for cacheline alignment on migration - Drop bogus assertion on eviction amdgpu: - Backlight fix - SI fixes - CIK fix - Make CE support debug only - IP discovery fix - Ring reset fixes - GPUVM fault memory barrier fix - Drop unused structures in amdgpu_drm.h - JPEG debugfs fix - VRAM handling fixes for GPUs without VRAM - GC 12 MES fixes amdkfd: - MES fix ast: - Fix display output after reboot bridge: - lt9211: Fix version check panthor: - Fix MCU suspend qaic: - Init bootlog in correct order - Treat remaining == 0 as error in find_and_map_user_pages() - Lock access to DBC request queue rockchip: - vop2: Fix destination size in atomic check" * tag 'drm-fixes-2025-10-17' of https://gitlab.freedesktop.org/drm/kernel: (44 commits) drm/sched: Fix potential double free in drm_sched_job_add_resv_dependencies drm/xe/evict: drop bogus assert drm/xe/migrate: don't misalign current bytes drm/xe/kunit: Fix kerneldoc for parameterized tests drm/xe/svm: Ensure data will be migrated to system if indicated by madvise. drm/gpuvm: Fix kernel-doc warning for drm_gpuvm_map_req.map drm/i915/psr: Deactivate PSR only on LNL and when selective fetch enabled drm/ast: Blank with VGACR17 sync enable, always clear VGACRB6 sync off accel/qaic: Synchronize access to DBC request queue head & tail pointer accel/qaic: Treat remaining == 0 as error in find_and_map_user_pages() accel/qaic: Fix bootlog initialization ordering drm/rockchip: vop2: use correct destination rectangle height check drm/draw: fix color truncation in drm_draw_fill24 drm/xe/guc: Check GuC running state before deregistering exec queue drm/xe: Enable media sampler power gating drm/xe: Handle mixed mappings and existing VRAM on atomic faults drm/xe/migrate: Fix an error path drm/xe: Move rebar to be done earlier drm/xe: Don't allow evicting of BOs in same VM in array of VM binds drm/xe: Increase global invalidation timeout to 1000us ...
2025-10-17Merge tag 'zynqmp-soc-for-6.18' of https://github.com/Xilinx/linux-xlnx into ↵Arnd Bergmann
soc/drivers arm64: Xilinx SOC changes for 6.18 firmware: - Add debugfs interface - Wire versal-net compatible string - Change SOC family detection * tag 'zynqmp-soc-for-6.18' of https://github.com/Xilinx/linux-xlnx: drivers: firmware: xilinx: Switch to new family code in zynqmp_pm_get_family_info() drivers: firmware: xilinx: Add unique family code for all platforms firmware: xilinx: Add Versal NET platform compatible string firmware: xilinx: Add debugfs support for PM_GET_NODE_STATUS
2025-10-17media: v4l2-mem2mem: Don't copy frame flags in v4l2_m2m_buf_copy_metadata()Laurent Pinchart
The v4l2_m2m_buf_copy_metadata() function takes a boolean copy_frame_flags argument. When true, it causes the function to copy the V4L2_BUF_FLAG_KEYFRAME, V4L2_BUF_FLAG_BFRAME and V4L2_BUF_FLAG_PFRAME flags from the output buffer to the capture buffer. There is no use cases in any upstream driver for copying the flags. KEY/P/B frames are properties of the bitstream buffer in some formats. Once decoded, this is no longer a property of the video frame and should be discarded. It was considered useful to know if an uncompressed frame was decoded from a KEY/P/B compressed frame, and to preserve that information if that same uncompressed frame was passed through another M2M device (e.g. a scaler). However, the V4L2 documentation makes it clear that the flags are meant for compressed frames only. Drop the copy_frame_flags argument from v4l2_m2m_buf_copy_metadata(). The change to drivers was performed with the following Coccinelle semantic patch: @@ expression src; expression dst; expression flag; @@ - v4l2_m2m_buf_copy_metadata(src, dst, flag); + v4l2_m2m_buf_copy_metadata(src, dst); include/media/v4l2-mem2mem.h and drivers/media/v4l2-core/v4l2-mem2mem.c have been updated manually. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Benjamin Gaignard <benjamin.gaignard@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-10-17media: v4l2-mem2mem: Document that v4l2_m2m_get_vq() never returns NULLLaurent Pinchart
The v4l2_m2m_get_vq() never returns a NULL pointer, as the internal get_queue_ctx() helper always returns a non-NULL pointer. Many drivers check the return value against NULL, due to a combination of old code and cargo-cult programming. Even v4l2-mem2mem.c contains unneeded NULL checks. Clarify the API by documenting explicitly that a NULL check is not needed, and simplify the code by removing the unneeded NULL checks from v4l2-mem2mem.c. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Stefan Klug <stefan.klug@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-10-17drm/xe/uapi: Hide the madvise autoreset behind a VM_BIND flagThomas Hellström
The madvise implementation currently resets the SVM madvise if the underlying CPU map is unmapped. This is in an attempt to mimic the CPU madvise behaviour. However, it's not clear that this is a desired behaviour since if the end app user relies on it for malloc()ed objects or stack objects, it may not work as intended. Instead of having the autoreset functionality being a direct application-facing implicit UAPI, make the UMD explicitly choose this behaviour if it wants to expose it by introducing DRM_XE_VM_BIND_FLAG_MADVISE_AUTORESET, and add a semantics description. v2: - Kerneldoc fixes. Fix a commit log message. Fixes: a2eb8aec3ebe ("drm/xe: Reset VMA attributes to default in SVM garbage collector") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: "Falkowski, John" <john.falkowski@intel.com> Cc: "Mrozek, Michal" <michal.mrozek@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20251015170726.178685-2-thomas.hellstrom@linux.intel.com
2025-10-17drm/gpusvm, drm/xe: Allow mixed mappings for userptrMatthew Brost
Compute kernels often issue memory copies immediately after completion. If the memory being copied is an SVM pointer that was faulted into the device and then bound via userptr, it is undesirable to move that memory. Worse, if userptr is mixed between system and device memory, the bind operation may be rejected. Xe already has the necessary plumbing to support userptr with mixed mappings. This update modifies GPUSVM's get_pages to correctly locate pages in such mixed mapping scenarios. v2: - Rebase (Thomas Hellström) v3: - Remove Fixes tag. v4: - Break out from series since the other patch was merged. - Update patch subject, ensure dri-devel and Maarten are CC'd. Cc: Maarten Lankhorst <maarten.lankhorst@intel.com> Cc: dri-devel@lists.freedesktop.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20251015120320.176338-1-thomas.hellstrom@linux.intel.com Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-10-17crypto: drbg - Replace AES cipher calls with library callsHarsh Jain
Replace aes used in drbg with library calls. Signed-off-by: Harsh Jain <h.jain@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-10-17crypto: drbg - Export CTR DRBG DF functionsHarsh Jain
Export drbg_ctr_df() derivative function to new module df_sp80090. Signed-off-by: Harsh Jain <h.jain@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-10-17can: treewide: remove can_change_mtu()Vincent Mailhol
can_change_mtu() became obsolete by commit 23049938605b ("can: populate the minimum and maximum MTU values"). Now that net_device->min_mtu and net_device->max_mtu are populated, all the checks are already done by dev_validate_mtu() in net/core/dev.c. Remove the net_device_ops->ndo_change_mtu() callback of all the physical interfaces, then remove can_change_mtu(). Only keep the vcan_change_mtu() and vxcan_change_mtu() because the virtual interfaces use their own different MTU logic. The only functional change this patch introduces is that now the user will be able to change the MTU even if the interface is up. This does not matter for Classical CAN and CAN FD because their MTU range is composed of only one value, respectively CAN_MTU and CANFD_MTU. For the upcoming CAN XL, the MTU will be configurable within the CANXL_MIN_MTU to CANXL_MAX_MTU range at any time, even if the interface is up. This is consistent with the other net protocols and does not contradict ISO 11898-1:2024 as having a modifiable MTU is a kernel extension. Signed-off-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20251003-remove-can_change_mtu-v1-1-337f8bc21181@kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-10-16net: dev_queue_xmit() llist adoptionEric Dumazet
Remove busylock spinlock and use a lockless list (llist) to reduce spinlock contention to the minimum. Idea is that only one cpu might spin on the qdisc spinlock, while others simply add their skb in the llist. After this patch, we get a 300 % improvement on heavy TX workloads. - Sending twice the number of packets per second. - While consuming 50 % less cycles. Note that this also allows in the future to submit batches to various qdisc->enqueue() methods. Tested: - Dual Intel(R) Xeon(R) 6985P-C (480 hyper threads). - 100Gbit NIC, 30 TX queues with FQ packet scheduler. - echo 64 >/sys/kernel/slab/skbuff_small_head/cpu_partial (avoid contention in mm) - 240 concurrent "netperf -t UDP_STREAM -- -m 120 -n" Before: 16 Mpps (41 Mpps if each thread is pinned to a different cpu) vmstat 2 5 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 243 0 0 2368988672 51036 1100852 0 0 146 1 242 60 0 9 91 0 0 244 0 0 2368988672 51036 1100852 0 0 536 10 487745 14718 0 52 48 0 0 244 0 0 2368988672 51036 1100852 0 0 512 0 503067 46033 0 52 48 0 0 244 0 0 2368988672 51036 1100852 0 0 512 0 494807 12107 0 52 48 0 0 244 0 0 2368988672 51036 1100852 0 0 702 26 492845 10110 0 52 48 0 0 Lock contention (1 second sample taken on 8 cores) perf lock record -C0-7 sleep 1; perf lock contention contended total wait max wait avg wait type caller 442111 6.79 s 162.47 ms 15.35 us spinlock dev_hard_start_xmit+0xcd 5961 9.57 ms 8.12 us 1.60 us spinlock __dev_queue_xmit+0x3a0 244 560.63 us 7.63 us 2.30 us spinlock do_softirq+0x5b 13 25.09 us 3.21 us 1.93 us spinlock net_tx_action+0xf8 If netperf threads are pinned, spinlock stress is very high. perf lock record -C0-7 sleep 1; perf lock contention contended total wait max wait avg wait type caller 964508 7.10 s 147.25 ms 7.36 us spinlock dev_hard_start_xmit+0xcd 201 268.05 us 4.65 us 1.33 us spinlock __dev_queue_xmit+0x3a0 12 26.05 us 3.84 us 2.17 us spinlock do_softirq+0x5b @__dev_queue_xmit_ns: [256, 512) 21 | | [512, 1K) 631 | | [1K, 2K) 27328 |@ | [2K, 4K) 265392 |@@@@@@@@@@@@@@@@ | [4K, 8K) 417543 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | [8K, 16K) 826292 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [16K, 32K) 733822 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [32K, 64K) 19055 |@ | [64K, 128K) 17240 |@ | [128K, 256K) 25633 |@ | [256K, 512K) 4 | | After: 29 Mpps (57 Mpps if each thread is pinned to a different cpu) vmstat 2 5 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 78 0 0 2369573632 32896 1350988 0 0 22 0 331 254 0 8 92 0 0 75 0 0 2369573632 32896 1350988 0 0 22 50 425713 280199 0 23 76 0 0 104 0 0 2369573632 32896 1350988 0 0 290 0 430238 298247 0 23 76 0 0 86 0 0 2369573632 32896 1350988 0 0 132 0 428019 291865 0 24 76 0 0 90 0 0 2369573632 32896 1350988 0 0 502 0 422498 278672 0 23 76 0 0 perf lock record -C0-7 sleep 1; perf lock contention contended total wait max wait avg wait type caller 2524 116.15 ms 486.61 us 46.02 us spinlock __dev_queue_xmit+0x55b 5821 107.18 ms 371.67 us 18.41 us spinlock dev_hard_start_xmit+0xcd 2377 9.73 ms 35.86 us 4.09 us spinlock ___slab_alloc+0x4e0 923 5.74 ms 20.91 us 6.22 us spinlock ___slab_alloc+0x5c9 121 3.42 ms 193.05 us 28.24 us spinlock net_tx_action+0xf8 6 564.33 us 167.60 us 94.05 us spinlock do_softirq+0x5b If netperf threads are pinned (~54 Mpps) perf lock record -C0-7 sleep 1; perf lock contention 32907 316.98 ms 195.98 us 9.63 us spinlock dev_hard_start_xmit+0xcd 4507 61.83 ms 212.73 us 13.72 us spinlock __dev_queue_xmit+0x554 2781 23.53 ms 40.03 us 8.46 us spinlock ___slab_alloc+0x5c9 3554 18.94 ms 34.69 us 5.33 us spinlock ___slab_alloc+0x4e0 233 9.09 ms 215.70 us 38.99 us spinlock do_softirq+0x5b 153 930.66 us 48.67 us 6.08 us spinlock net_tx_action+0xfd 84 331.10 us 14.22 us 3.94 us spinlock ___slab_alloc+0x5c9 140 323.71 us 9.94 us 2.31 us spinlock ___slab_alloc+0x4e0 @__dev_queue_xmit_ns: [128, 256) 1539830 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [256, 512) 2299558 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [512, 1K) 483936 |@@@@@@@@@@ | [1K, 2K) 265345 |@@@@@@ | [2K, 4K) 145463 |@@@ | [4K, 8K) 54571 |@ | [8K, 16K) 10270 | | [16K, 32K) 9385 | | [32K, 64K) 7749 | | [64K, 128K) 26799 | | [128K, 256K) 2665 | | [256K, 512K) 665 | | Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20251014171907.3554413-7-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-16net: sched: claim one cache line in QdiscEric Dumazet
Replace state2 field with a boolean. Move it to a hole between qstats and state so that we shrink Qdisc by a full cache line. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20251014171907.3554413-6-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-16Revert "net/sched: Fix mirred deadlock on device recursion"Eric Dumazet
This reverts commits 0f022d32c3eca477fbf79a205243a6123ed0fe11 and 44180feaccf266d9b0b28cc4ceaac019817deb5c. Prior patch in this series implemented loop detection in act_mirred, we can remove q->owner to save some cycles in the fast path. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20251014171907.3554413-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-16net/sched: act_mirred: add loop detectionEric Dumazet
Commit 0f022d32c3ec ("net/sched: Fix mirred deadlock on device recursion") added code in the fast path, even when act_mirred is not used. Prepare its revert by implementing loop detection in act_mirred. Adds an array of device pointers in struct netdev_xmit. tcf_mirred_is_act_redirect() can detect if the array already contains the target device. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20251014171907.3554413-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-17Merge tag 'drm-misc-fixes-2025-10-16' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: ast: - Fix display output after reboot bridge: - lt9211: Fix version check core: - draw: Avoid color truncation - gpuvm: Avoid kernel-doc warning - sched: Avoid double free panthor: - Fix MCU suspend qaic: - Init bootlog in correct order - Treat remaining == 0 as error in find_and_map_user_pages() - Lock access to DBC request queue rockchip: - vop2: Fix destination size in atomic check Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20251016141607.GA73919@linux.fritz.box
2025-10-16PCI/MSI: Delete pci_msi_create_irq_domain()Nam Cao
pci_msi_create_irq_domain() is now unused. Delete it. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2025-10-16bpf: Introduce SK_BPF_BYPASS_PROT_MEM.Kuniyuki Iwashima
If a socket has sk->sk_bypass_prot_mem flagged, the socket opts out of the global protocol memory accounting. This is easily controlled by net.core.bypass_prot_mem sysctl, but it lacks flexibility. Let's support flagging (and clearing) sk->sk_bypass_prot_mem via bpf_setsockopt() at the BPF_CGROUP_INET_SOCK_CREATE hook. int val = 1; bpf_setsockopt(ctx, SOL_SOCKET, SK_BPF_BYPASS_PROT_MEM, &val, sizeof(val)); As with net.core.bypass_prot_mem, this is inherited to child sockets, and BPF always takes precedence over sysctl at socket(2) and accept(2). SK_BPF_BYPASS_PROT_MEM is only supported at BPF_CGROUP_INET_SOCK_CREATE and not supported on other hooks for some reasons: 1. UDP charges memory under sk->sk_receive_queue.lock instead of lock_sock() 2. Modifying the flag after skb is charged to sk requires such adjustment during bpf_setsockopt() and complicates the logic unnecessarily We can support other hooks later if a real use case justifies that. Most changes are inline and hard to trace, but a microbenchmark on __sk_mem_raise_allocated() during neper/tcp_stream showed that more samples completed faster with sk->sk_bypass_prot_mem == 1. This will be more visible under tcp_mem pressure (but it's not a fair comparison). # bpftrace -e 'kprobe:__sk_mem_raise_allocated { @start[tid] = nsecs; } kretprobe:__sk_mem_raise_allocated /@start[tid]/ { @end[tid] = nsecs - @start[tid]; @times = hist(@end[tid]); delete(@start[tid]); }' # tcp_stream -6 -F 1000 -N -T 256 Without bpf prog: [128, 256) 3846 | | [256, 512) 1505326 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [512, 1K) 1371006 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [1K, 2K) 198207 |@@@@@@ | [2K, 4K) 31199 |@ | With bpf prog in the next patch: (must be attached before tcp_stream) # bpftool prog load sk_bypass_prot_mem.bpf.o /sys/fs/bpf/test type cgroup/sock_create # bpftool cgroup attach /sys/fs/cgroup/test cgroup_inet_sock_create pinned /sys/fs/bpf/test [128, 256) 6413 | | [256, 512) 1868425 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [512, 1K) 1101697 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [1K, 2K) 117031 |@@@@ | [2K, 4K) 11773 | | Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Link: https://patch.msgid.link/20251014235604.3057003-6-kuniyu@google.com
2025-10-16net: Introduce net.core.bypass_prot_mem sysctl.Kuniyuki Iwashima
If a socket has sk->sk_bypass_prot_mem flagged, the socket opts out of the global protocol memory accounting. Let's control the flag by a new sysctl knob. The flag is written once during socket(2) and is inherited to child sockets. Tested with a script that creates local socket pairs and send()s a bunch of data without recv()ing. Setup: # mkdir /sys/fs/cgroup/test # echo $$ >> /sys/fs/cgroup/test/cgroup.procs # sysctl -q net.ipv4.tcp_mem="1000 1000 1000" # ulimit -n 524288 Without net.core.bypass_prot_mem, charged to tcp_mem & memcg # python3 pressure.py & # cat /sys/fs/cgroup/test/memory.stat | grep sock sock 22642688 <-------------------------------------- charged to memcg # cat /proc/net/sockstat| grep TCP TCP: inuse 2006 orphan 0 tw 0 alloc 2008 mem 5376 <-- charged to tcp_mem # ss -tn | head -n 5 State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 2000 0 127.0.0.1:34479 127.0.0.1:53188 ESTAB 2000 0 127.0.0.1:34479 127.0.0.1:49972 ESTAB 2000 0 127.0.0.1:34479 127.0.0.1:53868 ESTAB 2000 0 127.0.0.1:34479 127.0.0.1:53554 # nstat | grep Pressure || echo no pressure TcpExtTCPMemoryPressures 1 0.0 With net.core.bypass_prot_mem=1, charged to memcg only: # sysctl -q net.core.bypass_prot_mem=1 # python3 pressure.py & # cat /sys/fs/cgroup/test/memory.stat | grep sock sock 2757468160 <------------------------------------ charged to memcg # cat /proc/net/sockstat | grep TCP TCP: inuse 2006 orphan 0 tw 0 alloc 2008 mem 0 <- NOT charged to tcp_mem # ss -tn | head -n 5 State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 111000 0 127.0.0.1:36019 127.0.0.1:49026 ESTAB 110000 0 127.0.0.1:36019 127.0.0.1:45630 ESTAB 110000 0 127.0.0.1:36019 127.0.0.1:44870 ESTAB 111000 0 127.0.0.1:36019 127.0.0.1:45274 # nstat | grep Pressure || echo no pressure no pressure Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Link: https://patch.msgid.link/20251014235604.3057003-4-kuniyu@google.com
2025-10-16net: Allow opt-out from global protocol memory accounting.Kuniyuki Iwashima
Some protocols (e.g., TCP, UDP) implement memory accounting for socket buffers and charge memory to per-protocol global counters pointed to by sk->sk_proto->memory_allocated. Sometimes, system processes do not want that limitation. For a similar purpose, there is SO_RESERVE_MEM for sockets under memcg. Also, by opting out of the per-protocol accounting, sockets under memcg can avoid paying costs for two orthogonal memory accounting mechanisms. A microbenchmark result is in the subsequent bpf patch. Let's allow opt-out from the per-protocol memory accounting if sk->sk_bypass_prot_mem is true. sk->sk_bypass_prot_mem and sk->sk_prot are placed in the same cache line, and sk_has_account() always fetches sk->sk_prot before accessing sk->sk_bypass_prot_mem, so there is no extra cache miss for this patch. The following patches will set sk->sk_bypass_prot_mem to true, and then, the per-protocol memory accounting will be skipped. Note that this does NOT disable memcg, but rather the per-protocol one. Another option not to use the hole in struct sock_common is create sk_prot variants like tcp_prot_bypass, but this would complicate SOCKMAP logic, tcp_bpf_prots etc. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Link: https://patch.msgid.link/20251014235604.3057003-3-kuniyu@google.com
2025-10-16sched_ext: Merge branch 'sched/core' of ↵Tejun Heo
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-6.19 Pull in tip/sched/core to receive: 50653216e4ff ("sched: Add support to pick functions to take rf") 4c95380701f5 ("sched/ext: Fold balance_scx() into pick_task_scx()") which will enable clean integration of DL server support among other things. This conflicts with the following from sched_ext/for-6.18-fixes: a8ad873113d3 ("sched_ext: defer queue_balance_callback() until after ops.dispatch") which adds maybe_queue_balance_callback() to balance_scx() which is removed by 50653216e4ff. Resolve by moving the invocation to pick_task_scx() in the equivalent location. Signed-off-by: Tejun Heo <tj@kernel.org>
2025-10-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.18-rc2). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-16Merge tag 'net-6.18-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from CAN Current release - regressions: - udp: do not use skb_release_head_state() before skb_attempt_defer_free() - gro_cells: use nested-BH locking for gro_cell - dpll: zl3073x: increase maximum size of flash utility Previous releases - regressions: - core: fix lockdep splat on device unregister - tcp: fix tcp_tso_should_defer() vs large RTT - tls: - don't rely on tx_work during send() - wait for pending async decryptions if tls_strp_msg_hold fails - can: j1939: add missing calls in NETDEV_UNREGISTER notification handler - eth: lan78xx: fix lost EEPROM write timeout in lan78xx_write_raw_eeprom Previous releases - always broken: - ip6_tunnel: prevent perpetual tunnel growth - dpll: zl3073x: handle missing or corrupted flash configuration - can: m_can: fix pm_runtime and CAN state handling - eth: - ixgbe: fix too early devlink_free() in ixgbe_remove() - ixgbevf: fix mailbox API compatibility - gve: Check valid ts bit on RX descriptor before hw timestamping - idpf: cleanup remaining SKBs in PTP flows - r8169: fix packet truncation after S4 resume on RTL8168H/RTL8111H" * tag 'net-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits) udp: do not use skb_release_head_state() before skb_attempt_defer_free() net: usb: lan78xx: fix use of improperly initialized dev->chipid in lan78xx_reset netdevsim: set the carrier when the device goes up selftests: tls: add test for short splice due to full skmsg selftests: net: tls: add tests for cmsg vs MSG_MORE tls: don't rely on tx_work during send() tls: wait for pending async decryptions if tls_strp_msg_hold fails tls: always set record_type in tls_process_cmsg tls: wait for async encrypt in case of error during latter iterations of sendmsg tls: trim encrypted message to match the plaintext on short splice tg3: prevent use of uninitialized remote_adv and local_adv variables MAINTAINERS: new entry for IPv6 IOAM gve: Check valid ts bit on RX descriptor before hw timestamping net: core: fix lockdep splat on device unregister MAINTAINERS: add myself as maintainer for b53 selftests: net: check jq command is supported net: airoha: Take into account out-of-order tx completions in airoha_dev_xmit() tcp: fix tcp_tso_should_defer() vs large RTT r8152: add error handling in rtl8152_driver_init usbnet: Fix using smp_processor_id() in preemptible code warnings ...
2025-10-16accel/amdxdna: Support getting last hardware errorLizhi Hou
Add new parameter DRM_AMDXDNA_HW_LAST_ASYNC_ERR to get array IOCTL. When hardware reports an error, the driver save the error information and timestamp. This new get array parameter retrieves the last error. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20251014234119.628453-1-lizhi.hou@amd.com
2025-10-16irqchip: Pass platform device to platform driversJohan Hovold
The IRQCHIP_PLATFORM_DRIVER macros can be used to convert OF irqchip drivers to platform drivers but currently reuse the OF init callback prototype that only takes OF nodes as arguments. This forces drivers to do reverse lookups of their struct devices during probe if they need them for things like dev_printk() and device managed resources. Half of the drivers doing reverse lookups also currently fail to release the additional reference taken during the lookup, while other drivers have had the reference leak plugged in various ways (e.g. using non-intuitive cleanup constructs which still confuse static checkers). Switch to using a probe callback that takes a platform device as its first argument to simplify drivers and plug the remaining (mostly benign) reference leaks. Fixes: 32c6c054661a ("irqchip: Add Broadcom BCM2712 MSI-X interrupt controller") Fixes: 70afdab904d2 ("irqchip: Add IMX MU MSI controller driver") Fixes: a6199bb514d8 ("irqchip: Add Qualcomm MPM controller driver") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Changhuang Liang <changhuang.liang@starfivetech.com>
2025-10-16gpiolib: of: Get rid of <linux/gpio/legacy-of-mm-gpiochip.h>Christophe Leroy
Last user of linux/gpio/legacy-of-mm-gpiochip.h is gone. Remove linux/gpio/legacy-of-mm-gpiochip.h and CONFIG_OF_GPIO_MM_GPIOCHIP Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-10-16gpio: regmap: add the .fixed_direction_output configuration parameterIoana Ciornei
There are GPIO controllers such as the one present in the LX2160ARDB QIXIS FPGA which have fixed-direction input and output GPIO lines mixed together in a single register. This cannot be modeled using the gpio-regmap as-is since there is no way to present the true direction of a GPIO line. In order to make this use case possible, add a new configuration parameter - fixed_direction_output - into the gpio_regmap_config structure. This will enable user drivers to provide a bitmap that represents the fixed direction of the GPIO lines. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Michael Walle <mwalle@kernel.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-10-16dt-bindings: power: Add power domain IDs for Tegra264Thierry Reding
Add the set of power domain IDs available on the Tegra264 SoC so that they can be used in device tree files. Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-10-16sched: Add support to pick functions to take rfJoel Fernandes
Some pick functions like the internal pick_next_task_fair() already take rf but some others dont. We need this for scx's server pick function. Prepare for this by having pick functions accept it. [peterz: - added RETRY_TASK handling - removed pick_next_task_fair indirection] Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org>
2025-10-16sched: Rename do_set_cpus_allowed()Peter Zijlstra
Hopefully saner naming. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
2025-10-16sched: Employ sched_change guardsPeter Zijlstra
As proposed a long while ago -- and half done by scx -- wrap the scheduler's 'change' pattern in a guard helper. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
2025-10-15net: remove obsolete WARN_ON(refcount_read(&sk->sk_refcnt) == 1)Eric Dumazet
sk->sk_refcnt has been converted to refcount_t in 2017. __sock_put(sk) being refcount_dec(&sk->sk_refcnt), it will complain loudly if the current refcnt is 1 (or less) in a non racy way. We can remove four WARN_ON() in favor of the generic refcount_dec() check. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Xuanqiang Luo<luoxuanqiang@kylinos.cn> Link: https://patch.msgid.link/20251014140605.2982703-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15drm/bridge: dw-hdmi-qp: Fixup timer base setupCristian Ciocaltea
Currently the TIMER_BASE_CONFIG0 register gets initialized to a fixed value as initially found in vendor driver code supporting the RK3588 SoC. As a matter of fact the value matches the rate of the HDMI TX reference clock, which is roughly 428.57 MHz. However, on RK3576 SoC that rate is slightly lower, i.e. 396.00 MHz, and the incorrect register configuration breaks CEC functionality. Set the timer base according to the actual reference clock rate that shall be provided by the platform driver. Otherwise fallback to the vendor default. While at it, also drop the unnecessary empty lines in dw_hdmi_qp_init_hw(). Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20250903-rk3588-hdmi-cec-v4-2-fa25163c4b08@collabora.com
2025-10-15drm/bridge: dw-hdmi-qp: Add CEC supportCristian Ciocaltea
Add support for the CEC interface of the Synopsys DesignWare HDMI QP TX controller. This is based on the downstream implementation, but rewritten on top of the CEC helpers added recently to the DRM HDMI connector framework. Also note struct dw_hdmi_qp_plat_data has been extended to include the CEC IRQ number to be provided by the platform driver. Co-developed-by: Algea Cao <algea.cao@rock-chips.com> Signed-off-by: Algea Cao <algea.cao@rock-chips.com> Co-developed-by: Derek Foreman <derek.foreman@collabora.com> Signed-off-by: Derek Foreman <derek.foreman@collabora.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20250903-rk3588-hdmi-cec-v4-1-fa25163c4b08@collabora.com
2025-10-15hung_task: fix warnings caused by unaligned lock pointersLance Yang
The blocker tracking mechanism assumes that lock pointers are at least 4-byte aligned to use their lower bits for type encoding. However, as reported by Eero Tamminen, some architectures like m68k only guarantee 2-byte alignment of 32-bit values. This breaks the assumption and causes two related WARN_ON_ONCE checks to trigger. To fix this, the runtime checks are adjusted to silently ignore any lock that is not 4-byte aligned, effectively disabling the feature in such cases and avoiding the related warnings. Thanks to Geert Uytterhoeven for bisecting! Link: https://lkml.kernel.org/r/20250909145243.17119-1-lance.yang@linux.dev Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker") Signed-off-by: Lance Yang <lance.yang@linux.dev> Reported-by: Eero Tamminen <oak@helsinkinet.fi> Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc1_0g@mail.gmail.com Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Anna Schumaker <anna.schumaker@oracle.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Finn Thain <fthain@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joel Granados <joel.granados@kernel.org> Cc: John Stultz <jstultz@google.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Lance Yang <lance.yang@linux.dev> Cc: Mingzhe Yang <mingzhe.yang@ly.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tomasz Figa <tfiga@chromium.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Yongliang Gao <leonylgao@tencent.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-10-15sched_ext: Add lockless peek operation for DSQsRyan Newton
The builtin DSQ queue data structures are meant to be used by a wide range of different sched_ext schedulers with different demands on these data structures. They might be per-cpu with low-contention, or high-contention shared queues. Unfortunately, DSQs have a coarse-grained lock around the whole data structure. Without going all the way to a lock-free, more scalable implementation, a small step we can take to reduce lock contention is to allow a lockless, small-fixed-cost peek at the head of the queue. This change allows certain custom SCX schedulers to cheaply peek at queues, e.g. during load balancing, before locking them. But it represents a few extra memory operations to update the pointer each time the DSQ is modified, including a memory barrier on ARM so the write appears correctly ordered. This commit adds a first_task pointer field which is updated atomically when the DSQ is modified, and allows any thread to peek at the head of the queue without holding the lock. Signed-off-by: Ryan Newton <newton@meta.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-10-15drm/gpuvm: Fix kernel-doc warning for drm_gpuvm_map_req.mapAnkan Biswas
The kernel-doc for struct drm_gpuvm_map_req.map was added as '@op_map' instead of '@map', leading to this warning during htmldocs build: WARNING: include/drm/drm_gpuvm.h:1083 struct member 'map' not described in 'drm_gpuvm_map_req' Fixes: 000a45dce7ad ("drm/gpuvm: Pass map arguments through a struct") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/all/20250821133539.03aa298e@canb.auug.org.au/ Signed-off-by: Ankan Biswas <spyjetfayed@gmail.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-10-15net: bcmgenet: remove unused platform codeHeiner Kallweit
This effectively reverts b0ba512e25d7 ("net: bcmgenet: enable driver to work without a device tree"). There has never been an in-tree user of struct bcmgenet_platform_data, all devices use OF or ACPI. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/108b4e64-55d4-4b4e-9a11-3c810c319d66@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15net: allow busy connected flows to switch tx queuesEric Dumazet
This is a followup of commit 726e9e8b94b9 ("tcp: refine skb->ooo_okay setting") and of prior commit in this series ("net: control skb->ooo_okay from skb_set_owner_w()") skb->ooo_okay might never be set for bulk flows that always have at least one skb in a qdisc queue of NIC queue, especially if TX completion is delayed because of a stressed cpu. The so-called "strange attractors" has caused many performance issues (see for instance 9b462d02d6dd ("tcp: TCP Small Queues and strange attractors")), we need to do better. We have tried very hard to avoid reorders because TCP was not dealing with them nicely a decade ago. Use the new net.core.txq_reselection_ms sysctl to let flows follow XPS and select a more efficient queue. After this patch, we no longer have to make sure threads are pinned to cpus, they now can be migrated without adding too much spinlock/qdisc/TX completion pressure anymore. TX completion part was problematic, because it added false sharing on various socket fields, but also added false sharing and spinlock contention in mm layers. Calling skb_orphan() from ndo_start_xmit() is not an option unfortunately. Note for later: 1) move sk->sk_tx_queue_mapping closer to sk_tx_queue_mapping_jiffies for better cache locality. 2) Study if 9b462d02d6dd ("tcp: TCP Small Queues and strange attractors") could be revised. Tested: Used a host with 32 TX queues, shared by groups of 8 cores. XPS setup : echo ff >/sys/class/net/eth1/queue/tx-0/xps_cpus echo ff00 >/sys/class/net/eth1/queue/tx-1/xps_cpus echo ff0000 >/sys/class/net/eth1/queue/tx-2/xps_cpus echo ff000000 >/sys/class/net/eth1/queue/tx-3/xps_cpus echo ff,00000000 >/sys/class/net/eth1/queue/tx-4/xps_cpus echo ff00,00000000 >/sys/class/net/eth1/queue/tx-5/xps_cpus echo ff0000,00000000 >/sys/class/net/eth1/queue/tx-6/xps_cpus echo ff000000,00000000 >/sys/class/net/eth1/queue/tx-7/xps_cpus ... Launched a tcp_stream with 15 threads and 1000 flows, initially affined to core 0-15 taskset -c 0-15 tcp_stream -T15 -F1000 -l1000 -c -H target_host Checked that only queues 0 and 1 are used as instructed by XPS : tc -s qdisc show dev eth1|grep backlog|grep -v "backlog 0b 0p" backlog 123489410b 1890p backlog 69809026b 1064p backlog 52401054b 805p Then force each thread to run on cpu 1,9,17,25,33,41,49,57,65,73,81,89,97,105,113,121 C=1;PID=`pidof tcp_stream`;for P in `ls /proc/$PID/task`; do taskset -pc $C $P; C=$(($C + 8));done Set txq_reselection_ms to 1000 echo 1000 > /proc/sys/net/core/txq_reselection_ms Check that the flows have migrated nicely: tc -s qdisc show dev eth1|grep backlog|grep -v "backlog 0b 0p" backlog 130508314b 1916p backlog 8584380b 126p backlog 8584380b 126p backlog 8379990b 123p backlog 8584380b 126p backlog 8487484b 125p backlog 8584380b 126p backlog 8448120b 124p backlog 8584380b 126p backlog 8720640b 128p backlog 8856900b 130p backlog 8584380b 126p backlog 8652510b 127p backlog 8448120b 124p backlog 8516250b 125p backlog 7834950b 115p Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251013152234.842065-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15net: add /proc/sys/net/core/txq_reselection_ms controlEric Dumazet
Add a new sysctl to control how often a queue reselection can happen even if a flow has a persistent queue of skbs in a Qdisc or NIC queue. A value of zero means the feature is disabled. Default is 1000 (1 second). This sysctl is used in the following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251013152234.842065-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15net: add SK_WMEM_ALLOC_BIAS constantEric Dumazet
sk->sk_wmem_alloc is initialized to 1, and sk_wmem_alloc_get() takes care of this initial value. Add SK_WMEM_ALLOC_BIAS define to not spread this magic value. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251013152234.842065-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15tcp: better handle TCP_TX_DELAY on established flowsEric Dumazet
Some applications uses TCP_TX_DELAY socket option after TCP flow is established. Some metrics need to be updated, otherwise TCP might take time to adapt to the new (emulated) RTT. This patch adjusts tp->srtt_us, tp->rtt_min, icsk_rto and sk->sk_pacing_rate. This is best effort, and for instance icsk_rto is reset without taking backoff into account. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251013145926.833198-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15drm/xe/uapi: Add documentation for DRM_XE_GEM_CREATE_FLAG_SCANOUTSanjay Yadav
Add documentation for drm_xe_gem_create structure flag DRM_XE_GEM_CREATE_FLAG_SCANOUT. Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20251014142823.3701228-2-sanjay.kumar.yadav@intel.com
2025-10-15ASoC: use sof_sdw as default Intel SOF SDW machineMark Brown
Merge series from Bard Liao <yung-chuan.liao@linux.intel.com>: Currently, we create a ACPI mach table for every new audio configuration. And all Intel SOF SoundWire configurations point to the same sof_sdw machine driver. Also, we don't need a specific topology for a coufguration, we can use the function topology instead. That give us a change to generate an ACPI mach table based on the SoundWire codec information reported by the ACPI table and use the sof_sdw machine driver as the default machine driver. This will reduce the effort to support a new Intel SOF SoundWire audio configuration.
2025-10-15bpf: Consistently use bpf_rcu_lock_held() everywhereAndrii Nakryiko
We have many places which open-code what's now is bpf_rcu_lock_held() macro, so replace all those places with a clean and short macro invocation. For that, move bpf_rcu_lock_held() macro into include/linux/bpf.h. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20251014201403.4104511-1-andrii@kernel.org
2025-10-15bpf: Replace bpf_map_kmalloc_node() with kmalloc_nolock() to allocate ↵Alexei Starovoitov
bpf_async_cb structures. The following kmemleak splat: [ 8.105530] kmemleak: Trying to color unknown object at 0xff11000100e918c0 as Black [ 8.106521] Call Trace: [ 8.106521] <TASK> [ 8.106521] dump_stack_lvl+0x4b/0x70 [ 8.106521] kvfree_call_rcu+0xcb/0x3b0 [ 8.106521] ? hrtimer_cancel+0x21/0x40 [ 8.106521] bpf_obj_free_fields+0x193/0x200 [ 8.106521] htab_map_update_elem+0x29c/0x410 [ 8.106521] bpf_prog_cfc8cd0f42c04044_overwrite_cb+0x47/0x4b [ 8.106521] bpf_prog_8c30cd7c4db2e963_overwrite_timer+0x65/0x86 [ 8.106521] bpf_prog_test_run_syscall+0xe1/0x2a0 happens due to the combination of features and fixes, but mainly due to commit 6d78b4473cdb ("bpf: Tell memcg to use allow_spinning=false path in bpf_timer_init()") It's using __GFP_HIGH, which instructs slub/kmemleak internals to skip kmemleak_alloc_recursive() on allocation, so subsequent kfree_rcu()-> kvfree_call_rcu()->kmemleak_ignore() complains with the above splat. To fix this imbalance, replace bpf_map_kmalloc_node() with kmalloc_nolock() and kfree_rcu() with call_rcu() + kfree_nolock() to make sure that the objects allocated with kmalloc_nolock() are freed with kfree_nolock() rather than the implicit kfree() that kfree_rcu() uses internally. Note, the kmalloc_nolock() happens under bpf_spin_lock_irqsave(), so it will always fail in PREEMPT_RT. This is not an issue at the moment, since bpf_timers are disabled in PREEMPT_RT. In the future bpf_spin_lock will be replaced with state machine similar to bpf_task_work. Fixes: 6d78b4473cdb ("bpf: Tell memcg to use allow_spinning=false path in bpf_timer_init()") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Harry Yoo <harry.yoo@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: linux-mm@kvack.org Link: https://lore.kernel.org/bpf/20251015000700.28988-1-alexei.starovoitov@gmail.com
2025-10-15regulator: core: forward undervoltage events downstream by defaultOleksij Rempel
Forward critical supply events downstream so consumers can react in time. An under-voltage event on an upstream rail may otherwise never reach end devices (e.g. eMMC). Register a notifier on a regulator's supply when the supply is resolved, and forward only REGULATOR_EVENT_UNDER_VOLTAGE to the consumer's notifier chain. Event handling is deferred to process context via a workqueue; the consumer rdev is lifetime-pinned and the rdev lock is held while calling the notifier chain. The notifier is unregistered on regulator teardown. No DT/UAPI changes. Behavior applies to all regulators with a supply. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20251001105650.2391477-1-o.rempel@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>