summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-06-02KVM: selftests: Add test for race in kvm_recalculate_apic_map()Michal Luczaj
Keep switching between LAPIC_MODE_X2APIC and LAPIC_MODE_DISABLED during APIC map construction to hunt for TOCTOU bugs in KVM. KVM's optimized map recalc makes multiple passes over the list of vCPUs, and the calculations ignore vCPU's whose APIC is hardware-disabled, i.e. there's a window where toggling LAPIC_MODE_DISABLED is quite interesting. Signed-off-by: Michal Luczaj <mhal@rbox.co> Co-developed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230602233250.1014316-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-06-02KVM: x86: Bail from kvm_recalculate_phys_map() if x2APIC ID is out-of-boundsSean Christopherson
Bail from kvm_recalculate_phys_map() and disable the optimized map if the target vCPU's x2APIC ID is out-of-bounds, i.e. if the vCPU was added and/or enabled its local APIC after the map was allocated. This fixes an out-of-bounds access bug in the !x2apic_format path where KVM would write beyond the end of phys_map. Check the x2APIC ID regardless of whether or not x2APIC is enabled, as KVM's hardcodes x2APIC ID to be the vCPU ID, i.e. it can't change, and the map allocation in kvm_recalculate_apic_map() doesn't check for x2APIC being enabled, i.e. the check won't get false postivies. Note, this also affects the x2apic_format path, which previously just ignored the "x2apic_id > new->max_apic_id" case. That too is arguably a bug fix, as ignoring the vCPU meant that KVM would not send interrupts to the vCPU until the next map recalculation. In practice, that "bug" is likely benign as a newly present vCPU/APIC would immediately trigger a recalc. But, there's no functional downside to disabling the map, and a future patch will gracefully handle the -E2BIG case by retrying instead of simply disabling the optimized map. Opportunistically add a sanity check on the xAPIC ID size, along with a comment explaining why the xAPIC ID is guaranteed to be "good". Reported-by: Michal Luczaj <mhal@rbox.co> Fixes: 5b84b0291702 ("KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230602233250.1014316-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-06-02KVM: x86: Account fastpath-only VM-Exits in vCPU statsSean Christopherson
Increment vcpu->stat.exits when handling a fastpath VM-Exit without going through any part of the "slow" path. Not bumping the exits stat can result in wildly misleading exit counts, e.g. if the primary reason the guest is exiting is to program the TSC deadline timer. Fixes: 404d5d7bff0d ("KVM: X86: Introduce more exit_fastpath_completion enum values") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230602011920.787844-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-06-02KVM: SVM: vNMI pending bit is V_NMI_PENDING_MASK not V_NMI_BLOCKING_MASKMaciej S. Szmigiero
While testing Hyper-V enabled Windows Server 2019 guests on Zen4 hardware I noticed that with vCPU count large enough (> 16) they sometimes froze at boot. With vCPU count of 64 they never booted successfully - suggesting some kind of a race condition. Since adding "vnmi=0" module parameter made these guests boot successfully it was clear that the problem is most likely (v)NMI-related. Running kvm-unit-tests quickly showed failing NMI-related tests cases, like "multiple nmi" and "pending nmi" from apic-split, x2apic and xapic tests and the NMI parts of eventinj test. The issue was that once one NMI was being serviced no other NMI was allowed to be set pending (NMI limit = 0), which was traced to svm_is_vnmi_pending() wrongly testing for the "NMI blocked" flag rather than for the "NMI pending" flag. Fix this by testing for the right flag in svm_is_vnmi_pending(). Once this is done, the NMI-related kvm-unit-tests pass successfully and the Windows guest no longer freezes at boot. Fixes: fa4c027a7956 ("KVM: x86: Add support for SVM's Virtual NMI") Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/be4ca192eb0c1e69a210db3009ca984e6a54ae69.1684495380.git.maciej.szmigiero@oracle.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-06-02KVM: x86/mmu: Grab memslot for correct address space in NX recovery workerSean Christopherson
Factor in the address space (non-SMM vs. SMM) of the target shadow page when recovering potential NX huge pages, otherwise KVM will retrieve the wrong memslot when zapping shadow pages that were created for SMM. The bug most visibly manifests as a WARN on the memslot being non-NULL, but the worst case scenario is that KVM could unaccount the shadow page without ensuring KVM won't install a huge page, i.e. if the non-SMM slot is being dirty logged, but the SMM slot is not. ------------[ cut here ]------------ WARNING: CPU: 1 PID: 3911 at arch/x86/kvm/mmu/mmu.c:7015 kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm] CPU: 1 PID: 3911 Comm: kvm-nx-lpage-re RIP: 0010:kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm] RSP: 0018:ffff99b284f0be68 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff99b284edd000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff9271397024e0 R08: 0000000000000000 R09: ffff927139702450 R10: 0000000000000000 R11: 0000000000000001 R12: ffff99b284f0be98 R13: 0000000000000000 R14: ffff9270991fcd80 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff927f9f640000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f0aacad3ae0 CR3: 000000088fc2c005 CR4: 00000000003726e0 Call Trace: <TASK> __pfx_kvm_nx_huge_page_recovery_worker+0x10/0x10 [kvm] kvm_vm_worker_thread+0x106/0x1c0 [kvm] kthread+0xd9/0x100 ret_from_fork+0x2c/0x50 </TASK> ---[ end trace 0000000000000000 ]--- This bug was exposed by commit edbdb43fc96b ("KVM: x86: Preserve TDP MMU roots until they are explicitly invalidated"), which allowed KVM to retain SMM TDP MMU roots effectively indefinitely. Before commit edbdb43fc96b, KVM would zap all SMM TDP MMU roots and thus all SMM TDP MMU shadow pages once all vCPUs exited SMM, which made the window where this bug (recovering an SMM NX huge page) could be encountered quite tiny. To hit the bug, the NX recovery thread would have to run while at least one vCPU was in SMM. Most VMs typically only use SMM during boot, and so the problematic shadow pages were gone by the time the NX recovery thread ran. Now that KVM preserves TDP MMU roots until they are explicitly invalidated (e.g. by a memslot deletion), the window to trigger the bug is effectively never closed because most VMMs don't delete memslots after boot (except for a handful of special scenarios). Fixes: eb298605705a ("KVM: x86/mmu: Do not recover dirty-tracked NX Huge Pages") Reported-by: Fabio Coatti <fabio.coatti@gmail.com> Closes: https://lore.kernel.org/all/CADpTngX9LESCdHVu_2mQkNGena_Ng2CphWNwsRGSMxzDsTjU2A@mail.gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230602010137.784664-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-06-02tpm, tpm_tis: correct tpm_tis_flags enumeration valuesLino Sanfilippo
With commit 858e8b792d06 ("tpm, tpm_tis: Avoid cache incoherency in test for interrupts") bit accessor functions are used to access flags in tpm_tis_data->flags. However these functions expect bit numbers, while the flags are defined as bit masks in enum tpm_tis_flag. Fix this inconsistency by using numbers instead of masks also for the flags in the enum. Reported-by: Pavel Machek <pavel@denx.de> Fixes: 858e8b792d06 ("tpm, tpm_tis: Avoid cache incoherency in test for interrupts") Signed-off-by: Lino Sanfilippo <l.sanfilippo@kunbus.com> Cc: stable@vger.kernel.org Reviewed-by: Pavel Machek <pavel@denx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-02Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fix from Ted Ts'o: "Fix an ext4 regression which landed during the 6.4 merge window" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: Revert "ext4: remove ac->ac_found > sbi->s_mb_min_to_scan dead check in ext4_mb_check_limits"
2023-06-02Merge tag 'for-6.4-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "One regression fix. The rewrite of scrub code in 6.4 broke device replace in zoned mode, some of the writes could happen out of order so this had to be adjusted for all cases" * tag 'for-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zoned: fix dev-replace after the scrub rework
2023-06-02Revert "ext4: remove ac->ac_found > sbi->s_mb_min_to_scan dead check in ↵Ojaswin Mujoo
ext4_mb_check_limits" This reverts commit 32c0869370194ae5ac9f9f501953ef693040f6a1. The reverted commit was intended to remove a dead check however it was observed that this check was actually being used to exit early instead of looping sbi->s_mb_max_to_scan times when we are able to find a free extent bigger than the goal extent. Due to this, a my performance tests (fsmark, parallel file writes in a highly fragmented FS) were seeing a 2x-3x regression. Example, the default value of the following variables is: sbi->s_mb_max_to_scan = 200 sbi->s_mb_min_to_scan = 10 In ext4_mb_check_limits() if we find an extent smaller than goal, then we return early and try again. This loop will go on until we have processed sbi->s_mb_max_to_scan(=200) number of free extents at which point we exit and just use whatever we have even if it is smaller than goal extent. Now, the regression comes when we find an extent bigger than goal. Earlier, in this case we would loop only sbi->s_mb_min_to_scan(=10) times and then just use the bigger extent. However with commit 32c08693 that check was removed and hence we would loop sbi->s_mb_max_to_scan(=200) times even though we have a big enough free extent to satisfy the request. The only time we would exit early would be when the free extent is *exactly* the size of our goal, which is pretty uncommon occurrence and so we would almost always end up looping 200 times. Hence, revert the commit by adding the check back to fix the regression. Also add a comment to outline this policy. Fixes: 32c086937019 ("ext4: remove ac->ac_found > sbi->s_mb_min_to_scan dead check in ext4_mb_check_limits") Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/ddcae9658e46880dfec2fb0aa61d01fb3353d202.1685449706.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-06-02media: uvcvideo: Don't expose unsupported formats to userspaceLaurent Pinchart
When the uvcvideo driver encounters a format descriptor with an unknown format GUID, it creates a corresponding struct uvc_format instance with the fcc field set to 0. Since commit 50459f103edf ("media: uvcvideo: Remove format descriptions"), the driver relies on the V4L2 core to provide the format description string, which the V4L2 core can't do without a valid 4CC. This triggers a WARN_ON. As a format with a zero 4CC can't be selected, it is unusable for applications. Ignore the format completely without creating a uvc_format instance, which fixes the warning. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217252 Link: https://bugzilla.redhat.com/show_bug.cgi?id=2180107 Fixes: 50459f103edf ("media: uvcvideo: Remove format descriptions") Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Ricardo Ribalda <ribalda@chromium.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2023-06-02Merge tag 'riscv-for-linus-6.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A build warning fix for BUILTIN_DTB=y - Hibernation support is hidden behind NONPORTABLE, as it depends on some undocumented early boot behavior and breaks on most platforms - A fix for relocatable kernels on systems with early boot errata - A fix to properly handle perf callchains for kernel tracepoints - A pair of fixes for NAPOT to avoid inconsistencies between PTEs and handle hardware that sets arbitrary A/D bits * tag 'riscv-for-linus-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Implement missing huge_ptep_get riscv: Fix huge_ptep_set_wrprotect when PTE is a NAPOT riscv: perf: Fix callchain parse error with kernel tracepoint events riscv: Fix relocatable kernels with early alternatives using -fno-pie RISC-V: mark hibernation as nonportable riscv: Fix unused variable warning when BUILTIN_DTB is set
2023-06-02media: v4l2-subdev: Fix missing kerneldoc for client_capsTomi Valkeinen
Add missing kernel doc for the new 'client_caps' field in struct v4l2_subdev_fh. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Fixes: f57fa2959244 ("media: v4l2-subdev: Add new ioctl for client capabilities") Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2023-06-02media: staging: media: imx: initialize hs_settle to avoid warningHans Verkuil
Initialize hs_settle to 0 to avoid this compiler warning: imx8mq-mipi-csi2.c: In function 'imx8mq_mipi_csi_start_stream.part.0': imx8mq-mipi-csi2.c:91:55: warning: 'hs_settle' may be used uninitialized [-Wmaybe-uninitialized] 91 | #define GPR_CSI2_1_S_PRG_RXHS_SETTLE(x) (((x) & 0x3f) << 2) | ^~ imx8mq-mipi-csi2.c:357:13: note: 'hs_settle' was declared here 357 | u32 hs_settle; | ^~~~~~~~~ It's a false positive, but it is too complicated for the compiler to detect that. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reviewed-by: Martin Kepplinger <martink@posteo.de> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2023-06-02media: v4l2-mc: Drop subdev check in v4l2_create_fwnode_links_to_pad()Vaishnav Achath
While updating v4l2_create_fwnode_links_to_pad() to accept non-subdev sinks, the check is_media_entity_v4l2_subdev() was not removed which prevented the function from being used with non-subdev sinks, Drop the unnecessary check. Fixes: bd5a03bc5be8 ("media: Accept non-subdev sinks in v4l2_create_fwnode_links_to_pad()") Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2023-06-02Merge tag 'nfsd-6.4-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Two minor bug fixes * tag 'nfsd-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: fix double fget() bug in __write_ports_addfd() nfsd: make a copy of struct iattr before calling notify_change
2023-06-02Merge tag 'block-6.4-2023-06-02' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: "Just an NVMe pull request with (mostly) KATO fixes, a regression fix for zoned device revalidation, and a fix for an md raid5 regression" * tag 'block-6.4-2023-06-02' of git://git.kernel.dk/linux: nvme: fix the name of Zone Append for verbose logging nvme: improve handling of long keep alives nvme: check IO start time when deciding to defer KA nvme: double KA polling frequency to avoid KATO with TBKAS on nvme: fix miss command type check block: fix revalidate performance regression md/raid5: fix miscalculation of 'end_sector' in raid5_read_one_chunk()
2023-06-02Merge tag 'io_uring-6.4-2023-06-02' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fix from Jens Axboe: "Just a single revert in here, removing the warning on the epoll ctl opcode. We originally deprecated this a few releases ago, but I've since had two people report that it's being used. Which isn't the biggest deal, obviously this is why we out in the deprecation notice in the first place, but it also means that we should just kill this warning again and abandon the deprecation plans. Since it's only a few handfuls of code to support epoll ctl, not worth going any further with this imho" * tag 'io_uring-6.4-2023-06-02' of git://git.kernel.dk/linux: io_uring: undeprecate epoll_ctl support
2023-06-02Merge tag 'mmc-v6.4-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC core: - Fix pwrseq for WILC1000/WILC3000 SDIO card MMC host: - vub300: Fix invalid response handling" * tag 'mmc-v6.4-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: pwrseq: sd8787: Fix WILC CHIP_EN and RESETN toggling order mmc: vub300: fix invalid response handling
2023-06-02Merge tag 'iommu-fixes-v6.4-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: "AMD IOMMU fixes: - Fix domain type and size checks - IOTLB flush fix for invalidating ranges - Guest IRQ handling fixes and GALOG overflow fix Rockchip IOMMU: - Error handling fix Mediatek IOMMU: - IOTLB flushing fix Renesas IOMMU: - Fix Kconfig dependencies to avoid build errors on RiscV" * tag 'iommu-fixes-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/mediatek: Flush IOTLB completely only if domain has been attached iommu/amd/pgtbl_v2: Fix domain max address iommu/amd: Fix domain flush size when syncing iotlb iommu/amd: Add missing domain type checks iommu/amd: Fix up merge conflict resolution iommu/amd: Handle GALog overflows iommu/amd: Don't block updates to GATag if guest mode is on iommu/rockchip: Fix unwind goto issue iommu: Make IPMMU_VMSA dependencies more strict
2023-06-02Merge tag 'drm-fixes-2023-06-02' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Quiet enough week, though the misc fixes tree didn't get to me when I was sending this, so maybe it'll be a bit bigger next week, just one i915 fix and some scattered amdgpu fixes: amdgpu: - Fix mclk and fclk output ordering on some APUs - Fix display regression with 5K VRR - VCN, JPEG spurious interrupt warning fixes - Fix SI DPM on some ARM64 platforms - Fix missing TMZ enablement on GC 11.0.1 i915: - Fix for OA reporting to allow detecting non-power-of-two reports" * tag 'drm-fixes-2023-06-02' of git://anongit.freedesktop.org/drm/drm: drm/i915/perf: Clear out entire reports after reading if not power of 2 size drm/amdgpu: enable tmz by default for GC 11.0.1 drm/amd/pm: resolve reboot exception for si oland drm/amdgpu: add RAS POISON interrupt funcs for jpeg_v4_0 drm/amdgpu: add RAS POISON interrupt funcs for jpeg_v2_6 drm/amdgpu: separate ras irq from jpeg instance irq for UVD_POISON drm/amdgpu: add RAS POISON interrupt funcs for vcn_v4_0 drm/amdgpu: add RAS POISON interrupt funcs for vcn_v2_6 drm/amdgpu: separate ras irq from vcn instance irq for UVD_POISON Revert "drm/amd/display: Do not set drr on pipe commit" Revert "drm/amd/display: Block optimize on consecutive FAMS enables" drm/amd/pm: reverse mclk and fclk clocks levels for renoir drm/amd/pm: reverse mclk and fclk clocks levels for vangogh drm/amd/pm: reverse mclk and fclk clocks levels for yellow carp drm/amd/pm: reverse mclk clocks levels for SMU v13.0.5 drm/amd/pm: reverse mclk and fclk clocks levels for SMU v13.0.4
2023-06-02Merge tag 'selinux-pr-20230601' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "A small SELinux Makefile fix to resolve a problem seen when building the kernel with older versions of make. The fix is pretty trivial and effectively reverts a patch that was merged during the last merge window" * tag 'selinux-pr-20230601' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: don't use make's grouped targets feature yet
2023-06-01riscv: Implement missing huge_ptep_getAlexandre Ghiti
huge_ptep_get must be reimplemented in order to go through all the PTEs of a NAPOT region: this is needed because the HW can update the A/D bits of any of the PTE that constitutes the NAPOT region. Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230428120120.21620-2-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-01riscv: Fix huge_ptep_set_wrprotect when PTE is a NAPOTAlexandre Ghiti
We need to avoid inconsistencies across the PTEs that form a NAPOT region, so when we write protect such a region, we should clear and flush all the PTEs to make sure that any of those PTEs is not cached which would result in such inconsistencies (arm64 does the same). Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230428120120.21620-1-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-01Merge tag 'modules-6.4-rc5-second-pull' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull modules fix from Luis Chamberlain: "A zstd fix by lucas as he tested zstd decompression support" * tag 'modules-6.4-rc5-second-pull' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: module/decompress: Fix error checking on zstd decompression
2023-06-01Merge tag 'efi-fixes-for-v6.4-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: "A few minor fixes for EFI, one of which fixes the reported boot regression when booting x86 kernels using the BIOS based loader built into the hypervisor framework on macOS. - fix harmless warning in zboot code on 'make clean' - add some missing prototypes - fix boot regressions triggered by PE/COFF header image minor version bump" * tag 'efi-fixes-for-v6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Bump stub image version for macOS HVF compatibility efi: fix missing prototype warnings efi/libstub: zboot: Avoid eager evaluation of objcopy flags
2023-06-02Merge tag 'drm-intel-fixes-2023-06-01' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix for OA reporting to allow detecting non-power-of-two reports Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZHimf55x/DyXYar1@jlahtine-mobl.ger.corp.intel.com
2023-06-02Merge tag 'amd-drm-fixes-6.4-2023-05-31' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.4-2023-05-31: amdgpu: - Fix mclk and fclk output ordering on some APUs - Fix display regression with 5K VRR - VCN, JPEG spurious interrupt warning fixes - Fix SI DPM on some ARM64 platforms - Fix missing TMZ enablement on GC 11.0.1 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230601033846.7628-1-alexander.deucher@amd.com
2023-06-01Merge tag 'fbdev-for-6.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev fixes from Helge Deller: "Most notable is a fix for a null-ptr-deref in fbcon's soft_cursor function which was found by syzbot. - Fix null-ptr-deref in soft_cursor - various remove callback conversions - error path fixes in imsttfb" * tag 'fbdev-for-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: fbdev: bw2: Convert to platform remove callback returning void fbdev: broadsheetfb: Convert to platform remove callback returning void fbdev: au1200fb: Convert to platform remove callback returning void fbdev: au1100fb: Convert to platform remove callback returning void fbdev: arcfb: Convert to platform remove callback returning void fbdev: au1100fb: Drop if with an always false condition fbcon: Fix null-ptr-deref in soft_cursor fbdev: imsttfb: Fix error path of imsttfb_probe() fbdev: imsttfb: Release framebuffer and dealloc cmap on error path fbdev: matroxfb ssd1307fb: Switch i2c drivers back to use .probe()
2023-06-01module/decompress: Fix error checking on zstd decompressionLucas De Marchi
While implementing support for in-kernel decompression in kmod, finit_module() was returning a very suspicious value: finit_module(3, "", MODULE_INIT_COMPRESSED_FILE) = 18446744072717407296 It turns out the check for module_get_next_page() failing is wrong, and hence the decompression was not really taking place. Invert the condition to fix it. Fixes: 169a58ad824d ("module/decompress: Support zstd in-kernel decompression") Cc: stable@kernel.org Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-01Merge tag 'mtd/fixes-for-6.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull mtd fixes from Miquel Raynal: "MTD core: - MAINTAINERS: Add Michal as reviewer instead of Naga - mtdchar: Mark bits of ioctl handler noinline NAND controller drivers: - marvell: - Don't set the NAND frequency select - Ensure timing values are written - ingenic: Fix empty stub helper definitions SPI-NOR core: - Fix divide by zero for spi-nor-generic flashes SPI-NOR manufacturer driver: - spansion: make sure local struct does not contain garbage" * tag 'mtd/fixes-for-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: marvell: don't set the NAND frequency select mtd: rawnand: marvell: ensure timing values are written mtdchar: mark bits of ioctl handler noinline MAINTAINERS: Add myself as reviewer instead of Naga mtd: spi-nor: Fix divide by zero for spi-nor-generic flashes mtd: rawnand: ingenic: fix empty stub helper definitions mtd: spi-nor: spansion: make sure local struct does not contain garbage
2023-06-01Merge tag 'net-6.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Happy Wear a Dress Day. Fairly standard-sized batch of fixes, accounting for the lack of sub-tree submissions this week. The mlx5 IRQ fixes are notable, people were complaining about that. No fires burning. Current release - regressions: - eth: mlx5e: - multiple fixes for dynamic IRQ allocation - prevent encap offload when neigh update is running - eth: mana: fix perf regression: remove rx_cqes, tx_cqes counters Current release - new code bugs: - eth: mlx5e: DR, add missing mutex init/destroy in pattern manager Previous releases - always broken: - tcp: deny tcp_disconnect() when threads are waiting - sched: prevent ingress Qdiscs from getting installed in random locations in the hierarchy and moving around - sched: flower: fix possible OOB write in fl_set_geneve_opt() - netlink: fix NETLINK_LIST_MEMBERSHIPS length report - udp6: fix race condition in udp6_sendmsg & connect - tcp: fix mishandling when the sack compression is deferred - rtnetlink: validate link attributes set at creation time - mptcp: fix connect timeout handling - eth: stmmac: fix call trace when stmmac_xdp_xmit() is invoked - eth: amd-xgbe: fix the false linkup in xgbe_phy_status - eth: mlx5e: - fix corner cases in internal buffer configuration - drain health before unregistering devlink - usb: qmi_wwan: set DTR quirk for BroadMobi BM818 Misc: - tcp: return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if user_mss set" * tag 'net-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits) mptcp: fix active subflow finalization mptcp: add annotations around sk->sk_shutdown accesses mptcp: fix data race around msk->first access mptcp: consolidate passive msk socket initialization mptcp: add annotations around msk->subflow accesses mptcp: fix connect timeout handling rtnetlink: add the missing IFLA_GRO_ tb check in validate_linkmsg rtnetlink: move IFLA_GSO_ tb check to validate_linkmsg rtnetlink: call validate_linkmsg in rtnl_create_link ice: recycle/free all of the fragments from multi-buffer frame net: phy: mxl-gpy: extend interrupt fix to all impacted variants net: renesas: rswitch: Fix return value in error path of xmit net: dsa: mv88e6xxx: Increase wait after reset deactivation net: ipa: Use correct value for IPA_STATUS_SIZE tcp: fix mishandling when the sack compression is deferred. net/sched: flower: fix possible OOB write in fl_set_geneve_opt() sfc: fix error unwinds in TC offload net/mlx5: Read embedded cpu after init bit cleared net/mlx5e: Fix error handling in mlx5e_refresh_tirs net/mlx5: Ensure af_desc.mask is properly initialized ...
2023-06-01fork, vhost: Use CLONE_THREAD to fix freezer/ps regressionMike Christie
When switching from kthreads to vhost_tasks two bugs were added: 1. The vhost worker tasks's now show up as processes so scripts doing ps or ps a would not incorrectly detect the vhost task as another process. 2. kthreads disabled freeze by setting PF_NOFREEZE, but vhost tasks's didn't disable or add support for them. To fix both bugs, this switches the vhost task to be thread in the process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that SIGKILL/STOP support is required because CLONE_THREAD requires CLONE_SIGHAND which requires those 2 signals to be supported. This is a modified version of the patch written by Mike Christie <michael.christie@oracle.com> which was a modified version of patch originally written by Linus. Much of what depended upon PF_IO_WORKER now depends on PF_USER_WORKER. Including ignoring signals, setting up the register state, and having get_signal return instead of calling do_group_exit. Tidied up the vhost_task abstraction so that the definition of vhost_task only needs to be visible inside of vhost_task.c. Making it easier to review the code and tell what needs to be done where. As part of this the main loop has been moved from vhost_worker into vhost_task_fn. vhost_worker now returns true if work was done. The main loop has been updated to call get_signal which handles SIGSTOP, freezing, and collects the message that tells the thread to exit as part of process exit. This collection clears __fatal_signal_pending. This collection is not guaranteed to clear signal_pending() so clear that explicitly so the schedule() sleeps. For now the vhost thread continues to exist and run work until the last file descriptor is closed and the release function is called as part of freeing struct file. To avoid hangs in the coredump rendezvous and when killing threads in a multi-threaded exec. The coredump code and de_thread have been modified to ignore vhost threads. Remvoing the special case for exec appears to require teaching vhost_dev_flush how to directly complete transactions in case the vhost thread is no longer running. Removing the special case for coredump rendezvous requires either the above fix needed for exec or moving the coredump rendezvous into get_signal. Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads") Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Co-developed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-01dt-bindings: serial: 8250_omap: add rs485-rts-active-highFrancesco Dolcini
Add rs485-rts-active-high property, this was removed by mistake. In general we just use rs485-rts-active-low property, however the OMAP UART for legacy reason uses the -high one. Fixes: 767d3467eb60 ("dt-bindings: serial: 8250_omap: drop rs485 properties") Closes: https://lore.kernel.org/all/ZGefR4mTHHo1iQ7H@francesco-nb.int.toradex.com/ Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230531111038.6302-1-francesco@dolcini.it Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-01selinux: don't use make's grouped targets feature yetPaul Moore
The Linux Kernel currently only requires make v3.82 while the grouped target functionality requires make v4.3. Removed the grouped target introduced in 4ce1f694eb5d ("selinux: ensure av_permissions.h is built when needed") as well as the multiple header file targets in the make rule. This effectively reverts the problem commit. We will revisit this change when make >= 4.3 is required by the rest of the kernel. Cc: stable@vger.kernel.org Fixes: 4ce1f694eb5d ("selinux: ensure av_permissions.h is built when needed") Reported-by: Erwan Velu <e.velu@criteo.com> Reported-by: Luiz Capitulino <luizcap@amazon.com> Tested-by: Luiz Capitulino <luizcap@amazon.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-01Merge tag 'mlx5-fixes-2023-05-31' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2023-05-31 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2023-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Read embedded cpu after init bit cleared net/mlx5e: Fix error handling in mlx5e_refresh_tirs net/mlx5: Ensure af_desc.mask is properly initialized net/mlx5: Fix setting of irq->map.index for static IRQ case net/mlx5: Remove rmap also in case dynamic MSIX not supported ==================== Link: https://lore.kernel.org/r/20230601031051.131529-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01Merge tag 'nvme-6.4-2023-06-01' of git://git.infradead.org/nvme into block-6.4Jens Axboe
Pull NVMe fixes from Keith: "nvme fixes for Linux 6.4 - Fixes for spurious Keep Alive timeouts (Uday) - Fix for command type check on passthrough actions (Min) - Fix for nvme command name for error logging (Christoph)" * tag 'nvme-6.4-2023-06-01' of git://git.infradead.org/nvme: nvme: fix the name of Zone Append for verbose logging nvme: improve handling of long keep alives nvme: check IO start time when deciding to defer KA nvme: double KA polling frequency to avoid KATO with TBKAS on nvme: fix miss command type check
2023-06-01riscv: perf: Fix callchain parse error with kernel tracepoint eventsIsm Hong
For RISC-V, when tracing with tracepoint events, the IP and status are set to 0, preventing the perf code parsing the callchain and resolving the symbols correctly. ./ply 'tracepoint:kmem/kmem_cache_alloc { @[stack]=count(); }' @: { <STACKID4294967282> }: 1 The fix is to implement perf_arch_fetch_caller_regs for riscv, which fills several necessary registers used for callchain unwinding, including epc, sp, s0 and status. It's similar to commit b3eac0265bf6 ("arm: perf: Fix callchain parse error with kernel tracepoint events") and commit 5b09a094f2fb ("arm64: perf: Fix callchain parse error with kernel tracepoint events"). With this patch, callchain can be parsed correctly as: ./ply 'tracepoint:kmem/kmem_cache_alloc { @[stack]=count(); }' @: { __traceiter_kmem_cache_alloc+68 __traceiter_kmem_cache_alloc+68 kmem_cache_alloc+354 __sigqueue_alloc+94 __send_signal_locked+646 send_signal_locked+154 do_send_sig_info+84 __kill_pgrp_info+130 kill_pgrp+60 isig+150 n_tty_receive_signal_char+36 n_tty_receive_buf_standard+2214 n_tty_receive_buf_common+280 n_tty_receive_buf2+26 tty_ldisc_receive_buf+34 tty_port_default_receive_buf+62 flush_to_ldisc+158 process_one_work+458 worker_thread+138 kthread+178 riscv_cpufeature_patch_func+832 }: 1 Signed-off-by: Ism Hong <ism.hong@gmail.com> Link: https://lore.kernel.org/r/20230601095355.1168910-1-ism.hong@gmail.com Fixes: 178e9fc47aae ("perf: riscv: preliminary RISC-V support") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-01Merge branch ↵Jakub Kicinski
'mptcp-fixes-for-connect-timeout-access-annotations-and-subflow-init' Mat Martineau says: ==================== mptcp: Fixes for connect timeout, access annotations, and subflow init Patch 1 allows the SO_SNDTIMEO sockopt to correctly change the connect timeout on MPTCP sockets. Patches 2-5 add READ_ONCE()/WRITE_ONCE() annotations to fix KCSAN issues. Patch 6 correctly initializes some subflow fields on outgoing connections. ==================== Link: https://lore.kernel.org/r/20230531-send-net-20230531-v1-0-47750c420571@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01mptcp: fix active subflow finalizationPaolo Abeni
Active subflow are inserted into the connection list at creation time. When the MPJ handshake completes successfully, a new subflow creation netlink event is generated correctly, but the current code wrongly avoid initializing a couple of subflow data. The above will cause misbehavior on a few exceptional events: unneeded mptcp-level retransmission on msk-level sequence wrap-around and infinite mapping fallback even when a MPJ socket is present. Address the issue factoring out the needed initialization in a new helper and invoking the latter from __mptcp_finish_join() time for passive subflow and from mptcp_finish_join() for active ones. Fixes: 0530020a7c8f ("mptcp: track and update contiguous data status") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01mptcp: add annotations around sk->sk_shutdown accessesPaolo Abeni
Christoph reported the mptcp variant of a recently addressed plain TCP issue. Similar to commit e14cadfd80d7 ("tcp: add annotations around sk->sk_shutdown accesses") add READ/WRITE ONCE annotations to silence KCSAN reports around lockless sk_shutdown access. Fixes: 71ba088ce0aa ("mptcp: cleanup accept and poll") Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/401 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01mptcp: fix data race around msk->first accessPaolo Abeni
The first subflow socket is accessed outside the msk socket lock by mptcp_subflow_fail(), we need to annotate each write access with WRITE_ONCE, but a few spots still lacks it. Fixes: 76a13b315709 ("mptcp: invoke MP_FAIL response when needed") Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01mptcp: consolidate passive msk socket initializationPaolo Abeni
When the msk socket is cloned at MPC handshake time, a few fields are initialized in a racy way outside mptcp_sk_clone() and the msk socket lock. The above is due historical reasons: before commit a88d0092b24b ("mptcp: simplify subflow_syn_recv_sock()") as the first subflow socket carrying all the needed date was not available yet at msk creation time We can now refactor the code moving the missing initialization bit under the socket lock, removing the init race and avoiding some code duplication. This will also simplify the next patch, as all msk->first write access are now under the msk socket lock. Fixes: 0397c6d85f9c ("mptcp: keep unaccepted MPC subflow into join list") Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01mptcp: add annotations around msk->subflow accessesPaolo Abeni
The MPTCP can access the first subflow socket in a few spots outside the socket lock scope. That is actually safe, as MPTCP will delete the socket itself only after the msk sock close(). Still the such accesses causes a few KCSAN splats, as reported by Christoph. Silence the harmless warning adding a few annotation around the relevant accesses. Fixes: 71ba088ce0aa ("mptcp: cleanup accept and poll") Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/402 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01mptcp: fix connect timeout handlingPaolo Abeni
Ondrej reported a functional issue WRT timeout handling on connect with a nice reproducer. The problem is that the current mptcp connect waits for both the MPTCP socket level timeout, and the first subflow socket timeout. The latter is not influenced/touched by the exposed setsockopt(). Overall the above makes the SO_SNDTIMEO a no-op on connect. Since mptcp_connect is invoked via inet_stream_connect and the latter properly handle the MPTCP level timeout, we can address the issue making the nested subflow level connect always unblocking. This also allow simplifying a bit the code, dropping an ugly hack to handle the fastopen and custom proto_ops connect. The issues predates the blamed commit below, but the current resolution requires the infrastructure introduced there. Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()") Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399 Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01Merge branch 'rtnetlink-a-couple-of-fixes-in-linkmsg-validation'Jakub Kicinski
Xin Long says: ==================== rtnetlink: a couple of fixes in linkmsg validation validate_linkmsg() was introduced to do linkmsg validation for existing links. However, the new created links also need this linkmsg validation. Add validate_linkmsg() check for link creating in Patch 1, and add more tb checks into validate_linkmsg() in Patch 2 and 3. ==================== Link: https://lore.kernel.org/r/cover.1685548598.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01rtnetlink: add the missing IFLA_GRO_ tb check in validate_linkmsgXin Long
This fixes the issue that dev gro_max_size and gso_ipv4_max_size can be set to a huge value: # ip link add dummy1 type dummy # ip link set dummy1 gro_max_size 4294967295 # ip -d link show dummy1 dummy addrgenmode eui64 ... gro_max_size 4294967295 Fixes: 0fe79f28bfaf ("net: allow gro_max_size to exceed 65536") Fixes: 9eefedd58ae1 ("net: add gso_ipv4_max_size and gro_ipv4_max_size per device") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01rtnetlink: move IFLA_GSO_ tb check to validate_linkmsgXin Long
These IFLA_GSO_* tb check should also be done for the new created link, otherwise, they can be set to a huge value when creating links: # ip link add dummy1 gso_max_size 4294967295 type dummy # ip -d link show dummy1 dummy addrgenmode eui64 ... gso_max_size 4294967295 Fixes: 46e6b992c250 ("rtnetlink: allow GSO maximums to be set on device creation") Fixes: 9eefedd58ae1 ("net: add gso_ipv4_max_size and gro_ipv4_max_size per device") Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01rtnetlink: call validate_linkmsg in rtnl_create_linkXin Long
validate_linkmsg() was introduced by commit 1840bb13c22f5b ("[RTNL]: Validate hardware and broadcast address attribute for RTM_NEWLINK") to validate tb[IFLA_ADDRESS/BROADCAST] for existing links. The same check should also be done for newly created links. This patch adds validate_linkmsg() call in rtnl_create_link(), to avoid the invalid address set when creating some devices like: # ip link add dummy0 type dummy # ip link add link dummy0 name mac0 address 01:02 type macsec Fixes: 0e06877c6fdb ("[RTNETLINK]: rtnl_link: allow specifying initial device address") Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01ice: recycle/free all of the fragments from multi-buffer frameMaciej Fijalkowski
The ice driver caches next_to_clean value at the beginning of ice_clean_rx_irq() in order to remember the first buffer that has to be freed/recycled after main Rx processing loop. The end boundary is indicated by first descriptor of frame that Rx processing loop has ended its duties. Note that if mentioned loop ended in the middle of gathering multi-buffer frame, next_to_clean would be pointing to the descriptor in the middle of the frame BUT freeing/recycling stage will stop at the first descriptor. This means that next iteration of ice_clean_rx_irq() will miss the (first_desc, next_to_clean - 1) entries. When running various 9K MTU workloads, such splats were observed: [ 540.780716] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 540.787787] #PF: supervisor read access in kernel mode [ 540.793002] #PF: error_code(0x0000) - not-present page [ 540.798218] PGD 0 P4D 0 [ 540.800801] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 540.805231] CPU: 18 PID: 3984 Comm: xskxceiver Tainted: G W 6.3.0-rc7+ #96 [ 540.813619] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [ 540.824209] RIP: 0010:ice_clean_rx_irq+0x2b6/0xf00 [ice] [ 540.829678] Code: 74 24 10 e9 aa 00 00 00 8b 55 78 41 31 57 10 41 09 c4 4d 85 ff 0f 84 83 00 00 00 49 8b 57 08 41 8b 4f 1c 65 8b 35 1a fa 4b 3f <48> 8b 02 48 c1 e8 3a 39 c6 0f 85 a2 00 00 00 f6 42 08 02 0f 85 98 [ 540.848717] RSP: 0018:ffffc9000f42fc50 EFLAGS: 00010282 [ 540.854029] RAX: 0000000000000004 RBX: 0000000000000002 RCX: 000000000000fffe [ 540.861272] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff [ 540.868519] RBP: ffff88984a05ac00 R08: 0000000000000000 R09: dead000000000100 [ 540.875760] R10: ffff88983fffcd00 R11: 000000000010f2b8 R12: 0000000000000004 [ 540.883008] R13: 0000000000000003 R14: 0000000000000800 R15: ffff889847a10040 [ 540.890253] FS: 00007f6ddf7fe640(0000) GS:ffff88afdf800000(0000) knlGS:0000000000000000 [ 540.898465] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 540.904299] CR2: 0000000000000000 CR3: 000000010d3da001 CR4: 00000000007706e0 [ 540.911542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 540.918789] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 540.926032] PKRU: 55555554 [ 540.928790] Call Trace: [ 540.931276] <TASK> [ 540.933418] ice_napi_poll+0x4ca/0x6d0 [ice] [ 540.937804] ? __pfx_ice_napi_poll+0x10/0x10 [ice] [ 540.942716] napi_busy_loop+0xd7/0x320 [ 540.946537] xsk_recvmsg+0x143/0x170 [ 540.950178] sock_recvmsg+0x99/0xa0 [ 540.953729] __sys_recvfrom+0xa8/0x120 [ 540.957543] ? do_futex+0xbd/0x1d0 [ 540.961008] ? __x64_sys_futex+0x73/0x1d0 [ 540.965083] __x64_sys_recvfrom+0x20/0x30 [ 540.969155] do_syscall_64+0x38/0x90 [ 540.972796] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 540.977934] RIP: 0033:0x7f6de5f27934 To fix this, set cached_ntc to first_desc so that at the end, when freeing/recycling buffers, descriptors from first to ntc are not missed. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230531154457.3216621-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01net: phy: mxl-gpy: extend interrupt fix to all impacted variantsXu Liang
The interrupt fix in commit 97a89ed101bb should be applied on all variants of GPY2xx PHY and GPY115C. Fixes: 97a89ed101bb ("net: phy: mxl-gpy: disable interrupts on GPY215 by default") Signed-off-by: Xu Liang <lxu@maxlinear.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230531074822.39136-1-lxu@maxlinear.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>