summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-08-11riscv: ensure cpu_ops_sbi is declaredConor Dooley
Sparse complains that cpu_ops_sbi is used undeclared: arch/riscv/kernel/cpu_ops_sbi.c:17:29: warning: symbol 'cpu_ops_sbi' was not declared. Should it be static? Fix the warning by adding cpu_ops_sbi to cpu_ops_sbi.h & including that where used. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220714080235.3853374-1-conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11Merge tag 'net-6.0-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth, bpf, can and netfilter. A little larger than usual but it's all fixes, no late features. It's large partially because of timing, and partially because of follow ups to stuff that got merged a week or so before the merge window and wasn't as widely tested. Maybe the Bluetooth fixes are a little alarming so we'll address that, but the rest seems okay and not scary. Notably we're including a fix for the netfilter Kconfig [1], your WiFi warning [2] and a bluetooth fix which should unblock syzbot [3]. Current release - regressions: - Bluetooth: - don't try to cancel uninitialized works [3] - L2CAP: fix use-after-free caused by l2cap_chan_put - tls: rx: fix device offload after recent rework - devlink: fix UAF on failed reload and leftover locks in mlxsw Current release - new code bugs: - netfilter: - flowtable: fix incorrect Kconfig dependencies [1] - nf_tables: fix crash when nf_trace is enabled - bpf: - use proper target btf when exporting attach_btf_obj_id - arm64: fixes for bpf trampoline support - Bluetooth: - ISO: unlock on error path in iso_sock_setsockopt() - ISO: fix info leak in iso_sock_getsockopt() - ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP - ISO: fix memory corruption on iso_pinfo.base - ISO: fix not using the correct QoS - hci_conn: fix updating ISO QoS PHY - phy: dp83867: fix get nvmem cell fail Previous releases - regressions: - wifi: cfg80211: fix validating BSS pointers in __cfg80211_connect_result [2] - atm: bring back zatm uAPI after ATM had been removed - properly fix old bug making bonding ARP monitor mode not being able to work with software devices with lockless Tx - tap: fix null-deref on skb->dev in dev_parse_header_protocol - revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps some devices and breaks others - netfilter: - nf_tables: many fixes rejecting cross-object linking which may lead to UAFs - nf_tables: fix null deref due to zeroed list head - nf_tables: validate variable length element extension - bgmac: fix a BUG triggered by wrong bytes_compl - bcmgenet: indicate MAC is in charge of PHY PM Previous releases - always broken: - bpf: - fix bad pointer deref in bpf_sys_bpf() injected via test infra - disallow non-builtin bpf programs calling the prog_run command - don't reinit map value in prealloc_lru_pop - fix UAFs during the read of map iterator fd - fix invalidity check for values in sk local storage map - reject sleepable program for non-resched map iterator - mptcp: - move subflow cleanup in mptcp_destroy_common() - do not queue data on closed subflows - virtio_net: fix memory leak inside XDP_TX with mergeable - vsock: fix memory leak when multiple threads try to connect() - rework sk_user_data sharing to prevent psock leaks - geneve: fix TOS inheriting for ipv4 - tunnels & drivers: do not use RT_TOS for IPv6 flowlabel - phy: c45 baset1: do not skip aneg configuration if clock role is not specified - rose: avoid overflow when /proc displays timer information - x25: fix call timeouts in blocking connects - can: mcp251x: fix race condition on receive interrupt - can: j1939: - replace user-reachable WARN_ON_ONCE() with netdev_warn_once() - fix memory leak of skbs in j1939_session_destroy() Misc: - docs: bpf: clarify that many things are not uAPI - seg6: initialize induction variable to first valid array index (to silence clang vs objtool warning) - can: ems_usb: fix clang 14's -Wunaligned-access warning" * tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (117 commits) net: atm: bring back zatm uAPI dpaa2-eth: trace the allocated address instead of page struct net: add missing kdoc for struct genl_multicast_group::flags nfp: fix use-after-free in area_cache_get() MAINTAINERS: use my korg address for mt7601u mlxsw: minimal: Fix deadlock in ports creation bonding: fix reference count leak in balance-alb mode net: usb: qmi_wwan: Add support for Cinterion MV32 bpf: Shut up kern_sys_bpf warning. net/tls: Use RCU API to access tls_ctx->netdev tls: rx: device: don't try to copy too much on detach tls: rx: device: bound the frag walk net_sched: cls_route: remove from list when handle is 0 selftests: forwarding: Fix failing tests with old libnet net: refactor bpf_sk_reuseport_detach() net: fix refcount bug in sk_psock_get (2) selftests/bpf: Ensure sleepable program is rejected by hash map iter selftests/bpf: Add write tests for sk local storage map iterator selftests/bpf: Add tests for reading a dangling map iter fd bpf: Only allow sleepable program for resched-able iterator ...
2022-08-11Merge tag 'acpi-5.20-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI updates from Rafael Wysocki: "These fix up direct references to the fwnode field in struct device and extend ACPI device properties support. Specifics: - Replace direct references to the fwnode field in struct device with dev_fwnode() and device_match_fwnode() (Andy Shevchenko) - Make the ACPI code handling device properties support properties with buffer values (Sakari Ailus)" * tag 'acpi-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: property: Fix error handling in acpi_init_properties() ACPI: VIOT: Do not dereference fwnode in struct device ACPI: property: Read buffer properties as integers ACPI: property: Add support for parsing buffer property UUID ACPI: property: Unify integer value reading functions ACPI: property: Switch node property referencing from ifs to a switch ACPI: property: Move property ref argument parsing into a new function ACPI: property: Use acpi_object_type consistently in property ref parsing ACPI: property: Tie data nodes to acpi handles ACPI: property: Return type of acpi_add_nondev_subnodes() should be bool
2022-08-11RISC-V: cpu_ops_spinwait.c should include head.hBen Dooks
Running sparse shows cpu_ops_spinwait.c is missing two definitions found in head.h, so include it to stop the following warnings: arch/riscv/kernel/cpu_ops_spinwait.c:15:6: warning: symbol '__cpu_spinwait_stack_pointer' was not declared. Should it be static? arch/riscv/kernel/cpu_ops_spinwait.c:16:6: warning: symbol '__cpu_spinwait_task_pointer' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@sifive.com> Link: https://lore.kernel.org/r/20220713215306.94675-1-ben.dooks@sifive.com Fixes: c78f94f35cf6 ("RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull more iomap updates from Darrick Wong: "In the past 10 days or so I've not heard any ZOMG STOP style complaints about removing ->writepage support from gfs2 or zonefs, so here's the pull request removing them (and the underlying fs iomap support) from the kernel: - Remove iomap_writepage and all callers, since the mm apparently never called the zonefs or gfs2 writepage functions" * tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: remove iomap_writepage zonefs: remove ->writepage gfs2: remove ->writepage gfs2: stop using generic_writepages in gfs2_ail1_start_one
2022-08-11RISC-V: Declare cpu_ops_spinwait in <asm/cpu_ops.h>Ben Dooks
The cpu_ops_spinwait is used in a couple of places in arch/riscv and is causing a sparse warning due to no declaration. Add this to <asm/cpu_ops.h> with the others to fix the following: arch/riscv/kernel/cpu_ops_spinwait.c:16:29: warning: symbol 'cpu_ops_spinwait' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@sifive.com> Link: https://lore.kernel.org/r/20220714071811.187491-1-ben.dooks@sifive.com [Palmer: Drop the extern from cpu_ops.c] Fixes: 2ffc48fc7071 ("RISC-V: Move spinwait booting method to its own config") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-12Merge tag 'drm-misc-next-fixes-2022-08-10' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next Short summary of fixes pull: * gem: Annotate WW context in error paths * shmem-helper: Add missing vunmap in error paths Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/YvOLPpufsvOJHiNY@linux-uq9g
2022-08-11Merge tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: "We have a good pile of various fixes and cleanups from Xiubo, Jeff, Luis and others, almost exclusively in the filesystem. Several patches touch files outside of our normal purview to set the stage for bringing in Jeff's long awaited ceph+fscrypt series in the near future. All of them have appropriate acks and sat in linux-next for a while" * tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-client: (27 commits) libceph: clean up ceph_osdc_start_request prototype libceph: fix ceph_pagelist_reserve() comment typo ceph: remove useless check for the folio ceph: don't truncate file in atomic_open ceph: make f_bsize always equal to f_frsize ceph: flush the dirty caps immediatelly when quota is approaching libceph: print fsid and epoch with osd id libceph: check pointer before assigned to "c->rules[]" ceph: don't get the inline data for new creating files ceph: update the auth cap when the async create req is forwarded ceph: make change_auth_cap_ses a global symbol ceph: fix incorrect old_size length in ceph_mds_request_args ceph: switch back to testing for NULL folio->private in ceph_dirty_folio ceph: call netfs_subreq_terminated with was_async == false ceph: convert to generic_file_llseek ceph: fix the incorrect comment for the ceph_mds_caps struct ceph: don't leak snap_rwsem in handle_cap_grant ceph: prevent a client from exceeding the MDS maximum xattr size ceph: choose auth MDS for getxattr with the Xs caps ceph: add session already open notify support ...
2022-08-11Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull more kvm updates from Paolo Bonzini: - Xen timer fixes - Documentation formatting fixes - Make rseq selftest compatible with glibc-2.35 - Fix handling of illegal LEA reg, reg - Cleanup creation of debugfs entries - Fix steal time cache handling bug - Fixes for MMIO caching - Optimize computation of number of LBRs - Fix uninitialized field in guest_maxphyaddr < host_maxphyaddr path * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (26 commits) KVM: x86/MMU: properly format KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability table Documentation: KVM: extend KVM_CAP_VM_DISABLE_NX_HUGE_PAGES heading underline KVM: VMX: Adjust number of LBR records for PERF_CAPABILITIES at refresh KVM: VMX: Use proper type-safe functions for vCPU => LBRs helpers KVM: x86: Refresh PMU after writes to MSR_IA32_PERF_CAPABILITIES KVM: selftests: Test all possible "invalid" PERF_CAPABILITIES.LBR_FMT vals KVM: selftests: Use getcpu() instead of sched_getcpu() in rseq_test KVM: selftests: Make rseq compatible with glibc-2.35 KVM: Actually create debugfs in kvm_create_vm() KVM: Pass the name of the VM fd to kvm_create_vm_debugfs() KVM: Get an fd before creating the VM KVM: Shove vcpu stats_id init into kvm_vcpu_init() KVM: Shove vm stats_id init into kvm_create_vm() KVM: x86/mmu: Add sanity check that MMIO SPTE mask doesn't overlap gen KVM: x86/mmu: rename trace function name for asynchronous page fault KVM: x86/xen: Stop Xen timer before changing IRQ KVM: x86/xen: Initialize Xen timer only once KVM: SVM: Disable SEV-ES support if MMIO caching is disable KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change KVM: x86: Tag kvm_mmu_x86_module_init() with __init ...
2022-08-11riscv: dts: starfive: correct number of external interruptsMark Kettenis
The PLIC integrated on the Vic_U7_Core integrated on the StarFive JH7100 SoC actually supports 133 external interrupts. 127 of these are exposed to the outside world; the remainder are used by other devices that are part of the core-complex such as the L2 cache controller. But all 133 interrupts are external interrupts as far as the PLIC is concerned. Fix the property so that the driver can manage these additional interrupts, which is important since the interrupts for the L2 cache controller are enabled by default. Fixes: ec85362fb121 ("RISC-V: Add initial StarFive JH7100 device tree") Signed-off-by: Mark Kettenis <kettenis@openbsd.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220707185529.19509-1-kettenis@openbsd.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11riscv: dts: sifive unmatched: Add PWM controlled LEDsEmil Renner Berthing
This adds the two PWM controlled LEDs to the HiFive Unmatched device tree. D12 is just a regular green diode, but D2 is an RGB diode with 3 PWM inputs controlling the three different colours. Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Pavel Machek <pavel@ucw.cz> Tested-by: Ron Economos <re@w6rz.net> Link: https://lore.kernel.org/r/20220705210143.315151-5-emil.renner.berthing@canonical.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11net: atm: bring back zatm uAPIJakub Kicinski
Jiri reports that linux-atm does not build without this header. Bring it back. It's completely dead code but we can't break the build for user space :( Reported-by: Jiri Slaby <jirislaby@kernel.org> Fixes: 052e1f01bfae ("net: atm: remove support for ZeitNet ZN122x ATM devices") Link: https://lore.kernel.org/all/8576aef3-37e4-8bae-bab5-08f82a78efd3@kernel.org/ Link: https://lore.kernel.org/r/20220810164547.484378-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11dpaa2-eth: trace the allocated address instead of page structChen Lin
We should trace the allocated address instead of page struct. Fixes: 27c874867c4e ("dpaa2-eth: Use a single page per Rx buffer") Signed-off-by: Chen Lin <chen.lin5@zte.com.cn> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20220811151651.3327-1-chen45464546@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11Merge branch 'acpi-properties'Rafael J. Wysocki
Merge changes adding support for device properties with buffer values to the ACPI device properties handling code. * acpi-properties: ACPI: property: Fix error handling in acpi_init_properties() ACPI: property: Read buffer properties as integers ACPI: property: Add support for parsing buffer property UUID ACPI: property: Unify integer value reading functions ACPI: property: Switch node property referencing from ifs to a switch ACPI: property: Move property ref argument parsing into a new function ACPI: property: Use acpi_object_type consistently in property ref parsing ACPI: property: Tie data nodes to acpi handles ACPI: property: Return type of acpi_add_nondev_subnodes() should be bool
2022-08-11riscv/purgatory: Omit use of bin2cMasahiro Yamada
The .incbin assembler directive is much faster than bin2c + $(CC). Do similar refactoring as in commit 4c0f032d4963 ("s390/purgatory: Omit use of bin2c"). Please note the .quad directive matches to size_t in C (both 8 byte) because the purgatory is compiled only for the 64-bit kernel. (KEXEC_FILE depends on 64BIT). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20220625223438.835408-2-masahiroy@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11riscv/purgatory: hard-code obj-y in MakefileMasahiro Yamada
The purgatory/ directory is entirely guarded in arch/riscv/Kbuild. CONFIG_ARCH_HAS_KEXEC_PURGATORY is bool type. $(CONFIG_ARCH_HAS_KEXEC_PURGATORY) is always 'y' when Kbuild visits this Makefile for building. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20220625223438.835408-1-masahiroy@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11net: add missing kdoc for struct genl_multicast_group::flagsJakub Kicinski
Multicast group flags were added in commit 4d54cc32112d ("mptcp: avoid lock_fast usage in accept path"), but it missed adding the kdoc. Mention which flags go into that field, and do the same for op structs. Link: https://lore.kernel.org/r/20220809232012.403730-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11Merge tag 'input-for-v5.20-rc0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - changes to input core to properly queue synthetic events (such as autorepeat) and to release multitouch contacts when an input device is inhibited or suspended - reworked quirk handling in i8042 driver that consolidates multiple DMI tables into one and adds several quirks for TUXEDO line of laptops - update to mt6779 keypad to better reflect organization of the hardware - changes to mtk-pmic-keys driver preparing it to handle more variants - facelift of adp5588-keys driver - improvements to iqs7222 driver - adjustments to various DT binding documents for input devices - other assorted driver fixes. * tag 'input-for-v5.20-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (54 commits) Input: adc-joystick - fix ordering in adc_joystick_probe() dt-bindings: input: ariel-pwrbutton: use spi-peripheral-props.yaml Input: deactivate MT slots when inhibiting or suspending devices Input: properly queue synthetic events dt-bindings: input: iqs7222: Use central 'linux,code' definition Input: i8042 - add dritek quirk for Acer Aspire One AO532 dt-bindings: input: gpio-keys: accept also interrupt-extended dt-bindings: input: gpio-keys: reference input.yaml and document properties dt-bindings: input: gpio-keys: enforce node names to match all properties dt-bindings: input: Convert adc-keys to DT schema dt-bindings: input: Centralize 'linux,input-type' definition dt-bindings: input: Use common 'linux,keycodes' definition dt-bindings: input: Centralize 'linux,code' definition dt-bindings: input: Increase maximum keycode value to 0x2ff Input: mt6779-keypad - implement row/column selection Input: mt6779-keypad - match hardware matrix organization Input: i8042 - add additional TUXEDO devices to i8042 quirk tables Input: goodix - switch use of acpi_gpio_get_*_resource() APIs Input: i8042 - add TUXEDO devices to i8042 quirk tables Input: i8042 - add debug output for quirks ...
2022-08-11RISC-V: fixups to work with crash toolPalmer Dabbelt
A handful of fixes to our kexec/crash kernel support that allow crash tool to function. Link: https://lore.kernel.org/r/mhng-f5fdaa37-e99a-4214-a297-ec81f0fed0c1@palmer-mbp2014 * commit 'f9293ad46d8ba9909187a37b7215324420ad4596': RISC-V: Add modules to virtual kernel memory layout dump RISC-V: Fixup schedule out issue in machine_crash_shutdown() RISC-V: Fixup get incorrect user mode PC for kernel mode regs RISC-V: kexec: Fixup use of smp_processor_id() in preemptible context
2022-08-11nfp: fix use-after-free in area_cache_get()Jialiang Wang
area_cache_get() is used to distribute cache->area and set cache->id, and if cache->id is not 0 and cache->area->kref refcount is 0, it will release the cache->area by nfp_cpp_area_release(). area_cache_get() set cache->id before cpp->op->area_init() and nfp_cpp_area_acquire(). But if area_init() or nfp_cpp_area_acquire() fails, the cache->id is is already set but the refcount is not increased as expected. At this time, calling the nfp_cpp_area_release() will cause use-after-free. To avoid the use-after-free, set cache->id after area_init() and nfp_cpp_area_acquire() complete successfully. Note: This vulnerability is triggerable by providing emulated device equipped with specified configuration. BUG: KASAN: use-after-free in nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760) Write of size 4 at addr ffff888005b7f4a0 by task swapper/0/1 Call Trace: <TASK> nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760) area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:884) Allocated by task 1: nfp_cpp_area_alloc_with_name (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:303) nfp_cpp_area_cache_add (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:802) nfp6000_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:1230) nfp_cpp_from_operations (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:1215) nfp_pci_probe (drivers/net/ethernet/netronome/nfp/nfp_main.c:744) Freed by task 1: kfree (mm/slub.c:4562) area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:873) nfp_cpp_read (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:924 drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:973) nfp_cpp_readl (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cpplib.c:48) Signed-off-by: Jialiang Wang <wangjialiang0806@163.com> Reviewed-by: Yinjun Zhang <yinjun.zhang@corigine.com> Acked-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220810073057.4032-1-wangjialiang0806@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11MAINTAINERS: use my korg address for mt7601uJakub Kicinski
Change my address for mt7601u to the main one. Link: https://lore.kernel.org/r/20220809233843.408004-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11mlxsw: minimal: Fix deadlock in ports creationVadim Pasternak
Drop devl_lock() / devl_unlock() from ports creation and removal flows since the devlink instance lock is now taken by mlxsw_core. Fixes: 72a4c8c94efa ("mlxsw: convert driver to use unlocked devlink API during init/fini") Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/f4afce5ab0318617f3866b85274be52542d59b32.1660211614.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11RISC-V: Add modules to virtual kernel memory layout dumpXianting Tian
Modules always live before the kernel, MODULES_END is fixed but MODULES_VADDR isn't fixed, it depends on the kernel size. Let's add it to virtual kernel memory layout dump. As MODULES is only defined for CONFIG_64BIT, so we dump it when CONFIG_64BIT=y. eg, MODULES_VADDR - MODULES_END 0xffffffff01133000 - 0xffffffff80000000 Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Link: https://lore.kernel.org/r/20220811074150.3020189-5-xianting.tian@linux.alibaba.com Cc: stable@vger.kernel.org Fixes: 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear mapping") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11riscv: traps_misaligned: do not duplicate stringifyKrzysztof Kozlowski
Use existing stringify macro from the kernel headers. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20220623112905.253157-1-krzysztof.kozlowski@linaro.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11RISC-V: Fixup schedule out issue in machine_crash_shutdown()Xianting Tian
Current task of executing crash kexec will be schedule out when panic is triggered by RCU Stall, as it needs to wait rcu completion. It lead to inability to enter the crash system. The implementation of machine_crash_shutdown() is non-standard for RISC-V according to other Arch's implementation(eg, x86, arm64), we need to send IPI to stop secondary harts. [224521.877268] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [224521.883471] rcu: 0-...0: (3 GPs behind) idle=cfa/0/0x1 softirq=3968793/3968793 fqs=2495 [224521.891742] (detected by 2, t=5255 jiffies, g=60855593, q=328) [224521.897754] Task dump for CPU 0: [224521.901074] task:swapper/0 state:R running task stack: 0 pid: 0 ppid: 0 flags:0x00000008 [224521.911090] Call Trace: [224521.913638] [<ffffffe000c432de>] __schedule+0x208/0x5ea [224521.918957] Kernel panic - not syncing: RCU Stall [224521.923773] bad: scheduling from the idle thread! [224521.928571] CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Tainted: G O 5.10.113-yocto-standard #1 [224521.938658] Call Trace: [224521.941200] [<ffffffe00020395c>] walk_stackframe+0x0/0xaa [224521.946689] [<ffffffe000c34f8e>] show_stack+0x32/0x3e [224521.951830] [<ffffffe000c39020>] dump_stack_lvl+0x7e/0xa2 [224521.957317] [<ffffffe000c39058>] dump_stack+0x14/0x1c [224521.962459] [<ffffffe000243884>] dequeue_task_idle+0x2c/0x40 [224521.968207] [<ffffffe000c434f4>] __schedule+0x41e/0x5ea [224521.973520] [<ffffffe000c43826>] schedule+0x34/0xe4 [224521.978487] [<ffffffe000c46cae>] schedule_timeout+0xc6/0x170 [224521.984234] [<ffffffe000c4491e>] wait_for_completion+0x98/0xf2 [224521.990157] [<ffffffe00026d9e2>] __wait_rcu_gp+0x148/0x14a [224521.995733] [<ffffffe0002761c4>] synchronize_rcu+0x5c/0x66 [224522.001307] [<ffffffe00026f1a6>] rcu_sync_enter+0x54/0xe6 [224522.006795] [<ffffffe00025a436>] percpu_down_write+0x32/0x11c [224522.012629] [<ffffffe000c4266a>] _cpu_down+0x92/0x21a [224522.017771] [<ffffffe000219a0a>] smp_shutdown_nonboot_cpus+0x90/0x118 [224522.024299] [<ffffffe00020701e>] machine_crash_shutdown+0x30/0x4a [224522.030483] [<ffffffe00029a3f8>] __crash_kexec+0x62/0xa6 [224522.035884] [<ffffffe000c3515e>] panic+0xfa/0x2b6 [224522.040678] [<ffffffe0002772be>] rcu_sched_clock_irq+0xc26/0xcb8 [224522.046774] [<ffffffe00027fc7a>] update_process_times+0x62/0x8a [224522.052785] [<ffffffe00028d522>] tick_sched_timer+0x9e/0x102 [224522.058533] [<ffffffe000280c3a>] __hrtimer_run_queues+0x16a/0x318 [224522.064716] [<ffffffe0002812ec>] hrtimer_interrupt+0xd4/0x228 [224522.070551] [<ffffffe0009a69b6>] riscv_timer_interrupt+0x3c/0x48 [224522.076646] [<ffffffe000268f8c>] handle_percpu_devid_irq+0xb0/0x24c [224522.083004] [<ffffffe00026428e>] __handle_domain_irq+0xa8/0x122 [224522.089014] [<ffffffe00062f954>] riscv_intc_irq+0x38/0x60 [224522.094501] [<ffffffe000201bd4>] ret_from_exception+0x0/0xc [224522.100161] [<ffffffe000c42146>] rcu_eqs_enter.constprop.0+0x8c/0xb8 With the patch, it can enter crash system when RCU Stall occur. Fixes: e53d28180d4d ("RISC-V: Add kdump support") Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Link: https://lore.kernel.org/r/20220811074150.3020189-4-xianting.tian@linux.alibaba.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11RISC-V: Fixup get incorrect user mode PC for kernel mode regsXianting Tian
When use 'echo c > /proc/sysrq-trigger' to trigger kdump, riscv_crash_save_regs() will be called to save regs for vmcore, we found "epc" value 00ffffffa5537400 is not a valid kernel virtual address, but is a user virtual address. Other regs(eg, ra, sp, gp...) are correct kernel virtual address. Actually 0x00ffffffb0dd9400 is the user mode PC of 'PID: 113 Comm: sh', which is saved in the task's stack. [ 21.201701] CPU: 0 PID: 113 Comm: sh Kdump: loaded Not tainted 5.18.9 #45 [ 21.201979] Hardware name: riscv-virtio,qemu (DT) [ 21.202160] epc : 00ffffffa5537400 ra : ffffffff80088640 sp : ff20000010333b90 [ 21.202435] gp : ffffffff810dde38 tp : ff6000000226c200 t0 : ffffffff8032be7c [ 21.202707] t1 : 0720072007200720 t2 : 30203a7375746174 s0 : ff20000010333cf0 [ 21.202973] s1 : 0000000000000000 a0 : ff20000010333b98 a1 : 0000000000000001 [ 21.203243] a2 : 0000000000000010 a3 : 0000000000000000 a4 : 28c8f0aeffea4e00 [ 21.203519] a5 : 28c8f0aeffea4e00 a6 : 0000000000000009 a7 : ffffffff8035c9b8 [ 21.203794] s2 : ffffffff810df0a8 s3 : ffffffff810df718 s4 : ff20000010333b98 [ 21.204062] s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80c4a468 [ 21.204331] s8 : 00ffffffef451410 s9 : 0000000000000007 s10: 00aaaaaac0510700 [ 21.204606] s11: 0000000000000001 t3 : ff60000001218f00 t4 : ff60000001218f00 [ 21.204876] t5 : ff60000001218000 t6 : ff200000103338b8 [ 21.205079] status: 0000000200000020 badaddr: 0000000000000000 cause: 0000000000000008 With the incorrect PC, the backtrace showed by crash tool as below, the first stack frame is abnormal, crash> bt PID: 113 TASK: ff60000002269600 CPU: 0 COMMAND: "sh" #0 [ff2000001039bb90] __efistub_.Ldebug_info0 at 00ffffffa5537400 <-- Abnormal #1 [ff2000001039bcf0] panic at ffffffff806578ba #2 [ff2000001039bd50] sysrq_reset_seq_param_set at ffffffff8038c030 #3 [ff2000001039bda0] __handle_sysrq at ffffffff8038c5f8 #4 [ff2000001039be00] write_sysrq_trigger at ffffffff8038cad8 #5 [ff2000001039be20] proc_reg_write at ffffffff801b7edc #6 [ff2000001039be40] vfs_write at ffffffff80152ba6 #7 [ff2000001039be80] ksys_write at ffffffff80152ece #8 [ff2000001039bed0] sys_write at ffffffff80152f46 With the patch, we can get current kernel mode PC, the output as below, [ 17.607658] CPU: 0 PID: 113 Comm: sh Kdump: loaded Not tainted 5.18.9 #42 [ 17.607937] Hardware name: riscv-virtio,qemu (DT) [ 17.608150] epc : ffffffff800078f8 ra : ffffffff8008862c sp : ff20000010333b90 [ 17.608441] gp : ffffffff810dde38 tp : ff6000000226c200 t0 : ffffffff8032be68 [ 17.608741] t1 : 0720072007200720 t2 : 666666666666663c s0 : ff20000010333cf0 [ 17.609025] s1 : 0000000000000000 a0 : ff20000010333b98 a1 : 0000000000000001 [ 17.609320] a2 : 0000000000000010 a3 : 0000000000000000 a4 : 0000000000000000 [ 17.609601] a5 : ff60000001c78000 a6 : 000000000000003c a7 : ffffffff8035c9a4 [ 17.609894] s2 : ffffffff810df0a8 s3 : ffffffff810df718 s4 : ff20000010333b98 [ 17.610186] s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80c4a468 [ 17.610469] s8 : 00ffffffca281410 s9 : 0000000000000007 s10: 00aaaaaab5bb6700 [ 17.610755] s11: 0000000000000001 t3 : ff60000001218f00 t4 : ff60000001218f00 [ 17.611041] t5 : ff60000001218000 t6 : ff20000010333988 [ 17.611255] status: 0000000200000020 badaddr: 0000000000000000 cause: 0000000000000008 With the correct PC, the backtrace showed by crash tool as below, crash> bt PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 <--- Normal #1 [ff20000010333cf0] panic at ffffffff806578c6 #2 [ff20000010333d50] sysrq_reset_seq_param_set at ffffffff8038c03c #3 [ff20000010333da0] __handle_sysrq at ffffffff8038c604 #4 [ff20000010333e00] write_sysrq_trigger at ffffffff8038cae4 #5 [ff20000010333e20] proc_reg_write at ffffffff801b7ee8 #6 [ff20000010333e40] vfs_write at ffffffff80152bb2 #7 [ff20000010333e80] ksys_write at ffffffff80152eda #8 [ff20000010333ed0] sys_write at ffffffff80152f52 Fixes: e53d28180d4d ("RISC-V: Add kdump support") Co-developed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Link: https://lore.kernel.org/r/20220811074150.3020189-3-xianting.tian@linux.alibaba.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11RISC-V: kexec: Fixup use of smp_processor_id() in preemptible contextXianting Tian
Use __smp_processor_id() to avoid check the preemption context when CONFIG_DEBUG_PREEMPT enabled, as we will enter crash kernel and no return. Without the patch, [ 103.781044] sysrq: Trigger a crash [ 103.784625] Kernel panic - not syncing: sysrq triggered crash [ 103.837634] CPU1: off [ 103.889668] CPU2: off [ 103.933479] CPU3: off [ 103.939424] Starting crashdump kernel... [ 103.943442] BUG: using smp_processor_id() in preemptible [00000000] code: sh/346 [ 103.950884] caller is debug_smp_processor_id+0x1c/0x26 [ 103.956051] CPU: 0 PID: 346 Comm: sh Kdump: loaded Not tainted 5.10.113-00002-gce03f03bf4ec-dirty #149 [ 103.965355] Call Trace: [ 103.967805] [<ffffffe00020372a>] walk_stackframe+0x0/0xa2 [ 103.973206] [<ffffffe000bcf1f4>] show_stack+0x32/0x3e [ 103.978258] [<ffffffe000bd382a>] dump_stack_lvl+0x72/0x8e [ 103.983655] [<ffffffe000bd385a>] dump_stack+0x14/0x1c [ 103.988705] [<ffffffe000bdc8fe>] check_preemption_disabled+0x9e/0xaa [ 103.995057] [<ffffffe000bdc926>] debug_smp_processor_id+0x1c/0x26 [ 104.001150] [<ffffffe000206c64>] machine_kexec+0x22/0xd0 [ 104.006463] [<ffffffe000291a7e>] __crash_kexec+0x6a/0xa4 [ 104.011774] [<ffffffe000bcf3fa>] panic+0xfc/0x2b0 [ 104.016480] [<ffffffe000656ca4>] sysrq_reset_seq_param_set+0x0/0x70 [ 104.022745] [<ffffffe000657310>] __handle_sysrq+0x8c/0x154 [ 104.028229] [<ffffffe0006577e8>] write_sysrq_trigger+0x5a/0x6a [ 104.034061] [<ffffffe0003d90e0>] proc_reg_write+0x58/0xd4 [ 104.039459] [<ffffffe00036cff4>] vfs_write+0x7e/0x254 [ 104.044509] [<ffffffe00036d2f6>] ksys_write+0x58/0xbe [ 104.049558] [<ffffffe00036d36a>] sys_write+0xe/0x16 [ 104.054434] [<ffffffe000201b9a>] ret_from_syscall+0x0/0x2 [ 104.067863] Will call new kernel at ecc00000 from hart id 0 [ 104.074939] FDT image at fc5ee000 [ 104.079523] Bye... With the patch we can got clear output, [ 67.740553] sysrq: Trigger a crash [ 67.744166] Kernel panic - not syncing: sysrq triggered crash [ 67.809123] CPU1: off [ 67.865210] CPU2: off [ 67.909075] CPU3: off [ 67.919123] Starting crashdump kernel... [ 67.924900] Will call new kernel at ecc00000 from hart id 0 [ 67.932045] FDT image at fc5ee000 [ 67.935560] Bye... Fixes: 0e105f1d0037 ("riscv: use hart id instead of cpu id on machine_kexec") Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Link: https://lore.kernel.org/r/20220811074150.3020189-2-xianting.tian@linux.alibaba.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11bonding: fix reference count leak in balance-alb modeJay Vosburgh
Commit d5410ac7b0ba ("net:bonding:support balance-alb interface with vlan to bridge") introduced a reference count leak by not releasing the reference acquired by ip_dev_find(). Remedy this by insuring the reference is released. Fixes: d5410ac7b0ba ("net:bonding:support balance-alb interface with vlan to bridge") Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/26758.1660194413@famine Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11Revert "Makefile.extrawarn: re-enable -Wformat for clang"Linus Torvalds
This reverts commit 258fafcd0683d9ccfa524129d489948ab3ddc24c. The clang -Wformat warning is terminally broken, and the clang people can't seem to get their act together. This test program causes a warning with clang: #include <stdio.h> int main(int argc, char **argv) { printf("%hhu\n", 'a'); } resulting in t.c:5:19: warning: format specifies type 'unsigned char' but the argument has type 'int' [-Wformat] printf("%hhu\n", 'a'); ~~~~ ^~~ %d and apparently clang people consider that a feature, because they don't want to face the reality of how either C character constants, C arithmetic, and C varargs functions work. The rest of the world just shakes their head at that kind of incompetence, and turns off -Wformat for clang again. And no, the "you should use a pointless cast to shut this up" is not a valid answer. That warning should not exist in the first place, or at least be optinal with some "-Wformat-me-harder" kind of option. [ Admittedly, there's also very little reason to *ever* use '%hh[ud]' in C, but what little reason there is is entirely about 'I want to see only the low 8 bits of the argument'. So I would suggest nobody ever use that format in the first place, but if they do, the clang behavious is simply always wrong. Because '%hhu' takes an 'int'. It's that simple. ] Reported-by: Sudip Mukherjee (Codethink) <sudipm.mukherjee@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-11dm bufio: fix some cases where the code sleeps with spinlock heldMikulas Patocka
Commit b32d45824aa7 ("dm bufio: Add DM_BUFIO_CLIENT_NO_SLEEP flag") added a "NO_SLEEP" mode, it replaces a mutex with a spinlock, and it is only usable when the device is in read-only mode (because the write path may be sleeping while holding the dm_bufio_client lock). However, there are still two points where the code could sleep even in read-only mode. One is in __get_unclaimed_buffer -> __make_buffer_clean. The other is in __try_evict_buffer -> __make_buffer_clean. These functions will call __make_buffer_clean which sleeps if the buffer is being read. Fix these cases so that if c->no_sleep is set __make_buffer_clean will not be called and the buffer will be skipped instead. Fixes: b32d45824aa7 ("dm bufio: Add DM_BUFIO_CLIENT_NO_SLEEP flag") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-08-11arch/riscv: add Zihintpause supportDao Lu
Implement support for the ZiHintPause extension. The PAUSE instruction is a HINT that indicates the current hart’s rate of instruction retirement should be temporarily reduced or paused. Reviewed-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Dao Lu <daolu@rivosinc.com> [Palmer: Some minor merge conflicts.] Link: https://lore.kernel.org/all/20220620201530.3929352-1-daolu@rivosinc.com/ Link: https://lore.kernel.org/all/20220811053356.17375-1-palmer@rivosinc.com/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11net: usb: qmi_wwan: Add support for Cinterion MV32Slark Xiao
There are 2 models for MV32 serials. MV32-W-A is designed based on Qualcomm SDX62 chip, and MV32-W-B is designed based on Qualcomm SDX65 chip. So we use 2 different PID to separate it. Test evidence as below: T: Bus=03 Lev=01 Prnt=01 Port=02 Cnt=03 Dev#= 3 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1e2d ProdID=00f3 Rev=05.04 S: Manufacturer=Cinterion S: Product=Cinterion PID 0x00F3 USB Mobile Broadband S: SerialNumber=d7b4be8d C: #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option T: Bus=03 Lev=01 Prnt=01 Port=02 Cnt=03 Dev#= 10 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1e2d ProdID=00f4 Rev=05.04 S: Manufacturer=Cinterion S: Product=Cinterion PID 0x00F4 USB Mobile Broadband S: SerialNumber=d095087d C: #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option Signed-off-by: Slark Xiao <slark_xiao@163.com> Acked-by: Bjørn Mork <bjorn@mork.no> Link: https://lore.kernel.org/r/20220810014521.9383-1-slark_xiao@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Pull in bpf to silence a false positive warning. * git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Shut up kern_sys_bpf warning. Link: https://lore.kernel.org/r/CAADnVQK589CZN1Q9w8huJqkEyEed+ZMTWqcpA1Rm2CjN3a4XoQ@mail.gmail.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11vdpa/mlx5: Fix possible uninitialized return valueEli Cohen
Initialize err local variable to return -EAGAIN if the asid cannot be found thus avoiding returning uninitialized value. Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220811134010.952291-1-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11Merge 'irq/loongarch', 'pci/ctrl/loongson' and 'pci/header-cleanup-immutable'Huacai Chen
LoongArch architecture changes for 5.20 depend on the irqchip and pci changes to work, so merge them to create a base.
2022-08-11vdpa_sim_blk: add support for discard and write-zeroesStefano Garzarella
Expose VIRTIO_BLK_F_DISCARD and VIRTIO_BLK_F_WRITE_ZEROES features to the drivers and handle VIRTIO_BLK_T_DISCARD and VIRTIO_BLK_T_WRITE_ZEROES requests checking ranges and flags. The simulator behaves like a ramdisk, so for VIRTIO_BLK_F_DISCARD does nothing, while for VIRTIO_BLK_T_WRITE_ZEROES sets to 0 the specified region. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220811083632.77525-5-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vdpa_sim_blk: add support for VIRTIO_BLK_T_FLUSHStefano Garzarella
The simulator behaves like a ramdisk, so we don't have to do anything when a VIRTIO_BLK_T_FLUSH request is received, but it could be useful to test driver behavior. Let's expose the VIRTIO_BLK_F_FLUSH feature to inform the driver that we support the flush command. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220811083632.77525-4-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vdpa_sim_blk: make vdpasim_blk_check_range usable by other requestsStefano Garzarella
Next patches will add handling of other requests, where will be useful to reuse vdpasim_blk_check_range(). So let's make it more generic by adding the `max_sectors` parameter, since different requests allow different numbers of maximum sectors. Let's also print the messages directly in vdpasim_blk_check_range() to avoid duplicate prints. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220811083632.77525-3-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vdpa_sim_blk: check if sector is 0 for commands other than read or writeStefano Garzarella
VIRTIO spec states: "The sector number indicates the offset (multiplied by 512) where the read or write is to occur. This field is unused and set to 0 for commands other than read or write." Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220811083632.77525-2-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vdpa_sim: Implement suspend vdpa opEugenio Pérez
Implement suspend operation for vdpa_sim devices, so vhost-vdpa will offer that backend feature and userspace can effectively suspend the device. This is a must before get virtqueue indexes (base) for live migration, since the device could modify them after userland gets them. There are individual ways to perform that action for some devices (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no way to perform it for any vhost device (and, in particular, vhost-vdpa). Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220810171512.2343333-5-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vhost-vdpa: uAPI to suspend the deviceEugenio Pérez
The ioctl adds support for suspending the device from userspace. This is a must before getting virtqueue indexes (base) for live migration, since the device could modify them after userland gets them. There are individual ways to perform that action for some devices (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no way to perform it for any vhost device (and, in particular, vhost-vdpa). After a successful return of the ioctl call the device must not process more virtqueue descriptors. The device can answer to read or writes of config fields as if it were not suspended. In particular, writing to "queue_enable" with a value of 1 will not make the device start processing buffers of the virtqueue. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220810171512.2343333-4-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vhost-vdpa: introduce SUSPEND backend feature bitEugenio Pérez
Userland knows if it can suspend the device or not by checking this feature bit. It's only offered if the vdpa driver backend implements the suspend() operation callback, and to offer it or userland to ack it if the backend does not offer that callback is an error. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220810171512.2343333-3-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vdpa: Add suspend operationEugenio Pérez
This operation is optional: It it's not implemented, backend feature bit will not be exposed. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220810171512.2343333-2-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio-blk: Avoid use-after-free on suspend/resumeShigeru Yoshida
hctx->user_data is set to vq in virtblk_init_hctx(). However, vq is freed on suspend and reallocated on resume. So, hctx->user_data is invalid after resume, and it will cause use-after-free accessing which will result in the kernel crash something like below: [ 22.428391] Call Trace: [ 22.428899] <TASK> [ 22.429339] virtqueue_add_split+0x3eb/0x620 [ 22.430035] ? __blk_mq_alloc_requests+0x17f/0x2d0 [ 22.430789] ? kvm_clock_get_cycles+0x14/0x30 [ 22.431496] virtqueue_add_sgs+0xad/0xd0 [ 22.432108] virtblk_add_req+0xe8/0x150 [ 22.432692] virtio_queue_rqs+0xeb/0x210 [ 22.433330] blk_mq_flush_plug_list+0x1b8/0x280 [ 22.434059] __blk_flush_plug+0xe1/0x140 [ 22.434853] blk_finish_plug+0x20/0x40 [ 22.435512] read_pages+0x20a/0x2e0 [ 22.436063] ? folio_add_lru+0x62/0xa0 [ 22.436652] page_cache_ra_unbounded+0x112/0x160 [ 22.437365] filemap_get_pages+0xe1/0x5b0 [ 22.437964] ? context_to_sid+0x70/0x100 [ 22.438580] ? sidtab_context_to_sid+0x32/0x400 [ 22.439979] filemap_read+0xcd/0x3d0 [ 22.440917] xfs_file_buffered_read+0x4a/0xc0 [ 22.441984] xfs_file_read_iter+0x65/0xd0 [ 22.442970] __kernel_read+0x160/0x2e0 [ 22.443921] bprm_execve+0x21b/0x640 [ 22.444809] do_execveat_common.isra.0+0x1a8/0x220 [ 22.446008] __x64_sys_execve+0x2d/0x40 [ 22.446920] do_syscall_64+0x37/0x90 [ 22.447773] entry_SYSCALL_64_after_hwframe+0x63/0xcd This patch fixes this issue by getting vq from vblk, and removes virtblk_init_hctx(). Fixes: 4e0400525691 ("virtio-blk: support polling I/O") Cc: "Suwan Kim" <suwan.kim027@gmail.com> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Message-Id: <20220810160948.959781-1-syoshida@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_vdpa: support the arg sizes of find_vqs()Bo Liu
Virtio vdpa support the new parameter sizes of find_vqs(). Signed-off-by: Bo Liu <liubo03@inspur.com> Message-Id: <20220810085151.7251-1-liubo03@inspur.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vhost-vdpa: Call ida_simple_remove() when failedBo Liu
In function vhost_vdpa_probe(), when code execution fails, we should call ida_simple_remove() to free ida. Signed-off-by: Bo Liu <liubo03@inspur.com> Message-Id: <20220805091254.20026-1-liubo03@inspur.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vDPA: fix 'cast to restricted le16' warnings in vdpa.cZhu Lingshan
This commit fixes spars warnings: cast to restricted __le16 in function vdpa_dev_net_config_fill() and vdpa_fill_stats_rec() Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Message-Id: <20220722115309.82746-7-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vDPA: !FEATURES_OK should not block querying device config spaceZhu Lingshan
Users may want to query the config space of a vDPA device, to choose a appropriate one for a certain guest. This means the users need to read the config space before FEATURES_OK, and the existence of config space contents does not depend on FEATURES_OK. The spec says: The device MUST allow reading of any device-specific configuration field before FEATURES_OK is set by the driver. This includes fields which are conditional on feature bits, as long as those feature bits are offered by the device. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20220722115309.82746-5-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vDPA/ifcvf: support userspace to query features and MQ of a management deviceZhu Lingshan
Adapting to current netlink interfaces, this commit allows userspace to query feature bits and MQ capability of a management device. Currently both the vDPA device and the management device are the VF itself, thus this ifcvf should initialize the virtio capabilities in probe() before setting up the struct vdpa_mgmt_dev. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20220722115309.82746-3-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11vDPA/ifcvf: get_config_size should return a value no greater than dev ↵Zhu Lingshan
implementation Drivers must not access a BAR outside the capability length, and for a virtio device, ifcvf driver should not report any non-standard capability contents to the upper layers. Function ifcvf_get_config_size() is introduced here to return a safe value of the device config capability size. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20220722115309.82746-2-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>