summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-25ext4: init error handle resource before init group descriptorsYe Bin
Now, 's_err_report' timer is init after ext4_group_desc_init() when fill super. Theoretically, ext4_group_desc_init() may access to error handle as follows: __ext4_fill_super ext4_group_desc_init ext4_check_descriptors ext4_get_group_desc ext4_error ext4_handle_error ext4_commit_super ext4_update_super if (!es->s_error_count) mod_timer(&sbi->s_err_report, jiffies + 24*60*60*HZ); --> Accessing Uninitialized Variables timer_setup(&sbi->s_err_report, print_daily_error_info, 0); Maybe above issue is just theoretical, as ext4_check_descriptors() didn't judge 'gpd' which get from ext4_get_group_desc(), if access to error handle ext4_get_group_desc() will return NULL, then will trigger null-ptr-deref in ext4_check_descriptors(). However, from the perspective of pure code, it is better to initialize resource that may need to be used first. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230119013711.86680-1-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-02-25Merge tag 'vfio-v6.3-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO updates from Alex Williamson: - Remove redundant resource check in vfio-platform (Angus Chen) - Use GFP_KERNEL_ACCOUNT for persistent userspace allocations, allowing removal of arbitrary kernel limits in favor of cgroup control (Yishai Hadas) - mdev tidy-ups, including removing the module-only build restriction for sample drivers, Kconfig changes to select mdev support, documentation movement to keep sample driver usage instructions with sample drivers rather than with API docs, remove references to out-of-tree drivers in docs (Christoph Hellwig) - Fix collateral breakages from mdev Kconfig changes (Arnd Bergmann) - Make mlx5 migration support match device support, improve source and target flows to improve pre-copy support and reduce downtime (Yishai Hadas) - Convert additional mdev sysfs case to use sysfs_emit() (Bo Liu) - Resolve copy-paste error in mdev mbochs sample driver Kconfig (Ye Xingchen) - Avoid propagating missing reset error in vfio-platform if reset requirement is relaxed by module option (Tomasz Duszynski) - Range size fixes in mlx5 variant driver for missed last byte and stricter range calculation (Yishai Hadas) - Fixes to suspended vaddr support and locked_vm accounting, excluding mdev configurations from the former due to potential to indefinitely block kernel threads, fix underflow and restore locked_vm on new mm (Steve Sistare) - Update outdated vfio documentation due to new IOMMUFD interfaces in recent kernels (Yi Liu) - Resolve deadlock between group_lock and kvm_lock, finally (Matthew Rosato) - Fix NULL pointer in group initialization error path with IOMMUFD (Yan Zhao) * tag 'vfio-v6.3-rc1' of https://github.com/awilliam/linux-vfio: (32 commits) vfio: Fix NULL pointer dereference caused by uninitialized group->iommufd docs: vfio: Update vfio.rst per latest interfaces vfio: Update the kdoc for vfio_device_ops vfio/mlx5: Fix range size calculation upon tracker creation vfio: no need to pass kvm pointer during device open vfio: fix deadlock between group lock and kvm lock vfio: revert "iommu driver notify callback" vfio/type1: revert "implement notify callback" vfio/type1: revert "block on invalid vaddr" vfio/type1: restore locked_vm vfio/type1: track locked_vm per dma vfio/type1: prevent underflow of locked_vm via exec() vfio/type1: exclude mdevs from VFIO_UPDATE_VADDR vfio: platform: ignore missing reset if disabled at module init vfio/mlx5: Improve the target side flow to reduce downtime vfio/mlx5: Improve the source side flow upon pre_copy vfio/mlx5: Check whether VF is migratable samples: fix the prompt about SAMPLE_VFIO_MDEV_MBOCHS vfio/mdev: Use sysfs_emit() to instead of sprintf() vfio-mdev: add back CONFIG_VFIO dependency ...
2023-02-25Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio updates from Michael Tsirkin: - device feature provisioning in ifcvf, mlx5 - new SolidNET driver - support for zoned block device in virtio blk - numa support in virtio pmem - VIRTIO_F_RING_RESET support in vhost-net - more debugfs entries in mlx5 - resume support in vdpa - completion batching in virtio blk - cleanup of dma api use in vdpa - now simulating more features in vdpa-sim - documentation, features, fixes all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (64 commits) vdpa/mlx5: support device features provisioning vdpa/mlx5: make MTU/STATUS presence conditional on feature bits vdpa: validate device feature provisioning against supported class vdpa: validate provisioned device features against specified attribute vdpa: conditionally read STATUS in config space vdpa: fix improper error message when adding vdpa dev vdpa/mlx5: Initialize CVQ iotlb spinlock vdpa/mlx5: Don't clear mr struct on destroy MR vdpa/mlx5: Directly assign memory key tools/virtio: enable to build with retpoline vringh: fix a typo in comments for vringh_kiov vhost-vdpa: print warning when vhost_vdpa_alloc_domain fails scsi: virtio_scsi: fix handling of kmalloc failure vdpa: Fix a couple of spelling mistakes in some messages vhost-net: support VIRTIO_F_RING_RESET vhost-scsi: convert sysfs snprintf and sprintf to sysfs_emit vdpa: mlx5: support per virtqueue dma device vdpa: set dma mask for vDPA device virtio-vdpa: support per vq dma device vdpa: introduce get_vq_dma_device() ...
2023-02-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "ARM: - Provide a virtual cache topology to the guest to avoid inconsistencies with migration on heterogenous systems. Non secure software has no practical need to traverse the caches by set/way in the first place - Add support for taking stage-2 access faults in parallel. This was an accidental omission in the original parallel faults implementation, but should provide a marginal improvement to machines w/o FEAT_HAFDBS (such as hardware from the fruit company) - A preamble to adding support for nested virtualization to KVM, including vEL2 register state, rudimentary nested exception handling and masking unsupported features for nested guests - Fixes to the PSCI relay that avoid an unexpected host SVE trap when resuming a CPU when running pKVM - VGIC maintenance interrupt support for the AIC - Improvements to the arch timer emulation, primarily aimed at reducing the trap overhead of running nested - Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the interest of CI systems - Avoid VM-wide stop-the-world operations when a vCPU accesses its own redistributor - Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions in the host - Aesthetic and comment/kerneldoc fixes - Drop the vestiges of the old Columbia mailing list and add [Oliver] as co-maintainer RISC-V: - Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE - Correctly place the guest in S-mode after redirecting a trap to the guest - Redirect illegal instruction traps to guest - SBI PMU support for guest s390: - Sort out confusion between virtual and physical addresses, which currently are the same on s390 - A new ioctl that performs cmpxchg on guest memory - A few fixes x86: - Change tdp_mmu to a read-only parameter - Separate TDP and shadow MMU page fault paths - Enable Hyper-V invariant TSC control - Fix a variety of APICv and AVIC bugs, some of them real-world, some of them affecting architecurally legal but unlikely to happen in practice - Mark APIC timer as expired if its in one-shot mode and the count underflows while the vCPU task was being migrated - Advertise support for Intel's new fast REP string features - Fix a double-shootdown issue in the emergency reboot code - Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give SVM similar treatment to VMX - Update Xen's TSC info CPUID sub-leaves as appropriate - Add support for Hyper-V's extended hypercalls, where "support" at this point is just forwarding the hypercalls to userspace - Clean up the kvm->lock vs. kvm->srcu sequences when updating the PMU and MSR filters - One-off fixes and cleanups - Fix and cleanup the range-based TLB flushing code, used when KVM is running on Hyper-V - Add support for filtering PMU events using a mask. If userspace wants to restrict heavily what events the guest can use, it can now do so without needing an absurd number of filter entries - Clean up KVM's handling of "PMU MSRs to save", especially when vPMU support is disabled - Add PEBS support for Intel Sapphire Rapids - Fix a mostly benign overflow bug in SEV's send|receive_update_data() - Move several SVM-specific flags into vcpu_svm x86 Intel: - Handle NMI VM-Exits before leaving the noinstr region - A few trivial cleanups in the VM-Enter flows - Stop enabling VMFUNC for L1 purely to document that KVM doesn't support EPTP switching (or any other VM function) for L1 - Fix a crash when using eVMCS's enlighted MSR bitmaps Generic: - Clean up the hardware enable and initialization flow, which was scattered around multiple arch-specific hooks. Instead, just let the arch code call into generic code. Both x86 and ARM should benefit from not having to fight common KVM code's notion of how to do initialization - Account allocations in generic kvm_arch_alloc_vm() - Fix a memory leak if coalesced MMIO unregistration fails selftests: - On x86, cache the CPU vendor (AMD vs. Intel) and use the info to emit the correct hypercall instruction instead of relying on KVM to patch in VMMCALL - Use TAP interface for kvm_binary_stats_test and tsc_msrs_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (325 commits) KVM: SVM: hyper-v: placate modpost section mismatch error KVM: x86/mmu: Make tdp_mmu_allowed static KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes KVM: arm64: nv: Filter out unsupported features from ID regs KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2 KVM: arm64: nv: Allow a sysreg to be hidden from userspace only KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2 KVM: arm64: nv: Handle SMCs taken from virtual EL2 KVM: arm64: nv: Handle trapped ERET from virtual EL2 KVM: arm64: nv: Inject HVC exceptions to the virtual EL2 KVM: arm64: nv: Support virtual EL2 exceptions KVM: arm64: nv: Handle HCR_EL2.NV system register traps KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state KVM: arm64: nv: Add EL2 system registers to vcpu context KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set KVM: arm64: nv: Introduce nested virtualization VCPU feature KVM: arm64: Use the S2 MMU context to iterate over S2 table ...
2023-02-25Merge tag 'riscv-for-linus-6.3-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "There's a bunch of fixes/cleanups throughout the tree as usual, but we also have a handful of new features: - Various improvements to the extension detection and alternative patching infrastructure - Zbb-optimized string routines - Support for cpu-capacity in the RISC-V DT bindings - Zicbom no longer depends on toolchain support - Some performance and code size improvements to ftrace - Support for ARCH_WANT_LD_ORPHAN_WARN - Oops now contain the faulting instruction" * tag 'riscv-for-linus-6.3-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (67 commits) RISC-V: add a spin_shadow_stack declaration riscv: mm: hugetlb: Enable ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP riscv: Add header include guards to insn.h riscv: alternative: proceed one more instruction for auipc/jalr pair riscv: Avoid enabling interrupts in die() riscv, mm: Perform BPF exhandler fixup on page fault RISC-V: take text_mutex during alternative patching riscv: hwcap: Don't alphabetize ISA extension IDs RISC-V: fix ordering of Zbb extension riscv: jump_label: Fixup unaligned arch_static_branch function RISC-V: Only provide the single-letter extensions in HWCAP riscv: mm: fix regression due to update_mmu_cache change scripts/decodecode: Add support for RISC-V riscv: Add instruction dump to RISC-V splats riscv: select ARCH_WANT_LD_ORPHAN_WARN for !XIP_KERNEL riscv: vmlinux.lds.S: explicitly catch .init.bss sections from EFI stub riscv: vmlinux.lds.S: explicitly catch .riscv.attributes sections riscv: vmlinux.lds.S: explicitly catch .rela.dyn symbols riscv: lds: define RUNTIME_DISCARD_EXIT RISC-V: move some stray __RISCV_INSN_FUNCS definitions from kprobes ...
2023-02-25scripts: coccicheck: Use /usr/bin/envJarkko Sakkinen
If bash is not located under /bin, coccicheck fails to run. In the real world, this happens for instance when NixOS is used in the host. Instead, use /usr/bin/env to locate the executable binary for bash. Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Deepak R Varma <drv@mailo.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2023-02-25scripts: coccicheck: Avoid warning about spurious escapePeter Foley
e.g. grep: warning: stray \ before - Signed-off-by: Peter Foley <pefoley2@pefoley.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2023-02-25Merge tag 'powerpc-6.3-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Support for configuring secure boot with user-defined keys on PowerVM LPARs - Simplify the replay of soft-masked IRQs by making it non-recursive - Add support for KCSAN on 64-bit Book3S - Improvements to the API & code which interacts with RTAS (pseries firmware) - Change 32-bit powermac to assign PCI bus numbers per domain by default - Some improvements to the 32-bit BPF JIT - Various other small features and fixes Thanks to Anders Roxell, Andrew Donnellan, Andrew Jeffery, Benjamin Gray, Christophe Leroy, Frederic Barrat, Ganesh Goudar, Geoff Levand, Greg Kroah-Hartman, Jan-Benedict Glaw, Josh Poimboeuf, Kajol Jain, Laurent Dufour, Mahesh Salgaonkar, Mathieu Desnoyers, Mimi Zohar, Murphy Zhou, Nathan Chancellor, Nathan Lynch, Nayna Jain, Nicholas Piggin, Pali Rohár, Petr Mladek, Rohan McLure, Russell Currey, Sachin Sant, Sathvika Vasireddy, Sourabh Jain, Stefan Berger, Stephen Rothwell, and Sudhakar Kuppusamy. * tag 'powerpc-6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (114 commits) powerpc/pseries: Avoid hcall in plpks_is_available() on non-pseries powerpc: dts: turris1x.dts: Set lower priority for CPLD syscon-reboot powerpc/e500: Add missing prototype for 'relocate_init' powerpc/64: Fix unannotated intra-function call warning powerpc/epapr: Don't use wrteei on non booke powerpc: Pass correct CPU reference to assembler powerpc/mm: Rearrange if-else block to avoid clang warning powerpc/nohash: Fix build with llvm-as powerpc/nohash: Fix build error with binutils >= 2.38 powerpc/pseries: Fix endianness issue when parsing PLPKS secvar flags macintosh: windfarm: Use unsigned type for 1-bit bitfields powerpc/kexec_file: print error string on usable memory property update failure powerpc/machdep: warn when machine_is() used too early powerpc/64: Replace -mcpu=e500mc64 by -mcpu=e5500 powerpc/eeh: Set channel state after notifying the drivers selftests/powerpc: Fix incorrect kernel headers search path powerpc/rtas: arch-wide function token lookup conversions powerpc/rtas: introduce rtas_function_token() API powerpc/pseries/lpar: convert to papr_sysparm API powerpc/pseries/hv-24x7: convert to papr_sysparm API ...
2023-02-25Merge tag 'mips_6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds
Pull MIPS updates from Thomas Bogendoerfer: "Just cleanups and fixes" * tag 'mips_6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: vpe-mt: drop physical_memsize mips: fix syscall_get_nr MIPS: SMP-CPS: fix build error when HOTPLUG_CPU not set MIPS: DTS: jz4780: add #clock-cells to rtc_dev MIPS: dts: Boston: Fix dtc 'pci_device_reg' warning mips: dts: ralink: mt7621: add port@5 as CPU port mips: dts: align LED node names with dtschema MIPS: ralink: Use devm_platform_get_and_ioremap_resource() MIPS: pci-mt7620: Use devm_platform_get_and_ioremap_resource() MIPS: pci: lantiq: Use devm_platform_get_and_ioremap_resource() MIPS: lantiq: xway: Use devm_platform_get_and_ioremap_resource() MIPS: BCM47XX: Add support for Linksys E2500 V3 mips: ralink: make SOC_MT7621 select PINCTRL_MT7621 and fix help section MIPS: DTS: CI20: fix otg power gpio MIPS: dts: lantiq: Remove bogus interrupt-parent; line MIPS: Fix a compilation issue MIPS: remove CONFIG_MIPS_LD_CAN_LINK_VDSO mips: Realtek RTL: select NO_EXCEPT_FILL MIPS: OCTEON: octeon-usb: Consolidate error messages
2023-02-25Merge tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds
Pull Compute Express Link (CXL) updates from Dan Williams: "To date Linux has been dependent on platform-firmware to map CXL RAM regions and handle events / errors from devices. With this update we can now parse / update the CXL memory layout, and report events / errors from devices. This is a precursor for the CXL subsystem to handle the end-to-end "RAS" flow for CXL memory. i.e. the flow that for DDR-attached-DRAM is handled by the EDAC driver where it maps system physical address events to a field-replaceable-unit (FRU / endpoint device). In general, CXL has the potential to standardize what has historically been a pile of memory-controller-specific error handling logic. Another change of note is the default policy for handling RAM-backed device-dax instances. Previously the default access mode was "device", mmap(2) a device special file to access memory. The new default is "kmem" where the address range is assigned to the core-mm via add_memory_driver_managed(). This saves typical users from wondering why their platform memory is not visible via free(1) and stuck behind a device-file. At the same time it allows expert users to deploy policy to, for example, get dedicated access to high performance memory, or hide low performance memory from general purpose kernel allocations. This affects not only CXL, but also systems with high-bandwidth-memory that platform-firmware tags with the EFI_MEMORY_SP (special purpose) designation. Summary: - CXL RAM region enumeration: instantiate 'struct cxl_region' objects for platform firmware created memory regions - CXL RAM region provisioning: complement the existing PMEM region creation support with RAM region support - "Soft Reservation" policy change: Online (memory hot-add) soft-reserved memory (EFI_MEMORY_SP) by default, but still allow for setting aside such memory for dedicated access via device-dax. - CXL Events and Interrupts: Takeover CXL event handling from platform-firmware (ACPI calls this CXL Memory Error Reporting) and export CXL Events via Linux Trace Events. - Convey CXL _OSC results to drivers: Similar to PCI, let the CXL subsystem interrogate the result of CXL _OSC negotiation. - Emulate CXL DVSEC Range Registers as "decoders": Allow for first-generation devices that pre-date the definition of the CXL HDM Decoder Capability to translate the CXL DVSEC Range Registers into 'struct cxl_decoder' objects. - Set timestamp: Per spec, set the device timestamp in case of hotplug, or if platform-firwmare failed to set it. - General fixups: linux-next build issues, non-urgent fixes for pre-production hardware, unit test fixes, spelling and debug message improvements" * tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (66 commits) dax/kmem: Fix leak of memory-hotplug resources cxl/mem: Add kdoc param for event log driver state cxl/trace: Add serial number to trace points cxl/trace: Add host output to trace points cxl/trace: Standardize device information output cxl/pci: Remove locked check for dvsec_range_allowed() cxl/hdm: Add emulation when HDM decoders are not committed cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders cxl/hdm: Emulate HDM decoder from DVSEC range registers cxl/pci: Refactor cxl_hdm_decode_init() cxl/port: Export cxl_dvsec_rr_decode() to cxl_port cxl/pci: Break out range register decoding from cxl_hdm_decode_init() cxl: add RAS status unmasking for CXL cxl: remove unnecessary calling of pci_enable_pcie_error_reporting() dax/hmem: build hmem device support as module if possible dax: cxl: add CXL_REGION dependency cxl: avoid returning uninitialized error code cxl/pmem: Fix nvdimm registration races cxl/mem: Fix UAPI command comment cxl/uapi: Tag commands from cxl_query_cmd() ...
2023-02-25Merge tag 'x86_tdx_for_6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Intel Trust Domain Extensions (TDX) updates from Dave Hansen: "Other than a minor fixup, the content here is to ensure that TDX guests never see virtualization exceptions (#VE's) that might be induced by the untrusted VMM. This is a highly desirable property. Without it, #VE exception handling would fall somewhere between NMIs, machine checks and total insanity. With it, #VE handling remains pretty mundane. Summary: - Fixup comment typo - Prevent unexpected #VE's from: - Hosts removing perfectly good guest mappings (SEPT_VE_DISABLE) - Excessive #VE notifications (NOTIFY_ENABLES) which are delivered via a #VE" * tag 'x86_tdx_for_6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tdx: Do not corrupt frame-pointer in __tdx_hypercall() x86/tdx: Disable NOTIFY_ENABLES x86/tdx: Relax SEPT_VE_DISABLE check for debug TD x86/tdx: Use ReportFatalError to report missing SEPT_VE_DISABLE x86/tdx: Expand __tdx_hypercall() to handle more arguments x86/tdx: Refactor __tdx_hypercall() to allow pass down more arguments x86/tdx: Add more registers to struct tdx_hypercall_args x86/tdx: Fix typo in comment in __tdx_hypercall()
2023-02-25selftests/ftrace: Add LoongArch kprobe args string tests supportQing Zhang
Before: [5] Kprobe event string type argument [UNTESTED] [7] Kprobe event argument syntax [UNTESTED] After: [5] Kprobe event string type argument [PASS] [7] Kprobe event argument syntax [PASS] Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25selftests/seccomp: Add LoongArch selftesting supportHuacai Chen
BPF for LoongArch is supported now, add the selftesting support in seccomp_bpf.c. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25tools: Add LoongArch build infrastructureHuacai Chen
We will add tools support for LoongArch (bpf, perf, objtool, etc.), add build infrastructure and common headers for preparation. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25samples/kprobes: Add LoongArch supportTiezhu Yang
Add LoongArch specific info in handler_pre() and handler_post(). Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Mark some assembler symbols as non-kprobe-ableTiezhu Yang
Some assembler symbols are not kprobe safe, such as handle_syscall (used as syscall exception handler), *memset*/*memcpy*/*memmove* (may cause recursive exceptions), they can not be instrumented, just blacklist them for kprobing. Here is a related problem and discussion: Link: https://lore.kernel.org/lkml/20230114143859.7ccc45c1c5d9ce302113ab0a@kernel.org/ Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Add kprobes on ftrace supportTiezhu Yang
Add kprobe_ftrace_handler() and arch_prepare_kprobe_ftrace() to support kprobes on ftrace, the code is similar with x86 and riscv. Here is a simple example: # echo 'p:myprobe kernel_clone' > /sys/kernel/debug/tracing/kprobe_events # echo 'r:myretprobe kernel_clone $retval' >> /sys/kernel/debug/tracing/kprobe_events # echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable # echo 1 > /sys/kernel/debug/tracing/events/kprobes/myretprobe/enable # echo 1 > /sys/kernel/debug/tracing/tracing_on # cat /sys/kernel/debug/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 2/2 #P:4 # # _-----=> irqs-off/BH-disabled # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / _-=> migrate-disable # |||| / delay # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | bash-488 [002] ..... 2041.190681: myprobe: (kernel_clone+0x0/0x40c) bash-488 [002] ..... 2041.190788: myretprobe: (__do_sys_clone+0x84/0xb8 <- kernel_clone) arg1=0x200 Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Add kretprobes supportTiezhu Yang
Use the generic kretprobe trampoline handler to add kretprobes support for LoongArch. Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Add kprobes supportTiezhu Yang
Kprobes allows you to trap at almost any kernel address and execute a callback function, this commit adds kprobes support for LoongArch. Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Simulate branch and PC* instructionsTiezhu Yang
According to LoongArch Reference Manual, simulate branch and PC* instructions, this is preparation for later patch. Link: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#branch-instructions Link: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#_pcaddi_pcaddu121_pcaddu18l_pcalau12i Tested-by: Jeff Xie <xiehuan09@gmail.com> Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: ptrace: Add hardware single step supportQing Zhang
Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT, PTRACE_KILL and PTRACE_SINGLESTEP handling. This implies defining arch_has_single_step() and implementing the user_enable_single_step() and user_disable_single_step() functions. LoongArch cannot do hardware single-stepping per se, the hardware single-stepping it is achieved by configuring the instruction fetch watchpoints (FWPS) and specifies that the next instruction must trigger the watch exception by setting the mask bit. In some scenarios CSR.FWPS.Skip is used to ignore the next hit result, avoid endless repeated triggering of the same watchpoint without canceling it. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: ptrace: Add function argument access APIQing Zhang
Add regs_get_argument() which returns N th argument of the function call, This enables ftrace kprobe events to access kernel function arguments via $argN syntax for later use. E.g.: echo 'p bio_add_page arg1=$arg1' > kprobe_events bash: echo: write error: Invalid argument Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: ptrace: Expose hardware breakpoints to debuggersQing Zhang
Implement the regset-based ptrace interface that exposes hardware breakpoints to user-space debuggers to query and set instruction and data breakpoints. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Add hardware breakpoints/watchpoints supportQing Zhang
Use perf framework to manage hardware instruction and data breakpoints. LoongArch defines hardware watchpoint functions for instruction fetch and memory load/store operations. After the software configures hardware watchpoints, the processor hardware will monitor the access address of the instruction fetch and load/store operation, and trigger an exception of the watchpoint when it meets the conditions set by the watchpoint. The hardware monitoring points for instruction fetching and load/store operations each have a register for the overall configuration of all monitoring points, a register for recording the status of all monitoring points, and four registers required for configuration of each watchpoint individually. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: kdump: Add crashkernel=YM handlingYouling Tang
When the kernel crashkernel parameter is specified with just a size, we are supposed to allocate a region from RAM to store the crashkernel, "crashkernel=512M" would be recommended for kdump. Fix this by lifting similar code from x86, importing it to LoongArch with LoongArch specific parameters added. We allocate the crashkernel region from the first 4GB of physical memory (because SWIOTLB should be allocated below 4GB). However, LoongArch currently does not implement crashkernel_low and crashkernel_high the same as x86. When X is not specified, crash_base defaults to 0 (crashkernel=YM@XM). Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: kdump: Add single kernel image implementationYouling Tang
This feature depends on the kernel being relocatable. Enable using single kernel image for kdump, and then no longer need to build two kernels (production kernel and capture kernel share a single kernel image). Also enable CONFIG_CRASH_DUMP in loongson3_defconfig. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Add support for kernel address space layout randomization (KASLR)Youling Tang
This patch adds support for relocating the kernel to a random address. Entropy is derived from the banner, which will change every build and random_get_entropy() which should provide additional runtime entropy. The kernel is relocated by up to RANDOMIZE_BASE_MAX_OFFSET bytes from its link address. Because relocation happens so early during the kernel booting, the amount of physical memory has not yet been determined. This means the only way to limit relocation within the available memory is via Kconfig. So we limit the maximum value of RANDOMIZE_BASE_MAX_OFFSET to 256M (0x10000000) because our memory layout has many holes. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Xi Ruoyao <xry111@xry111.site> # Fix compiler warnings Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Add support for kernel relocationYouling Tang
This config allows to compile kernel as PIE and to relocate it at any virtual address at runtime: this paves the way to KASLR. Runtime relocation is possible since relocation metadata are embedded into the kernel. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Xi Ruoyao <xry111@xry111.site> # Use arch_initcall Signed-off-by: Jinyang He <hejinyang@loongson.cn> # Provide la_abs relocation code Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Add la_abs macro implementationYouling Tang
Use the "la_abs macro" instead of the "la.abs pseudo instruction" to prepare for the subsequent PIE kernel. When PIE is not enabled, la_abs is equivalent to la.abs. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Add JUMP_VIRT_ADDR macro implementation to avoid using la.absYouling Tang
Add JUMP_VIRT_ADDR macro implementation to avoid using la.abs directly. This is a preparation for subsequent patches. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Use la.pcrel instead of la.abs when it's trivially possibleXi Ruoyao
Let's start to kill la.abs in preparation for the subsequent support of the PIE kernel. BTW, Re-tab the indention in arch/loongarch/kernel/entry.S for alignment. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Make -mstrict-align configurableHuacai Chen
Introduce Kconfig option ARCH_STRICT_ALIGN to make -mstrict-align be configurable. Not all LoongArch cores support h/w unaligned access, we can use the -mstrict-align build parameter to prevent unaligned accesses. CPUs with h/w unaligned access support: Loongson-2K2000/2K3000/3A5000/3C5000/3D5000. CPUs without h/w unaligned access support: Loongson-2K500/2K1000. This option is enabled by default to make the kernel be able to run on all LoongArch systems. But you can disable it manually if you want to run kernel only on systems with h/w unaligned access support in order to optimise for performance. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Only call get_timer_irq() once in constant_clockevent_init()Tiezhu Yang
Under CONFIG_DEBUG_ATOMIC_SLEEP=y and CONFIG_DEBUG_PREEMPT=y, we can see the following messages on LoongArch, this is because using might_sleep() in preemption disable context. [ 0.001127] smp: Bringing up secondary CPUs ... [ 0.001222] Booting CPU#1... [ 0.001244] 64-bit Loongson Processor probed (LA464 Core) [ 0.001247] CPU1 revision is: 0014c012 (Loongson-64bit) [ 0.001250] FPU1 revision is: 00000000 [ 0.001252] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283 [ 0.001255] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1 [ 0.001257] preempt_count: 1, expected: 0 [ 0.001258] RCU nest depth: 0, expected: 0 [ 0.001259] Preemption disabled at: [ 0.001261] [<9000000000223800>] arch_dup_task_struct+0x20/0x110 [ 0.001272] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.2.0-rc7+ #43 [ 0.001275] Hardware name: Loongson Loongson-3A5000-7A1000-1w-A2101/Loongson-LS3A5000-7A1000-1w-A2101, BIOS vUDK2018-LoongArch-V4.0.05132-beta10 12/13/202 [ 0.001277] Stack : 0072617764726148 0000000000000000 9000000000222f1c 90000001001e0000 [ 0.001286] 90000001001e3be0 90000001001e3be8 0000000000000000 0000000000000000 [ 0.001292] 90000001001e3be8 0000000000000040 90000001001e3cb8 90000001001e3a50 [ 0.001297] 9000000001642000 90000001001e3be8 be694d10ce4139dd 9000000100174500 [ 0.001303] 0000000000000001 0000000000000001 00000000ffffe0a2 0000000000000020 [ 0.001309] 000000000000002f 9000000001354116 00000000056b0000 ffffffffffffffff [ 0.001314] 0000000000000000 0000000000000000 90000000014f6e90 9000000001642000 [ 0.001320] 900000000022b69c 0000000000000001 0000000000000000 9000000001736a90 [ 0.001325] 9000000100038000 0000000000000000 9000000000222f34 0000000000000000 [ 0.001331] 00000000000000b0 0000000000000004 0000000000000000 0000000000070000 [ 0.001337] ... [ 0.001339] Call Trace: [ 0.001342] [<9000000000222f34>] show_stack+0x5c/0x180 [ 0.001346] [<90000000010bdd80>] dump_stack_lvl+0x60/0x88 [ 0.001352] [<9000000000266418>] __might_resched+0x180/0x1cc [ 0.001356] [<90000000010c742c>] mutex_lock+0x20/0x64 [ 0.001359] [<90000000002a8ccc>] irq_find_matching_fwspec+0x48/0x124 [ 0.001364] [<90000000002259c4>] constant_clockevent_init+0x68/0x204 [ 0.001368] [<900000000022acf4>] start_secondary+0x40/0xa8 [ 0.001371] [<90000000010c0124>] smpboot_entry+0x60/0x64 Here are the complete call chains: smpboot_entry() start_secondary() constant_clockevent_init() get_timer_irq() irq_find_matching_fwnode() irq_find_matching_fwspec() mutex_lock() might_sleep() __might_sleep() __might_resched() In order to avoid the above issue, we should break the call chains, using timer_irq_installed variable as check condition to only call get_timer_irq() once in constant_clockevent_init() is a simple and proper way. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Fix Chinese comma in cpu.hJinyang He
Fix Chinese comma introduced by accident in cpu.h. Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25driver core: fw_devlink: Print full path and name of fwnodeSaravana Kannan
Some of the log messages were printing just the fwnode name. While it's short, it's not always uniquely identifiable in system. So print the full path and name to make debugging easier. Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20230225065443.278284-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-25driver core: fw_devlink: Avoid spurious error messageSaravana Kannan
fw_devlink can sometimes try to create a device link with the consumer and supplier as the same device. These attempts will fail (correctly), but are harmless. So, avoid printing an error for these cases. Also, add more detail to the error message. Fixes: 3fb16866b51d ("driver core: fw_devlink: Make cycle detection more robust") Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20230225064148.274376-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-25driver core: bus: Handle early calls to bus_to_subsys()Geert Uytterhoeven
When calling soc_device_match() from early_initcall(), bus_kset is still NULL, causing a crash: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028 ... Call trace: __lock_acquire+0x530/0x20f0 lock_acquire.part.0+0xc8/0x210 lock_acquire+0x64/0x80 _raw_spin_lock+0x4c/0x60 bus_to_subsys+0x24/0xac bus_for_each_dev+0x30/0xcc soc_device_match+0x4c/0xe0 r8a7795_sysc_init+0x18/0x60 rcar_sysc_pd_init+0xb0/0x33c do_one_initcall+0x128/0x2bc Before, bus_for_each_dev() handled this gracefully by checking that the back-pointer to the private structure was valid. Fix this by adding a NULL check for bus_kset to bus_to_subsys(). Fixes: 83b9148df2c95e23 ("driver core: bus: bus iterator cleanups") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/0a92979f6e790737544638e8a4c19b0564e660a2.1676983596.git.geert+renesas@glider.be Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-25Merge 'pci/enumeration' into loongarch-nextHuacai Chen
LoongArch architecture changes for 6.3 depend on the pci changes to work well, so merge them to create a base.
2023-02-24alpha: in_irq() cleanupChangbin Du
Replace the obsolete and ambiguos macro in_irq() with new macro in_hardirq(). Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Matt Turner <mattst88@gmail.com>
2023-02-24alpha: lazy FPU switchingAl Viro
On each context switch we save the FPU registers on stack of old process and restore FPU registers from the stack of new one. That allows us to avoid doing that each time we enter/leave the kernel mode; however, that can get suboptimal in some cases. For one thing, we don't need to bother saving anything for kernel threads. For another, if between entering and leaving the kernel a thread gives CPU up more than once, it will do useless work, saving the same values every time, only to discard the saved copy as soon as it returns from switch_to(). Alternative solution: * move the array we save into from switch_stack to thread_info * have a (thread-synchronous) flag set when we save them * have another flag set when they should be restored on return to userland. * do *NOT* save/restore them in do_switch_stack()/undo_switch_stack(). * restore on the exit to user mode if the restore flag had been set. Clear both flags. * on context switch, entry to fork/clone/vfork, before entry into do_signal() and on entry into straced syscall save the registers and set the 'saved' flag unless it had been already set. * on context switch set the 'restore' flag as well. * have copy_thread() set both flags for child, so the registers would be restored once the child returns to userland. * use the saved data in setup_sigcontext(); have restore_sigcontext() set both flags and copy from sigframe to save area. * teach ptrace to look for FPU registers in thread_info instead of switch_stack. * teach isolated accesses to FPU registers (rdfpcr, wrfpcr, etc.) to check the 'saved' flag (under preempt_disable()) and work with the save area if it's been set; if 'saved' flag is found upon write access, set 'restore' flag as well. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
2023-02-24alpha/boot/misc: trim unused declarationsAl Viro
gzip_mark() and gzip_release() are gone; there used to be two forward declarations of each and the patch removing those suckers had left one of each behind... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
2023-02-24alpha/boot/tools/objstrip: fix the check for ELF headerAl Viro
Just memcmp() with ELFMAG - that's the normal way to do it in userland code, which that thing is. Besides, that has the benefit of actually building - str_has_prefix() is *NOT* present in <string.h>. Fixes: 5f14596e55de "alpha: Replace strncmp with str_has_prefix" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
2023-02-24alpha/boot: fix the breakage from -isystem series...Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
2023-02-24alpha: fix FEN fault handlingAl Viro
Type 3 instruction fault (FPU insn with FPU disabled) is handled by quietly enabling FPU and returning. Which is fine, except that we need to do that both for fault in userland and in the kernel; the latter *can* legitimately happen - all it takes is this: .global _start _start: call_pal 0xae lda $0, 0 ldq $0, 0($0) - call_pal CLRFEN to clear "FPU enabled" flag and arrange for a signal delivery (SIGSEGV in this case). Fixed by moving the handling of type 3 into the common part of do_entIF(), before we check for kernel vs. user mode. Incidentally, check for kernel mode is unidiomatic; the normal way to do that is !user_mode(regs). The difference is that the open-coded variant treats any of bits 63..3 of regs->ps being set as "it's user mode" while the normal approach is to check just the bit 3. PS is a 4-bit register and regs->ps always will have bits 63..4 clear, so the open-code variant here is actually equivalent to !user_mode(regs). Harder to follow, though... Reproducer above will crash any box where CLRFEN is not ignored by PAL (== any actual hardware, AFAICS; PAL used in qemu doesn't bother implementing that crap). Cc: stable@vger.kernel.org # all way back... Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
2023-02-24alpha: Avoid comma separated statementsJoe Perches
Use semicolons and braces. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Matt Turner <mattst88@gmail.com>
2023-02-24alpha: fixed a typo in core_cia.crj1
Signed-off-by: Matt Turner <mattst88@gmail.com>
2023-02-24Merge branch 'work.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Assorted stuff that didn't fit anywhere else" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: nsfs: repair kernel-doc for ns_match() nsfs: add compat ioctl handler fs/cramfs: Convert kmap() to kmap_local_data()
2023-02-24Merge branch 'work.namespace' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull ipc namespace update from Al Viro: "Rik's patches reducing the amount of synchronize_rcu() triggered by ipc namespace destruction. I've some pending stuff reducing that on the normal umount side, but it's nowhere near ready and Rik's stuff shouldn't be held back due to conflicts - I'll just redo the parts of my series that stray into ipc/*" * 'work.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ipc,namespace: batch free ipc_namespace structures ipc,namespace: make ipc namespace allocation wait for pending free
2023-02-24Merge branch 'work.alpha' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull alpha updates from Al Viro: "FEN (floating-point enable) fault fix deals with a really old oopsable braino, the rest is alpha/boot compile fixes and minor cleaning up" * 'work.alpha' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: alpha/boot/misc: trim unused declarations alpha/boot/tools/objstrip: fix the check for ELF header alpha/boot: fix the breakage from -isystem series... alpha: fix FEN fault handling
2023-02-24Merge branch 'work.sysv' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull sysv updates from Al Viro: "Fabio's 'switch to kmap_local_page()' patchset (originally after the ext2 counterpart, with a lot of cleaning up done to it; as the matter of fact, ext2 side is in need of similar cleanups - calling conventions there are bloody awful). Plus the equivalents of minix stuff..." * 'work.sysv' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: sysv: fix handling of delete_entry and set_link failures fs/sysv: Replace kmap() with kmap_local_page() fs/sysv: Use dir_put_page() in sysv_rename() fs/sysv: Change the signature of dir_get_page() fs/sysv: Use the offset_in_page() helper sysv: don't flush page immediately for DIRSYNC directories