summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2023-05-02mm/ksm: move disabling KSM from s390/gmap code to KSM codeDavid Hildenbrand
Let's factor out actual disabling of KSM. The existing "mm->def_flags &= ~VM_MERGEABLE;" was essentially a NOP and can be dropped, because def_flags should never include VM_MERGEABLE. Note that we don't currently prevent re-enabling KSM. This should now be faster in case KSM was never enabled, because we only conditionally iterate all VMAs. Further, it certainly looks cleaner. Link: https://lkml.kernel.org/r/20230422210156.33630-1-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Stefan Roesch <shr@devkernel.io> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Rik van Riel <riel@surriel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-02arm64: lds: move .got section out of .textFangrui Song
Currently, the .got section is placed within the output section .text. However, when .got is non-empty, the SHF_WRITE flag is set for .text when linked by lld. GNU ld recognizes .text as a special section and ignores the SHF_WRITE flag. By renaming .text, we can also get the SHF_WRITE flag. The kernel has performed R_AARCH64_RELATIVE resolving very early, and can then assume that .got is read-only. Let's move .got to the vmlinux_rodata pseudo-segment. As Ard Biesheuvel notes: "This matters to consumers of the vmlinux ELF representation of the kernel image, such as syzkaller, which disregards writable PT_LOAD segments when resolving code symbols. The kernel itself does not care about this distinction, but given that the GOT contains data and not code, it does not require executable permissions, and therefore does not belong in .text to begin with." Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Fangrui Song <maskray@google.com> Link: https://lore.kernel.org/r/20230502074105.1541926-1-maskray@google.com Signed-off-by: Will Deacon <will@kernel.org>
2023-05-02arm64: kernel: remove SHF_WRITE|SHF_EXECINSTR from .idmap.textndesaulniers@google.com
commit d54170812ef1 ("arm64: fix .idmap.text assertion for large kernels") modified some of the section assembler directives that declare .idmap.text to be SHF_ALLOC instead of SHF_ALLOC|SHF_WRITE|SHF_EXECINSTR. This patch fixes up the remaining stragglers that were left behind. Add Fixes tag so that this doesn't precede related change in stable. Fixes: d54170812ef1 ("arm64: fix .idmap.text assertion for large kernels") Reported-by: Greg Thelen <gthelen@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20230428-awx-v2-1-b197ffa16edc@google.com Signed-off-by: Will Deacon <will@kernel.org>
2023-05-02arm64: cpufeature: Fix pointer auth hwcapsKristina Martsenko
The pointer auth hwcaps are not getting reported to userspace, as they are missing the .matches field. Add the field back. Fixes: 876e3c8efe79 ("arm64/cpufeature: Pull out helper for CPUID register definitions") Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230428132546.2513834-1-kristina.martsenko@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-05-01RISC-V: include cpufeature.h in cpufeature.cConor Dooley
Automation complains: warning: symbol '__pcpu_scope_misaligned_access_speed' was not declared. Should it be static? cpufeature.c doesn't actually include the header of the same name, as it had not previously used anything from it. The per-cpu variable is declared there, so include it to silence the complaints. Fixes: 62a31d6e38bd ("RISC-V: hwprobe: Support probing of misaligned access performance") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Evan Green <evan@rivosinc.com> Link: https://lore.kernel.org/r/20230420-wound-gizzard-2b2b589d9bea@spud Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-05-01Merge tag 'input-for-v6.4-rc0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - a new driver for Novatek touch controllers - a new driver for power button for NXP BBNSM - a skeleton KUnit tests for the input core - improvements to Xpad game controller driver to support more devices - improvements to edt-ft5x06, hideep and other drivers * tag 'input-for-v6.4-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (42 commits) Revert "Input: xpad - fix support for some third-party controllers" dt-bindings: input: pwm-beeper: convert to dt schema Input: xpad - fix PowerA EnWired Controller guide button Input: xpad - add constants for GIP interface numbers Input: synaptics-rmi4 - fix function name in kerneldoc Input: raspberrypi-ts - fix refcount leak in rpi_ts_probe Input: edt-ft5x06 - select REGMAP_I2C Input: melfas_mip4 - report palm touches Input: cma3000_d0x - remove unneeded code Input: edt-ft5x06 - calculate points data length only once Input: edt-ft5x06 - unify the crc check Input: edt-ft5x06 - convert to use regmap API Input: edt-ft5x06 - don't print error messages with dev_dbg() Input: edt-ft5x06 - remove code duplication Input: edt-ft5x06 - don't recalculate the CRC Input: edt-ft5x06 - add spaces to ensure format specification Input: edt-ft5x06 - remove unnecessary blank lines Input: edt-ft5x06 - fix indentation Input: tsc2007 - enable cansleep pendown GPIO Input: Add KUnit tests for some of the input core helper functions ...
2023-05-01riscv: Move .rela.dyn to the init sectionsAlexandre Ghiti
The recent introduction of relocatable kernels prepared the move of .rela.dyn to the init section, but actually forgot to do so, so do it here. Before this patch: "Freeing unused kernel image (initmem) memory: 2592K" After this patch: "Freeing unused kernel image (initmem) memory: 6288K" The difference corresponds to the size of the .rela.dyn section: "[42] .rela.dyn RELA ffffffff8197e798 0127f798 000000000039c660 0000000000000018 A 47 0 8" Fixes: 559d1e45a16d ("riscv: Use --emit-relocs in order to move .rela.dyn in init") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230428120932.22735-1-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-05-01riscv: compat_syscall_table: Fixup compile warningGuo Ren
../arch/riscv/kernel/compat_syscall_table.c:12:41: warning: initialized field overwritten [-Woverride-init] 12 | #define __SYSCALL(nr, call) [nr] = (call), | ^ ../include/uapi/asm-generic/unistd.h:567:1: note: in expansion of macro '__SYSCALL' 567 | __SYSCALL(__NR_semget, sys_semget) Fixes: 59c10c52f573 ("riscv: compat: syscall: Add compat_sys_call_table implementation") Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reported-by: kernel test robot <lkp@intel.com> Tested-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Drew Fustini <dfustini@baylibre.com> Link: https://lore.kernel.org/r/20230501223353.2833899-1-dfustini@baylibre.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-05-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "s390: - More phys_to_virt conversions - Improvement of AP management for VSIE (nested virtualization) ARM64: - Numerous fixes for the pathological lock inversion issue that plagued KVM/arm64 since... forever. - New framework allowing SMCCC-compliant hypercalls to be forwarded to userspace, hopefully paving the way for some more features being moved to VMMs rather than be implemented in the kernel. - Large rework of the timer code to allow a VM-wide offset to be applied to both virtual and physical counters as well as a per-timer, per-vcpu offset that complements the global one. This last part allows the NV timer code to be implemented on top. - A small set of fixes to make sure that we don't change anything affecting the EL1&0 translation regime just after having having taken an exception to EL2 until we have executed a DSB. This ensures that speculative walks started in EL1&0 have completed. - The usual selftest fixes and improvements. x86: - Optimize CR0.WP toggling by avoiding an MMU reload when TDP is enabled, and by giving the guest control of CR0.WP when EPT is enabled on VMX (VMX-only because SVM doesn't support per-bit controls) - Add CR0/CR4 helpers to query single bits, and clean up related code where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long" return as a bool - Move AMD_PSFD to cpufeatures.h and purge KVM's definition - Avoid unnecessary writes+flushes when the guest is only adding new PTEs - Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s optimizations when emulating invalidations - Clean up the range-based flushing APIs - Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a single A/D bit using a LOCK AND instead of XCHG, and skip all of the "handle changed SPTE" overhead associated with writing the entire entry - Track the number of "tail" entries in a pte_list_desc to avoid having to walk (potentially) all descriptors during insertion and deletion, which gets quite expensive if the guest is spamming fork() - Disallow virtualizing legacy LBRs if architectural LBRs are available, the two are mutually exclusive in hardware - Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES) after KVM_RUN, similar to CPUID features - Overhaul the vmx_pmu_caps selftest to better validate PERF_CAPABILITIES - Apply PMU filters to emulated events and add test coverage to the pmu_event_filter selftest - AMD SVM: - Add support for virtual NMIs - Fixes for edge cases related to virtual interrupts - Intel AMX: - Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is not being reported due to userspace not opting in via prctl() - Fix a bug in emulation of ENCLS in compatibility mode - Allow emulation of NOP and PAUSE for L2 - AMX selftests improvements - Misc cleanups MIPS: - Constify MIPS's internal callbacks (a leftover from the hardware enabling rework that landed in 6.3) Generic: - Drop unnecessary casts from "void *" throughout kvm_main.c - Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct size by 8 bytes on 64-bit kernels by utilizing a padding hole Documentation: - Fix goof introduced by the conversion to rST" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (211 commits) KVM: s390: pci: fix virtual-physical confusion on module unload/load KVM: s390: vsie: clarifications on setting the APCB KVM: s390: interrupt: fix virtual-physical confusion for next alert GISA KVM: arm64: Have kvm_psci_vcpu_on() use WRITE_ONCE() to update mp_state KVM: arm64: Acquire mp_state_lock in kvm_arch_vcpu_ioctl_vcpu_init() KVM: selftests: Test the PMU event "Instructions retired" KVM: selftests: Copy full counter values from guest in PMU event filter test KVM: selftests: Use error codes to signal errors in PMU event filter test KVM: selftests: Print detailed info in PMU event filter asserts KVM: selftests: Add helpers for PMC asserts in PMU event filter test KVM: selftests: Add a common helper for the PMU event filter guest code KVM: selftests: Fix spelling mistake "perrmited" -> "permitted" KVM: arm64: vhe: Drop extra isb() on guest exit KVM: arm64: vhe: Synchronise with page table walker on MMU update KVM: arm64: pkvm: Document the side effects of kvm_flush_dcache_to_poc() KVM: arm64: nvhe: Synchronise with page table walker on TLBI KVM: arm64: Handle 32bit CNTPCTSS traps KVM: arm64: nvhe: Synchronise with page table walker on vcpu run KVM: arm64: vgic: Don't acquire its_lock before config_lock KVM: selftests: Add test to verify KVM's supported XCR0 ...
2023-05-01Merge tag 'for-linus' of https://github.com/openrisc/linuxLinus Torvalds
Pull OpenRISC updates from Stafford Horne: "Two things for OpenRISC this cycle: - Small cleanup for device tree cpu iteration from Rob Herring - Add support for storing, restoring and accessing user space FPU state, to allow for libc to support the FPU on OpenRISC" * tag 'for-linus' of https://github.com/openrisc/linux: openrisc: Add floating point regset openrisc: Support floating point user api openrisc: Support storing and restoring fpu state openrisc: Properly store r31 to pt_regs on unhandled exceptions openrisc: Use common of_get_cpu_node() instead of open-coding
2023-05-01LoongArch: ftrace: Add direct call trampoline samples supportYouling Tang
The ftrace samples need per-architecture trampoline implementations to save and restore argument registers around the calls to my_direct_func* and to restore polluted registers (e.g: ra). Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: ftrace: Add direct call supportYouling Tang
Select the HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS to provide the register_ftrace_direct[_multi] interfaces allowing users to register the customed trampoline (direct_caller) as the mcount for one or more target functions. And modify_ftrace_direct[_multi] are also provided for modifying direct_caller. There are a few cases to distinguish: - If a direct call ops is the only one tracing a function AND the direct called trampoline is within the reach of a 'bl' instruction -> the ftrace patchsite jumps to the trampoline - Else -> the ftrace patchsite jumps to the ftrace_regs_caller trampoline points to ftrace_list_ops so it iterates over all registered ftrace ops, including the direct call ops and calls its call_direct_funcs handler which stores the direct called trampoline's address in the ftrace_regs and the ftrace_regs_caller trampoline will return to that address instead of returning to the traced function Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: ftrace: Implement ftrace_find_callable_addr() to simplify codeYouling Tang
In the module processing functions, the same logic can be reused by implementing ftrace_find_callable_addr(). Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: ftrace: Fix build error if DYNAMIC_FTRACE_WITH_REGS is not setYouling Tang
We can see the following build error if CONFIG_DYNAMIC_FTRACE_WITH_REGS is not set on LoongArch: arch/loongarch/kernel/ftrace_dyn.c: In function ‘ftrace_make_call’: arch/loongarch/kernel/ftrace_dyn.c:167:23: error: implicit declaration of function ‘__get_mod’ 167 | ret = __get_mod(&mod, pc); | ^~~~~~~~~ arch/loongarch/kernel/ftrace_dyn.c:171:24: error: implicit declaration of function ‘get_plt_addr’ 171 | addr = get_plt_addr(mod, addr); | ^~~~~~~~~~~~ The reason is that the __get_mod() and get_plt_addr() may be called in ftrace_make_{call,nop}. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: ftrace: Abstract DYNAMIC_FTRACE_WITH_ARGS accessesQing Zhang
Add new ftrace_regs_{get,set}_*() helpers which can be used to manipulate ftrace_regs. When CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y, these can always be used on any ftrace_regs, and when CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS =n these can be used when regs are available. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Add support for function error injectionTiezhu Yang
Inspired by the commit 42d038c4fb00f ("arm64: Add support for function error injection") and the commit ee55ff803b383 ("riscv: Add support for function error injection"), this patch supports function error injection for LoongArch. Mainly implement two functions: (1) regs_set_return_value() which is used to overwrite the return value, (2) override_function_with_return() which is used to override the probed function returning and jump to its caller. Here is a simple test under CONFIG_FUNCTION_ERROR_INJECTION and CONFIG_FAIL_FUNCTION: # echo sys_clone > /sys/kernel/debug/fail_function/inject # echo 100 > /sys/kernel/debug/fail_function/probability # dmesg bash: fork: Invalid argument # dmesg ... FAULT_INJECTION: forcing a failure. name fail_function, interval 1, probability 100, space 0, times 1 ... Call Trace: [<90000000002238f4>] show_stack+0x5c/0x180 [<90000000012e384c>] dump_stack_lvl+0x60/0x88 [<9000000000b1879c>] should_fail_ex+0x1b0/0x1f4 [<900000000032ead4>] fei_kprobe_handler+0x28/0x6c [<9000000000230970>] kprobe_breakpoint_handler+0xf0/0x118 [<90000000012e3e60>] do_bp+0x2c4/0x358 [<9000000002241924>] exception_handlers+0x1924/0x10000 [<900000000023b7d0>] sys_clone+0x0/0x4 [<90000000012e4744>] do_syscall+0x7c/0x94 [<9000000000221e44>] handle_syscall+0xc4/0x160 Tested-by: Hengqi Chen <hengqi.chen@gmail.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Add ARCH_HAS_FORTIFY_SOURCE selectionQing Zhang
FORTIFY_SOURCE could detect various overflows at compile and run time. ARCH_HAS_FORTIFY_SOURCE means that the architecture can be built and run with CONFIG_FORTIFY_SOURCE. So select it in LoongArch. See more about this feature from commit 6974f0c4555e285 ("include/linux/ string.h: add the option of fortified string.h functions"). Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: crypto: Add crc32 and crc32c hw accelerationMin Zhou
With a blatant copy of some MIPS bits we introduce the crc32 and crc32c hw accelerated module to LoongArch. LoongArch has provided these instructions to calculate crc32 and crc32c: * crc.w.b.w crcc.w.b.w * crc.w.h.w crcc.w.h.w * crc.w.w.w crcc.w.w.w * crc.w.d.w crcc.w.d.w So we can make use of these instructions to improve the performance of calculation for crc32(c) checksums. As can be seen from the following test results, crc32(c) instructions can improve the performance by 58%. Software implemention Hardware acceleration Buffer size time cost (seconds) time cost (seconds) Accel. 100 KB 0.000845 0.000534 59.1% 1 MB 0.007758 0.004836 59.4% 10 MB 0.076593 0.047682 59.4% 100 MB 0.756734 0.479126 58.5% 1000 MB 7.563841 4.778266 58.5% Signed-off-by: Min Zhou <zhoumin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Add checksum optimization for 64-bit systemBibo Mao
LoongArch platform is 64-bit system, which supports 8-bytes memory accessing, but generic checksum functions use 4-byte memory access. So add 8-bytes memory access optimization for checksum functions on LoongArch. And the code comes from arm64 system. When network hw checksum is disabled, iperf performance improves about 10% with this patch. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Optimize memory ops (memset/memcpy/memmove)WANG Rui
To optimize memset()/memcpy()/memmove() and so on, we use a jump table to dispatch cases for short data lengths; and for long data lengths, we split the destination into head part (first 8 bytes), tail part (last 8 bytes) and middle part. The head part and tail part may be at unaligned addresses, while the middle part is always aligned (the middle part is allowed to overlap the head/tail part). In this way, the first and last 8 bytes may be unaligned accesses, but we can make sure the data in the middle is processed at an aligned destination address. We have tested micro-bench[1] on a Loongson-3C5000 16-core machine (2.2GHz): 1. memset | length | src offset | dst offset | speed before | speed after | % | |--------|------------|------------|--------------|-------------|---------| | 8 | 0 | 0 | 696.191 | 1518.785 | 118.16% | | 8 | 0 | 1 | 696.325 | 1518.937 | 118.14% | | 50 | 0 | 0 | 969.976 | 8053.902 | 730.32% | | 50 | 0 | 1 | 970.034 | 8058.475 | 730.74% | | 300 | 0 | 0 | 5876.612 | 16544.703 | 181.53% | | 300 | 0 | 1 | 5030.849 | 16549.011 | 228.95% | | 1200 | 0 | 0 | 11797.077 | 16752.137 | 42.00% | | 1200 | 0 | 1 | 5687.141 | 16645.233 | 192.68% | | 4000 | 0 | 0 | 15723.27 | 16761.557 | 6.60% | | 4000 | 0 | 1 | 5906.114 | 16732.316 | 183.30% | | 8000 | 0 | 0 | 16751.403 | 16770.002 | 0.11% | | 8000 | 0 | 1 | 5995.449 | 16754.07 | 179.45% | 2. memcpy | length | src offset | dst offset | speed before | speed after | % | |--------|------------|------------|--------------|-------------|---------| | 8 | 0 | 0 | 696.2 | 1670.605 | 139.96% | | 8 | 0 | 1 | 696.325 | 1671.138 | 139.99% | | 50 | 0 | 0 | 969.974 | 8724.999 | 799.51% | | 50 | 0 | 1 | 970.032 | 8730.138 | 799.98% | | 300 | 0 | 0 | 5564.662 | 16272.652 | 192.43% | | 300 | 0 | 1 | 4670.436 | 14972.842 | 220.59% | | 1200 | 0 | 0 | 10740.23 | 16751.728 | 55.97% | | 1200 | 0 | 1 | 5027.741 | 14874.564 | 195.85% | | 4000 | 0 | 0 | 15122.367 | 16737.642 | 10.68% | | 4000 | 0 | 1 | 5536.918 | 14890.397 | 168.93% | | 8000 | 0 | 0 | 16505.453 | 16553.543 | 0.29% | | 8000 | 0 | 1 | 5821.619 | 14841.804 | 154.94% | 3. memmove | length | src offset | dst offset | speed before | speed after | % | |--------|------------|------------|--------------|-------------|---------| | 8 | 0 | 0 | 982.693 | 1670.568 | 70.00% | | 8 | 0 | 1 | 983.023 | 1671.174 | 70.00% | | 50 | 0 | 0 | 1230.87 | 8727.625 | 609.06% | | 50 | 0 | 1 | 1232.515 | 8730.138 | 608.32% | | 300 | 0 | 0 | 6490.375 | 16296.993 | 151.09% | | 300 | 0 | 1 | 4282.687 | 14972.842 | 249.61% | | 1200 | 0 | 0 | 11742.755 | 16752.546 | 42.66% | | 1200 | 0 | 1 | 5039.338 | 14872.951 | 195.14% | | 4000 | 0 | 0 | 15467.786 | 16737.09 | 8.21% | | 4000 | 0 | 1 | 5009.905 | 14890.542 | 197.22% | | 8000 | 0 | 0 | 16489.664 | 16553.273 | 0.39% | | 8000 | 0 | 1 | 5823.786 | 14858.646 | 155.14% | * speed: MB/s * length: byte [1] https://github.com/heiher/mem-bench Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Provide kernel fpu functionsHuacai Chen
Provide kernel_fpu_begin()/kernel_fpu_end() to allow the kernel itself to use fpu. They can be used by some other kernel components, e.g., the AMDGPU graphic driver for DCN. Reported-by: WANG Xuerui <kernel@xen0n.name> Tested-by: WANG Xuerui <kernel@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Relay BCE exceptions to userland as SIGSEGV with si_code=SEGV_BNDERRWANG Xuerui
SEGV_BNDERR was introduced initially for supporting the Intel MPX, but fell into disuse after the MPX support was removed. The LoongArch bounds-checking instructions behave very differently than MPX, but overall the interface is still kind of suitable for conveying the information to userland when bounds-checking assertions trigger, so we wouldn't have to invent more UAPI. Specifically, when the BCE triggers, a SEGV_BNDERR is sent to userland, with si_addr set to the out-of-bounds address or value (in asrt{gt,le}'s case), and one of si_lower or si_upper set to the configured bound depending on the faulting instruction. The other bound is set to either 0 or ULONG_MAX to resemble a range with both lower and upper bounds. Note that it is possible to have si_addr == si_lower in case of a failing asrtgt or {ld,st}gt, because those instructions test for strict greater-than relationship. This should not pose a problem for userland, though, because the faulting PC is available for the application to associate back to the exact instruction for figuring out the expectation. Example exception context generated by a faulting `asrtgt.d t0, t1` (assert t0 > t1 or BCE) with t0=100 and t1=200: > pc 00005555558206a4 ra 00007ffff2d854fc tp 00007ffff2f2f180 sp 00007ffffbf9fb80 > a0 0000000000000002 a1 00007ffffbf9fce8 a2 00007ffffbf9fd00 a3 00007ffff2ed4558 > a4 0000000000000000 a5 00007ffff2f044c8 a6 00007ffffbf9fce0 a7 fffffffffffff000 > t0 0000000000000064 t1 00000000000000c8 t2 00007ffffbfa2d5e t3 00007ffff2f12aa0 > t4 00007ffff2ed6158 t5 00007ffff2ed6158 t6 000000000000002e t7 0000000003d8f538 > t8 0000000000000005 u0 0000000000000000 s9 0000000000000000 s0 00007ffffbf9fce8 > s1 0000000000000002 s2 0000000000000000 s3 00007ffff2f2c038 s4 0000555555820610 > s5 00007ffff2ed5000 s6 0000555555827e38 s7 00007ffffbf9fd00 s8 0000555555827e38 > ra: 00007ffff2d854fc > ERA: 00005555558206a4 > CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) > PRMD: 00000007 (PPLV3 +PIE -PWE) > EUEN: 00000000 (-FPE -SXE -ASXE -BTE) > ECFG: 0007181c (LIE=2-4,11-12 VS=7) > ESTAT: 000a0000 [BCE] (IS= ECode=10 EsubCode=0) > PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Tweak the BADV and CPUCFG.PRID lines in show_regs()WANG Xuerui
Use ISA manual names for BADV and CPUCFG.PRID lines in show_regs(), for stylistic consistency with the other lines already touched. While at it, also include current CPU's full name in show_regs() output. It may be more helpful for developers looking at the resulting dumps, because multiple distinct CPU models may share the same PRID. Not having this info available may hide problems only found on some but not all of the models sharing one specific PRID. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Humanize the ESTAT line when showing registersWANG Xuerui
Example output looks like: [ xx.xxxxxx] ESTAT: 00001000 [INT] (IS=12 ECode=0 EsubCode=0) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Humanize the ECFG line when showing registersWANG Xuerui
Example output looks like: [ xx.xxxxxx] ECFG: 00071c1c (LIE=2-4,10-12 VS=7) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Humanize the EUEN line when showing registersWANG Xuerui
Example output looks like: [ xx.xxxxxx] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Humanize the PRMD line when showing registersWANG Xuerui
Example output looks like: [ xx.xxxxxx] PRMD: 00000004 (PPLV0 +PIE -PWE) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Humanize the CRMD line when showing registersWANG Xuerui
Example output looks like: [ xx.xxxxxx] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) Some initial machinery for this pretty-printing format has been included in this patch as well. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Fix format of CSR lines during show_regs()WANG Xuerui
Use uppercase CSR names throughout for consistency with the manual wording, and right-align the keys. The "CSR" part is inferrable from context, hence dropped for more horizontal space. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Print symbol info for $ra and CSR.ERA only for kernel-mode contextsWANG Xuerui
Otherwise the addresses wouldn't make sense at all. While at it, align the "map keys" to maintain right-alignment with the "estat:" line too; also swap the ERA and ra lines so all CSRs are shown together. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Print GPRs with ABI names when showing registersWANG Xuerui
Show PC (CSR.ERA) in place of $zero, and also show the syscall restart flag (conveniently stuffed in regs[0]) if non-zero. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Define regular names for BCE/WATCH/HVC/GSPR exceptionsWANG Xuerui
Define them according to the ISA manual, in order to enable matching the sub-exceptions for humanization purposes later. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Clean up the architectural interrupt definitionsWANG Xuerui
While interrupts are assigned ECodes `64 + interrupt number`, all existing use sites of interrupt numbers want the 64 subtracted. Re-arrange the definitions so that the actual interrupt number is used everywhere, and make EXCCODE_INT_END inclusive as it is more intuitive that way. While at it, according to the asm/loongarch.h definitions, the total number of architectural interrupts should be 14, but various other places indicate otherwise (13 or 15). Those places have been adjusted to 14 as well for consistency. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-04-30Merge tag 'iommu-updates-v6.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - Convert to platform remove callback returning void - Extend changing default domain to normal group - Intel VT-d updates: - Remove VT-d virtual command interface and IOASID - Allow the VT-d driver to support non-PRI IOPF - Remove PASID supervisor request support - Various small and misc cleanups - ARM SMMU updates: - Device-tree binding updates: * Allow Qualcomm GPU SMMUs to accept relevant clock properties * Document Qualcomm 8550 SoC as implementing an MMU-500 * Favour new "qcom,smmu-500" binding for Adreno SMMUs - Fix S2CR quirk detection on non-architectural Qualcomm SMMU implementations - Acknowledge SMMUv3 PRI queue overflow when consuming events - Document (in a comment) why ATS is disabled for bypass streams - AMD IOMMU updates: - 5-level page-table support - NUMA awareness for memory allocations - Unisoc driver: Support for reattaching an existing domain - Rockchip driver: Add missing set_platform_dma_ops callback - Mediatek driver: Adjust the dma-ranges - Various other small fixes and cleanups * tag 'iommu-updates-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (82 commits) iommu: Remove iommu_group_get_by_id() iommu: Make iommu_release_device() static iommu/vt-d: Remove BUG_ON in dmar_insert_dev_scope() iommu/vt-d: Remove a useless BUG_ON(dev->is_virtfn) iommu/vt-d: Remove BUG_ON in map/unmap() iommu/vt-d: Remove BUG_ON when domain->pgd is NULL iommu/vt-d: Remove BUG_ON in handling iotlb cache invalidation iommu/vt-d: Remove BUG_ON on checking valid pfn range iommu/vt-d: Make size of operands same in bitwise operations iommu/vt-d: Remove PASID supervisor request support iommu/vt-d: Use non-privileged mode for all PASIDs iommu/vt-d: Remove extern from function prototypes iommu/vt-d: Do not use GFP_ATOMIC when not needed iommu/vt-d: Remove unnecessary checks in iopf disabling path iommu/vt-d: Move PRI handling to IOPF feature path iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path iommu/vt-d: Move iopf code from SVA to IOPF enabling path iommu/vt-d: Allow SVA with device-specific IOPF dmaengine: idxd: Add enable/disable device IOPF feature arm64: dts: mt8186: Add dma-ranges for the parent "soc" node ...
2023-04-30Merge tag 's390-6.4-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Add support for stackleak feature. Also allow specifying architecture-specific stackleak poison function to enable faster implementation. On s390, the mvc-based implementation helps decrease typical overhead from a factor of 3 to just 25% - Convert all assembler files to use SYM* style macros, deprecating the ENTRY() macro and other annotations. Select ARCH_USE_SYM_ANNOTATIONS - Improve KASLR to also randomize module and special amode31 code base load addresses - Rework decompressor memory tracking to support memory holes and improve error handling - Add support for protected virtualization AP binding - Add support for set_direct_map() calls - Implement set_memory_rox() and noexec module_alloc() - Remove obsolete overriding of mem*() functions for KASAN - Rework kexec/kdump to avoid using nodat_stack to call purgatory - Convert the rest of the s390 code to use flexible-array member instead of a zero-length array - Clean up uaccess inline asm - Enable ARCH_HAS_MEMBARRIER_SYNC_CORE - Convert to using CONFIG_FUNCTION_ALIGNMENT and enable DEBUG_FORCE_FUNCTION_ALIGN_64B - Resolve last_break in userspace fault reports - Simplify one-level sysctl registration - Clean up branch prediction handling - Rework CPU counter facility to retrieve available counter sets just once - Other various small fixes and improvements all over the code * tag 's390-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (118 commits) s390/stackleak: provide fast __stackleak_poison() implementation stackleak: allow to specify arch specific stackleak poison function s390: select ARCH_USE_SYM_ANNOTATIONS s390/mm: use VM_FLUSH_RESET_PERMS in module_alloc() s390: wire up memfd_secret system call s390/mm: enable ARCH_HAS_SET_DIRECT_MAP s390/mm: use BIT macro to generate SET_MEMORY bit masks s390/relocate_kernel: adjust indentation s390/relocate_kernel: use SYM* macros instead of ENTRY(), etc. s390/entry: use SYM* macros instead of ENTRY(), etc. s390/purgatory: use SYM* macros instead of ENTRY(), etc. s390/kprobes: use SYM* macros instead of ENTRY(), etc. s390/reipl: use SYM* macros instead of ENTRY(), etc. s390/head64: use SYM* macros instead of ENTRY(), etc. s390/earlypgm: use SYM* macros instead of ENTRY(), etc. s390/mcount: use SYM* macros instead of ENTRY(), etc. s390/crc32le: use SYM* macros instead of ENTRY(), etc. s390/crc32be: use SYM* macros instead of ENTRY(), etc. s390/crypto,chacha: use SYM* macros instead of ENTRY(), etc. s390/amode31: use SYM* macros instead of ENTRY(), etc. ...
2023-04-30Merge tag 'kbuild-v6.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Refactor scripts/kallsyms to make it faster and easier to maintain - Clean up menuconfig - Provide Clang with hard-coded target triple instead of CROSS_COMPILE - Use -z pack-relative-relocs flags instead of --use-android-relr-tags for arm64 CONFIG_RELR - Add srcdeb-pkg target to build only a Debian source package - Add KDEB_SOURCE_COMPRESS option to specify the compression for a Debian source package - Misc cleanups and fixes * tag 'kbuild-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: deb-pkg: specify targets in debian/rules as .PHONY sparc: unify sparc32/sparc64 archhelp kbuild: rpm-pkg: remove kernel-drm PROVIDES kbuild: deb-pkg: add KDEB_SOURCE_COMPRESS to specify source compression kbuild: add srcdeb-pkg target Makefile: use -z pack-relative-relocs kbuild: clang: do not use CROSS_COMPILE for target triple kconfig: menuconfig: reorder functions to remove forward declarations kconfig: menuconfig: remove unused M_EVENT macro kconfig: menuconfig: remove OLD_NCURSES macro kbuild: builddeb: Eliminate debian/arch use scripts/kallsyms: update the usage in the comment block scripts/kallsyms: decrease expand_symbol() / cleanup_symbol_name() calls scripts/kallsyms: change the output order scripts/kallsyms: move compiler-generated symbol patterns to mksysmap scripts/kallsyms: exclude symbols generated by itself dynamically scripts/mksysmap: use sed with in-line comments scripts/mksysmap: remove comments described in nm(1) scripts/kallsyms: remove redundant code for omitting U and N kallsyms: expand symbol name into comment for debugging
2023-04-29Merge tag 'efi-next-for-v6.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - relocate the LoongArch kernel if the preferred address is already occupied - implement BTI annotations for arm64 EFI stub and zboot images - clean up arm64 zboot Kbuild rules for injecting the kernel code size * tag 'efi-next-for-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi/zboot: arm64: Grab code size from ELF symbol in payload efi/zboot: arm64: Inject kernel code size symbol into the zboot payload efi/zboot: Set forward edge CFI compat header flag if supported efi/zboot: Add BSS padding before compression arm64: efi: Enable BTI codegen and add PE/COFF annotation efi/pe: Import new BTI/IBT header flags from the spec efi/loongarch: Reintroduce efi_relocate_kernel() to relocate kernel
2023-04-29Merge tag 'clk-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk updates from Stephen Boyd: "Nothing looks out of the ordinary in this batch of clk driver updates. There are a couple patches to the core clk framework, but they're all basically cleanups or debugging aids. The driver updates and new additions are dominated in the diffstat by Qualcomm and MediaTek drivers. Qualcomm gained a handful of new drivers for various SoCs, and MediaTek gained a bunch of drivers for MT8188. The MediaTek drivers are being modernized as well, so there are updates all over that vendor's clk drivers. There's also a couple other new clk drivers in here, for example the Starfive JH7110 SoC support is added. Outside of the two major SoC vendors though, we have the usual collection of non-critical fixes and cleanups to various clk drivers. It's good to see that we're getting more cleanups and modernization patches. Maybe one day we'll be able to properly split clk providers from clk consumers. Core: - Print an informational message before disabling unused clks New Drivers: - BCM63268 timer clock and reset controller - Frequency Hopping (FHCTL) on MediaTek MT6795, MT8173, MT8192 and MT8195 SoCs - Mediatek MT8188 SoC clk drivers - Clock driver for Sunplus SP7021 SoC - Clk driver support for Loongson-2 SoCs - Clock driver for Skyworks Si521xx I2C PCIe clock generators - Initial Starfive JH7110 clk/reset support - Global clock controller drivers for Qualcomm SM7150, IPQ9574, MSM8917 and IPQ5332 SoCs - GPU clock controller drivers for SM6115, SM6125, SM6375 and SA8775P SoCs Updates: - Shrink size of clk_fractional_divider a little - Convert various clk drivers to devm_of_clk_add_hw_provider() - Convert platform clk drivers to remove_new() - Converted most Mediatek clock drivers to struct platform_driver - MediaTek clock drivers can be built as modules - Reimplement Loongson-1 clk driver with DT support - Migrate socfpga clk driver to of_clk_add_hw_provider() - Support for i3c clks on Aspeed ast2600 SoCs - Add clock generic devm_clk_hw_register_gate_parent_data - Add audiomix block control for i.MX8MP - Add support for determine_rate to i.MX composite-8m - Let the LCDIF Pixel clock of i.MX8MM and i.MX8MN set parent rate - Provide clock name in error message for clk-gpr-mux on get parent failure - Drop duplicate imx_clk_mux_flags macro - Register the i.MX8MP Media Disp2 Pix clock as bus clock - Add Media LDB root clock to i.MX8MP - Make i.MX8MP nand_usdhc_bus clock as non-critical - Fix the rate table for i.MX fracn-gppll - Disable HW control for the fracn-gppll in order to be controlled by register write - Add support for interger PLL in fracn-gppll - Add mcore_booted module parameter to i.MX93 provider - Add NIC, A55 and ARM PLL clocks to i.MX93 - Fix i.MX8ULP XBAR_DIVBUS and AD_SLOW clock parents - Use "divider closest" clock type for PLL4_PFD dividers on i.MX8ULP to get more accurate clock rates - Mark the MU0_Bi and TPM5 clocks on i.MX8ULP as critical - Update some of the i.MX critical clocks flags to allow glitchless on-the-fly rate change. - Add I2C5 clock on Renesas R-Car V3H - Exynos850: Add CMU_G3D clock controller for the Mali GPU - Extract Exynos5433 (ARM64) clock controller power management code to common driver parts - Exynos850: make PMU_ALIVE_PCLK clock critical - Add Audio, thermal, camera (CSI-2), Image Signal Processor/Channel Selector (ISPCS), and video capture (VIN) clocks on Renesas R-Car V4H - Add video capture (VIN) clocks on Renesas R-Car V3H - Add Cortex-A53 System CPU (Z2) clocks on Renesas R-Car V3M and V3H - Support for Stromer Plus PLL on Qualcomm IPQ5332 - Add a missing reset to Qualcomm QCM2290 - Migrate Qualcomm IPQ4019 to clk_parent_data - Make USB GDSCs enter retention state when disabled on Qualcomm SM6375, MSM8996 and MSM8998 SoCs - Set floor rounding clk_ops for Qualcomm QCM2290 SDCC2 clk - Add two EMAC GDSCs on Qualcomm SC8280XP - Use shared rcg clk ops in Qualcomm SM6115 GCC - Park Qualcomm SM8350 PCIe PIPE clks when disabled - Add GDSCs to Qualcomm SC7280 LPASS audio clock controller - Add missing XO clocks to Qualcomm MSM8226 and MSM8974 - Convert some Qualcomm clk DT bindings to YAML - Reparenting fix for the clock supplying camera modules on Rockchip rk3399 - Mark more critical (bus-)clocks on Rockchip rk3588" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (290 commits) clk: qcom: gcc-sc8280xp: Add EMAC GDSCs clk: starfive: Delete the redundant dev_set_drvdata() in JH7110 clock drivers clk: rockchip: rk3588: make gate linked clocks critical clk: qcom: dispcc-qcm2290: Remove inexistent DSI1PHY clk clk: qcom: add the GPUCC driver for sa8775p dt-bindings: clock: qcom: describe the GPUCC clock for SA8775P clk: qcom: gcc-sm8350: fix PCIe PIPE clocks handling clk: qcom: lpassaudiocc-sc7280: Add required gdsc power domain clks in lpass_cc_sc7280_desc clk: qcom: lpasscc-sc7280: Skip qdsp6ss clock registration dt-bindings: clock: qcom,sc7280-lpasscc: Add qcom,adsp-pil-mode property clk: starfive: Avoid casting iomem pointers clk: microchip: fix potential UAF in auxdev release callback clk: qcom: rpm: Use managed `of_clk_add_hw_provider()` clk: mediatek: fhctl: Mark local variables static clk: sifive: make SiFive clk drivers depend on ARCH_ symbols clk: uniphier: Use managed `of_clk_add_hw_provider()` clk: si5351: Use managed `of_clk_add_hw_provider()` clk: si570: Use managed `of_clk_add_hw_provider()` clk: si514: Use managed `of_clk_add_hw_provider()` clk: lmk04832: Use managed `of_clk_add_hw_provider()` ...
2023-04-29RISC-V: fixup in-flight collision with ARCH_WANT_OPTIMIZE_VMEMMAP renameConor Dooley
Lukas warned that ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP had been renamed in the mm tree & that RISC-V would need a fixup as part of the merge. The warning was missed however, and RISC-V is selecting the orphaned Kconfig option. Fixes: 89d77f71f493 ("Merge tag 'riscv-for-linus-6.4-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux") Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>. Link: https://lore.kernel.org/linux-riscv/CAKXUXMyVeg2kQK_edKHtMD3eADrDK_PKhCSVkMrLDdYgTQQ5rg@mail.gmail.com/ Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230429-trilogy-jolly-12bf5c53d62d@spud Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-29RISC-V: fix sifive and thead section mismatches in errataRandy Dunlap
When CONFIG_MODULES is set, __init_or_module becomes <empty>, but when CONFIG_MODULES is not set, __init_or_module becomes __init. In the latter case, it causes section mismatch warnings: WARNING: modpost: vmlinux.o: section mismatch in reference: riscv_fill_cpu_mfr_info (section: .text) -> sifive_errata_patch_func (section: .init.text) WARNING: modpost: vmlinux.o: section mismatch in reference: riscv_fill_cpu_mfr_info (section: .text) -> thead_errata_patch_func (section: .init.text) Fixes: bb3f89487fd9 ("RISC-V: hwprobe: Remove __init on probe_vendor_features()") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Evan Green <evan@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230429155247.12131-1-rdunlap@infradead.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-29RISC-V: Align SBI probe implementation with specAndrew Jones
sbi_probe_extension() is specified with "Returns 0 if the given SBI extension ID (EID) is not available, or 1 if it is available unless defined as any other non-zero value by the implementation." Additionally, sbiret.value is a long. Fix the implementation to ensure any nonzero long value is considered a success, rather than only positive int values. Fixes: b9dcd9e41587 ("RISC-V: Add basic support for SBI v0.2") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230427163626.101042-1-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-29riscv: mm: remove redundant parameter of create_fdt_early_page_tableSong Shuai
create_fdt_early_page_table() explicitly uses early_pg_dir for 32-bit fdt mapping and the pgdir parameter is redundant here. So remove it and its caller. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Song Shuai <suagrfillet@gmail.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Fixes: ef69d2559fe9 ("riscv: Move early dtb mapping into the fixmap region") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230426100009.685435-1-suagrfillet@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-29Merge patch series "RISC-V Hibernation Support"Palmer Dabbelt
Sia Jee Heng <jeeheng.sia@starfivetech.com> says: This series adds RISC-V Hibernation/suspend to disk support. Low level Arch functions were created to support hibernation. swsusp_arch_suspend() relies code from __cpu_suspend_enter() to write cpu state onto the stack, then calling swsusp_save() to save the memory image. Arch specific hibernation header is implemented and is utilized by the arch_hibernation_header_restore() and arch_hibernation_header_save() functions. The arch specific hibernation header consists of satp, hartid, and the cpu_resume address. The kernel built version is also need to be saved into the hibernation image header to making sure only the same kernel is restore when resume. swsusp_arch_resume() creates a temporary page table that covering only the linear map. It copies the restore code to a 'safe' page, then start to restore the memory image. Once completed, it restores the original kernel's page table. It then calls into __hibernate_cpu_resume() to restore the CPU context. Finally, it follows the normal hibernation path back to the hibernation core. To enable hibernation/suspend to disk into RISCV, the below config need to be enabled: - CONFIG_HIBERNATION - CONFIG_ARCH_HIBERNATION_HEADER - CONFIG_ARCH_HIBERNATION_POSSIBLE At high-level, this series includes the following changes: 1) Change suspend_save_csrs() and suspend_restore_csrs() to public function as these functions are common to suspend/hibernation. (patch 1) 2) Refactor the common code in the __cpu_resume_enter() function and __hibernate_cpu_resume() function. The common code are used by hibernation and suspend. (patch 2) 3) Enhance kernel_page_present() function to support huge page. (patch 3) 4) Add arch/riscv low level functions to support hibernation/suspend to disk. (patch 4) * b4-shazam-merge: RISC-V: Add arch functions to support hibernation/suspend-to-disk RISC-V: mm: Enable huge page support to kernel_page_present() function RISC-V: Factor out common code of __cpu_resume_enter() RISC-V: Change suspend_save_csrs and suspend_restore_csrs to public function Link: https://lore.kernel.org/r/20230330064321.1008373-1-jeeheng.sia@starfivetech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-29riscv: Adjust dependencies of HAVE_DYNAMIC_FTRACE selectionNathan Chancellor
When building allmodconfig with clang and its integrated assembler and linking with a version of GNU ld prior to 2.36, the following link error occurs: riscv64-linux-gnu-ld: .init.data has both ordered [`__patchable_function_entries' in init/main.o] and unordered [`.init_array.0' in kernel/trace/trace_benchmark.o] sections riscv64-linux-gnu-ld: final link failed: bad value This is the same error addressed by commit 45bd8951806e ("arm64: Improve HAVE_DYNAMIC_FTRACE_WITH_REGS selection for clang") for arm64. See that changelog for a full description of why this error occurs with this combination of tools. In a similar manner as that change, restrict the CONFIG_HAVE_DYNAMIC_FTRACE selection to combinations of tools known to work so that there are no errors. Link: https://github.com/ClangBuiltLinux/linux/issues/1817 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230404-riscv-dynamic-ftrace-checks-clang-v1-1-0ce296b7d423@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-29RISC-V: Add arch functions to support hibernation/suspend-to-diskSia Jee Heng
Low level Arch functions were created to support hibernation. swsusp_arch_suspend() relies code from __cpu_suspend_enter() to write cpu state onto the stack, then calling swsusp_save() to save the memory image. Arch specific hibernation header is implemented and is utilized by the arch_hibernation_header_restore() and arch_hibernation_header_save() functions. The arch specific hibernation header consists of satp, hartid, and the cpu_resume address. The kernel built version is also need to be saved into the hibernation image header to making sure only the same kernel is restore when resume. swsusp_arch_resume() creates a temporary page table that covering only the linear map. It copies the restore code to a 'safe' page, then start to restore the memory image. Once completed, it restores the original kernel's page table. It then calls into __hibernate_cpu_resume() to restore the CPU context. Finally, it follows the normal hibernation path back to the hibernation core. To enable hibernation/suspend to disk into RISCV, the below config need to be enabled: - CONFIG_HIBERNATION - CONFIG_ARCH_HIBERNATION_HEADER - CONFIG_ARCH_HIBERNATION_POSSIBLE Signed-off-by: Sia Jee Heng <jeeheng.sia@starfivetech.com> Reviewed-by: Ley Foon Tan <leyfoon.tan@starfivetech.com> Reviewed-by: Mason Huo <mason.huo@starfivetech.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230330064321.1008373-5-jeeheng.sia@starfivetech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-29RISC-V: mm: Enable huge page support to kernel_page_present() functionSia Jee Heng
Currently kernel_page_present() function doesn't support huge page detection causes the function to mistakenly return false to the hibernation core. Add huge page detection to the function to solve the problem. Fixes: 9e953cda5cdf ("riscv: Introduce huge page support for 32/64bit kernel") Signed-off-by: Sia Jee Heng <jeeheng.sia@starfivetech.com> Reviewed-by: Ley Foon Tan <leyfoon.tan@starfivetech.com> Reviewed-by: Mason Huo <mason.huo@starfivetech.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230330064321.1008373-4-jeeheng.sia@starfivetech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-29RISC-V: Factor out common code of __cpu_resume_enter()Sia Jee Heng
The cpu_resume() function is very similar for the suspend to disk and suspend to ram cases. Factor out the common code into suspend_restore_csrs macro and suspend_restore_regs macro. Signed-off-by: Sia Jee Heng <jeeheng.sia@starfivetech.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230330064321.1008373-3-jeeheng.sia@starfivetech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-29RISC-V: Change suspend_save_csrs and suspend_restore_csrs to public functionSia Jee Heng
Currently suspend_save_csrs() and suspend_restore_csrs() functions are statically defined in the suspend.c. Change the function's attribute to public so that the functions can be used by hibernation as well. Signed-off-by: Sia Jee Heng <jeeheng.sia@starfivetech.com> Reviewed-by: Ley Foon Tan <leyfoon.tan@starfivetech.com> Reviewed-by: Mason Huo <mason.huo@starfivetech.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230330064321.1008373-2-jeeheng.sia@starfivetech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-29Merge tag 'dma-mapping-6.4-2023-04-28' of ↵Linus Torvalds
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - fix a PageHighMem check in dma-coherent initialization (Doug Berger) - clean up the coherency defaul initialiation (Jiaxun Yang) - add cacheline to user/kernel dma-debug space dump messages (Desnes Nunes, Geert Uytterhoeve) - swiotlb statistics improvements (Michael Kelley) - misc cleanups (Petr Tesarik) * tag 'dma-mapping-6.4-2023-04-28' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: Omit total_used and used_hiwater if !CONFIG_DEBUG_FS swiotlb: track and report io_tlb_used high water marks in debugfs swiotlb: fix debugfs reporting of reserved memory pools swiotlb: relocate PageHighMem test away from rmem_swiotlb_setup of: address: always use dma_default_coherent for default coherency dma-mapping: provide CONFIG_ARCH_DMA_DEFAULT_COHERENT dma-mapping: provide a fallback dma_default_coherent dma-debug: Use %pa to format phys_addr_t dma-debug: add cacheline to user/kernel space dump messages dma-debug: small dma_debug_entry's comment and variable name updates dma-direct: cleanup parameters to dma_direct_optimal_gfp_mask
2023-04-29locking/x86: Define arch_try_cmpxchg_local()Uros Bizjak
Define target specific arch_try_cmpxchg_local(). This definition overrides the generic arch_try_cmpxchg_local() fallback definition and enables target-specific implementation of try_cmpxchg_local(). Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230405141710.3551-5-ubizjak@gmail.com Cc: Linus Torvalds <torvalds@linux-foundation.org>