summaryrefslogtreecommitdiff
path: root/arch/riscv/kernel
AgeCommit message (Collapse)Author
2025-05-08riscv: Add SiFive xsfvfnrclipxfqf vendor extensionCyan Yang
Add SiFive vendor extension "xsfvfnrclipxfqf" support to the kernel. Signed-off-by: Cyan Yang <cyan.yang@sifive.com> Link: https://lore.kernel.org/r/20250418053239.4351-7-cyan.yang@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08riscv: hwprobe: Add SiFive vendor extension support and probe for ↵Cyan Yang
xsfqmaccdod and xsfqmaccqoq Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0" which allows userspace to probe for the new vendor extensions from SiFive. Also, add new hwprobe for SiFive "xsfvqmaccdod" and "xsfvqmaccqoq" vendor extensions. Signed-off-by: Cyan Yang <cyan.yang@sifive.com> Link: https://lore.kernel.org/r/20250418053239.4351-5-cyan.yang@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08riscv: Add SiFive xsfvqmaccdod and xsfvqmaccqoq vendor extensionsCyan Yang
Add SiFive vendor extension support to the kernel with the target of "xsfvqmaccdod" and "xsfvqmaccqoq". Signed-off-by: Cyan Yang <cyan.yang@sifive.com> Link: https://lore.kernel.org/r/20250418053239.4351-3-cyan.yang@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08riscv: vDSO: Remove --hash-style=bothXi Ruoyao
When RISC-V borned, DT_GNU_HASH had already became the de-facto standard so DT_HASH is just wasting storage space. Remove the explicit --hash-style=both setting and rely on the distro toolchain default, which is most likely "gnu" (i.e. generating only DT_GNU_HASH, no DT_HASH). Following the logic of commit 48f6430505c0 ("arm64/vdso: Remove --hash-style=sysv"). Signed-off-by: Xi Ruoyao <xry111@xry111.site> Link: https://lore.kernel.org/r/20250224112042.60282-2-xry111@xry111.site Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08Merge patch series "riscv: uaccess: optimisations"Palmer Dabbelt
Cyril Bur <cyrilbur@tenstorrent.com> says: This series tries to optimize riscv uaccess by allowing the use of user_access_begin() and user_access_end() which permits grouping user accesses and avoiding the CSR write penalty for each access. The error path can also be optimised using asm goto which patches 3 and 4 achieve. This will speed up jumping to labels by avoiding the need of an intermediary error type variable within the uaccess macros I did read the discussion this series generated. It isn't clear to me which direction to take the patches, if any. * b4-shazam-merge: riscv: uaccess: use 'asm_goto_output' for get_user() riscv: uaccess: use 'asm goto' for put_user() riscv: uaccess: use input constraints for ptr of __put_user() riscv: implement user_access_begin() and families riscv: save the SR_SUM status over switches Link: https://lore.kernel.org/r/20250410070526.3160847-1-cyrilbur@tenstorrent.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08riscv: save the SR_SUM status over switchesBen Dooks
When threads/tasks are switched we need to ensure the old execution's SR_SUM state is saved and the new thread has the old SR_SUM state restored. The issue was seen under heavy load especially with the syz-stress tool running, with crashes as follows in schedule_tail: Unable to handle kernel access to user memory without uaccess routines at virtual address 000000002749f0d0 Oops [#1] Modules linked in: CPU: 1 PID: 4875 Comm: syz-executor.0 Not tainted 5.12.0-rc2-syzkaller-00467-g0d7588ab9ef9 #0 Hardware name: riscv-virtio,qemu (DT) epc : schedule_tail+0x72/0xb2 kernel/sched/core.c:4264 ra : task_pid_vnr include/linux/sched.h:1421 [inline] ra : schedule_tail+0x70/0xb2 kernel/sched/core.c:4264 epc : ffffffe00008c8b0 ra : ffffffe00008c8ae sp : ffffffe025d17ec0 gp : ffffffe005d25378 tp : ffffffe00f0d0000 t0 : 0000000000000000 t1 : 0000000000000001 t2 : 00000000000f4240 s0 : ffffffe025d17ee0 s1 : 000000002749f0d0 a0 : 000000000000002a a1 : 0000000000000003 a2 : 1ffffffc0cfac500 a3 : ffffffe0000c80cc a4 : 5ae9db91c19bbe00 a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffe000082eba s2 : 0000000000040000 s3 : ffffffe00eef96c0 s4 : ffffffe022c77fe0 s5 : 0000000000004000 s6 : ffffffe067d74e00 s7 : ffffffe067d74850 s8 : ffffffe067d73e18 s9 : ffffffe067d74e00 s10: ffffffe00eef96e8 s11: 000000ae6cdf8368 t3 : 5ae9db91c19bbe00 t4 : ffffffc4043cafb2 t5 : ffffffc4043cafba t6 : 0000000000040000 status: 0000000000000120 badaddr: 000000002749f0d0 cause: 000000000000000f Call Trace: [<ffffffe00008c8b0>] schedule_tail+0x72/0xb2 kernel/sched/core.c:4264 [<ffffffe000005570>] ret_from_exception+0x0/0x14 Dumping ftrace buffer: (ftrace buffer empty) ---[ end trace b5f8f9231dc87dda ]--- The issue comes from the put_user() in schedule_tail (kernel/sched/core.c) doing the following: asmlinkage __visible void schedule_tail(struct task_struct *prev) { ... if (current->set_child_tid) put_user(task_pid_vnr(current), current->set_child_tid); ... } the put_user() macro causes the code sequence to come out as follows: 1: __enable_user_access() 2: reg = task_pid_vnr(current); 3: *current->set_child_tid = reg; 4: __disable_user_access() The problem is that we may have a sleeping function as argument which could clear SR_SUM causing the panic above. This was fixed by evaluating the argument of the put_user() macro outside the user-enabled section in commit 285a76bb2cf5 ("riscv: evaluate put_user() arg before enabling user access")" In order for riscv to take advantage of unsafe_get/put_XXX() macros and to avoid the same issue we had with put_user() and sleeping functions we must ensure code flow can go through switch_to() from within a region of code with SR_SUM enabled and come back with SR_SUM still enabled. This patch addresses the problem allowing future work to enable full use of unsafe_get/put_XXX() macros without needing to take a CSR bit flip cost on every access. Make switch_to() save and restore SR_SUM. Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Deepak Gupta <debug@rivosinc.com> Link: https://lore.kernel.org/r/20250410070526.3160847-2-cyrilbur@tenstorrent.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08riscv: Disallow PR_GET_TAGGED_ADDR_CTRL without SupmSamuel Holland
When the prctl() interface for pointer masking was added, it did not check that the pointer masking ISA extension was supported, only the individual submodes. Userspace could still attempt to disable pointer masking and query the pointer masking state. commit 81de1afb2dd1 ("riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL") disallowed the former, as the senvcfg write could crash on older systems. PR_GET_TAGGED_ADDR_CTRL state does not crash, because it reads only kernel-internal state and not senvcfg, but it should still be disallowed for consistency. Fixes: 09d6775f503b ("riscv: Add support for userspace pointer masking") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20250507145230.2272871-1-samuel.holland@sifive.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRLNam Cao
When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not available, the kernel crashes: Oops - illegal instruction [#1] [snip] epc : set_tagged_addr_ctrl+0x112/0x15a ra : set_tagged_addr_ctrl+0x74/0x15a epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10 [snip] status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002 set_tagged_addr_ctrl+0x112/0x15a __riscv_sys_prctl+0x352/0x73c do_trap_ecall_u+0x17c/0x20c andle_exception+0x150/0x15c Fix it by checking if Supm is available. Fixes: 09d6775f503b ("riscv: Add support for userspace pointer masking") Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20250504101920.3393053-1-namcao@linutronix.de Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08riscv: misaligned: use get_user() instead of __get_user()Clément Léger
Now that we can safely handle user memory accesses while in the misaligned access handlers, use get_user() instead of __get_user() to have user memory access checks. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250422162324.956065-4-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08riscv: misaligned: enable IRQs while handling misaligned accessesClément Léger
We can safely reenable IRQs if coming from userspace. This allows to access user memory that could potentially trigger a page fault. Fixes: b686ecdeacf6 ("riscv: misaligned: Restrict user access to kernel memory") Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250422162324.956065-3-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08riscv: misaligned: factorize trap handlingClément Léger
Since both load/store and user/kernel should use almost the same path and that we are going to add some code around that, factorize it. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250422162324.956065-2-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-05riscv: misaligned: Add handling for ZCB instructionsNylon Chen
Add support for the Zcb extension's compressed half-word instructions (C.LHU, C.LH, and C.SH) in the RISC-V misaligned access trap handler. Signed-off-by: Zong Li <zong.li@sifive.com> Signed-off-by: Nylon Chen <nylon.chen@sifive.com> Fixes: 956d705dd279 ("riscv: Unaligned load/store handling for M_MODE") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250411073850.3699180-2-nylon.chen@sifive.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-29riscv: entry: Split ret_from_fork() into user and kernelCharlie Jenkins
This function was unified into a single function in commit ab9164dae273 ("riscv: entry: Consolidate ret_from_kernel_thread into ret_from_fork"). However that imposed a performance degradation. Partially reverting this commit to have ret_from_fork() split again, results in a 1% increase on the number of times fork is able to be called per second. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/all/20250320-riscv_optimize_entry-v6-2-63e187e26041@rivosinc.com
2025-04-29riscv: entry: Convert ret_from_fork() to CCharlie Jenkins
Move the main section of ret_from_fork() to C to allow inlining of syscall_exit_to_user_mode(). Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/all/20250320-riscv_optimize_entry-v6-1-63e187e26041@rivosinc.com
2025-04-25Merge tag 'riscv-for-linus-6.15-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix for a missing icache flush in uprobes, which manifests as at least a BFF selftest failure on the Spacemit X1 - A workaround for build warnings in flush_icache_range() * tag 'riscv-for-linus-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: uprobes: Add missing fence.i after building the XOL buffer riscv: Replace function-like macro by static inline function
2025-04-24riscv: uprobes: Add missing fence.i after building the XOL bufferBjörn Töpel
The XOL (execute out-of-line) buffer is used to single-step the replaced instruction(s) for uprobes. The RISC-V port was missing a proper fence.i (i$ flushing) after constructing the XOL buffer, which can result in incorrect execution of stale/broken instructions. This was found running the BPF selftests "test_progs: uprobe_autoattach, attach_probe" on the Spacemit K1/X60, where the uprobes tests randomly blew up. Reviewed-by: Guo Ren <guoren@kernel.org> Fixes: 74784081aac8 ("riscv: Add uprobes supported") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250419111402.1660267-2-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-23Fix mis-uses of 'cc-option' for warning disablementLinus Torvalds
This was triggered by one of my mis-uses causing odd build warnings on sparc in linux-next, but while figuring out why the "obviously correct" use of cc-option caused such odd breakage, I found eight other cases of the same thing in the tree. The root cause is that 'cc-option' doesn't work for checking negative warning options (ie things like '-Wno-stringop-overflow') because gcc will silently accept options it doesn't recognize, and so 'cc-option' ends up thinking they are perfectly fine. And it all works, until you have a situation where _another_ warning is emitted. At that point the compiler will go "Hmm, maybe the user intended to disable this warning but used that wrong option that I didn't recognize", and generate a warning for the unrecognized negative option. Which explains why we have several cases of this in the tree: the 'cc-option' test really doesn't work for this situation, but most of the time it simply doesn't matter that ity doesn't work. The reason my recently added case caused problems on sparc was pointed out by Thomas Weißschuh: the sparc build had a previous explicit warning that then triggered the new one. I think the best fix for this would be to make 'cc-option' a bit smarter about this sitation, possibly by adding an intentional warning to the test case that then triggers the unrecognized option warning reliably. But the short-term fix is to replace 'cc-option' with an existing helper designed for this exact case: 'cc-disable-warning', which picks the negative warning but uses the positive form for testing the compiler support. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/all/20250422204718.0b4e3f81@canb.auug.org.au/ Explained-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-18Merge patch series "riscv: misaligned: Add ZCB handling and fix sleeping ↵Palmer Dabbelt
function" Nylon Chen <nylon.chen@sifive.com> says: 1. Adds support for ZCB compressed instructions (C.LHU, C.LH, C.SH). 2. Fixes a bug where copy_from/to_user() calls in non-sleepable contexts triggered attempts to sleep. Signed-off-by: Zong Li <zong.li@sifive.com> Signed-off-by: Nylon Chen nylon.chen@sifive.com Nylon Chen (2): riscv: misaligned: Add handling for ZCB instructions riscv: misaligned: fix sleeping function called during misaligned access handling * b4-shazam-merge: riscv: misaligned: fix sleeping function called during misaligned access handling riscv: misaligned: Add handling for ZCB instructions Link: https://lore.kernel.org/r/20250411073850.3699180-1-nylon.chen@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-18riscv: misaligned: Add handling for ZCB instructionsNylon Chen
Add support for the Zcb extension's compressed half-word instructions (C.LHU, C.LH, and C.SH) in the RISC-V misaligned access trap handler. Signed-off-by: Zong Li <zong.li@sifive.com> Signed-off-by: Nylon Chen <nylon.chen@sifive.com> Link: https://lore.kernel.org/r/20250411073850.3699180-2-nylon.chen@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-18riscv: misaligned: fix sleeping function called during misaligned access ↵Nylon Chen
handling Use copy_from_user_nofault() and copy_to_user_nofault() instead of copy_from/to_user functions in the misaligned access trap handlers. The following bug report was found when executing misaligned memory accesses: BUG: sleeping function called from invalid context at ./include/linux/uaccess.h:162 in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 115, name: two preempt_count: 0, expected: 0 CPU: 0 UID: 0 PID: 115 Comm: two Not tainted 6.14.0-rc5 #24 Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff800160ea>] dump_backtrace+0x1c/0x24 [<ffffffff80002304>] show_stack+0x28/0x34 [<ffffffff80010fae>] dump_stack_lvl+0x4a/0x68 [<ffffffff80010fe0>] dump_stack+0x14/0x1c [<ffffffff8004e44e>] __might_resched+0xfa/0x104 [<ffffffff8004e496>] __might_sleep+0x3e/0x62 [<ffffffff801963c4>] __might_fault+0x1c/0x24 [<ffffffff80425352>] _copy_from_user+0x28/0xaa [<ffffffff8000296c>] handle_misaligned_store+0x204/0x254 [<ffffffff809eae82>] do_trap_store_misaligned+0x24/0xee [<ffffffff809f4f1a>] handle_exception+0x146/0x152 Fixes: b686ecdeacf6 ("riscv: misaligned: Restrict user access to kernel memory") Fixes: 441381506ba7 ("riscv: misaligned: remove CONFIG_RISCV_M_MODE specific code") Signed-off-by: Zong Li <zong.li@sifive.com> Signed-off-by: Nylon Chen <nylon.chen@sifive.com> Link: https://lore.kernel.org/r/20250411073850.3699180-3-nylon.chen@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-16Merge tag 'riscv-fixes-6.15-rc3' of ↵Palmer Dabbelt
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into fixes riscv fixes for 6.15-rc3 - A couple of fixes regarding module relocations - Fix a build error by implementing missing alternative macros - Another fix for kexec by fixing /proc/iomem * tag 'riscv-fixes-6.15-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux: riscv: Avoid fortify warning in syscall_get_arguments() riscv: Provide all alternative macros all the time riscv: module: Allocate PLT entries for R_RISCV_PLT32 riscv: module: Fix out-of-bounds relocation access riscv: Properly export reserved regions in /proc/iomem riscv: Fix unaligned access info messages
2025-04-16Merge patch series "riscv: Rework the arch_kgdb_breakpoint() implementation"Palmer Dabbelt
WangYuli <wangyuli@uniontech.com> says: 1. The arch_kgdb_breakpoint() function defines the kgdb_compiled_break symbol using inline assembly. There's a potential issue where the compiler might inline arch_kgdb_breakpoint(), which would then define the kgdb_compiled_break symbol multiple times, leading to fail to link vmlinux.o. This isn't merely a potential compilation problem. The intent here is to determine the global symbol address of kgdb_compiled_break, and if this function is inlined multiple times, it would logically be a grave error. 2. Remove ".option norvc/.option rvc" to fix a bug that the C extension would unconditionally enable even if the kernel is being built with CONFIG_RISCV_ISA_C=n. * b4-shazam-merge: riscv: KGDB: Remove ".option norvc/.option rvc" for kgdb_compiled_break riscv: KGDB: Do not inline arch_kgdb_breakpoint() Link: https://lore.kernel.org/r/D5A83DF3A06E1DF9+20250411072905.55134-1-wangyuli@uniontech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-16riscv: KGDB: Remove ".option norvc/.option rvc" for kgdb_compiled_breakWangYuli
[ Quoting Samuel Holland: ] This is a separate issue, but using ".option rvc" here is a bug. It will unconditionally enable the C extension for the rest of the file, even if the kernel is being built with CONFIG_RISCV_ISA_C=n. [ Quoting Palmer Dabbelt: ] We're just looking at the address of kgdb_compiled_break, so it's fine if it ends up as a c.ebreak. [ Quoting Alexandre Ghiti: ] .option norvc is used to prevent the assembler from using compressed instructions, but it's generally used when we need to ensure the size of the instructions that are used, which is not the case here as noted by Palmer since we only care about the address. So yes it will work fine with C enabled :) So let's just remove them all. Link: https://lore.kernel.org/all/4b4187c1-77e5-44b7-885f-d6826723dd9a@sifive.com/ Link: https://lore.kernel.org/all/mhng-69513841-5068-441d-be8f-2aeebdc56a08@palmer-ri-x1c9a/ Link: https://lore.kernel.org/all/23693e7f-4fff-40f3-a437-e06d827278a5@ghiti.fr/ Fixes: fe89bd2be866 ("riscv: Add KGDB support") Cc: Samuel Holland <samuel.holland@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: WangYuli <wangyuli@uniontech.com> Link: https://lore.kernel.org/r/8B431C6A4626225C+20250411073222.56820-2-wangyuli@uniontech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-16riscv: KGDB: Do not inline arch_kgdb_breakpoint()WangYuli
The arch_kgdb_breakpoint() function defines the kgdb_compiled_break symbol using inline assembly. There's a potential issue where the compiler might inline arch_kgdb_breakpoint(), which would then define the kgdb_compiled_break symbol multiple times, leading to fail to link vmlinux.o. This isn't merely a potential compilation problem. The intent here is to determine the global symbol address of kgdb_compiled_break, and if this function is inlined multiple times, it would logically be a grave error. Link: https://lore.kernel.org/all/4b4187c1-77e5-44b7-885f-d6826723dd9a@sifive.com/ Link: https://lore.kernel.org/all/5b0adf9b-2b22-43fe-ab74-68df94115b9a@ghiti.fr/ Link: https://lore.kernel.org/all/23693e7f-4fff-40f3-a437-e06d827278a5@ghiti.fr/ Fixes: fe89bd2be866 ("riscv: Add KGDB support") Co-developed-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: WangYuli <wangyuli@uniontech.com> Link: https://lore.kernel.org/r/F22359AFB6FF9FD8+20250411073222.56820-1-wangyuli@uniontech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-16RISC-V: vDSO: Wire up getrandom() vDSO implementationXi Ruoyao
Hook up the generic vDSO implementation to the generic vDSO getrandom implementation by providing the required __arch_chacha20_blocks_nostack and getrandom_syscall implementations. Also wire up the selftests. The benchmark result: vdso: 25000000 times in 2.466341333 seconds libc: 25000000 times in 41.447720005 seconds syscall: 25000000 times in 41.043926672 seconds vdso: 25000000 x 256 times in 162.286219353 seconds libc: 25000000 x 256 times in 2953.855018685 seconds syscall: 25000000 x 256 times in 2796.268546000 seconds Signed-off-by: Xi Ruoyao <xry111@xry111.site> Link: https://lore.kernel.org/r/20250411024600.16045-1-xry111@xry111.site Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-14riscv: module: Allocate PLT entries for R_RISCV_PLT32Samuel Holland
apply_r_riscv_plt32_rela() may need to emit a PLT entry for the referenced symbol, so there must be space allocated in the PLT. Fixes: 8fd6c5142395 ("riscv: Add remaining module relocations") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250409171526.862481-2-samuel.holland@sifive.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14riscv: module: Fix out-of-bounds relocation accessSamuel Holland
The current code allows rel[j] to access one element past the end of the relocation section. Simplify to num_relocations which is equivalent to the existing size expression. Fixes: 080c4324fa5e ("riscv: optimize ELF relocation function in riscv") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Maxim Kochetkov <fido_max@inbox.ru> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250409171526.862481-1-samuel.holland@sifive.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14riscv: Properly export reserved regions in /proc/iomemBjörn Töpel
The /proc/iomem represents the kernel's memory map. Regions marked with "Reserved" tells the user that the range should not be tampered with. Kexec-tools, when using the older kexec_load syscall relies on the "Reserved" regions to build the memory segments, that will be the target of the new kexec'd kernel. The RISC-V port tries to expose all reserved regions to userland, but some regions were not properly exposed: Regions that resided in both the "regular" and reserved memory block, e.g. the EFI Memory Map. A missing entry could result in reserved memory being overwritten. It turns out, that arm64, and loongarch had a similar issue a while back: commit d91680e687f4 ("arm64: Fix /proc/iomem for reserved but not memory regions") commit 50d7ba36b916 ("arm64: export memblock_reserve()d regions via /proc/iomem") Similar to the other ports, resolve the issue by splitting the regions in an arch initcall, since we need a working allocator. Fixes: ffe0e5261268 ("RISC-V: Improve init_resources()") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250409182129.634415-1-bjorn@kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14riscv: Fix unaligned access info messagesAndrew Jones
Ensure we only print messages about command line parameters when the parameters are actually in use. Also complain about the use of the vector parameter when vector support isn't available. Fixes: aecb09e091dc ("riscv: Add parameter for skipping access speed tests") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/all/CAMuHMdVEp2_ho51gkpLLJG2HimqZ1gZ0fa=JA4uNNZjFFqaPMg@mail.gmail.com/ Closes: https://lore.kernel.org/all/CAMuHMdWVMP0MYCLFq+b7H_uz-2omdFiDDUZq0t_gw0L9rrJtkQ@mail.gmail.com/ Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250409153650.84433-2-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-07riscv: Use kvmalloc_array on relocation_hashtableWill Pierce
The number of relocations may be a huge value that is unallocatable by kmalloc. Use kvmalloc instead so that it does not fail. Fixes: 8fd6c5142395 ("riscv: Add remaining module relocations") Suggested-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Will Pierce <wgpierce17@gmail.com> Link: https://lore.kernel.org/r/20250402081426.5197-1-wgpierce17@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-04Merge tag 'riscv-for-linus-6.15-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - The sub-architecture selection Kconfig system has been cleaned up, the documentation has been improved, and various detections have been fixed - The vector-related extensions dependencies are now validated when parsing from device tree and in the DT bindings - Misaligned access probing can be overridden via a kernel command-line parameter, along with various fixes to misalign access handling - Support for relocatable !MMU kernels builds - Support for hpge pfnmaps, which should improve TLB utilization - Support for runtime constants, which improves the d_hash() performance - Support for bfloat16, Zicbom, Zaamo, Zalrsc, Zicntr, Zihpm - Various fixes, including: - We were missing a secondary mmu notifier call when flushing the tlb which is required for IOMMU - Fix ftrace panics by saving the registers as expected by ftrace - Fix a couple of stimecmp usage related to cpu hotplug - purgatory_start is now aligned as per the STVEC requirements - A fix for hugetlb when calculating the size of non-present PTEs * tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (65 commits) riscv: Add norvc after .option arch in runtime const riscv: Make sure toolchain supports zba before using zba instructions riscv/purgatory: 4B align purgatory_start riscv/kexec_file: Handle R_RISCV_64 in purgatory relocator selftests: riscv: fix v_exec_initval_nolibc.c riscv: Fix hugetlb retrieval of number of ptes in case of !present pte riscv: print hartid on bringup riscv: Add norvc after .option arch in runtime const riscv: Remove CONFIG_PAGE_OFFSET riscv: Support CONFIG_RELOCATABLE on riscv32 asm-generic: Always define Elf_Rel and Elf_Rela riscv: Support CONFIG_RELOCATABLE on NOMMU riscv: Allow NOMMU kernels to access all of RAM riscv: Remove duplicate CONFIG_PAGE_OFFSET definition RISC-V: errata: Use medany for relocatable builds dt-bindings: riscv: document vector crypto requirements dt-bindings: riscv: add vector sub-extension dependencies dt-bindings: riscv: d requires f RISC-V: add f & d extension validation checks RISC-V: add vector crypto extension validation checks ...
2025-04-01Merge tag 'mm-stable-2025-03-30-16-52' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - The series "Enable strict percpu address space checks" from Uros Bizjak uses x86 named address space qualifiers to provide compile-time checking of percpu area accesses. This has caused a small amount of fallout - two or three issues were reported. In all cases the calling code was found to be incorrect. - The series "Some cleanup for memcg" from Chen Ridong implements some relatively monir cleanups for the memcontrol code. - The series "mm: fixes for device-exclusive entries (hmm)" from David Hildenbrand fixes a boatload of issues which David found then using device-exclusive PTE entries when THP is enabled. More work is needed, but this makes thins better - our own HMM selftests now succeed. - The series "mm: zswap: remove z3fold and zbud" from Yosry Ahmed remove the z3fold and zbud implementations. They have been deprecated for half a year and nobody has complained. - The series "mm: further simplify VMA merge operation" from Lorenzo Stoakes implements numerous simplifications in this area. No runtime effects are anticipated. - The series "mm/madvise: remove redundant mmap_lock operations from process_madvise()" from SeongJae Park rationalizes the locking in the madvise() implementation. Performance gains of 20-25% were observed in one MADV_DONTNEED microbenchmark. - The series "Tiny cleanup and improvements about SWAP code" from Baoquan He contains a number of touchups to issues which Baoquan noticed when working on the swap code. - The series "mm: kmemleak: Usability improvements" from Catalin Marinas implements a couple of improvements to the kmemleak user-visible output. - The series "mm/damon/paddr: fix large folios access and schemes handling" from Usama Arif provides a couple of fixes for DAMON's handling of large folios. - The series "mm/damon/core: fix wrong and/or useless damos_walk() behaviors" from SeongJae Park fixes a few issues with the accuracy of kdamond's walking of DAMON regions. - The series "expose mapping wrprotect, fix fb_defio use" from Lorenzo Stoakes changes the interaction between framebuffer deferred-io and core MM. No functional changes are anticipated - this is preparatory work for the future removal of page structure fields. - The series "mm/damon: add support for hugepage_size DAMOS filter" from Usama Arif adds a DAMOS filter which permits the filtering by huge page sizes. - The series "mm: permit guard regions for file-backed/shmem mappings" from Lorenzo Stoakes extends the guard region feature from its present "anon mappings only" state. The feature now covers shmem and file-backed mappings. - The series "mm: batched unmap lazyfree large folios during reclamation" from Barry Song cleans up and speeds up the unmapping for pte-mapped large folios. - The series "reimplement per-vma lock as a refcount" from Suren Baghdasaryan puts the vm_lock back into the vma. Our reasons for pulling it out were largely bogus and that change made the code more messy. This patchset provides small (0-10%) improvements on one microbenchmark. - The series "Docs/mm/damon: misc DAMOS filters documentation fixes and improves" from SeongJae Park does some maintenance work on the DAMON docs. - The series "hugetlb/CMA improvements for large systems" from Frank van der Linden addresses a pile of issues which have been observed when using CMA on large machines. - The series "mm/damon: introduce DAMOS filter type for unmapped pages" from SeongJae Park enables users of DMAON/DAMOS to filter my the page's mapped/unmapped status. - The series "zsmalloc/zram: there be preemption" from Sergey Senozhatsky teaches zram to run its compression and decompression operations preemptibly. - The series "selftests/mm: Some cleanups from trying to run them" from Brendan Jackman fixes a pile of unrelated issues which Brendan encountered while runnimg our selftests. - The series "fs/proc/task_mmu: add guard region bit to pagemap" from Lorenzo Stoakes permits userspace to use /proc/pid/pagemap to determine whether a particular page is a guard page. - The series "mm, swap: remove swap slot cache" from Kairui Song removes the swap slot cache from the allocation path - it simply wasn't being effective. - The series "mm: cleanups for device-exclusive entries (hmm)" from David Hildenbrand implements a number of unrelated cleanups in this code. - The series "mm: Rework generic PTDUMP configs" from Anshuman Khandual implements a number of preparatoty cleanups to the GENERIC_PTDUMP Kconfig logic. - The series "mm/damon: auto-tune aggregation interval" from SeongJae Park implements a feedback-driven automatic tuning feature for DAMON's aggregation interval tuning. - The series "Fix lazy mmu mode" from Ryan Roberts fixes some issues in powerpc, sparc and x86 lazy MMU implementations. Ryan did this in preparation for implementing lazy mmu mode for arm64 to optimize vmalloc. - The series "mm/page_alloc: Some clarifications for migratetype fallback" from Brendan Jackman reworks some commentary to make the code easier to follow. - The series "page_counter cleanup and size reduction" from Shakeel Butt cleans up the page_counter code and fixes a size increase which we accidentally added late last year. - The series "Add a command line option that enables control of how many threads should be used to allocate huge pages" from Thomas Prescher does that. It allows the careful operator to significantly reduce boot time by tuning the parallalization of huge page initialization. - The series "Fix calculations in trace_balance_dirty_pages() for cgwb" from Tang Yizhou fixes the tracing output from the dirty page balancing code. - The series "mm/damon: make allow filters after reject filters useful and intuitive" from SeongJae Park improves the handling of allow and reject filters. Behaviour is made more consistent and the documention is updated accordingly. - The series "Switch zswap to object read/write APIs" from Yosry Ahmed updates zswap to the new object read/write APIs and thus permits the removal of some legacy code from zpool and zsmalloc. - The series "Some trivial cleanups for shmem" from Baolin Wang does as it claims. - The series "fs/dax: Fix ZONE_DEVICE page reference counts" from Alistair Popple regularizes the weird ZONE_DEVICE page refcount handling in DAX, permittig the removal of a number of special-case checks. - The series "refactor mremap and fix bug" from Lorenzo Stoakes is a preparatoty refactoring and cleanup of the mremap() code. - The series "mm: MM owner tracking for large folios (!hugetlb) + CONFIG_NO_PAGE_MAPCOUNT" from David Hildenbrand reworks the manner in which we determine whether a large folio is known to be mapped exclusively into a single MM. - The series "mm/damon: add sysfs dirs for managing DAMOS filters based on handling layers" from SeongJae Park adds a couple of new sysfs directories to ease the management of DAMON/DAMOS filters. - The series "arch, mm: reduce code duplication in mem_init()" from Mike Rapoport consolidates many per-arch implementations of mem_init() into code generic code, where that is practical. - The series "mm/damon/sysfs: commit parameters online via damon_call()" from SeongJae Park continues the cleaning up of sysfs access to DAMON internal data. - The series "mm: page_ext: Introduce new iteration API" from Luiz Capitulino reworks the page_ext initialization to fix a boot-time crash which was observed with an unusual combination of compile and cmdline options. - The series "Buddy allocator like (or non-uniform) folio split" from Zi Yan reworks the code to split a folio into smaller folios. The main benefit is lessened memory consumption: fewer post-split folios are generated. - The series "Minimize xa_node allocation during xarry split" from Zi Yan reduces the number of xarray xa_nodes which are generated during an xarray split. - The series "drivers/base/memory: Two cleanups" from Gavin Shan performs some maintenance work on the drivers/base/memory code. - The series "Add tracepoints for lowmem reserves, watermarks and totalreserve_pages" from Martin Liu adds some more tracepoints to the page allocator code. - The series "mm/madvise: cleanup requests validations and classifications" from SeongJae Park cleans up some warts which SeongJae observed during his earlier madvise work. - The series "mm/hwpoison: Fix regressions in memory failure handling" from Shuai Xue addresses two quite serious regressions which Shuai has observed in the memory-failure implementation. - The series "mm: reliable huge page allocator" from Johannes Weiner makes huge page allocations cheaper and more reliable by reducing fragmentation. - The series "Minor memcg cleanups & prep for memdescs" from Matthew Wilcox is preparatory work for the future implementation of memdescs. - The series "track memory used by balloon drivers" from Nico Pache introduces a way to track memory used by our various balloon drivers. - The series "mm/damon: introduce DAMOS filter type for active pages" from Nhat Pham permits users to filter for active/inactive pages, separately for file and anon pages. - The series "Adding Proactive Memory Reclaim Statistics" from Hao Jia separates the proactive reclaim statistics from the direct reclaim statistics. - The series "mm/vmscan: don't try to reclaim hwpoison folio" from Jinjiang Tu fixes our handling of hwpoisoned pages within the reclaim code. * tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (431 commits) mm/page_alloc: remove unnecessary __maybe_unused in order_to_pindex() x86/mm: restore early initialization of high_memory for 32-bits mm/vmscan: don't try to reclaim hwpoison folio mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper cgroup: docs: add pswpin and pswpout items in cgroup v2 doc mm: vmscan: split proactive reclaim statistics from direct reclaim statistics selftests/mm: speed up split_huge_page_test selftests/mm: uffd-unit-tests support for hugepages > 2M docs/mm/damon/design: document active DAMOS filter type mm/damon: implement a new DAMOS filter type for active pages fs/dax: don't disassociate zero page entries MM documentation: add "Unaccepted" meminfo entry selftests/mm: add commentary about 9pfs bugs fork: use __vmalloc_node() for stack allocation docs/mm: Physical Memory: Populate the "Zones" section xen: balloon: update the NR_BALLOON_PAGES state hv_balloon: update the NR_BALLOON_PAGES state balloon_compaction: update the NR_BALLOON_PAGES state meminfo: add a per node counter for balloon drivers mm: remove references to folio in __memcg_kmem_uncharge_page() ...
2025-04-01riscv/kexec_file: Handle R_RISCV_64 in purgatory relocatorYao Zi
Commit 58ff537109ac ("riscv: Omit optimized string routines when using KASAN") introduced calls to EXPORT_SYMBOL() in assembly string routines, which result in R_RISCV_64 relocations against .export_symbol section. As these rountines are reused by RISC-V purgatory and our relocator doesn't recognize these relocations, this fails kexec-file-load with dmesg like [ 11.344251] kexec_image: Unknown rela relocation: 2 [ 11.345972] kexec_image: Error loading purgatory ret=-8 Let's support R_RISCV_64 relocation to fix kexec on 64-bit RISC-V. 32-bit variant isn't covered since KEXEC_FILE and KEXEC_PURGATORY isn't available. Fixes: 58ff537109ac ("riscv: Omit optimized string routines when using KASAN") Signed-off-by: Yao Zi <ziyao@disroot.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250326051445.55131-2-ziyao@disroot.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01Merge patch series "Add some validation for vector, vector crypto and fp stuff"Alexandre Ghiti
Conor Dooley <conor@kernel.org> says: From: Conor Dooley <conor.dooley@microchip.com> Yo, This series is partly leveraging Clement's work adding a validate callback in the extension detection code so that things like checking for whether a vector crypto extension is usable can be done like: has_extension(<vector crypto>) rather than has_vector() && has_extension(<vector crypto>) which Eric pointed out was a poor design some months ago. The rest of this is adding some requirements to the bindings that prevent combinations of extensions disallowed by the ISA. There's a bunch of over-long lines in here, but I thought that the over-long lines were clearer than breaking them up. Cheers, Conor. * patches from https://lore.kernel.org/r/20250312-abide-pancreas-3576b8c44d2c@spud: dt-bindings: riscv: document vector crypto requirements dt-bindings: riscv: add vector sub-extension dependencies dt-bindings: riscv: d requires f RISC-V: add f & d extension validation checks RISC-V: add vector crypto extension validation checks RISC-V: add vector extension validation checks Link: https://lore.kernel.org/r/20250312-abide-pancreas-3576b8c44d2c@spud Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01riscv: print hartid on bringupYunhui Cui
Firmware randomly releases cores, so CPU numbers don't linearly map to hartids. When the system has an exception, we care more about hartids. Adding "dyndbg="file smpboot.c +p" loglevel=8" to the cmdline can output the hartid. Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250303083424.14309-1-cuiyunhui@bytedance.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25Merge tag 'timers-vdso-2025-03-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull VDSO infrastructure updates from Thomas Gleixner: - Consolidate the VDSO storage The VDSO data storage and data layout has been largely architecture specific for historical reasons. That increases the maintenance effort and causes inconsistencies over and over. There is no real technical reason for architecture specific layouts and implementations. The architecture specific details can easily be integrated into a generic layout, which also reduces the amount of duplicated code for managing the mappings. Convert all architectures over to a unified layout and common mapping infrastructure. This splits the VDSO data layout into subsystem specific blocks, timekeeping, random and architecture parts, which provides a better structure and allows to improve and update the functionalities without conflict and interaction. - Rework the timekeeping data storage The current implementation is designed for exposing system timekeeping accessors, which was good enough at the time when it was designed. PTP and Time Sensitive Networking (TSN) change that as there are requirements to expose independent PTP clocks, which are not related to system timekeeping. Replace the monolithic data storage by a structured layout, which allows to add support for independent PTP clocks on top while reusing both the data structures and the time accessor implementations. * tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits) sparc/vdso: Always reject undefined references during linking x86/vdso: Always reject undefined references during linking vdso: Rework struct vdso_time_data and introduce struct vdso_clock vdso: Move architecture related data before basetime data powerpc/vdso: Prepare introduction of struct vdso_clock arm64/vdso: Prepare introduction of struct vdso_clock x86/vdso: Prepare introduction of struct vdso_clock time/namespace: Prepare introduction of struct vdso_clock vdso/namespace: Rename timens_setup_vdso_data() to reflect new vdso_clock struct vdso/vsyscall: Prepare introduction of struct vdso_clock vdso/gettimeofday: Prepare helper functions for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_coarse_timens() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_coarse() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_hres_timens() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_hres() for introduction of struct vdso_clock vdso/gettimeofday: Prepare introduction of struct vdso_clock vdso/helpers: Prepare introduction of struct vdso_clock vdso/datapage: Define vdso_clock to prepare for multiple PTP clocks vdso: Make vdso_time_data cacheline aligned arm64: Make asm/cache.h compatible with vDSO ...
2025-03-25RISC-V: add f & d extension validation checksConor Dooley
Using Clement's new validation callbacks, support checking that dependencies have been satisfied for the floating point extensions. The check for "d" might be slightly confusingly shorter than that of "f", despite "d" depending on "f". This is because the requirement that a hart supporting double precision must also support single precision, should be validated by dt-bindings etc, not the kernel but lack of support for single precision only is a limitation of the kernel. Tested-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Clément Léger <cleger@rivosinc.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250312-reptile-platinum-62ee0f444a32@spud Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25RISC-V: add vector crypto extension validation checksConor Dooley
Using Clement's new validation callbacks, support checking that dependencies have been satisfied for the vector crpyto extensions. Currently riscv_isa_extension_available(<vector crypto>) will return true on systems that support the extensions but vector itself has been disabled by the kernel, adding validation callbacks will prevent such a scenario from occuring and make the behaviour of the extension detection functions more consistent with user expectations - it's not expected to have to check for vector AND the specific crypto extension. The Unpriv spec states: | The Zvknhb and Zvbc Vector Crypto Extensions --and accordingly the | composite extensions Zvkn, Zvknc, Zvkng, and Zvksc-- require a Zve64x | base, or application ("V") base Vector Extension. All of the other | Vector Crypto Extensions can be built on any embedded (Zve*) or | application ("V") base Vector Extension. While this could be used as the basis for checking that the correct base for individual crypto extensions, but that's not really the kernel's job in my opinion and it is sufficient to leave that sort of precision to the dt-bindings. The kernel only needs to make sure that vector, in some form, is available. Link: https://github.com/riscv/riscv-isa-manual/blob/main/src/vector-crypto.adoc#extensions-overview Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20250312-entertain-shaking-b664142c2f99@spud Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25RISC-V: add vector extension validation checksConor Dooley
Using Clement's new validation callbacks, support checking that dependencies have been satisfied for the vector extensions. From the kernel's perfective, it's not required to differentiate between the conditions for all the various vector subsets - it's the firmware's job to not report impossible combinations. Instead, the kernel only has to check that the correct config options are enabled and to enforce its requirement of the d extension being present for FPU support. Since vector will now be disabled proactively, there's no need to clear the bit in elf_hwcap in riscv_fill_hwcap() any longer. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250312-eclair-affluent-55b098c3602b@spud Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20Merge patch series "riscv: Add runtime constant support"Alexandre Ghiti
Charlie Jenkins <charlie@rivosinc.com> says: Ard brought this to my attention in this patch [1]. I benchmarked this patch on the Nezha D1 (which does not contain Zba or Zbkb so it uses the default algorithm) by navigating through a large directory structure. I created a 1000-deep directory structure and then cd and ls through it. With this patch there was a 0.57% performance improvement. [1] https://lore.kernel.org/lkml/CAMj1kXE4DJnwFejNWQu784GvyJO=aGNrzuLjSxiowX_e7nW8QA@mail.gmail.com/ * patches from https://lore.kernel.org/r/20250319-runtime_const_riscv-v10-0-745b31a11d65@rivosinc.com: riscv: Add runtime constant support riscv: Move nop definition to insn-def.h Link: https://lore.kernel.org/linux-riscv/20250319-runtime_const_riscv-v10-0-745b31a11d65@rivosinc.com/ Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20riscv: Add runtime constant supportCharlie Jenkins
Implement the runtime constant infrastructure for riscv. Use this infrastructure to generate constants to be used by the d_hash() function. This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime constant' support") and commit e3c92e81711d ("runtime constants: add x86 architecture support"). [ alex: Remove trailing whitespace ] Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250319-runtime_const_riscv-v10-2-745b31a11d65@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20riscv: Move nop definition to insn-def.hCharlie Jenkins
We have duplicated the definition of the nop instruction in ftrace.h and in jump_label.c. Move this definition into the generic file insn-def.h so that they can share the definition with each other and with future files. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250319-runtime_const_riscv-v10-1-745b31a11d65@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20Merge patch series "riscv: Unaligned access speed probing fixes and skipping"Alexandre Ghiti
Andrew Jones <ajones@ventanamicro.com> says: The first six patches of this series are fixes and cleanups of the unaligned access speed probing code. The next patch introduces a kernel command line option that allows the probing to be skipped. This command line option is a different approach than Jesse's [1]. [1] takes a cpu-list for a particular speed, supporting heterogeneous platforms. With this approach, the kernel command line should only be used for homogeneous platforms. [1] also only allowed 'fast' and 'slow' to be selected. This parameter also supports 'unsupported', which could be useful for testing code paths gated on that. The final patch adds the documentation. [1] https://lore.kernel.org/linux-riscv/20240805173816.3722002-1-jesse@rivosinc.com/ * patches from https://lore.kernel.org/r/20250304120014.143628-10-ajones@ventanamicro.com: Documentation/kernel-parameters: Add riscv unaligned speed parameters riscv: Add parameter for skipping access speed tests riscv: Fix set up of vector cpu hotplug callback riscv: Fix set up of cpu hotplug callbacks riscv: Change check_unaligned_access_speed_all_cpus to void riscv: Fix check_unaligned_access_all_cpus riscv: Fix riscv_online_cpu_vec riscv: Annotate unaligned access init functions Link: https://lore.kernel.org/r/20250304120014.143628-10-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19riscv: Add parameter for skipping access speed testsAndrew Jones
Allow skipping scalar and vector unaligned access speed tests. This is useful for testing alternative code paths and to skip the tests in environments where they run too slowly. All CPUs must have the same unaligned access speed. The code movement is because we now need the scalar cpu hotplug callback to always run, so we need to bring it and its supporting functions out of CONFIG_RISCV_PROBE_UNALIGNED_ACCESS. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250304120014.143628-17-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19riscv: Fix set up of vector cpu hotplug callbackAndrew Jones
Whether or not we have RISCV_PROBE_VECTOR_UNALIGNED_ACCESS we need to set up a cpu hotplug callback to check if we have vector at all, since, when we don't have vector, we need to set vector_misaligned_access to unsupported rather than leave it the default of unknown. Fixes: e7c9d66e313b ("RISC-V: Report vector unaligned access speed hwprobe") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250304120014.143628-16-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19riscv: Fix set up of cpu hotplug callbacksAndrew Jones
CPU hotplug callbacks should be set up even if we detected all current cpus emulate misaligned accesses, since we want to ensure our expectations of all cpus emulating is maintained. Fixes: 6e5ce7f2eae3 ("riscv: Decouple emulated unaligned accesses from access speed") Fixes: e7c9d66e313b ("RISC-V: Report vector unaligned access speed hwprobe") Reviewed-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250304120014.143628-15-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19riscv: Change check_unaligned_access_speed_all_cpus to voidAndrew Jones
The return value of check_unaligned_access_speed_all_cpus() is always zero, so make the function void so we don't need to concern ourselves with it. The change also allows us to tidy up check_unaligned_access_all_cpus() a bit. Reviewed-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250304120014.143628-14-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19riscv: Fix check_unaligned_access_all_cpusAndrew Jones
check_vector_unaligned_access_emulated_all_cpus(), like its name suggests, will return true when all cpus emulate unaligned vector accesses. If the function returned false it may have been because vector isn't supported at all (!has_vector()) or because at least one cpu doesn't emulate unaligned vector accesses. Since false may be returned for two cases, checking for it isn't sufficient when attempting to determine if we should proceed with the vector speed check. Move the !has_vector() functionality to check_unaligned_access_all_cpus() in order for check_vector_unaligned_access_emulated_all_cpus() to return false for a single case. Fixes: e7c9d66e313b ("RISC-V: Report vector unaligned access speed hwprobe") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250304120014.143628-13-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19riscv: Fix riscv_online_cpu_vecAndrew Jones
We shouldn't probe when we already know vector is unsupported and we should probe when we see we don't yet know whether it's supported. Furthermore, we should ensure we've set the access type to unsupported when we don't have vector at all. Fixes: e7c9d66e313b ("RISC-V: Report vector unaligned access speed hwprobe") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250304120014.143628-12-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19riscv: Annotate unaligned access init functionsAndrew Jones
Several functions used in unaligned access probing are only run at init time. Annotate them appropriately. Fixes: f413aae96cda ("riscv: Set unaligned access speed at compile time") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250304120014.143628-11-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>