summaryrefslogtreecommitdiff
path: root/arch/riscv/lib
AgeCommit message (Collapse)Author
2025-06-06Merge tag 'riscv-for-linus-6.16-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the FWFT SBI extension, which is part of SBI 3.0 and a dependency for many new SBI and ISA extensions - Support for getrandom() in the VDSO - Support for mseal - Optimized routines for raid6 syndrome and recovery calculations - kexec_file() supports loading Image-formatted kernel binaries - Improvements to the instruction patching framework to allow for atomic instruction patching, along with rules as to how systems need to behave in order to function correctly - Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha, some SiFive vendor extensions - Various fixes and cleanups, including: misaligned access handling, perf symbol mangling, module loading, PUD THPs, and improved uaccess routines * tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits) riscv: uaccess: Only restore the CSR_STATUS SUM bit RISC-V: vDSO: Wire up getrandom() vDSO implementation riscv: enable mseal sysmap for RV64 raid6: Add RISC-V SIMD syndrome and recovery calculations riscv: mm: Add support for Svinval extension RISC-V: Documentation: Add enough title underlines to CMODX riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE MAINTAINERS: Update Atish's email address riscv: uaccess: do not do misaligned accesses in get/put_user() riscv: process: use unsigned int instead of unsigned long for put_user() riscv: make unsafe user copy routines use existing assembly routines riscv: hwprobe: export Zabha extension riscv: Make regs_irqs_disabled() more clear perf symbols: Ignore mapping symbols on riscv RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND riscv: module: Optimize PLT/GOT entry counting riscv: Add support for PUD THP riscv: xchg: Prefetch the destination word for sc.w riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop riscv: Add support for Zicbop ...
2025-06-05riscv: make unsafe user copy routines use existing assembly routinesAlexandre Ghiti
The current implementation is underperforming and in addition, it triggers misaligned access traps on platforms which do not handle misaligned accesses in hardware. Use the existing assembly routines to solve both problems at once. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250602193918.868962-2-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
2025-05-12crypto: lib/chacha - add array bounds to function prototypesEric Biggers
Add explicit array bounds to the function prototypes for the parameters that didn't already get handled by the conversion to use chacha_state: - chacha_block_*(): Change 'u8 *out' or 'u8 *stream' to u8 out[CHACHA_BLOCK_SIZE]. - hchacha_block_*(): Change 'u32 *out' or 'u32 *stream' to u32 out[HCHACHA_OUT_WORDS]. - chacha_init(): Change 'const u32 *key' to 'const u32 key[CHACHA_KEY_WORDS]'. Change 'const u8 *iv' to 'const u8 iv[CHACHA_IV_SIZE]'. No functional changes. This just makes it clear when fixed-size arrays are expected. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12crypto: lib/chacha - strongly type the ChaCha stateEric Biggers
The ChaCha state matrix is 16 32-bit words. Currently it is represented in the code as a raw u32 array, or even just a pointer to u32. This weak typing is error-prone. Instead, introduce struct chacha_state: struct chacha_state { u32 x[16]; }; Convert all ChaCha and HChaCha functions to use struct chacha_state. No functional changes. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-05crypto: riscv/sha256 - Add simd block functionHerbert Xu
Add CRYPTO_ARCH_HAVE_LIB_SHA256_SIMD and a SIMD block function so that the caller can decide whether to use SIMD. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-05crypto: arch/sha256 - Export block functions as GPL onlyHerbert Xu
Export the block functions as GPL only, there is no reason to let arbitrary modules use these internal functions. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-05Revert "crypto: run initcalls for generic implementations earlier"Herbert Xu
This reverts commit c4741b23059794bd99beef0f700103b0d983b3fd. Crypto API self-tests no longer run at registration time and now occur either at late_initcall or upon the first use. Therefore the premise of the above commit no longer exists. Revert it and subsequent additions of subsys_initcall and arch_initcall. Note that lib/crypto calls will stay at subsys_initcall (or rather downgraded from arch_initcall) because they may need to occur before Crypto API registration. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-05crypto: riscv/sha256 - implement library instead of shashEric Biggers
Instead of providing crypto_shash algorithms for the arch-optimized SHA-256 code, instead implement the SHA-256 library. This is much simpler, it makes the SHA-256 library functions be arch-optimized, and it fixes the longstanding issue where the arch-optimized SHA-256 was disabled by default. SHA-256 still remains available through crypto_shash, but individual architectures no longer need to handle it. To match sha256_blocks_arch(), change the type of the nblocks parameter of the assembly function from int to size_t. The assembly function actually already treated it as size_t. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-28crypto: lib/chacha - remove INTERNAL symbol and selection of CRYPTOEric Biggers
Now that the architecture-optimized ChaCha kconfig symbols are defined regardless of CRYPTO, there is no need for CRYPTO_LIB_CHACHA to select CRYPTO. So, remove that. This makes the indirection through the CRYPTO_LIB_CHACHA_INTERNAL symbol unnecessary, so get rid of that and just use CRYPTO_LIB_CHACHA directly. Finally, make the fallback to the generic implementation use a default value instead of a select; this makes it consistent with how the arch-optimized code gets enabled and also with how CRYPTO_LIB_BLAKE2S_GENERIC gets enabled. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-28crypto: riscv - move library functions to arch/riscv/lib/crypto/Eric Biggers
Continue disentangling the crypto library functions from the generic crypto infrastructure by moving the riscv ChaCha library functions into a new directory arch/riscv/lib/crypto/ that does not depend on CRYPTO. This mirrors the distinction between crypto/ and lib/crypto/. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-04Merge tag 'riscv-for-linus-6.15-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - The sub-architecture selection Kconfig system has been cleaned up, the documentation has been improved, and various detections have been fixed - The vector-related extensions dependencies are now validated when parsing from device tree and in the DT bindings - Misaligned access probing can be overridden via a kernel command-line parameter, along with various fixes to misalign access handling - Support for relocatable !MMU kernels builds - Support for hpge pfnmaps, which should improve TLB utilization - Support for runtime constants, which improves the d_hash() performance - Support for bfloat16, Zicbom, Zaamo, Zalrsc, Zicntr, Zihpm - Various fixes, including: - We were missing a secondary mmu notifier call when flushing the tlb which is required for IOMMU - Fix ftrace panics by saving the registers as expected by ftrace - Fix a couple of stimecmp usage related to cpu hotplug - purgatory_start is now aligned as per the STVEC requirements - A fix for hugetlb when calculating the size of non-present PTEs * tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (65 commits) riscv: Add norvc after .option arch in runtime const riscv: Make sure toolchain supports zba before using zba instructions riscv/purgatory: 4B align purgatory_start riscv/kexec_file: Handle R_RISCV_64 in purgatory relocator selftests: riscv: fix v_exec_initval_nolibc.c riscv: Fix hugetlb retrieval of number of ptes in case of !present pte riscv: print hartid on bringup riscv: Add norvc after .option arch in runtime const riscv: Remove CONFIG_PAGE_OFFSET riscv: Support CONFIG_RELOCATABLE on riscv32 asm-generic: Always define Elf_Rel and Elf_Rela riscv: Support CONFIG_RELOCATABLE on NOMMU riscv: Allow NOMMU kernels to access all of RAM riscv: Remove duplicate CONFIG_PAGE_OFFSET definition RISC-V: errata: Use medany for relocatable builds dt-bindings: riscv: document vector crypto requirements dt-bindings: riscv: add vector sub-extension dependencies dt-bindings: riscv: d requires f RISC-V: add f & d extension validation checks RISC-V: add vector crypto extension validation checks ...
2025-03-18Merge patch series "RISC-V: clarify what some RISCV_ISA* config options do & ↵Alexandre Ghiti
redo Zbb toolchain dependency" Conor Dooley <conor@kernel.org> says: Since one depends on the other, albeit trivially, here's a v4 of the Zbb toolchain dep removal alongside the rewording of Kconfig options I'd sent out before the merge window. I think I like this implementation better than v1, but I couldn't think of a good name for a "public" version of __ALTERNATIVE(), so I used it here directly. Unfortunately "ALTERNATIVE_2_CFG" already exists and I couldn't think of a good way to name an alternative macro that allows for several config options that didn't make the distinction sufficiently clear.. Yell if you have better suggestions than I did. I am a wee bit "worried" that this makes the Kconfig option confusing as it isn't immediately obvious if someone is or is not going to get the toolchain based optimisations. Cheers, Conor. * patches from https://lore.kernel.org/r/20241024-aspire-rectify-9982da6943e5@spud: RISC-V: separate Zbb optimisations requiring and not requiring toolchain support RISC-V: clarify what some RISCV_ISA* config options do Link: https://lore.kernel.org/r/20241024-aspire-rectify-9982da6943e5@spud Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18RISC-V: separate Zbb optimisations requiring and not requiring toolchain supportConor Dooley
It seems a bit ridiculous to require toolchain support for BPF to assemble Zbb instructions, so move the dependency on toolchain support for Zbb optimisations out of the Kconfig option and to the callsites. Zbb support has always depended on alternatives, so while adjusting the config options guarding optimisations, remove any checks for whether or not alternatives are enabled. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20241024-chump-freebase-d26b6d81af33@spud Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-10riscv/crc64: add Zbc optimized CRC64 functionsEric Biggers
Wire up crc64_be_arch() and crc64_nvme_arch() for 64-bit RISC-V using crc-clmul-template.h. This greatly improves the performance of these CRCs on Zbc-capable CPUs in 64-bit kernels. These optimized CRC64 functions are not yet supported in 32-bit kernels, since crc-clmul-template.h assumes that the CRC fits in an unsigned long. That implementation limitation could be addressed, but it would add a fair bit of complexity, so it has been omitted for now. Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250216225530.306980-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10riscv/crc-t10dif: add Zbc optimized CRC-T10DIF functionEric Biggers
Wire up crc_t10dif_arch() for RISC-V using crc-clmul-template.h. This greatly improves CRC-T10DIF performance on Zbc-capable CPUs. Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250216225530.306980-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10riscv/crc32: reimplement the CRC32 functions using new templateEric Biggers
Delete the previous Zbc optimized CRC32 code, and re-implement it using the new template. The new implementation is more optimized and shares more code among CRC variants. Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250216225530.306980-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10riscv/crc: add "template" for Zbc optimized CRC functionsEric Biggers
Add a "template" crc-clmul-template.h that can generate RISC-V Zbc optimized CRC functions. Each generated CRC function is parameterized by CRC length and bit order, and it accepts a pointer to the constants struct required for the specific CRC polynomial desired. Update gen-crc-consts.py to support generating the needed constants structs. This makes it possible to easily wire up a Zbc optimized implementation of almost any CRC. The design generally follows what I did for x86, but it is simplified by using RISC-V's scalar carryless multiplication Zbc, which has no equivalent on x86. RISC-V's clmulr instruction is also helpful. A potential switch to Zvbc (or support for Zvbc alongside Zbc) is left for future work. For long messages Zvbc should be fastest, but it would need to be shown to be worthwhile over just using Zbc which is significantly more convenient to use, especially in the kernel context. Compared to the existing Zbc-optimized CRC32 code and the earlier proposed Zbc-optimized CRC-T10DIF code (https://lore.kernel.org/r/20250211071101.181652-1-zhihang.shao.iscas@gmail.com), this submission deduplicates the code among CRC variants and is significantly more optimized. It uses "folding" to take better advantage of instruction-level parallelism (to a more limited extent than x86 for now, but it could be extended to more), it reworks the Barrett reduction to eliminate unnecessary instructions, and it documents all the math used and makes all the constants reproducible. Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250216225530.306980-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08lib/crc32: remove "_le" from crc32c base and arch functionsEric Biggers
Following the standardization on crc32c() as the lib entry point for the Castagnoli CRC32 instead of the previous mix of crc32c(), crc32c_le(), and __crc32c_le(), make the same change to the underlying base and arch functions that implement it. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250208024911.14936-7-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08lib/crc32: don't bother with pure and const function attributesEric Biggers
Drop the use of __pure and __attribute_const__ from the CRC32 library functions that had them. Both of these are unusual optimizations that don't help properly written code. They seem more likely to cause problems than have any real benefit. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250208024911.14936-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01lib/crc32: expose whether the lib is really optimized at runtimeEric Biggers
Make the CRC32 library export a function crc32_optimizations() which returns flags that indicate which CRC32 functions are actually executing optimized code at runtime. This will be used to determine whether the crc32[c]-$arch shash algorithms should be registered in the crypto API. btrfs could also start using these flags instead of the hack that it currently uses where it parses the crypto_shash_driver_name. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20241202010844.144356-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01lib/crc32: improve support for arch-specific overridesEric Biggers
Currently the CRC32 library functions are defined as weak symbols, and the arm64 and riscv architectures override them. This method of arch-specific overrides has the limitation that it only works when both the base and arch code is built-in. Also, it makes the arch-specific code be silently not used if it is accidentally built with lib-y instead of obj-y; unfortunately the RISC-V code does this. This commit reorganizes the code to have explicit *_arch() functions that are called when they are enabled, similar to how some of the crypto library code works (e.g. chacha_crypt() calls chacha_crypt_arch()). Make the existing kconfig choice for the CRC32 implementation also control whether the arch-optimized implementation (if one is available) is enabled or not. Make it enabled by default if CRC32 is also enabled. The result is that arch-optimized CRC32 library functions will be included automatically when appropriate, but it is now possible to disable them. They can also now be built as a loadable module if the CRC32 library functions happen to be used only by loadable modules, in which case the arch and base CRC32 modules will be automatically loaded via direct symbol dependency when appropriate. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20241202010844.144356-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01lib/crc32: drop leading underscores from __crc32c_le_baseEric Biggers
Remove the leading underscores from __crc32c_le_base(). This is in preparation for adding crc32c_le_arch() and eventually renaming __crc32c_le() to crc32c_le(). Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20241202010844.144356-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-09-19Merge patch series "riscv: Improve KASAN coverage to fix unit tests"Palmer Dabbelt
Samuel Holland <samuel.holland@sifive.com> says: This series fixes two areas where uninstrumented assembly routines caused gaps in KASAN coverage on RISC-V, which were caught by KUnit tests. The KASAN KUnit test suite passes after applying this series. This series fixes the following test failures: # kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1520 KASAN failure expected in "kasan_int_result = strcmp(ptr, "2")", but none occurred # kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1524 KASAN failure expected in "kasan_int_result = strlen(ptr)", but none occurred not ok 60 kasan_strings # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1531 KASAN failure expected in "set_bit(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1533 KASAN failure expected in "clear_bit(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1535 KASAN failure expected in "clear_bit_unlock(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1536 KASAN failure expected in "__clear_bit_unlock(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1537 KASAN failure expected in "change_bit(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1543 KASAN failure expected in "test_and_set_bit(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1545 KASAN failure expected in "test_and_set_bit_lock(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1546 KASAN failure expected in "test_and_clear_bit(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1548 KASAN failure expected in "test_and_change_bit(nr, addr)", but none occurred not ok 61 kasan_bitops_generic Samuel Holland (2): riscv: Omit optimized string routines when using KASAN riscv: Enable bitops instrumentation arch/riscv/include/asm/bitops.h | 43 ++++++++++++++++++--------------- arch/riscv/include/asm/string.h | 2 ++ arch/riscv/kernel/riscv_ksyms.c | 3 --- arch/riscv/lib/Makefile | 2 ++ arch/riscv/lib/strcmp.S | 1 + arch/riscv/lib/strlen.S | 1 + arch/riscv/lib/strncmp.S | 1 + arch/riscv/purgatory/Makefile | 2 ++ 8 files changed, 32 insertions(+), 23 deletions(-) * b4-shazam-merge: riscv: Enable bitops instrumentation riscv: Omit optimized string routines when using KASAN Link: https://lore.kernel.org/r/20240801033725.28816-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-19riscv: Omit optimized string routines when using KASANSamuel Holland
The optimized string routines are implemented in assembly, so they are not instrumented for use with KASAN. Fall back to the C version of the routines in order to improve KASAN coverage. This fixes the kasan_strings() unit test. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240801033725.28816-2-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-05Merge patch series "RISC-V: Parse DT for Zkr to seed KASLR"Palmer Dabbelt
Jesse Taube <jesse@rivosinc.com> says: Add functions to pi/fdt_early.c to help parse the FDT to check if the isa string has the Zkr extension. Then use the Zkr extension to seed the KASLR base address. The first two patches fix the visibility of symbols. * b4-shazam-merge: RISC-V: Use Zkr to seed KASLR base address RISC-V: pi: Add kernel/pi/pi.h RISC-V: lib: Add pi aliases for string functions RISC-V: pi: Force hidden visibility for all symbol references Link: https://lore.kernel.org/r/20240709173937.510084-1-jesse@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-05RISC-V: lib: Add pi aliases for string functionsJesse Taube
memset, strcmp, and strncmp are all used in the __pi_ section, add SYM_FUNC_ALIAS for them. When KASAN is enabled in <asm/string.h> __pi___memset is also needed. Suggested-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Jesse Taube <jesse@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240709173937.510084-3-jesse@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-10riscv: Optimize crc32 with Zbc extensionXiao Wang
As suggested by the B-ext spec, the Zbc (carry-less multiplication) instructions can be used to accelerate CRC calculations. Currently, the crc32 is the most widely used crc function inside kernel, so this patch focuses on the optimization of just the crc32 APIs. Compared with the current table-lookup based optimization, Zbc based optimization can also achieve large stride during CRC calculation loop, meantime, it avoids the memory access latency of the table-lookup based implementation and it reduces memory footprint. If Zbc feature is not supported in a runtime environment, then the table-lookup based implementation would serve as fallback via alternative mechanism. By inspecting the vmlinux built by gcc v12.2.0 with default optimization level (-O2), we can see below instruction count change for each 8-byte stride in the CRC32 loop: rv64: crc32_be (54->31), crc32_le (54->13), __crc32c_le (54->13) rv32: crc32_be (50->32), crc32_le (50->16), __crc32c_le (50->16) The compile target CPU is little endian, extra effort is needed for byte swapping for the crc32_be API, thus, the instruction count change is not as significant as that in the *_le cases. This patch is tested on QEMU VM with the kernel CRC32 selftest for both rv64 and rv32. Running the CRC32 selftest on a real hardware (SpacemiT K1) with Zbc extension shows 65% and 125% performance improvement respectively on crc32_test() and crc32c_test(). Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240621054707.1847548-1-xiao.w.wang@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30riscv: vector: adjust minimum Vector requirement to ZVE32XAndy Chiu
Make has_vector() to check for ZVE32X. Every in-kernel usage of V that requires a more complicate version of V must then call out explicitly. Also, change riscv_v_first_use_handler(), and boot code that calls riscv_v_setup_vsize() to accept ZVE32X. Most kernel/user interfaces requires minimum of ZVE32X. Thus, programs compiled and run with ZVE32X should be supported by the kernel on most aspects. This includes context-switch, signal, ptrace, prctl, and hwprobe. One exception is that ELF_HWCAP returns 'V' only if full V is supported on the platform. This means that the system without a full V must not rely on ELF_HWCAP to tell whether it is allowable to execute Vector without first invoking a prctl() check. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Acked-by: Joel Granados <j.granados@samsung.com> Link: https://lore.kernel.org/r/20240510-zve-detection-v5-7-0711bdd26c12@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22riscv: uaccess: Relax the threshold for fast pathXiao Wang
The bytes copy for unaligned head would cover at most SZREG-1 bytes, so it's better to set the threshold as >= (SZREG-1 + word_copy stride size) which equals to 9*SZREG-1. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240313091929.4029960-1-xiao.w.wang@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22riscv: uaccess: Allow the last potential unrolled copyXiao Wang
When the dst buffer pointer points to the last accessible aligned addr, we could still run another iteration of unrolled copy. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240313103334.4036554-1-xiao.w.wang@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-22Merge tag 'riscv-for-linus-6.9-mw2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for various vector-accelerated crypto routines - Hibernation is now enabled for portable kernel builds - mmap_rnd_bits_max is larger on systems with larger VAs - Support for fast GUP - Support for membarrier-based instruction cache synchronization - Support for the Andes hart-level interrupt controller and PMU - Some cleanups around unaligned access speed probing and Kconfig settings - Support for ACPI LPI and CPPC - Various cleanus related to barriers - A handful of fixes * tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (66 commits) riscv: Fix syscall wrapper for >word-size arguments crypto: riscv - add vector crypto accelerated AES-CBC-CTS crypto: riscv - parallelize AES-CBC decryption riscv: Only flush the mm icache when setting an exec pte riscv: Use kcalloc() instead of kzalloc() riscv/barrier: Add missing space after ',' riscv/barrier: Consolidate fence definitions riscv/barrier: Define RISCV_FULL_BARRIER riscv/barrier: Define __{mb,rmb,wmb} RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ cpufreq: Move CPPC configs to common Kconfig and add RISC-V ACPI: RISC-V: Add CPPC driver ACPI: Enable ACPI_PROCESSOR for RISC-V ACPI: RISC-V: Add LPI driver cpuidle: RISC-V: Move few functions to arch/riscv riscv: Introduce set_compat_task() in asm/compat.h riscv: Introduce is_compat_thread() into compat.h riscv: add compile-time test into is_compat_task() riscv: Replace direct thread flag check with is_compat_task() riscv: Improve arch_get_mmap_end() macro ...
2024-03-13Merge patch series "riscv: Use Kconfig to set unaligned access speed"Palmer Dabbelt
Charlie Jenkins <charlie@rivosinc.com> says: If the hardware unaligned access speed is known at compile time, it is possible to avoid running the unaligned access speed probe to speedup boot-time. * b4-shazam-merge: riscv: Set unaligned access speed at compile time riscv: Decouple emulated unaligned accesses from access speed riscv: Only check online cpus for emulated accesses riscv: lib: Introduce has_fast_unaligned_access() Link: https://lore.kernel.org/r/20240308-disable_misaligned_probe_config-v9-0-a388770ba0ce@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-13riscv: lib: Introduce has_fast_unaligned_access()Charlie Jenkins
Create has_fast_unaligned_access to avoid needing to explicitly check the fast_misaligned_access_speed_key static key. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Evan Green <evan@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240308-disable_misaligned_probe_config-v9-1-a388770ba0ce@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-09work around gcc bugs with 'asm goto' with outputsLinus Torvalds
We've had issues with gcc and 'asm goto' before, and we created a 'asm_volatile_goto()' macro for that in the past: see commits 3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation bug") and a9f180345f53 ("compiler/gcc4: Make quirk for asm_volatile_goto() unconditional"). Then, much later, we ended up removing the workaround in commit 43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR 58670") because we no longer supported building the kernel with the affected gcc versions, but we left the macro uses around. Now, Sean Christopherson reports a new version of a very similar problem, which is fixed by re-applying that ancient workaround. But the problem in question is limited to only the 'asm goto with outputs' cases, so instead of re-introducing the old workaround as-is, let's rename and limit the workaround to just that much less common case. It looks like there are at least two separate issues that all hit in this area: (a) some versions of gcc don't mark the asm goto as 'volatile' when it has outputs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420 which is easy to work around by just adding the 'volatile' by hand. (b) Internal compiler errors: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422 which are worked around by adding the extra empty 'asm' as a barrier, as in the original workaround. but the problem Sean sees may be a third thing since it involves bad code generation (not an ICE) even with the manually added 'volatile'. but the same old workaround works for this case, even if this feels a bit like voodoo programming and may only be hiding the issue. Reported-and-tested-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/ Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Andrew Pinski <quic_apinski@quicinc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-01-21riscv: remove unneeded #include <asm-generic/export.h>Masahiro Yamada
Commit 62694797f56b ("use linux/export.h rather than asm-generic/export.h") replaced deprecated <asm-generic/export.h> inclusions. Commit c2a658d41924 ("riscv: lib: vectorize copy_to_user/copy_from_user") introduced a new instance of #include <asm-generic/export.h>. arch/riscv/lib/uaccess_vector.S does not use EXPORT_SYMBOL, hence this include directive is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240120213312.3033528-1-masahiroy@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-18riscv: lib: Check if output in asm goto supportedCharlie Jenkins
The output field of an asm goto statement is not supported by all compilers. If it is not supported, fallback to the non-optimized code. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Fixes: a04c192eabfb ("riscv: Add checksum library") Link: https://lore.kernel.org/r/20240118-csum_remove_output_operands_asm_goto-v2-1-5d1b73cf93d4@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17Merge patch series "riscv: Add fine-tuned checksum functions"Palmer Dabbelt
Charlie Jenkins <charlie@rivosinc.com> says: Each architecture generally implements fine-tuned checksum functions to leverage the instruction set. This patch adds the main checksum functions that are used in networking. Tested on QEMU, this series allows the CHECKSUM_KUNIT tests to complete an average of 50.9% faster. This patch takes heavy use of the Zbb extension using alternatives patching. To test this patch, enable the configs for KUNIT, then CHECKSUM_KUNIT. I have attempted to make these functions as optimal as possible, but I have not ran anything on actual riscv hardware. My performance testing has been limited to inspecting the assembly, running the algorithms on x86 hardware, and running in QEMU. ip_fast_csum is a relatively small function so even though it is possible to read 64 bits at a time on compatible hardware, the bottleneck becomes the clean up and setup code so loading 32 bits at a time is actually faster. * b4-shazam-merge: kunit: Add tests for csum_ipv6_magic and ip_fast_csum riscv: Add checksum library riscv: Add checksum header riscv: Add static key for misaligned accesses asm-generic: Improve csum_fold Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-0-1c50de5f2167@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17riscv: Add checksum libraryCharlie Jenkins
Provide a 32 and 64 bit version of do_csum. When compiled for 32-bit will load from the buffer in groups of 32 bits, and when compiled for 64-bit will load in groups of 64 bits. Additionally provide riscv optimized implementation of csum_ipv6_magic. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Xiao Wang <xiao.w.wang@intel.com> Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-4-1c50de5f2167@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16Merge patch series "riscv: support kernel-mode Vector"Palmer Dabbelt
Andy Chiu <andy.chiu@sifive.com> says: This series provides support running Vector in kernel mode. Additionally, kernel-mode Vector can be configured to run without turnning off preemption on a CONFIG_PREEMPT kernel. Along with the suport, we add Vector optimized copy_{to,from}_user. And provide a simple threshold to decide when to run the vectorized functions. We decided to drop vectorized memcpy/memset/memmove for the moment due to the concern of memory side-effect in kernel_vector_begin(). The detailed description can be found at v9[0] This series is composed by 4 parts: patch 1-4: adds basic support for kernel-mode Vector patch 5: includes vectorized copy_{to,from}_user into the kernel patch 6: refactor context switch code in fpu [1] patch 7-10: provides some code refactors and support for preemptible kernel-mode Vector. This series can be merged if we feel any part of {1~4, 5, 6, 7~10} is mature enough. This patch is tested on a QEMU with V and verified that booting, normal userspace operations all work as usual with thresholds set to 0. Also, we test by launching multiple kernel threads which continuously executes and verifies Vector operations in the background. The module that tests these operation is expected to be upstream later. * b4-shazam-merge: riscv: vector: allow kernel-mode Vector with preemption riscv: vector: use kmem_cache to manage vector context riscv: vector: use a mask to write vstate_ctrl riscv: vector: do not pass task_struct into riscv_v_vstate_{save,restore}() riscv: fpu: drop SR_SD bit checking riscv: lib: vectorize copy_to_user/copy_from_user riscv: sched: defer restoring Vector context for user riscv: Add vector extension XOR implementation riscv: vector: make Vector always available for softirq context riscv: Add support for kernel mode vector Link: https://lore.kernel.org/r/20240115055929.4736-1-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16riscv: lib: vectorize copy_to_user/copy_from_userAndy Chiu
This patch utilizes Vector to perform copy_to_user/copy_from_user. If Vector is available and the size of copy is large enough for Vector to perform better than scalar, then direct the kernel to do Vector copies for userspace. Though the best programming practice for users is to reduce the copy, this provides a faster variant when copies are inevitable. The optimal size for using Vector, copy_to_user_thres, is only a heuristic for now. We can add DT parsing if people feel the need of customizing it. The exception fixup code of the __asm_vector_usercopy must fallback to the scalar one because accessing user pages might fault, and must be sleepable. Current kernel-mode Vector does not allow tasks to be preemptible, so we must disactivate Vector and perform a scalar fallback in such case. The original implementation of Vector operations comes from https://github.com/sifive/sifive-libc, which we agree to contribute to Linux kernel. Co-developed-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Jerry Shih <jerry.shih@sifive.com> Co-developed-by: Nick Knight <nick.knight@sifive.com> Signed-off-by: Nick Knight <nick.knight@sifive.com> Suggested-by: Guo Ren <guoren@kernel.org> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-6-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16riscv: Add vector extension XOR implementationGreentime Hu
This patch adds support for vector optimized XOR and it is tested in qemu. Co-developed-by: Han-Kuan Chen <hankuan.chen@sifive.com> Signed-off-by: Han-Kuan Chen <hankuan.chen@sifive.com> Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-4-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09use linux/export.h rather than asm-generic/export.hAl Viro
asm-generic/export.h is a wrapper for linux/export.h, with explicit request to use linux/export.h directly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20231214191922.GQ1674809@ZenIV Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06riscv: Use SYM_*() assembly macros instead of deprecated onesClément Léger
ENTRY()/END()/WEAK() macros are deprecated and we should make use of the new SYM_*() macros [1] for better annotation of symbols. Replace the deprecated ones with the new ones and fix wrong usage of END()/ENDPROC() to correctly describe the symbols. [1] https://docs.kernel.org/core-api/asm-annotations.html Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20231024132655.730417-3-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06riscv: use ".L" local labels in assembly when applicableClément Léger
For the sake of coherency, use local labels in assembly when applicable. This also avoid kprobes being confused when applying a kprobe since the size of function is computed by checking where the next visible symbol is located. This might end up in computing some function size to be way shorter than expected and thus failing to apply kprobes to the specified offset. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20231024132655.730417-2-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-05RISC-V: capitalise CMO op macrosConor Dooley
The CMO op macros initially used lower case, as the original iteration of the ALT_CMO_OP alternative stringified the first parameter to finalise the assembly for the standard variant. As a knock-on, the T-Head versions of these CMOs had to use mixed case defines. Commit dd23e9535889 ("RISC-V: replace cbom instructions with an insn-def") removed the asm construction with stringify, replacing it an insn-def macro, rending the lower-case surplus to requirements. As far as I can tell from a brief check, CBO_zero does not see similar use and didn't require the mixed case define in the first place. Replace the lower case characters now for consistency with other insn-def macros in the standard and T-Head forms, and adjust the callsites. Suggested-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230915-aloe-dollar-994937477776@spud Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-16riscv: uaccess: Return the number of bytes effectively not copiedAlexandre Ghiti
It was reported that the riscv kernel hangs while executing the test in [1]. Indeed, the test hangs when trying to write a buffer to a file. The problem is that the riscv implementation of raw_copy_from_user() does not return the correct number of bytes not written when an exception happens and is fixed up, instead it always returns the initial size to copy, even if some bytes were actually copied. generic_perform_write() pre-faults the user pages and bails out if nothing can be written, otherwise it will access the userspace buffer: here the riscv implementation keeps returning it was not able to copy any byte though the pre-faulting indicates otherwise. So generic_perform_write() keeps retrying to access the user memory and ends up in an infinite loop. Note that before the commit mentioned in [1] that introduced this regression, it worked because generic_perform_write() would bail out if only one byte could not be written. So fix this by returning the number of bytes effectively not written in __asm_copy_[to|from]_user() and __clear_user(), as it is expected. Link: https://lore.kernel.org/linux-riscv/20230309151841.bomov6hq3ybyp42a@debian/ [1] Fixes: ebcbd75e3962 ("riscv: Fix the bug in memory access fixup code") Reported-by: Bo YU <tsu.yubo@gmail.com> Closes: https://lore.kernel.org/linux-riscv/20230309151841.bomov6hq3ybyp42a@debian/#t Reported-by: Aurelien Jarno <aurelien@aurel32.net> Closes: https://lore.kernel.org/linux-riscv/ZNOnCakhwIeue3yr@aurel32.net/ Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Link: https://lore.kernel.org/r/20230811150604.1621784-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-26riscv: Allow to downgrade paging mode from the command lineAlexandre Ghiti
Add 2 early command line parameters that allow to downgrade satp mode (using the same naming as x86): - "no5lvl": use a 4-level page table (down from sv57 to sv48) - "no4lvl": use a 3-level page table (down from sv57/sv48 to sv39) Note that going through the device tree to get the kernel command line works with ACPI too since the efi stub creates a device tree anyway with the command line. In KASAN kernels, we can't use the libfdt that early in the boot process since we are not ready to execute instrumented functions. So instead of using the "generic" libfdt, we compile our own versions of those functions that are not instrumented and that are prefixed so that they do not conflict with the generic ones. We also need the non-instrumented versions of the string functions and the prefixed versions of memcpy/memmove. This is largely inspired by commit aacd149b6238 ("arm64: head: avoid relocating the kernel twice for KASLR") from which I removed compilation flags that were not relevant to RISC-V at the moment (LTO, SCS). Also note that we have to link with -z norelro to avoid ld.lld to throw a warning with the new .got sections, like in commit 311bea3cb9ee ("arm64: link with -z norelro for LLD or aarch64-elf"). Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20230424092313.178699-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-14RISC-V: Use Zicboz in clear_page when availableAndrew Jones
Using memset() to zero a 4K page takes 563 total instructions, where 20 are branches. clear_page(), with Zicboz and a 64 byte block size, takes 169 total instructions, where 4 are branches and 33 are nops. Even though the block size is a variable, thanks to alternatives, we can still implement a Duff device without having to do any preliminary calculations. This is achieved by using the alternatives' cpufeature value (the upper 16 bits of patch_id). The value used is the maximum zicboz block size order accepted at the patch site. This enables us to stop patching / unrolling when 4K bytes have been zeroed (we would loop and continue after 4K if the page size would be larger) For 4K pages, unrolling 16 times allows block sizes of 64 and 128 to only loop a few times and larger block sizes to not loop at all. Since cbo.zero doesn't take an offset, we also need an 'add' after each instruction, making the loop body 112 to 160 bytes. Hopefully this is small enough to not cause icache misses. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230224162631.405473-7-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-14riscv: lib: Include hwcap.h directlyAndrew Jones
When using alternatives for cpufeatures we should include hwcap.h directly, rather than through errata_list.h. Opportunistically drop an unused include too. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Link: https://lore.kernel.org/r/20230224154601.88163-6-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-28riscv, lib: Fix Zbb strncmpBjörn Töpel
The Zbb optimized strncmp has two parts; a fast path that does XLEN/8B per iteration, and a slow that does one byte per iteration. The idea is to compare aligned XLEN chunks for most of strings, and do the remainder tail in the slow path. The Zbb strncmp has two issues in the fast path: Incorrect remainder handling (wrong compare): Assume that the string length is 9. On 64b systems, the fast path should do one iteration, and one iteration in the slow path. Instead, both were done in the fast path, which lead to incorrect results. An example: strncmp("/dev/vda", "/dev/", 5); Correct by changing "bgt" to "bge". Missing NULL checks in the second string: This could lead to incorrect results for: strncmp("/dev/vda", "/dev/vda\0", 8); Correct by adding an additional check. Fixes: b6fcdb191e36 ("RISC-V: add zbb support to string functions") Suggested-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230228184211.1585641-1-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>