summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2025-03-14scanf: implicate test line in failure messagesTamir Duberstein
This improves the failure output by pointing to the failing line at the top level of the test. Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Tamir Duberstein <tamird@gmail.com> Link: https://lore.kernel.org/r/20250307-scanf-kunit-convert-v9-1-b98820fa39ff@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-13printf: implicate test line in failure messagesTamir Duberstein
This improves the failure output by pointing to the failing line at the top level of the test, e.g.: # test_number: EXPECTATION FAILED at lib/printf_kunit.c:103 lib/printf_kunit.c:167: vsnprintf(buf, 256, "%#-12x", ...) wrote '0x1234abcd ', expected '0x1234abce ' # test_number: EXPECTATION FAILED at lib/printf_kunit.c:142 lib/printf_kunit.c:167: kvasprintf(..., "%#-12x", ...) returned '0x1234abcd ', expected '0x1234abce ' Signed-off-by: Tamir Duberstein <tamird@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20250307-printf-kunit-convert-v6-3-4d85c361c241@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-13printf: break kunit into test casesTamir Duberstein
Move all tests into `printf_test_cases`. This gives us nicer output in the event of a failure. Combine `plain_format` and `plain_hash` into `hash_pointer` since they're testing the same scenario. Signed-off-by: Tamir Duberstein <tamird@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20250307-printf-kunit-convert-v6-2-4d85c361c241@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-13printf: convert self-test to KUnitTamir Duberstein
Convert the printf() self-test to a KUnit test. In the interest of keeping the patch reasonably-sized this doesn't refactor the tests into proper parameterized tests - it's all one big test case. Signed-off-by: Tamir Duberstein <tamird@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20250307-printf-kunit-convert-v6-1-4d85c361c241@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-12kunit/fortify: Replace "volatile" with OPTIMIZER_HIDE_VAR()Kees Cook
It does seem that using "volatile" isn't going to be sane compared to using OPTIMIZER_HIDE_VAR() going forward. Some strange interactions[1] with the sanitizers have been observed in the self-test code, so replace the logic. Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://github.com/ClangBuiltLinux/linux/issues/2075 [1] Link: https://lore.kernel.org/r/20250312000439.work.112-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-12kunit/fortify: Expand testing of __compiletime_strlen()Kees Cook
It seems that Clang thinks __builtin_constant_p() of undefined variables should return true[1]. This is being fixed separately[2], but in the meantime, expand the fortify tests to help track this kind of thing down faster in the future. Link: https://github.com/ClangBuiltLinux/linux/issues/2073 [1] Link: https://github.com/llvm/llvm-project/pull/130713 [2] Link: https://lore.kernel.org/r/20250312000349.work.786-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-05kunit/stackinit: Use fill byte different from Clang i386 patternKees Cook
The byte initialization values used with -ftrivial-auto-var-init=pattern (CONFIG_INIT_STACK_ALL_PATTERN=y) depends on the compiler, architecture, and byte position relative to struct member types. On i386 with Clang, this includes the 0xFF value, which means it looks like nothing changes between the leaf byte filling pass and the expected "stack wiping" pass of the stackinit test. Use the byte fill value of 0x99 instead, fixing the test for i386 Clang builds. Reported-by: ernsteiswuerfel Closes: https://github.com/ClangBuiltLinux/linux/issues/2071 Fixes: 8c30d32b1a32 ("lib/test_stackinit: Handle Clang auto-initialization pattern") Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250304225606.work.030-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-04kunit/overflow: Fix DEFINE_FLEX tests for counted_byKees Cook
Unfortunately, __builtin_dynamic_object_size() does not take into account flexible array sizes, even when they are sized by __counted_by. As a result, the size tests for the flexible arrays need to be separated to get an accurate check of the compiler's behavior. While at it, fully test sizeof, __struct_size (bdos(..., 0)), and __member_size (bdos(..., 1)). I still think this is a compiler design issue, but there's not much to be done about it currently beyond adjusting these tests. GCC and Clang agree on this behavior at least. :) Reported-by: "Thomas Weißschuh" <linux@weissschuh.net> Closes: https://lore.kernel.org/lkml/e1a1531d-6968-4ae8-a3b5-5ea0547ec4b3@t-8ch.de/ Fixes: 9dd5134c6158 ("kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()") Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-12lib/prime_numbers: convert self-test to KUnitTamir Duberstein
Extract a private header and convert the prime_numbers self-test to a KUnit test. I considered parameterizing the test using `KUNIT_CASE_PARAM` but didn't see how it was possible since the test logic is entangled with the test parameter generation logic. Signed-off-by: Tamir Duberstein <tamird@gmail.com> Link: https://lore.kernel.org/r/20250208-prime_numbers-kunit-convert-v5-2-b0cb82ae7c7d@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-12lib/math: Add Kunit test suite for gcd()Yu-Chun Lin
Add a KUnit test suite for the gcd() function. This test suite verifies the correctness of gcd() across various scenarios, including edge cases. Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com> Link: https://lore.kernel.org/r/20250203075400.3431330-1-eleanor15x@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-10lib/tests/kfifo_kunit.c: add tests for the kfifo structureDiego Vieira
Add KUnit tests for the kfifo data structure. They test the vast majority of macros defined in the kfifo header (include/linux/kfifo.h). These are inspired by the existing tests for the doubly linked list in lib/tests/list-test.c (previously at lib/list-test.c) [1]. Note that this patch depends on the patch that moves the KUnit tests on lib/ into lib/tests/ [2]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/list-test.c?h=v6.11-rc6 [2] https://lore.kernel.org/all/20240720181025.work.002-kees@kernel.org/ Signed-off-by: Diego Vieira <diego.daniel.professional@gmail.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Rae Moar <rmoar@google.com> Link: https://lore.kernel.org/r/20241202075545.3648096-5-davidgow@google.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-10lib: Move KUnit tests into tests/ subdirectoryKees Cook
Following from the recent KUnit file naming discussion[1], move all KUnit tests in lib/ into lib/tests/. Link: https://lore.kernel.org/lkml/20240720165441.it.320-kees@kernel.org/ [1] Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: David Gow <davidgow@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Rae Moar <rmoar@google.com> Link: https://lore.kernel.org/r/20241202075545.3648096-4-davidgow@google.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-10lib/math: Add int_log test suiteBruno Sobreira França
This commit introduces KUnit tests for the intlog2 and intlog10 functions, which compute logarithms in base 2 and base 10, respectively. The tests cover a range of inputs to ensure the correctness of these functions across common and edge cases. Signed-off-by: Bruno Sobreira França <brunofrancadevsec@gmail.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Rae Moar <rmoar@google.com> Link: https://lore.kernel.org/r/20241202075545.3648096-3-davidgow@google.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-10lib: math: Move KUnit tests into tests/ subdirLuis Felipe Hernandez
This patch is a follow-up task from a discussion stemming from point 3 in a recent patch introducing the int_pow kunit test [1] and documentation regarding kunit test style and nomenclature [2]. Colocate all kunit test suites in lib/math/tests/ and follow recommended naming convention for files <suite>_kunit.c and kconfig entries CONFIG_<name>_KUNIT_TEST. Link: https://lore.kernel.org/all/CABVgOS=-vh5TqHFCq_jo=ffq8v_nGgr6JsPnOZag3e6+19ysxQ@mail.gmail.com/ [1] Link: https://docs.kernel.org/dev-tools/kunit/style.html [2] Signed-off-by: Luis Felipe Hernandez <luis.hernandez093@gmail.com> Acked-by: Nicolas Pitre <npitre@baylibre.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Rae Moar <rmoar@google.com> Link: https://lore.kernel.org/r/20241202075545.3648096-2-davidgow@google.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-08Merge tag 'hardening-v6.14-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: "Address a KUnit stack initialization regression that got tickled on m68k, and solve a Clang(v14 and earlier) bug found by 0day: - Fix stackinit KUnit regression on m68k - Use ARRAY_SIZE() for memtostr*()/strtomem*()" * tag 'hardening-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: string.h: Use ARRAY_SIZE() for memtostr*()/strtomem*() compiler.h: Introduce __must_be_byte_array() compiler.h: Move C string helpers into C-only kernel section stackinit: Fix comment for test_small_end stackinit: Keep selftest union size small on m68k
2025-02-06stackinit: Fix comment for test_small_endGeert Uytterhoeven
In union test_small_end, the small members are three and four. Fixes: e71a29db79da1946 ("stackinit: Add union initialization to selftests") Closes: https://lore.kernel.org/CAMuHMdWvcKOc6v5o3-9-SqP_4oh5-GZQjZZb=-krhY=mVRED_Q@mail.gmail.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/3f8faa2d7d0d6b36571093ab0fb1fd5157abd7bb.1738593178.git.geert+renesas@glider.be Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-06stackinit: Keep selftest union size small on m68kKees Cook
The stack frame on m68k is very sensitive to the size of what needs to be stored. Like done for long string testing, reduce the size of the large trailing struct in the union initialization testing. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/all/CAMuHMdXW8VbtOAixO7w+aDOG70aZtZ50j1Ybcr8B3eYnRUcrcA@mail.gmail.com Fixes: e71a29db79da ("stackinit: Add union initialization to selftests") Link: https://lore.kernel.org/r/20250204174509.work.711-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
2025-02-01Merge tag 'mm-hotfixes-stable-2025-02-01-03-56' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "21 hotfixes. 8 are cc:stable and the remainder address post-6.13 issues. 13 are for MM and 8 are for non-MM. All are singletons, please see the changelogs for details" * tag 'mm-hotfixes-stable-2025-02-01-03-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits) MAINTAINERS: include linux-mm for xarray maintenance revert "xarray: port tests to kunit" MAINTAINERS: add lib/test_xarray.c mailmap, MAINTAINERS, docs: update Carlos's email address mm/hugetlb: fix hugepage allocation for interleaved memory nodes mm: gup: fix infinite loop within __get_longterm_locked mm, swap: fix reclaim offset calculation error during allocation .mailmap: update email address for Christopher Obbard kfence: skip __GFP_THISNODE allocations on NUMA systems nilfs2: fix possible int overflows in nilfs_fiemap() mm: compaction: use the proper flag to determine watermarks kernel: be more careful about dup_mmap() failures and uprobe registering mm/fake-numa: handle cases with no SRAT info mm: kmemleak: fix upper boundary check for physical address objects mailmap: add an entry for Hamza Mahfooz MAINTAINERS: mailmap: update Yosry Ahmed's email address scripts/gdb: fix aarch64 userspace detection in get_current_task mm/vmscan: accumulate nr_demoted for accurate demotion statistics ocfs2: fix incorrect CPU endianness conversion causing mount failure mm/zsmalloc: add __maybe_unused attribute for is_first_zpdesc() ...
2025-02-01revert "xarray: port tests to kunit"Andrew Morton
Revert c7bb5cf9fc4e ("xarray: port tests to kunit"). It broke the build when compiing the xarray userspace test harness code. Reported-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Closes: https://lkml.kernel.org/r/07cf896e-adf8-414f-a629-a808fc26014a@oracle.com Cc: David Gow <davidgow@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Tamir Duberstein <tamird@gmail.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-31Merge tag 'hardening-v6.14-rc1-fix1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: "This is a fix for the soon to be released GCC 15 which has regressed its initialization of unions when performing explicit initialization (i.e. a general problem, not specifically a hardening problem; we're just carrying the fix). Details in the final patch, Acked by Masahiro, with updated selftests to validate the fix" * tag 'hardening-v6.14-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: kbuild: Use -fzero-init-padding-bits=all stackinit: Add union initialization to selftests stackinit: Add old-style zero-init syntax to struct tests
2025-01-30stackinit: Add union initialization to selftestsKees Cook
The stack initialization selftests were checking scalars, strings, and structs, but not unions. Add union tests (which are mostly identical setup to structs). This catches the recent union initialization behavioral changes seen in GCC 15. Before GCC 15, this new test passes: ok 18 test_small_start_old_zero With GCC 15, it fails: not ok 18 test_small_start_old_zero Specifically, a union with a larger member where a smaller member is initialized with the older "= { 0 }" syntax: union test_small_start { char one:1; char two; short three; unsigned long four; struct big_struct { unsigned long array[8]; } big; }; This is a regression in compiler behavior that Linux has depended on. GCC does not seem likely to fix it, instead suggesting that affected projects start using -fzero-init-padding-bits=unions: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118403 Link: https://lore.kernel.org/r/20250127191031.245214-2-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-01-30stackinit: Add old-style zero-init syntax to struct testsKees Cook
The deprecated way to do a full zero init of a structure is with "= { 0 }", but we weren't testing this style. Add it. Link: https://lore.kernel.org/r/20250127191031.245214-1-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-01-29Merge tag 'crc-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull CRC cleanups from Eric Biggers: "Simplify the kconfig options for controlling which CRC implementations are built into the kernel, as was requested by Linus. This means making the option to disable the arch code visible only when CONFIG_EXPERT=y, and standardizing on a single generic implementation of CRC32" * tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crc32: remove other generic implementations lib/crc: simplify the kconfig options for CRC implementations
2025-01-29Merge tag 'constfy-sysctl-6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl table constification from Joel Granados: "All ctl_table declared outside of functions and that remain unmodified after initialization are const qualified. This prevents unintended modifications to proc_handler function pointers by placing them in the .rodata section. This is a continuation of the tree-wide effort started a few releases ago with the constification of the ctl_table struct arguments in the sysctl API done in 78eb4ea25cd5 ("sysctl: treewide: constify the ctl_table argument of proc_handlers")" * tag 'constfy-sysctl-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: treewide: const qualify ctl_tables where applicable
2025-01-29lib/crc32: remove other generic implementationsEric Biggers
Now that we've standardized on the byte-by-byte implementation of CRC32 as the only generic implementation (see previous commit for the rationale), remove the code for the other implementations. Tested with crc_kunit. Link: https://lore.kernel.org/r/20250123212904.118683-3-ebiggers@kernel.org Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-01-29lib/crc: simplify the kconfig options for CRC implementationsEric Biggers
Make the following simplifications to the kconfig options for choosing CRC implementations for CRC32 and CRC_T10DIF: 1. Make the option to disable the arch-optimized code be visible only when CONFIG_EXPERT=y. 2. Make a single option control the inclusion of the arch-optimized code for all enabled CRC variants. 3. Make CRC32_SARWATE (a.k.a. slice-by-1 or byte-by-byte) be the only generic CRC32 implementation. The result is there is now just one option, CRC_OPTIMIZATIONS, which is default y and can be disabled only when CONFIG_EXPERT=y. Rationale: 1. Enabling the arch-optimized code is nearly always the right choice. However, people trying to build the tiniest kernel possible would find some use in disabling it. Anything we add to CRC32 is de facto unconditional, given that CRC32 gets selected by something in nearly all kernels. And unfortunately enabling the arch CRC code does not eliminate the need to build the generic CRC code into the kernel too, due to CPU feature dependencies. The size of the arch CRC code will also increase slightly over time as more CRC variants get added and more implementations targeting different instruction set extensions get added. Thus, it seems worthwhile to still provide an option to disable it, but it should be considered an expert-level tweak. 2. Considering the use case described in (1), there doesn't seem to be sufficient value in making the arch-optimized CRC code be independently configurable for different CRC variants. Note also that multiple variants were already grouped together, e.g. CONFIG_CRC32 actually enables three different variants of CRC32. 3. The bit-by-bit implementation is uselessly slow, whereas slice-by-n for n=4 and n=8 use tables that are inconveniently large: 4096 bytes and 8192 bytes respectively, compared to 1024 bytes for n=1. Higher n gives higher instruction-level parallelism, so higher n easily wins on traditional microbenchmarks on most CPUs. However, the larger tables, which are accessed randomly, can be harmful in real-world situations where the dcache may be cold or useful data may need be evicted from the dcache. Meanwhile, today most architectures have much faster CRC32 implementations using dedicated CRC32 instructions or carryless multiplication instructions anyway, which make the generic code obsolete in most cases especially on long messages. Another reason for going with n=1 is that this is already what is used by all the other CRC variants in the kernel. CRC32 was unique in having support for larger tables. But as per the above this can be considered an outdated optimization. The standardization on slice-by-1 a.k.a. CRC32_SARWATE makes much of the code in lib/crc32.c unused. A later patch will clean that up. Link: https://lore.kernel.org/r/20250123212904.118683-2-ebiggers@kernel.org Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-01-28Merge tag 'driver-core-6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs updates from Greg KH: "Here is the big set of driver core and debugfs updates for 6.14-rc1. Included in here is a bunch of driver core, PCI, OF, and platform rust bindings (all acked by the different subsystem maintainers), hence the merge conflict with the rust tree, and some driver core api updates to mark things as const, which will also require some fixups due to new stuff coming in through other trees in this merge window. There are also a bunch of debugfs updates from Al, and there is at least one user that does have a regression with these, but Al is working on tracking down the fix for it. In my use (and everyone else's linux-next use), it does not seem like a big issue at the moment. Here's a short list of the things in here: - driver core rust bindings for PCI, platform, OF, and some i/o functions. We are almost at the "write a real driver in rust" stage now, depending on what you want to do. - misc device rust bindings and a sample driver to show how to use them - debugfs cleanups in the fs as well as the users of the fs api for places where drivers got it wrong or were unnecessarily doing things in complex ways. - driver core const work, making more of the api take const * for different parameters to make the rust bindings easier overall. - other small fixes and updates All of these have been in linux-next with all of the aforementioned merge conflicts, and the one debugfs issue, which looks to be resolved "soon"" * tag 'driver-core-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (95 commits) rust: device: Use as_char_ptr() to avoid explicit cast rust: device: Replace CString with CStr in property_present() devcoredump: Constify 'struct bin_attribute' devcoredump: Define 'struct bin_attribute' through macro rust: device: Add property_present() saner replacement for debugfs_rename() orangefs-debugfs: don't mess with ->d_name octeontx2: don't mess with ->d_parent or ->d_parent->d_name arm_scmi: don't mess with ->d_parent->d_name slub: don't mess with ->d_name sof-client-ipc-flood-test: don't mess with ->d_name qat: don't mess with ->d_name xhci: don't mess with ->d_iname mtu3: don't mess wiht ->d_iname greybus/camera - stop messing with ->d_iname mediatek: stop messing with ->d_iname netdevsim: don't embed file_operations into your structs b43legacy: make use of debugfs_get_aux() b43: stop embedding struct file_operations into their objects carl9170: stop embedding file_operations into their objects ...
2025-01-28treewide: const qualify ctl_tables where applicableJoel Granados
Add the const qualifier to all the ctl_tables in the tree except for watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls, loadpin_sysctl_table and the ones calling register_net_sysctl (./net, drivers/inifiniband dirs). These are special cases as they use a registration function with a non-const qualified ctl_table argument or modify the arrays before passing them on to the registration function. Constifying ctl_table structs will prevent the modification of proc_handler function pointers as the arrays would reside in .rodata. This is made possible after commit 78eb4ea25cd5 ("sysctl: treewide: constify the ctl_table argument of proc_handlers") constified all the proc_handlers. Created this by running an spatch followed by a sed command: Spatch: virtual patch @ depends on !(file in "net") disable optional_qualifier @ identifier table_name != { watchdog_hardlockup_sysctl, iwcm_ctl_table, ucma_ctl_table, memory_allocation_profiling_sysctls, loadpin_sysctl_table }; @@ + const struct ctl_table table_name [] = { ... }; sed: sed --in-place \ -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \ kernel/utsname_sysctl.c Reviewed-by: Song Liu <song@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> # for kernel/trace/ Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # SCSI Reviewed-by: Darrick J. Wong <djwong@kernel.org> # xfs Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Acked-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Acked-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-01-27Merge tag 'char-misc-6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull Char/Misc/IIO driver updates from Greg KH: "Here is the "big" set of char/misc/iio and other smaller driver subsystem updates for 6.14-rc1. Loads of different things in here this development cycle, highlights are: - ntsync "driver" to handle Windows locking types enabling Wine to work much better on many workloads (i.e. games). The driver framework was in 6.13, but now it's enabled and fully working properly. Should make many SteamOS users happy. Even comes with tests! - Large IIO driver updates and bugfixes - FPGA driver updates - Coresight driver updates - MHI driver updates - PPS driver updatesa - const bin_attribute reworking for many drivers - binder driver updates - smaller driver updates and fixes All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (311 commits) ntsync: Fix reference leaks in the remaining create ioctls. spmi: hisi-spmi-controller: Drop duplicated OF node assignment in spmi_controller_probe() spmi: Set fwnode for spmi devices ntsync: fix a file reference leak in drivers/misc/ntsync.c scripts/tags.sh: Don't tag usages of DECLARE_BITMAP dt-bindings: interconnect: qcom,msm8998-bwmon: Add SM8750 CPU BWMONs dt-bindings: interconnect: OSM L3: Document sm8650 OSM L3 compatible dt-bindings: interconnect: qcom-bwmon: Document QCS615 bwmon compatibles interconnect: sm8750: Add missing const to static qcom_icc_desc memstick: core: fix kernel-doc notation intel_th: core: fix kernel-doc warnings binder: log transaction code on failure iio: dac: ad3552r-hs: clear reset status flag iio: dac: ad3552r-common: fix ad3541/2r ranges iio: chemical: bme680: Fix uninitialized variable in __bme680_read_raw() misc: fastrpc: Fix copy buffer page size misc: fastrpc: Fix registered buffer page address misc: fastrpc: Deregister device nodes properly in error scenarios nvmem: core: improve range check for nvmem_cell_write() nvmem: qcom-spmi-sdam: Set size in struct nvmem_config ...
2025-01-26Merge tag 'mm-stable-2025-01-26-14-59' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "The various patchsets are summarized below. Plus of course many indivudual patches which are described in their changelogs. - "Allocate and free frozen pages" from Matthew Wilcox reorganizes the page allocator so we end up with the ability to allocate and free zero-refcount pages. So that callers (ie, slab) can avoid a refcount inc & dec - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to use large folios other than PMD-sized ones - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance and fixes for this small built-in kernel selftest - "mas_anode_descend() related cleanup" from Wei Yang tidies up part of the mapletree code - "mm: fix format issues and param types" from Keren Sun implements a few minor code cleanups - "simplify split calculation" from Wei Yang provides a few fixes and a test for the mapletree code - "mm/vma: make more mmap logic userland testable" from Lorenzo Stoakes continues the work of moving vma-related code into the (relatively) new mm/vma.c - "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David Hildenbrand cleans up and rationalizes handling of gfp flags in the page allocator - "readahead: Reintroduce fix for improper RA window sizing" from Jan Kara is a second attempt at fixing a readahead window sizing issue. It should reduce the amount of unnecessary reading - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng addresses an issue where "huge" amounts of pte pagetables are accumulated: https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/ Qi's series addresses this windup by synchronously freeing PTE memory within the context of madvise(MADV_DONTNEED) - "selftest/mm: Remove warnings found by adding compiler flags" from Muhammad Usama Anjum fixes some build warnings in the selftests code when optional compiler warnings are enabled - "mm: don't use __GFP_HARDWALL when migrating remote pages" from David Hildenbrand tightens the allocator's observance of __GFP_HARDWALL - "pkeys kselftests improvements" from Kevin Brodsky implements various fixes and cleanups in the MM selftests code, mainly pertaining to the pkeys tests - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to estimate application working set size - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn provides some cleanups to memcg's hugetlb charging logic - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song removes the global swap cgroup lock. A speedup of 10% for a tmpfs-based kernel build was demonstrated - "zram: split page type read/write handling" from Sergey Senozhatsky has several fixes and cleaups for zram in the area of zram_write_page(). A watchdog softlockup warning was eliminated - "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin Brodsky cleans up the pagetable destructor implementations. A rare use-after-free race is fixed - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes simplifies and cleans up the debugging code in the VMA merging logic - "Account page tables at all levels" from Kevin Brodsky cleans up and regularizes the pagetable ctor/dtor handling. This results in improvements in accounting accuracy - "mm/damon: replace most damon_callback usages in sysfs with new core functions" from SeongJae Park cleans up and generalizes DAMON's sysfs file interface logic - "mm/damon: enable page level properties based monitoring" from SeongJae Park increases the amount of information which is presented in response to DAMOS actions - "mm/damon: remove DAMON debugfs interface" from SeongJae Park removes DAMON's long-deprecated debugfs interfaces. Thus the migration to sysfs is completed - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from Peter Xu cleans up and generalizes the hugetlb reservation accounting - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino removes a never-used feature of the alloc_pages_bulk() interface - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park extends DAMOS filters to support not only exclusion (rejecting), but also inclusion (allowing) behavior - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi introduces a new memory descriptor for zswap.zpool that currently overlaps with struct page for now. This is part of the effort to reduce the size of struct page and to enable dynamic allocation of memory descriptors - "mm, swap: rework of swap allocator locks" from Kairui Song redoes and simplifies the swap allocator locking. A speedup of 400% was demonstrated for one workload. As was a 35% reduction for kernel build time with swap-on-zram - "mm: update mips to use do_mmap(), make mmap_region() internal" from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that mmap_region() can be made MM-internal - "mm/mglru: performance optimizations" from Yu Zhao fixes a few MGLRU regressions and otherwise improves MGLRU performance - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae Park updates DAMON documentation - "Cleanup for memfd_create()" from Isaac Manjarres does that thing - "mm: hugetlb+THP folio and migration cleanups" from David Hildenbrand provides various cleanups in the areas of hugetlb folios, THP folios and migration - "Uncached buffered IO" from Jens Axboe implements the new RWF_DONTCACHE flag which provides synchronous dropbehind for pagecache reading and writing. To permite userspace to address issues with massive buildup of useless pagecache when reading/writing fast devices - "selftests/mm: virtual_address_range: Reduce memory" from Thomas Weißschuh fixes and optimizes some of the MM selftests" * tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits) mm/compaction: fix UBSAN shift-out-of-bounds warning s390/mm: add missing ctor/dtor on page table upgrade kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags() tools: add VM_WARN_ON_VMG definition mm/damon/core: use str_high_low() helper in damos_wmark_wait_us() seqlock: add missing parameter documentation for raw_seqcount_try_begin() mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh mm/page_alloc: remove the incorrect and misleading comment zram: remove zcomp_stream_put() from write_incompressible_page() mm: separate move/undo parts from migrate_pages_batch() mm/kfence: use str_write_read() helper in get_access_type() selftests/mm/mkdirty: fix memory leak in test_uffdio_copy() kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags() selftests/mm: virtual_address_range: avoid reading from VM_IO mappings selftests/mm: vm_util: split up /proc/self/smaps parsing selftests/mm: virtual_address_range: unmap chunks after validation selftests/mm: virtual_address_range: mmap() without PROT_WRITE selftests/memfd/memfd_test: fix possible NULL pointer dereference mm: add FGP_DONTCACHE folio creation flag mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue ...
2025-01-26Merge tag 'mm-nonmm-stable-2025-01-24-23-16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Mainly individually changelogged singleton patches. The patch series in this pull are: - "lib min_heap: Improve min_heap safety, testing, and documentation" from Kuan-Wei Chiu provides various tightenings to the min_heap library code - "xarray: extract __xa_cmpxchg_raw" from Tamir Duberstein preforms some cleanup and Rust preparation in the xarray library code - "Update reference to include/asm-<arch>" from Geert Uytterhoeven fixes pathnames in some code comments - "Converge on using secs_to_jiffies()" from Easwar Hariharan uses the new secs_to_jiffies() in various places where that is appropriate - "ocfs2, dlmfs: convert to the new mount API" from Eric Sandeen switches two filesystems to the new mount API - "Convert ocfs2 to use folios" from Matthew Wilcox does that - "Remove get_task_comm() and print task comm directly" from Yafang Shao removes now-unneeded calls to get_task_comm() in various places - "squashfs: reduce memory usage and update docs" from Phillip Lougher implements some memory savings in squashfs and performs some maintainability work - "lib: clarify comparison function requirements" from Kuan-Wei Chiu tightens the sort code's behaviour and adds some maintenance work - "nilfs2: protect busy buffer heads from being force-cleared" from Ryusuke Konishi fixes an issues in nlifs when the fs is presented with a corrupted image - "nilfs2: fix kernel-doc comments for function return values" from Ryusuke Konishi fixes some nilfs kerneldoc - "nilfs2: fix issues with rename operations" from Ryusuke Konishi addresses some nilfs BUG_ONs which syzbot was able to trigger - "minmax.h: Cleanups and minor optimisations" from David Laight does some maintenance work on the min/max library code - "Fixes and cleanups to xarray" from Kemeng Shi does maintenance work on the xarray library code" * tag 'mm-nonmm-stable-2025-01-24-23-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (131 commits) ocfs2: use str_yes_no() and str_no_yes() helper functions include/linux/lz4.h: add some missing macros Xarray: use xa_mark_t in xas_squash_marks() to keep code consistent Xarray: remove repeat check in xas_squash_marks() Xarray: distinguish large entries correctly in xas_split_alloc() Xarray: move forward index correctly in xas_pause() Xarray: do not return sibling entries from xas_find_marked() ipc/util.c: complete the kernel-doc function descriptions gcov: clang: use correct function param names latencytop: use correct kernel-doc format for func params minmax.h: remove some #defines that are only expanded once minmax.h: simplify the variants of clamp() minmax.h: move all the clamp() definitions after the min/max() ones minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp() minmax.h: reduce the #define expansion of min(), max() and clamp() minmax.h: update some comments minmax.h: add whitespace around operators and after commas nilfs2: do not update mtime of renamed directory that is not moved nilfs2: handle errors that nilfs_prepare_chunk() may return CREDITS: fix spelling mistake ...
2025-01-25mm/memblock: add memblock_alloc_or_panic interfaceGuo Weikang
Before SLUB initialization, various subsystems used memblock_alloc to allocate memory. In most cases, when memory allocation fails, an immediate panic is required. To simplify this behavior and reduce repetitive checks, introduce `memblock_alloc_or_panic`. This function ensures that memory allocation failures result in a panic automatically, improving code readability and consistency across subsystems that require this behavior. [guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic] Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/ Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com Signed-off-by: Guo Weikang <guoweikang.kernel@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> [s390] Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25alloc_tag: avoid current->alloc_tag manipulations when profiling is disabledSuren Baghdasaryan
When memory allocation profiling is disabled there is no need to update current->alloc_tag and these manipulations add unnecessary overhead. Fix the overhead by skipping these extra updates. I ran comprehensive testing on Pixel 6 on Big, Medium and Little cores: Overhead before fixes Overhead after fixes slab alloc page alloc slab alloc page alloc Big 6.21% 5.32% 3.31% 4.93% Medium 4.51% 5.05% 3.79% 4.39% Little 7.62% 1.82% 6.68% 1.02% This is an allocation microbenchmark doing allocations in a tight loop. Not a really realistic scenario and useful only to make performance comparisons. Link: https://lkml.kernel.org/r/20241226211639.1357704-1-surenb@google.com Fixes: b951aaff5035 ("mm: enable page allocation tagging") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: David Wang <00107082@163.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Yu Zhao <yuzhao@google.com> Cc: Zhenhua Huang <quic_zhenhuah@quicinc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25mm: alloc_pages_bulk: rename APILuiz Capitulino
The previous commit removed the page_list argument from alloc_pages_bulk_noprof() along with the alloc_pages_bulk_list() function. Now that only the *_array() flavour of the API remains, we can do the following renaming (along with the _noprof() ones): alloc_pages_bulk_array -> alloc_pages_bulk alloc_pages_bulk_array_mempolicy -> alloc_pages_bulk_mempolicy alloc_pages_bulk_array_node -> alloc_pages_bulk_node Link: https://lkml.kernel.org/r/275a3bbc0be20fbe9002297d60045e67ab3d4ada.1734991165.git.luizcap@redhat.com Signed-off-by: Luiz Capitulino <luizcap@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25lib/list_debug.c: add object information in case of invalid objectManinder Singh
As of now during link list corruption it prints about cluprit address and its wrong value, but sometime it is not enough to catch the actual issue point. If it prints allocation and free path of that corrupted node, it will be a lot easier to find and fix the issues. Adding the same information when data mismatch is found in link list debug data: [ 14.243055] slab kmalloc-32 start ffff0000cda19320 data offset 32 pointer offset 8 size 32 allocated at add_to_list+0x28/0xb0 [ 14.245259] __kmalloc_cache_noprof+0x1c4/0x358 [ 14.245572] add_to_list+0x28/0xb0 ... [ 14.248632] do_el0_svc_compat+0x1c/0x34 [ 14.249018] el0_svc_compat+0x2c/0x80 [ 14.249244] Free path: [ 14.249410] kfree+0x24c/0x2f0 [ 14.249724] do_force_corruption+0xbc/0x100 ... [ 14.252266] el0_svc_common.constprop.0+0x40/0xe0 [ 14.252540] do_el0_svc_compat+0x1c/0x34 [ 14.252763] el0_svc_compat+0x2c/0x80 [ 14.253071] ------------[ cut here ]------------ [ 14.253303] list_del corruption. next->prev should be ffff0000cda192a8, but was 6b6b6b6b6b6b6b6b. (next=ffff0000cda19348) [ 14.254255] WARNING: CPU: 3 PID: 84 at lib/list_debug.c:65 __list_del_entry_valid_or_report+0x158/0x164 Moved prototype of mem_dump_obj() to bug.h, as mm.h can not be included in bug.h. Link: https://lkml.kernel.org/r/20241230101043.53773-1-maninder1.s@samsung.com Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Acked-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Marco Elver <elver@google.com> Cc: Rohit Thapliyal <r.thapliyal@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25test_maple_tree: test exhausted upper limit of mtree_alloc_cyclic()Liam R. Howlett
When the upper bound of the search is exhausted, the maple state may be returned in an error state of -EBUSY. This means maple state needs to be reset before the second search in mas_alloc_cylic() to ensure the search happens. This test ensures the issue is not recreated. Link: https://lkml.kernel.org/r/20241216190113.1226145-3-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Yang Erkun <yangerkun@huawei.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Chuck Lever <chuck.lever@oracle.com> says: Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-24include/linux/lz4.h: add some missing macrosGao Xiang
Currently, LZ4_DISTANCE_MAX and LZ4_DECOMPRESS_INPLACE_MARGIN are defined in the erofs subsystem for LZ4 in-place decompression, which is somewhat unsuitable since they should belong to the LZ4 itself and may change with future LZ4 codebase updates. Move them to include/linux/lz4.h to match the upstream LZ4 library [1]. No logic changes. [1] https://github.com/lz4/lz4/blob/v1.10.0/lib/lz4.h#L670 Link: https://lkml.kernel.org/r/20250114130454.1191150-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Cc: Yann Collet <yann.collet.73@gmail.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Chao Yu <chao@kernel.org> Cc: Yue Hu <zbestahu@gmail.com> Cc; Jeffle Xu <jefflexu@linux.alibaba.com> Cc: Sandeep Dhavale <dhavale@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-24Xarray: use xa_mark_t in xas_squash_marks() to keep code consistentKemeng Shi
Besides xas_squash_marks(), all functions use xa_mark_t type to iterate all possible marks. Use xa_mark_t in xas_squash_marks() to keep code consistent. Link: https://lkml.kernel.org/r/20241213122523.12764-6-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Mattew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-24Xarray: remove repeat check in xas_squash_marks()Kemeng Shi
Caller of xas_squash_marks() has ensured xas->xa_sibs is non-zero. Just remove repeat check of xas->xa_sibs in xas_squash_marks(). Link: https://lkml.kernel.org/r/20241213122523.12764-5-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Mattew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-24Xarray: distinguish large entries correctly in xas_split_alloc()Kemeng Shi
We don't support large entries which expand two more level xa_node in split. For case "xas->xa_shift + 2 * XA_CHUNK_SHIFT == order", we also need two level of xa_node to expand. Distinguish entry as large entry in case "xas->xa_shift + 2 * XA_CHUNK_SHIFT == order". As max order of folio in pagecache (MAX_PAGECACHE_ORDER) is <= (XA_CHUNK_SHIFT * 2 - 1), this change is more likely a cleanup... Link: https://lkml.kernel.org/r/20241213122523.12764-4-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Mattew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-24Xarray: move forward index correctly in xas_pause()Kemeng Shi
After xas_load(), xas->index could point to mid of found multi-index entry and xas->index's bits under node->shift maybe non-zero. The afterward xas_pause() will move forward xas->index with xa->node->shift with bits under node->shift un-masked and thus skip some index unexpectedly. Consider following case: Assume XA_CHUNK_SHIFT is 4. xa_store_range(xa, 16, 31, ...) xa_store(xa, 32, ...) XA_STATE(xas, xa, 17); xas_for_each(&xas,...) xas_load(&xas) /* xas->index = 17, xas->xa_offset = 1, xas->xa_node->xa_shift = 4 */ xas_pause() /* xas->index = 33, xas->xa_offset = 2, xas->xa_node->xa_shift = 4 */ As we can see, index of 32 is skipped unexpectedly. Fix this by mask bit under node->xa_shift when move forward index in xas_pause(). For now, this will not cause serious problems. Only minor problem like cachestat return less number of page status could happen. Link: https://lkml.kernel.org/r/20241213122523.12764-3-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Mattew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-24Xarray: do not return sibling entries from xas_find_marked()Kemeng Shi
Patch series "Fixes and cleanups to xarray", v5. This series contains some random fixes and cleanups to xarray. Patch 1-2 are fixes and patch 3-6 are cleanups. More details can be found in respective patches. This patch (of 5): Similar to issue fixed in commit cbc02854331ed ("XArray: Do not return sibling entries from xa_load()"), we may return sibling entries from xas_find_marked as following: Thread A: Thread B: xa_store_range(xa, entry, 6, 7, gfp); xa_set_mark(xa, 6, mark) XA_STATE(xas, xa, 6); xas_find_marked(&xas, 7, mark); offset = xas_find_chunk(xas, advance, mark); [offset is 6 which points to a valid entry] xa_store_range(xa, entry, 4, 7, gfp); entry = xa_entry(xa, node, 6); [entry is a sibling of 4] if (!xa_is_node(entry)) return entry; Skip sibling entry like xas_find() does to protect caller from seeing sibling entry from xas_find_marked() or caller may use sibling entry as a valid entry and crash the kernel. Besides, load_race() test is modified to catch mentioned issue and modified load_race() only passes after this fix is merged. Here is an example how this bug could be triggerred in tmpfs which enables large folio in mapping: Let's take a look at involved racer: 1. How pages could be created and dirtied in shmem file. write ksys_write vfs_write new_sync_write shmem_file_write_iter generic_perform_write shmem_write_begin shmem_get_folio shmem_allowable_huge_orders shmem_alloc_and_add_folios shmem_alloc_folio __folio_set_locked shmem_add_to_page_cache XA_STATE_ORDER(..., index, order) xax_store() shmem_write_end folio_mark_dirty() 2. How dirty pages could be deleted in shmem file. ioctl do_vfs_ioctl file_ioctl ioctl_preallocate vfs_fallocate shmem_fallocate shmem_truncate_range shmem_undo_range truncate_inode_folio filemap_remove_folio page_cache_delete xas_store(&xas, NULL); 3. How dirty pages could be lockless searched sync_file_range ksys_sync_file_range __filemap_fdatawrite_range filemap_fdatawrite_wbc do_writepages writeback_use_writepage writeback_iter writeback_get_folio filemap_get_folios_tag find_get_entry folio = xas_find_marked() folio_try_get(folio) Kernel will crash as following: 1.Create 2.Search 3.Delete /* write page 2,3 */ write ... shmem_write_begin XA_STATE_ORDER(xas, i_pages, index = 2, order = 1) xa_store(&xas, folio) shmem_write_end folio_mark_dirty() /* sync page 2 and page 3 */ sync_file_range ... find_get_entry folio = xas_find_marked() /* offset will be 2 */ offset = xas_find_chunk() /* delete page 2 and page 3 */ ioctl ... xas_store(&xas, NULL); /* write page 0-3 */ write ... shmem_write_begin XA_STATE_ORDER(xas, i_pages, index = 0, order = 2) xa_store(&xas, folio) shmem_write_end folio_mark_dirty(folio) /* get sibling entry from offset 2 */ entry = xa_entry(.., 2) /* use sibling entry as folio and crash kernel */ folio_try_get(folio) Link: https://lkml.kernel.org/r/20241213122523.12764-1-shikemeng@huaweicloud.com Link: https://lkml.kernel.org/r/20241213122523.12764-2-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Mattew Wilcox <willy@infradead.org> [English fixes] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-24lib/list_sort: clarify comparison function requirements in list_sort()Kuan-Wei Chiu
Add a detailed explanation in the list_sort() kernel doc comment specifying that the comparison function must satisfy antisymmetry and transitivity. These properties are essential for the sorting algorithm to produce correct results. Issues have arisen in the past [1][2][3][4] where comparison functions violated the transitivity property, causing sorting algorithms to fail to correctly order elements. While these requirements may seem straightforward, they are commonly misunderstood or overlooked, leading to bugs. Highlighting these properties in the documentation will help prevent such mistakes in the future. Link: https://lore.kernel.org/lkml/20240701205639.117194-1-visitorckw@gmail.com [1] Link: https://lore.kernel.org/lkml/20241203202228.1274403-1-visitorckw@gmail.com [2] Link: https://lore.kernel.org/lkml/20241209134226.1939163-1-visitorckw@gmail.com [3] Link: https://lore.kernel.org/lkml/20241209145728.1975311-1-visitorckw@gmail.com [4] Link: https://lkml.kernel.org/r/20250106170104.3137845-3-visitorckw@gmail.com Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw> Cc: <chuang@cs.nycu.edu.tw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-24lib/sort: clarify comparison function requirements in sort_r()Kuan-Wei Chiu
Patch series "lib: clarify comparison function requirements", v2. Add a detailed explanation in the sort_r/list_sort kernel doc comment specifying that the comparison function must satisfy antisymmetry and transitivity. These properties are essential for the sorting algorithm to produce correct results. Issues have arisen in the past [1][2][3][4] where comparison functions violated the transitivity property, causing sorting algorithms to fail to correctly order elements. While these requirements may seem straightforward, they are commonly misunderstood or overlooked, leading to bugs. Highlighting these properties in the documentation will help prevent such mistakes in the future. Link: https://lore.kernel.org/lkml/20240701205639.117194-1-visitorckw@gmail.com [1] Link: https://lore.kernel.org/lkml/20241203202228.1274403-1-visitorckw@gmail.com [2] Link: https://lore.kernel.org/lkml/20241209134226.1939163-1-visitorckw@gmail.com [3] Link: https://lore.kernel.org/lkml/20241209145728.1975311-1-visitorckw@gmail.com [4] This patch (of 2): Add a detailed explanation in the sort_r() kernel doc comment specifying that the comparison function must satisfy antisymmetry and transitivity. These properties are essential for the sorting algorithm to produce correct results. Issues have arisen in the past [1][2][3][4] where comparison functions violated the transitivity property, causing sorting algorithms to fail to correctly order elements. While these requirements may seem straightforward, they are commonly misunderstood or overlooked, leading to bugs. Highlighting these properties in the documentation will help prevent such mistakes in the future. Link: https://lkml.kernel.org/r/20250106170104.3137845-1-visitorckw@gmail.com Link: https://lore.kernel.org/lkml/20240701205639.117194-1-visitorckw@gmail.com [1] Link: https://lore.kernel.org/lkml/20241203202228.1274403-1-visitorckw@gmail.com [2] Link: https://lore.kernel.org/lkml/20241209134226.1939163-1-visitorckw@gmail.com [3] Link: https://lore.kernel.org/lkml/20241209145728.1975311-1-visitorckw@gmail.com [4] Link: https://lkml.kernel.org/r/20250106170104.3137845-2-visitorckw@gmail.com Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw> Cc: <chuang@cs.nycu.edu.tw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-24Merge tag 'v6.14-p1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Remove physical address skcipher walking - Fix boot-up self-test race Algorithms: - Optimisations for x86/aes-gcm - Optimisations for x86/aes-xts - Remove VMAC - Remove keywrap Drivers: - Remove n2 Others: - Fixes for padata UAF - Fix potential rhashtable deadlock by moving schedule_work outside lock" * tag 'v6.14-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (75 commits) rhashtable: Fix rhashtable_try_insert test dt-bindings: crypto: qcom,inline-crypto-engine: Document the SM8750 ICE dt-bindings: crypto: qcom,prng: Document SM8750 RNG dt-bindings: crypto: qcom-qce: Document the SM8750 crypto engine crypto: asymmetric_keys - Remove unused key_being_used_for[] padata: avoid UAF for reorder_work padata: fix UAF in padata_reorder padata: add pd get/put refcnt helper crypto: skcipher - call cond_resched() directly crypto: skcipher - optimize initializing skcipher_walk fields crypto: skcipher - clean up initialization of skcipher_walk::flags crypto: skcipher - fold skcipher_walk_skcipher() into skcipher_walk_virt() crypto: skcipher - remove redundant check for SKCIPHER_WALK_SLOW crypto: skcipher - remove redundant clamping to page size crypto: skcipher - remove unnecessary page alignment of bounce buffer crypto: skcipher - document skcipher_walk_done() and rename some vars crypto: omap - switch from scatter_walk to plain offset crypto: powerpc/p10-aes-gcm - simplify handling of linear associated data crypto: bcm - Drop unused setting of local 'ptr' variable crypto: hisilicon/qm - support new function communication ...
2025-01-23Merge tag 'trace-ringbuffer-v6.14-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull trace fing buffer fix from Steven Rostedt: "Fix atomic64 operations on some architectures for the tracing ring buffer: - Have emulating atomic64 use arch_spin_locks instead of raw_spin_locks The tracing ring buffer events have a small timestamp that holds the delta between itself and the event before it. But this can be tricky to update when interrupts come in. It originally just set the deltas to zero for events that interrupted the adding of another event which made all the events in the interrupt have the same timestamp as the event it interrupted. This was not suitable for many tools, so it was eventually fixed. But that fix required adding an atomic64 cmpxchg on the timestamp in cases where an event was added while another event was in the process of being added. Originally, for 32 bit architectures, the manipulation of the 64 bit timestamp was done by a structure that held multiple 32bit words to hold parts of the timestamp and a counter. But as updates to the ring buffer were done, maintaining this became too complex and was replaced by the atomic64 generic operations which are now used by both 64bit and 32bit architectures. Shortly after that, it was reported that riscv32 and other 32 bit architectures that just used the generic atomic64 were locking up. This was because the generic atomic64 operations defined in lib/atomic64.c uses a raw_spin_lock() to emulate an atomic64 operation. The problem here was that raw_spin_lock() can also be traced by the function tracer (which is commonly used for debugging raw spin locks). Since the function tracer uses the tracing ring buffer, which now is being traced internally, this was triggering a recursion and setting off a warning that the spin locks were recusing. There's no reason for the code that emulates atomic64 operations to be using raw_spin_locks which have a lot of debugging infrastructure attached to them (depending on the config options). Instead it should be using the arch_spin_lock() which does not have any infrastructure attached to them and is used by low level infrastructure like RCU locks, lockdep and of course tracing. Using arch_spin_lock()s fixes this issue. - Do not trace in NMI if the architecture uses emulated atomic64 operations Another issue with using the emulated atomic64 operations that uses spin locks to emulate the atomic64 operations is that they cannot be used in NMI context. As an NMI can trigger while holding the atomic64 spin locks it can try to take the same lock and cause a deadlock. Have the ring buffer fail recording events if in NMI context and the architecture uses the emulated atomic64 operations" * tag 'trace-ringbuffer-v6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: atomic64: Use arch_spin_locks instead of raw_spin_locks ring-buffer: Do not allow events in NMI with generic atomic64 cmpxchg()
2025-01-23Merge tag 'bpf-next-6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: "A smaller than usual release cycle. The main changes are: - Prepare selftest to run with GCC-BPF backend (Ihor Solodrai) In addition to LLVM-BPF runs the BPF CI now runs GCC-BPF in compile only mode. Half of the tests are failing, since support for btf_decl_tag is still WIP, but this is a great milestone. - Convert various samples/bpf to selftests/bpf/test_progs format (Alexis Lothoré and Bastien Curutchet) - Teach verifier to recognize that array lookup with constant in-range index will always succeed (Daniel Xu) - Cleanup migrate disable scope in BPF maps (Hou Tao) - Fix bpf_timer destroy path in PREEMPT_RT (Hou Tao) - Always use bpf_mem_alloc in bpf_local_storage in PREEMPT_RT (Martin KaFai Lau) - Refactor verifier lock support (Kumar Kartikeya Dwivedi) This is a prerequisite for upcoming resilient spin lock. - Remove excessive 'may_goto +0' instructions in the verifier that LLVM leaves when unrolls the loops (Yonghong Song) - Remove unhelpful bpf_probe_write_user() warning message (Marco Elver) - Add fd_array_cnt attribute for prog_load command (Anton Protopopov) This is a prerequisite for upcoming support for static_branch" * tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (125 commits) selftests/bpf: Add some tests related to 'may_goto 0' insns bpf: Remove 'may_goto 0' instruction in opt_remove_nops() bpf: Allow 'may_goto 0' instruction in verifier selftests/bpf: Add test case for the freeing of bpf_timer bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT bpf: Free element after unlock in __htab_map_lookup_and_delete_elem() bpf: Bail out early in __htab_map_lookup_and_delete_elem() bpf: Free special fields after unlock in htab_lru_map_delete_node() tools: Sync if_xdp.h uapi tooling header libbpf: Work around kernel inconsistently stripping '.llvm.' suffix bpf: selftests: verifier: Add nullness elision tests bpf: verifier: Support eliding map lookup nullness bpf: verifier: Refactor helper access type tracking bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write bpf: verifier: Add missing newline on verbose() call selftests/bpf: Add distilled BTF test about marking BTF_IS_EMBEDDED libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED libbpf: Fix return zero when elf_begin failed selftests/bpf: Fix btf leak on new btf alloc failure in btf_distill test veristat: Load struct_ops programs only once ...
2025-01-22Merge tag 'crc-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull CRC updates from Eric Biggers: - Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be directly accessible via the library API, instead of requiring the crypto API. This is much simpler and more efficient. - Convert some users such as ext4 to use the CRC32 library API instead of the crypto API. More conversions like this will come later. - Add a KUnit test that tests and benchmarks multiple CRC variants. Remove older, less-comprehensive tests that are made redundant by this. - Add an entry to MAINTAINERS for the kernel's CRC library code. I'm volunteering to maintain it. I have additional cleanups and optimizations planned for future cycles. * tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (31 commits) MAINTAINERS: add entry for CRC library powerpc/crc: delete obsolete crc-vpmsum_test.c lib/crc32test: delete obsolete crc32test.c lib/crc16_kunit: delete obsolete crc16_kunit.c lib/crc_kunit.c: add KUnit test suite for CRC library functions powerpc/crc-t10dif: expose CRC-T10DIF function through lib arm64/crc-t10dif: expose CRC-T10DIF function through lib arm/crc-t10dif: expose CRC-T10DIF function through lib x86/crc-t10dif: expose CRC-T10DIF function through lib crypto: crct10dif - expose arch-optimized lib function lib/crc-t10dif: add support for arch overrides lib/crc-t10dif: stop wrapping the crypto API scsi: target: iscsi: switch to using the crc32c library f2fs: switch to using the crc32 library jbd2: switch to using the crc32c library ext4: switch to using the crc32c library lib/crc32: make crc32c() go directly to lib bcachefs: Explicitly select CRYPTO from BCACHEFS_FS x86/crc32: expose CRC32 functions through lib x86/crc32: update prototype for crc32_pclmul_le_16() ...
2025-01-22Merge tag 'linux_kselftest-kunit-6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit updates from Shuah Khan: - fix struct completion warning - introduce autorun option - add fallback for os.sched_getaffinity - enable hardware acceleration when available * tag 'linux_kselftest-kunit-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: Introduce autorun option kunit: enable hardware acceleration when available kunit: add fallback for os.sched_getaffinity kunit: platform: Resolve 'struct completion' warning
2025-01-22atomic64: Use arch_spin_locks instead of raw_spin_locksSteven Rostedt
raw_spin_locks can be traced by lockdep or tracing itself. Atomic64 operations can be used in the tracing infrastructure. When an architecture does not have true atomic64 operations it can use the generic version that disables interrupts and uses spin_locks. The tracing ring buffer code uses atomic64 operations for the time keeping. But because some architectures use the default operations, the locking inside the atomic operations can cause an infinite recursion. As atomic64 implementation is architecture specific, it should not be using raw_spin_locks() but instead arch_spin_locks as that is the purpose of arch_spin_locks. To be used in architecture specific implementations of generic infrastructure like atomic64 operations. Note, by switching from raw_spin_locks to arch_spin_locks, the locks taken to emulate the atomic64 operations will not have lockdep, mmio, or any kind of checks done on them. They will not even disable preemption, although the code will disable interrupts preventing the tasks that hold the locks from being preempted. As the locks held are done so for very short periods of time, and the logic is only done to emulate atomic64, not having them be instrumented should not be an issue. Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andreas Larsson <andreas@gaisler.com> Link: https://lore.kernel.org/20250122144311.64392baf@gandalf.local.home Fixes: c84897c0ff592 ("ring-buffer: Remove 32bit timestamp logic") Closes: https://lore.kernel.org/all/86fb4f86-a0e4-45a2-a2df-3154acc4f086@gaisler.com/ Reported-by: Ludwig Rydberg <ludwig.rydberg@gaisler.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>