summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2024-05-14Merge tag 'timers-core-2024-05-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timers and timekeeping updates from Thomas Gleixner: "Core code: - Make timekeeping and VDSO time readouts resilent against math overflow: In guest context the kernel is prone to math overflow when the host defers the timer interrupt due to overload, malfunction or malice. This can be mitigated by checking the clocksource delta for the maximum deferrement which is readily available. If that value is exceeded then the code uses a slowpath function which can handle the multiplication overflow. This functionality is enabled unconditionally in the kernel, but made conditional in the VDSO code. The latter is conditional because it allows architectures to optimize the check so it is not causing performance regressions. On X86 this is achieved by reworking the existing check for negative TSC deltas as a negative delta obviously exceeds the maximum deferrement when it is evaluated as an unsigned value. That avoids two conditionals in the hotpath and allows to hide both the negative delta and the large delta handling in the same slow path. - Add an initial minimal ktime_t abstraction for Rust - The usual boring cleanups and enhancements Drivers: - Boring updates to device trees and trivial enhancements in various drivers" * tag 'timers-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) clocksource/drivers/arm_arch_timer: Mark hisi_161010101_oem_info const clocksource/drivers/timer-ti-dm: Remove an unused field in struct dmtimer clocksource/drivers/renesas-ostm: Avoid reprobe after successful early probe clocksource/drivers/renesas-ostm: Allow OSTM driver to reprobe for RZ/V2H(P) SoC dt-bindings: timer: renesas: ostm: Document Renesas RZ/V2H(P) SoC rust: time: doc: Add missing C header links clocksource: Make the int help prompt unit readable in ncurses hrtimer: Rename __hrtimer_hres_active() to hrtimer_hres_active() timerqueue: Remove never used function timerqueue_node_expires() rust: time: Add Ktime vdso: Fix powerpc build U64_MAX undeclared error clockevents: Convert s[n]printf() to sysfs_emit() clocksource: Convert s[n]printf() to sysfs_emit() clocksource: Make watchdog and suspend-timing multiplication overflow safe timekeeping: Let timekeeping_cycles_to_ns() handle both under and overflow timekeeping: Make delta calculation overflow safe timekeeping: Prepare timekeeping_cycles_to_ns() for overflow safety timekeeping: Fold in timekeeping_delta_to_ns() timekeeping: Consolidate timekeeping helpers timekeeping: Refactor timekeeping helpers ...
2024-05-13Merge tag 'sched-core-2024-05-13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Add cpufreq pressure feedback for the scheduler - Rework misfit load-balancing wrt affinity restrictions - Clean up and simplify the code around ::overutilized and ::overload access. - Simplify sched_balance_newidle() - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES handling that changed the output. - Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch() - Reorganize, clean up and unify most of the higher level scheduler balancing function names around the sched_balance_*() prefix - Simplify the balancing flag code (sched_balance_running) - Miscellaneous cleanups & fixes * tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) sched/pelt: Remove shift of thermal clock sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() thermal/cpufreq: Remove arch_update_thermal_pressure() sched/cpufreq: Take cpufreq feedback into account cpufreq: Add a cpufreq pressure feedback for the scheduler sched/fair: Fix update of rd->sg_overutilized sched/vtime: Do not include <asm/vtime.h> header s390/irq,nmi: Include <asm/vtime.h> header directly s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover sched/vtime: Get rid of generic vtime_task_switch() implementation sched/vtime: Remove confusing arch_vtime_task_switch() declaration sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized() sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded() sched/fair: Rename root_domain::overload to ::overloaded sched/fair: Use helper functions to access root_domain::overload sched/fair: Check root_domain::overload value before update sched/fair: Combine EAS check with root_domain::overutilized access sched/fair: Simplify the continue_balancing logic in sched_balance_newidle() ...
2024-05-13Merge tag 'hardening-6.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "The bulk of the changes here are related to refactoring and expanding the KUnit tests for string helper and fortify behavior. Some trivial strncpy replacements in fs/ were carried in my tree. Also some fixes to SCSI string handling were carried in my tree since the helper for those was introduce here. Beyond that, just little fixes all around: objtool getting confused about LKDTM+KCFI, preparing for future refactors (constification of sysctl tables, additional __counted_by annotations), a Clang UBSAN+i386 crash fix, and adding more options in the hardening.config Kconfig fragment. Summary: - selftests: Add str*cmp tests (Ivan Orlov) - __counted_by: provide UAPI for _le/_be variants (Erick Archer) - Various strncpy deprecation refactors (Justin Stitt) - stackleak: Use a copy of soon-to-be-const sysctl table (Thomas Weißschuh) - UBSAN: Work around i386 -regparm=3 bug with Clang prior to version 19 - Provide helper to deal with non-NUL-terminated string copying - SCSI: Fix older string copying bugs (with new helper) - selftests: Consolidate string helper behavioral tests - selftests: add memcpy() fortify tests - string: Add additional __realloc_size() annotations for "dup" helpers - LKDTM: Fix KCFI+rodata+objtool confusion - hardening.config: Enable KCFI" * tag 'hardening-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (29 commits) uapi: stddef.h: Provide UAPI macros for __counted_by_{le, be} stackleak: Use a copy of the ctl_table argument string: Add additional __realloc_size() annotations for "dup" helpers kunit/fortify: Fix replaced failure path to unbreak __alloc_size hardening: Enable KCFI and some other options lkdtm: Disable CFI checking for perms functions kunit/fortify: Add memcpy() tests kunit/fortify: Do not spam logs with fortify WARNs kunit/fortify: Rename tests to use recommended conventions init: replace deprecated strncpy with strscpy_pad kunit/fortify: Fix mismatched kvalloc()/vfree() usage scsi: qla2xxx: Avoid possible run-time warning with long model_num scsi: mpi3mr: Avoid possible run-time warning with long manufacturer strings scsi: mptfusion: Avoid possible run-time warning with long manufacturer strings fs: ecryptfs: replace deprecated strncpy with strscpy hfsplus: refactor copy_name to not use strncpy reiserfs: replace deprecated strncpy with scnprintf virt: acrn: replace deprecated strncpy with strscpy ubsan: Avoid i386 UBSAN handler crashes with Clang ubsan: Remove 1-element array usage in debug reporting ...
2024-05-13Merge tag 'for-6.10/block-20240511' of git://git.kernel.dk/linuxLinus Torvalds
Pull block updates from Jens Axboe: - Add a partscan attribute in sysfs, fixing an issue with systemd relying on an internal interface that went away. - Attempt #2 at making long running discards interruptible. The previous attempt went into 6.9, but we ended up mostly reverting it as it had issues. - Remove old ida_simple API in bcache - Support for zoned write plugging, greatly improving the performance on zoned devices. - Remove the old throttle low interface, which has been experimental since 2017 and never made it beyond that and isn't being used. - Remove page->index debugging checks in brd, as it hasn't caught anything and prepares us for removing in struct page. - MD pull request from Song - Don't schedule block workers on isolated CPUs * tag 'for-6.10/block-20240511' of git://git.kernel.dk/linux: (84 commits) blk-throttle: delay initialization until configuration blk-throttle: remove CONFIG_BLK_DEV_THROTTLING_LOW block: fix that util can be greater than 100% block: support to account io_ticks precisely block: add plug while submitting IO bcache: fix variable length array abuse in btree_iter bcache: Remove usage of the deprecated ida_simple_xx() API md: Revert "md: Fix overflow in is_mddev_idle" blk-lib: check for kill signal in ioctl BLKDISCARD block: add a bio_await_chain helper block: add a blk_alloc_discard_bio helper block: add a bio_chain_and_submit helper block: move discard checks into the ioctl handler block: remove the discard_granularity check in __blkdev_issue_discard block/ioctl: prefer different overflow check null_blk: Fix the WARNING: modpost: missing MODULE_DESCRIPTION() block: fix and simplify blkdevparts= cmdline parsing block: refine the EOF check in blkdev_iomap_begin block: add a partscan sysfs attribute for disks block: add a disk_has_partscan helper ...
2024-05-13Merge tag 'tpmdd-next-6.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull TPM updates from Jarkko Sakkinen: "These are the changes for the TPM driver with a single major new feature: TPM bus encryption and integrity protection. The key pair on TPM side is generated from so called null random seed per power on of the machine [1]. This supports the TPM encryption of the hard drive by adding layer of protection against bus interposer attacks. Other than that, a few minor fixes and documentation for tpm_tis to clarify basics of TPM localities for future patch review discussions (will be extended and refined over times, just a seed)" Link: https://lore.kernel.org/linux-integrity/20240429202811.13643-1-James.Bottomley@HansenPartnership.com/ [1] * tag 'tpmdd-next-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: (28 commits) Documentation: tpm: Add TPM security docs toctree entry tpm: disable the TPM if NULL name changes Documentation: add tpm-security.rst tpm: add the null key name as a sysfs export KEYS: trusted: Add session encryption protection to the seal/unseal path tpm: add session encryption protection to tpm2_get_random() tpm: add hmac checks to tpm2_pcr_extend() tpm: Add the rest of the session HMAC API tpm: Add HMAC session name/handle append tpm: Add HMAC session start and end functions tpm: Add TCG mandated Key Derivation Functions (KDFs) tpm: Add NULL primary creation tpm: export the context save and load commands tpm: add buffer function to point to returned parameters crypto: lib - implement library version of AES in CFB mode KEYS: trusted: tpm2: Use struct tpm_buf for sized buffers tpm: Add tpm_buf_read_{u8,u16,u32} tpm: TPM2B formatted buffers tpm: Store the length of the tpm_buf data separately. tpm: Update struct tpm_buf documentation comments ...
2024-05-13Merge tag 'slab-for-6.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: "This time it's mostly random cleanups and fixes, with two performance fixes that might have significant impact, but limited to systems experiencing particular bad corner case scenarios rather than general performance improvements. The memcg hook changes are going through the mm tree due to dependencies. - Prevent stalls when reading /proc/slabinfo (Jianfeng Wang) This fixes the long-standing problem that can happen with workloads that have alloc/free patterns resulting in many partially used slabs (in e.g. dentry cache). Reading /proc/slabinfo will traverse the long partial slab list under spinlock with disabled irqs and thus can stall other processes or even trigger the lockup detection. The traversal is only done to count free objects so that <active_objs> column can be reported along with <num_objs>. To avoid affecting fast paths with another shared counter (attempted in the past) or complex partial list traversal schemes that allow rescheduling, the chosen solution resorts to approximation - when the partial list is over 10000 slabs long, we will only traverse first 5000 slabs from head and tail each and use the average of those to estimate the whole list. Both head and tail are used as the slabs near head to tend to have more free objects than the slabs towards the tail. It is expected the approximation should not break existing /proc/slabinfo consumers. The <num_objs> field is still accurate and reflects the overall kmem_cache footprint. The <active_objs> was already imprecise due to cpu and percpu-partial slabs, so can't be relied upon to determine exact cache usage. The difference between <active_objs> and <num_objs> is mainly useful to determine the slab fragmentation, and that will be possible even with the approximation in place. - Prevent allocating many slabs when a NUMA node is full (Chen Jun) Currently, on NUMA systems with a node under significantly bigger pressure than other nodes, the fallback strategy may result in each kmalloc_node() that can't be safisfied from the preferred node, to allocate a new slab on a fallback node, and not reuse the slabs already on that node's partial list. This is now fixed and partial lists of fallback nodes are checked even for kmalloc_node() allocations. It's still preferred to allocate a new slab on the requested node before a fallback, but only with a GFP_NOWAIT attempt, which will fail quickly when the node is under a significant memory pressure. - More SLAB removal related cleanups (Xiu Jianfeng, Hyunmin Lee) - Fix slub_kunit self-test with hardened freelists (Guenter Roeck) - Mark racy accesses for KCSAN (linke li) - Misc cleanups (Xiongwei Song, Haifeng Xu, Sangyun Kim)" * tag 'slab-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/slub: remove the check for NULL kmalloc_caches mm/slub: create kmalloc 96 and 192 caches regardless cache size order mm/slub: mark racy access on slab->freelist slub: use count_partial_free_approx() in slab_out_of_memory() slub: introduce count_partial_free_approx() slub: Set __GFP_COMP in kmem_cache by default mm/slub: remove duplicate initialization for early_kmem_cache_node_alloc() mm/slub: correct comment in do_slab_free() mm/slub, kunit: Use inverted data to corrupt kmem cache mm/slub: simplify get_partial_node() mm/slub: add slub_get_cpu_partial() helper mm/slub: remove the check of !kmem_cache_has_cpu_partial() mm/slub: Reduce memory consumption in extreme scenarios mm/slub: mark racy accesses on slab->slabs mm/slub: remove dummy slabinfo functions
2024-05-13Merge tag 'cmpxchg.2024.05.11a' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull cmpxchg updates from Paul McKenney: "Provide one-byte and two-byte cmpxchg() support on sparc32, parisc, and csky This provides native one-byte and two-byte cmpxchg() support for sparc32 and parisc, courtesy of Al Viro. This support is provided by the same hashed-array-of-locks technique used for the other atomic operations provided for these two platforms. There is also emulated one-byte cmpxchg() support for csky using a new cmpxchg_emu_u8() function that uses a four-byte cmpxchg() to emulate the one-byte variant. Similar patches for emulation of one-byte cmpxchg() for arc, sh, and xtensa have not yet received maintainer acks, so they are slated for the v6.11 merge window" * tag 'cmpxchg.2024.05.11a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: csky: Emulate one-byte cmpxchg lib: Add one-byte emulation function parisc: add u16 support to cmpxchg() parisc: add missing export of __cmpxchg_u8() parisc: unify implementations of __cmpxchg_u{8,32,64} parisc: __cmpxchg_u32(): lift conversion into the callers sparc32: add __cmpxchg_u{8,16}() and teach __cmpxchg() to handle those sizes sparc32: unify __cmpxchg_u{32,64} sparc32: make the first argument of __cmpxchg_u64() volatile u64 * sparc32: make __cmpxchg_u32() return u32
2024-05-10Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM fixes from Andrew Morton: "18 hotfixes, 7 of which are cc:stable. More fixups for this cycle's page_owner updates. And a few userfaultfd fixes. Otherwise, random singletons - see the individual changelogs for details" * tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: add entry for Barry Song selftests/mm: fix powerpc ARCH check mailmap: add entry for John Garry XArray: set the marks correctly when splitting an entry selftests/vDSO: fix runtime errors on LoongArch selftests/vDSO: fix building errors on LoongArch mm,page_owner: don't remove __GFP_NOLOCKDEP in add_stack_record_to_list fs/proc/task_mmu: fix uffd-wp confusion in pagemap_scan_pmd_entry() fs/proc/task_mmu: fix loss of young/dirty bits during pagemap scan mm/vmalloc: fix return value of vb_alloc if size is 0 mm: use memalloc_nofs_save() in page_cache_ra_order() kmsan: compiler_types: declare __no_sanitize_or_inline lib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add() tools: fix userspace compilation with new test_xarray changes MAINTAINERS: update URL's for KEYS/KEYRINGS_INTEGRITY and TPM DEVICE DRIVER mm: page_owner: fix wrong information in dump_page_owner maple_tree: fix mas_empty_area_rev() null pointer dereference mm/userfaultfd: reset ptes when close() for wr-protected ones
2024-05-10kbuild: use $(src) instead of $(srctree)/$(src) for source directoryMasahiro Yamada
Kbuild conventionally uses $(obj)/ for generated files, and $(src)/ for checked-in source files. It is merely a convention without any functional difference. In fact, $(obj) and $(src) are exactly the same, as defined in scripts/Makefile.build: src := $(obj) When the kernel is built in a separate output directory, $(src) does not accurately reflect the source directory location. While Kbuild resolves this discrepancy by specifying VPATH=$(srctree) to search for source files, it does not cover all cases. For example, when adding a header search path for local headers, -I$(srctree)/$(src) is typically passed to the compiler. This introduces inconsistency between upstream and downstream Makefiles because $(src) is used instead of $(srctree)/$(src) for the latter. To address this inconsistency, this commit changes the semantics of $(src) so that it always points to the directory in the source tree. Going forward, the variables used in Makefiles will have the following meanings: $(obj) - directory in the object tree $(src) - directory in the source tree (changed by this commit) $(objtree) - the top of the kernel object tree $(srctree) - the top of the kernel source tree Consequently, $(srctree)/$(src) in upstream Makefiles need to be replaced with $(src). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-05-09crypto: lib - implement library version of AES in CFB modeArd Biesheuvel
Implement AES in CFB mode using the existing, mostly constant-time generic AES library implementation. This will be used by the TPM code to encrypt communications with TPM hardware, which is often a discrete component connected using sniffable wires or traces. While a CFB template does exist, using a skcipher is a major pain for non-performance critical synchronous crypto where the algorithm is known at compile time and the data is in contiguous buffers with valid kernel virtual addresses. Tested-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lore.kernel.org/all/20230216201410.15010-1-James.Bottomley@HansenPartnership.com/ Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 35d92abfbad8 ("net: hns3: fix kernel crash when devlink reload during initialization") 2a1a1a7b5fd7 ("net: hns3: add command queue trace for hns3") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-09bitmap: relax find_nth_bit() limitation on return valueYury Norov
The function claims to return the bitmap size, if Nth bit doesn't exist. This rule is violated in inline case because the fns() that is used there doesn't know anything about size of the bitmap. So, relax this requirement to '>= size', and make the outline implementation a bit cheaper. All in-tree kernel users of find_nth_bit() are safe against that. Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Closes: https://lore.kernel.org/all/Zi50cAgR8nZvgLa3@yury-ThinkPad/T/#m6da806a0525e74dcc91f35e5f20766ed4e853e8a Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-09lib: make test_bitops compilable into the kernel imageYury Norov
The test now is limited to be compiled as a module. There's no technical reason for it. Now that the test bears some performance benchmarks, it would be reasonable to run it at kernel load time, before userspace starts, to reduce possible jitter. Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-09lib/test_bitops: Add benchmark test for fns()Kuan-Wei Chiu
Introduce a benchmark test for the fns(). It measures the total time taken by fns() to process 10,000 test data generated using get_random_bytes() for each n in the range [0, BITS_PER_LONG). example: test_bitops: fns: 7637268 ns CC: Andrew Morton <akpm@linux-foundation.org> CC: Rasmus Villemoes <linux@rasmusvillemoes.dk> CC: David Laight <David.Laight@aculab.com> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Suggested-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-08closures: closure_sync_timeout()Kent Overstreet
Add a new variant of closure_sync_timeout() that takes a timeout. Note that when this returns -ETIME the closure will still be waiting on something, i.e. it's not safe to return if you've got a stack allocated closure. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08kfifo: don't use "proxy" headersAndy Shevchenko
Update header inclusions to follow IWYU (Include What You Use) principle. Link: https://lkml.kernel.org/r/20240423192529.3249134-4-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Alain Volmat <alain.volmat@foss.st.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Jernej Skrabec <jernej.skrabec@gmail.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Patrice Chotard <patrice.chotard@foss.st.com> Cc: Rob Herring <robh@kernel.org> Cc: Samuel Holland <samuel@sholland.org> Cc: Sean Wang <sean.wang@mediatek.com> Cc: Sean Young <sean@mess.org> Cc: Stefani Seibold <stefani@seibold.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-07lib: Allow for the DIM library to be modularFlorian Fainelli
Allow the Dynamic Interrupt Moderation (DIM) library to be built as a module. This is particularly useful in an Android GKI (Google Kernel Image) configuration where everything is built as a module, including Ethernet controller drivers. Having to build DIMLIB into the kernel image with potentially no user is wasteful. Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20240506175040.410446-1-florian.fainelli@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-06kunit: bail out early in __kunit_test_suites_init() if there are no suites ↵Scott Mayhew
to test Commit c72a870926c2 added a mutex to prevent kunit tests from running concurrently. Unfortunately that mutex gets locked during module load regardless of whether the module actually has any kunit tests. This causes a problem for kunit tests that might need to load other kernel modules (e.g. gss_krb5_test loading the camellia module). So check to see if there are actually any tests to run before locking the kunit_run_lock mutex. Fixes: c72a870926c2 ("kunit: add ability to run tests after boot using debugfs") Reported-by: Nico Pache <npache@redhat.com> Signed-off-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06kunit: string-stream-test: use KUNIT_DEFINE_ACTION_WRAPPERIvan Orlov
Use KUNIT_DEFINE_ACTION_WRAPPER macro to define the 'kfree' and 'string_stream_destroy' wrappers for kunit_add_action. Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> Reviewed-by: Rae Moar <rmoar@google.com> Acked-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06kunit: test: Move fault tests behind KUNIT_FAULT_TEST Kconfig optionDavid Gow
The NULL dereference tests in kunit_fault deliberately trigger a kernel BUG(), and therefore print the associated stack trace, even when the test passes. This is both annoying (as it bloats the test output), and can confuse some test harnesses, which assume any BUG() is a failure. Allow these tests to be specifically disabled (without disabling all of KUnit's other tests), by placing them behind the CONFIG_KUNIT_FAULT_TEST Kconfig option. This is enabled by default, but can be set to 'n' to disable the test. An empty 'kunit_fault' suite is left behind, which will automatically be marked 'skipped'. As the fault tests already were disabled under UML (as they weren't compatible with its fault handling), we can simply adapt those conditions, and add a dependency on !UML for our new option. Suggested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/all/928249cc-e027-4f7f-b43f-502f99a1ea63@roeck-us.net/ Fixes: 82b0beff3497 ("kunit: Add tests for fault") Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Mickaël Salaün <mic@digikod.net> Reviewed-by: Rae Moar <rmoar@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06kunit: unregister the device on errorWander Lairson Costa
kunit_init_device() should unregister the device on bus register error, but mistakenly it tries to unregister the bus. Unregister the device instead of the bus. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Fixes: d03c720e03bd ("kunit: Add APIs for managing devices") Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06kunit: Fix race condition in try-catch completionDavid Gow
KUnit's try-catch infrastructure now uses vfork_done, which is always set to a valid completion when a kthread is created, but which is set to NULL once the thread terminates. This creates a race condition, where the kthread exits before we can wait on it. Keep a copy of vfork_done, which is taken before we wake_up_process() and so valid, and wait on that instead. Fixes: 93533996100c ("kunit: Handle test faults") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Closes: https://lore.kernel.org/lkml/20240410102710.35911-1-naresh.kamboju@linaro.org/ Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Acked-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Rae Moar <rmoar@google.com> Tested-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06kunit: Add tests for faultMickaël Salaün
Add a test case to check NULL pointer dereference and make sure it would result as a failed test. The full kunit_fault test suite is marked as skipped when run on UML because it would result to a kernel panic. Tested with: ./tools/testing/kunit/kunit.py run --arch x86_64 kunit_fault ./tools/testing/kunit/kunit.py run --arch arm64 \ --cross_compile=aarch64-linux-gnu- kunit_fault Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Rae Moar <rmoar@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-8-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06kunit: Print last test location on faultMickaël Salaün
This helps identify the location of test faults with opportunistic calls to _KUNIT_SAVE_LOC(). This can be useful while writing tests or debugging them. It is possible to call KUNIT_SUCCESS() to explicit save last location. Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Gow <davidgow@google.com> Cc: Rae Moar <rmoar@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-7-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06kunit: Fix KUNIT_SUCCESS() calls in iov_iter testsMickaël Salaün
Fix KUNIT_SUCCESS() calls to pass a test argument. This is a no-op for now because this macro does nothing, but it will be required for the next commit. Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Rae Moar <rmoar@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-6-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06kunit: Handle test faultsMickaël Salaün
Previously, when a kernel test thread crashed (e.g. NULL pointer dereference, general protection fault), the KUnit test hanged for 30 seconds and exited with a timeout error. Fix this issue by waiting on task_struct->vfork_done instead of the custom kunit_try_catch.try_completion, and track the execution state by initially setting try_result with -EINTR and only setting it to 0 if the test passed. Fix kunit_generic_run_threadfn_adapter() signature by returning 0 instead of calling kthread_complete_and_exit(). Because thread's exit code is never checked, always set it to 0 to make it clear. To make this explicit, export kthread_exit() for KUnit tests built as module. Fix the -EINTR error message, which couldn't be reached until now. This is tested with a following patch. Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Gow <davidgow@google.com> Tested-by: Rae Moar <rmoar@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-5-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06kunit: Fix timeout messageMickaël Salaün
The exit code is always checked, so let's properly handle the -ETIMEDOUT error code. Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Rae Moar <rmoar@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-4-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06kunit: Fix kthread referenceMickaël Salaün
There is a race condition when a kthread finishes after the deadline and before the call to kthread_stop(), which may lead to use after free. Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Fixes: adf505457032 ("kunit: fix UAF when run kfence test case test_gfpzero") Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Rae Moar <rmoar@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-3-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06kunit: Handle thread creation errorMickaël Salaün
Previously, if a thread creation failed (e.g. -ENOMEM), the function was called (kunit_catch_run_case or kunit_catch_run_case_cleanup) without marking the test as failed. Instead, fill try_result with the error code returned by kthread_run(), which will mark the test as failed and print "internal error occurred...". Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Link: https://lore.kernel.org/r/20240408074625.65017-2-mic@digikod.net Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-05xarray: inline xas_descend to improve performanceLong Li
The commit 63b1898fffcd ("XArray: Disallow sibling entries of nodes") modified the xas_descend function in such a way that it was no longer being compiled as an inline function, because it increased the size of xas_descend(), and the compiler no longer optimizes it as inline. This had a negative impact on performance, xas_descend is called frequently to traverse downwards in the xarray tree, making it a hot function. Inlining xas_descend has been shown to significantly improve performance by approximately 4.95% in the iozone write test. Machine: Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz #iozone i 0 -i 1 -s 64g -r 16m -f /test/tmptest Before this patch: kB reclen write rewrite read reread 67108864 16384 2230080 3637689 6315197 5496027 After this patch: kB reclen write rewrite read reread 67108864 16384 2340360 3666175 6272401 5460782 Percentage change: 4.95% 0.78% -0.68% -0.64% This patch introduces inlining to the xas_descend function. While this change increases the size of lib/xarray.o, the performance gains in critical workloads make this an acceptable trade-off. Size comparison before and after patch: .text .data .bss file 0x3502 0 0 lib/xarray.o.before 0x3602 0 0 lib/xarray.o.after Link: https://lkml.kernel.org/r/20240416061628.3768901-1-leo.lilong@huawei.com Signed-off-by: Long Li <leo.lilong@huawei.com> Cc: Hou Tao <houtao1@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: yangerkun <yangerkun@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05XArray: set the marks correctly when splitting an entryMatthew Wilcox (Oracle)
If we created a new node to replace an entry which had search marks set, we were setting the search mark on every entry in that node. That works fine when we're splitting to order 0, but when splitting to a larger order, we must not set the search marks on the sibling entries. Link: https://lkml.kernel.org/r/20240501153120.4094530-1-willy@infradead.org Fixes: c010d47f107f ("mm: thp: split huge page to any lower order pages") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/ZjFGCOYk3FK_zVy3@bombadil.infradead.org Tested-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05lib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add()Luis Chamberlain
While testing lib/test_xarray in userspace I've noticed we can fail with: make -C tools/testing/radix-tree ./tools/testing/radix-tree/xarray BUG at check_xa_multi_store_adv_add:749 xarray: 0x55905fb21a00x head 0x55905fa1d8e0x flags 0 marks 0 0 0 0: 0x55905fa1d8e0x xarray: ../../../lib/test_xarray.c:749: check_xa_multi_store_adv_add: Assertion `0' failed. Aborted We get a failure with a BUG_ON(), and that is because we actually can fail due to -ENOMEM, the check in xas_nomem() will fix this for us so it makes no sense to expect no failure inside the loop. So modify the check and since this is also useful for instructional purposes clarify the situation. The check for XA_BUG_ON(xa, xa_load(xa, index) != p) is already done at the end of the loop so just remove the bogus on inside the loop. With this we now pass the test in both kernel and userspace: In userspace: ./tools/testing/radix-tree/xarray XArray: 149092856 of 149092856 tests passed In kernel space: XArray: 148257077 of 148257077 tests passed Link: https://lkml.kernel.org/r/20240423192221.301095-3-mcgrof@kernel.org Fixes: a60cc288a1a2 ("test_xarray: add tests for advanced multi-index use") Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05maple_tree: fix mas_empty_area_rev() null pointer dereferenceLiam R. Howlett
Currently the code calls mas_start() followed by mas_data_end() if the maple state is MA_START, but mas_start() may return with the maple state node == NULL. This will lead to a null pointer dereference when checking information in the NULL node, which is done in mas_data_end(). Avoid setting the offset if there is no node by waiting until after the maple state is checked for an empty or single entry state. A user could trigger the events to cause a kernel oops by unmapping all vmas to produce an empty maple tree, then mapping a vma that would cause the scenario described above. Link: https://lkml.kernel.org/r/20240422203349.2418465-1-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Marius Fleischer <fleischermarius@gmail.com> Closes: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW2-R6w@mail.gmail.com/ Link: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW2-R6w@mail.gmail.com/ Tested-by: Marius Fleischer <fleischermarius@gmail.com> Tested-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05Merge tag 'char-misc-6.9-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small char/misc/other driver fixes and new device ids for 6.9-rc7 that resolve some reported problems. Included in here are: - iio driver fixes - mei driver fix and new device ids - dyndbg bugfix - pvpanic-pci driver bugfix - slimbus driver bugfix - fpga new device id All have been in linux-next with no reported problems" * tag 'char-misc-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: slimbus: qcom-ngd-ctrl: Add timeout for wait operation dyndbg: fix old BUG_ON in >control parser misc/pvpanic-pci: register attributes via pci_driver fpga: dfl-pci: add PCI subdevice ID for Intel D5005 card mei: me: add lunar lake point M DID mei: pxp: match against PCI_CLASS_DISPLAY_OTHER iio:imu: adis16475: Fix sync mode setting iio: accel: mxc4005: Reset chip on probe() and resume() iio: accel: mxc4005: Interrupt handling fixes dt-bindings: iio: health: maxim,max30102: fix compatible check iio: pressure: Fixes SPI support for BMP3xx devices iio: pressure: Fixes BME280 SPI driver data
2024-05-02wrapper for access to ->bd_partnoAl Viro
On the next step it's going to get folded into a field where flags will go. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-05-02Use bdev_is_paritition() instead of open-coding itAl Viro
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-05-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: include/linux/filter.h kernel/bpf/core.c 66e13b615a0c ("bpf: verifier: prevent userspace memory access") d503a04f8bc0 ("bpf: Add support for certain atomics in bpf_arena to x86 JIT") https://lore.kernel.org/all/20240429114939.210328b0@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02Merge tag 'net-6.9-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf. Relatively calm week, likely due to public holiday in most places. No known outstanding regressions. Current release - regressions: - rxrpc: fix wrong alignmask in __page_frag_alloc_align() - eth: e1000e: change usleep_range to udelay in PHY mdic access Previous releases - regressions: - gro: fix udp bad offset in socket lookup - bpf: fix incorrect runtime stat for arm64 - tipc: fix UAF in error path - netfs: fix a potential infinite loop in extract_user_to_sg() - eth: ice: ensure the copied buf is NUL terminated - eth: qeth: fix kernel panic after setting hsuid Previous releases - always broken: - bpf: - verifier: prevent userspace memory access - xdp: use flags field to disambiguate broadcast redirect - bridge: fix multicast-to-unicast with fraglist GSO - mptcp: ensure snd_nxt is properly initialized on connect - nsh: fix outer header access in nsh_gso_segment(). - eth: bcmgenet: fix racing registers access - eth: vxlan: fix stats counters. Misc: - a bunch of MAINTAINERS file updates" * tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits) MAINTAINERS: mark MYRICOM MYRI-10G as Orphan MAINTAINERS: remove Ariel Elior net: gro: add flush check in udp_gro_receive_segment net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb ipv4: Fix uninit-value access in __ip_make_skb() s390/qeth: Fix kernel panic after setting hsuid vxlan: Pull inner IP header in vxlan_rcv(). tipc: fix a possible memleak in tipc_buf_append tipc: fix UAF in error path rxrpc: Clients must accept conn from any address net: core: reject skb_copy(_expand) for fraglist GSO skbs net: bridge: fix multicast-to-unicast with fraglist GSO mptcp: ensure snd_nxt is properly initialized on connect e1000e: change usleep_range to udelay in PHY mdic access net: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341 cxgb4: Properly lock TX queue for the selftest. rxrpc: Fix using alignmask being zero for __page_frag_alloc_align() vxlan: Add missing VNI filter counter update in arp_reduce(). vxlan: Fix racy device stats updates. net: qede: use return from qede_parse_actions() ...
2024-05-02string: Add additional __realloc_size() annotations for "dup" helpersKees Cook
Several other "dup"-style interfaces could use the __realloc_size() attribute. (As a reminder to myself and others: "realloc" is used here instead of "alloc" because the "alloc_size" attribute implies that the memory contents are uninitialized. Since we're copying contents into the resulting allocation, it must use "realloc_size" to avoid confusing the compiler's optimization passes.) Add KUnit test coverage where possible. (KUnit still does not have the ability to manipulate userspace memory.) Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20240502145218.it.729-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-02vmlinux: Avoid weak reference to notes sectionArd Biesheuvel
Weak references are references that are permitted to remain unsatisfied in the final link. This means they cannot be implemented using place relative relocations, resulting in GOT entries when using position independent code generation. The notes section should always exist, so the weak annotations can be omitted. Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-05-02Merge drm/drm-next into drm-misc-nextThomas Zimmermann
Backmerging to get DRM fixes from v6.9-rc6. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-05-02lib/fonts: Allow to select fonts for drm_panicJocelyn Falempe
drm_panic has been introduced recently, and uses the same fonts as FRAMEBUFFER_CONSOLE. Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20240419132243.154466-1-jfalempe@redhat.com
2024-05-01kunit/fortify: Fix replaced failure path to unbreak __alloc_sizeKees Cook
The __alloc_size annotation for kmemdup() was getting disabled under KUnit testing because the replaced fortify_panic macro implementation was using "return NULL" as a way to survive the sanity checking. But having the chance to return NULL invalidated __alloc_size, so kmemdup was not passing the __builtin_dynamic_object_size() tests any more: [23:26:18] [PASSED] fortify_test_alloc_size_kmalloc_const [23:26:19] # fortify_test_alloc_size_kmalloc_dynamic: EXPECTATION FAILED at lib/fortify_kunit.c:265 [23:26:19] Expected __builtin_dynamic_object_size(p, 1) == expected, but [23:26:19] __builtin_dynamic_object_size(p, 1) == -1 (0xffffffffffffffff) [23:26:19] expected == 11 (0xb) [23:26:19] __alloc_size() not working with __bdos on kmemdup("hello there", len, gfp) [23:26:19] [FAILED] fortify_test_alloc_size_kmalloc_dynamic Normal builds were not affected: __alloc_size continued to work there. Use a zero-sized allocation instead, which allows __alloc_size to behave. Fixes: 4ce615e798a7 ("fortify: Provide KUnit counters for failure testing") Fixes: fa4a3f86d498 ("fortify: Add KUnit tests for runtime overflows") Link: https://lore.kernel.org/r/20240501232937.work.532-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-01objpool: cache nr_possible_cpus() and avoid caching nr_cpu_idsAndrii Nakryiko
Profiling shows that calling nr_possible_cpus() in objpool_pop() takes a noticeable amount of CPU (when profiled on 80-core machine), as we need to recalculate number of set bits in a CPU bit mask. This number can't change, so there is no point in paying the price for recalculating it. As such, cache this value in struct objpool_head and use it in objpool_pop(). On the other hand, cached pool->nr_cpus isn't necessary, as it's not used in hot path and is also a pretty trivial value to retrieve. So drop pool->nr_cpus in favor of using nr_cpu_ids everywhere. This way the size of struct objpool_head remains the same, which is a nice bonus. Same BPF selftests benchmarks were used to evaluate the effect. Using changes in previous patch (inlining of objpool_pop/objpool_push) as baseline, here are the differences: BASELINE ======== kretprobe : 9.937 ± 0.174M/s kretprobe-multi: 10.440 ± 0.108M/s AFTER ===== kretprobe : 10.106 ± 0.120M/s (+1.7%) kretprobe-multi: 10.515 ± 0.180M/s (+0.7%) Link: https://lore.kernel.org/all/20240424215214.3956041-3-andrii@kernel.org/ Cc: Matt (Qiang) Wu <wuqiang.matt@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-05-01objpool: enable inlining objpool_push() and objpool_pop() operationsAndrii Nakryiko
objpool_push() and objpool_pop() are very performance-critical functions and can be called very frequently in kretprobe triggering path. As such, it makes sense to allow compiler to inline them completely to eliminate function calls overhead. Luckily, their logic is quite well isolated and doesn't have any sprawling dependencies. This patch moves both objpool_push() and objpool_pop() into include/linux/objpool.h and marks them as static inline functions, enabling inlining. To avoid anyone using internal helpers (objpool_try_get_slot, objpool_try_add_slot), rename them to use leading underscores. We used kretprobe microbenchmark from BPF selftests (bench trig-kprobe and trig-kprobe-multi benchmarks) running no-op BPF kretprobe/kretprobe.multi programs in a tight loop to evaluate the effect. BPF own overhead in this case is minimal and it mostly stresses the rest of in-kernel kretprobe infrastructure overhead. Results are in millions of calls per second. This is not super scientific, but shows the trend nevertheless. BEFORE ====== kretprobe : 9.794 ± 0.086M/s kretprobe-multi: 10.219 ± 0.032M/s AFTER ===== kretprobe : 9.937 ± 0.174M/s (+1.5%) kretprobe-multi: 10.440 ± 0.108M/s (+2.2%) Link: https://lore.kernel.org/all/20240424215214.3956041-2-andrii@kernel.org/ Cc: Matt (Qiang) Wu <wuqiang.matt@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-04-30kunit/fortify: Add memcpy() testsKees Cook
Add fortify tests for memcpy() and memmove(). This can use a similar method to the fortify_panic() replacement, only we can do it for what was the WARN_ONCE(), which can be redefined. Since this is primarily testing the fortify behaviors of the memcpy() and memmove() defenses, the tests for memcpy() and memmove() are identical. Link: https://lore.kernel.org/r/20240429194342.2421639-3-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30kunit/fortify: Do not spam logs with fortify WARNsKees Cook
When running KUnit fortify tests, we're already doing precise tracking of which warnings are getting hit. Don't fill the logs with WARNs unless we've been explicitly built with DEBUG enabled. Link: https://lore.kernel.org/r/20240429194342.2421639-2-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30kunit/fortify: Rename tests to use recommended conventionsKees Cook
The recommended conventions for KUnit tests is ${module}_test_${what}. Adjust the fortify tests to match. Link: https://lore.kernel.org/r/20240429194342.2421639-1-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30dyndbg: fix old BUG_ON in >control parserJim Cromie
Fix a BUG_ON from 2009. Even if it looks "unreachable" (I didn't really look), lets make sure by removing it, doing pr_err and return -EINVAL instead. Cc: stable <stable@kernel.org> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20240429193145.66543-2-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-29Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-04-29 We've added 147 non-merge commits during the last 32 day(s) which contain a total of 158 files changed, 9400 insertions(+), 2213 deletions(-). The main changes are: 1) Add an internal-only BPF per-CPU instruction for resolving per-CPU memory addresses and implement support in x86 BPF JIT. This allows inlining per-CPU array and hashmap lookups and the bpf_get_smp_processor_id() helper, from Andrii Nakryiko. 2) Add BPF link support for sk_msg and sk_skb programs, from Yonghong Song. 3) Optimize x86 BPF JIT's emit_mov_imm64, and add support for various atomics in bpf_arena which can be JITed as a single x86 instruction, from Alexei Starovoitov. 4) Add support for passing mark with bpf_fib_lookup helper, from Anton Protopopov. 5) Add a new bpf_wq API for deferring events and refactor sleepable bpf_timer code to keep common code where possible, from Benjamin Tissoires. 6) Fix BPF_PROG_TEST_RUN infra with regards to bpf_dummy_struct_ops programs to check when NULL is passed for non-NULLable parameters, from Eduard Zingerman. 7) Harden the BPF verifier's and/or/xor value tracking, from Harishankar Vishwanathan. 8) Introduce crypto kfuncs to make BPF programs able to utilize the kernel crypto subsystem, from Vadim Fedorenko. 9) Various improvements to the BPF instruction set standardization doc, from Dave Thaler. 10) Extend libbpf APIs to partially consume items from the BPF ringbuffer, from Andrea Righi. 11) Bigger batch of BPF selftests refactoring to use common network helpers and to drop duplicate code, from Geliang Tang. 12) Support bpf_tail_call_static() helper for BPF programs with GCC 13, from Jose E. Marchesi. 13) Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF program to have code sections where preemption is disabled, from Kumar Kartikeya Dwivedi. 14) Allow invoking BPF kfuncs from BPF_PROG_TYPE_SYSCALL programs, from David Vernet. 15) Extend the BPF verifier to allow different input maps for a given bpf_for_each_map_elem() helper call in a BPF program, from Philo Lu. 16) Add support for PROBE_MEM32 and bpf_addr_space_cast instructions for riscv64 and arm64 JITs to enable BPF Arena, from Puranjay Mohan. 17) Shut up a false-positive KMSAN splat in interpreter mode by unpoison the stack memory, from Martin KaFai Lau. 18) Improve xsk selftest coverage with new tests on maximum and minimum hardware ring size configurations, from Tushar Vyavahare. 19) Various ReST man pages fixes as well as documentation and bash completion improvements for bpftool, from Rameez Rehman & Quentin Monnet. 20) Fix libbpf with regards to dumping subsequent char arrays, from Quentin Deslandes. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (147 commits) bpf, docs: Clarify PC use in instruction-set.rst bpf_helpers.h: Define bpf_tail_call_static when building with GCC bpf, docs: Add introduction for use in the ISA Internet Draft selftests/bpf: extend BPF_SOCK_OPS_RTT_CB test for srtt and mrtt_us bpf: add mrtt and srtt as BPF_SOCK_OPS_RTT_CB args selftests/bpf: dummy_st_ops should reject 0 for non-nullable params bpf: check bpf_dummy_struct_ops program params for test runs selftests/bpf: do not pass NULL for non-nullable params in dummy_st_ops selftests/bpf: adjust dummy_st_ops_success to detect additional error bpf: mark bpf_dummy_struct_ops.test_1 parameter as nullable selftests/bpf: Add ring_buffer__consume_n test. bpf: Add bpf_guard_preempt() convenience macro selftests: bpf: crypto: add benchmark for crypto functions selftests: bpf: crypto skcipher algo selftests bpf: crypto: add skcipher to bpf crypto bpf: make common crypto API for TC/XDP programs bpf: update the comment for BTF_FIELDS_MAX selftests/bpf: Fix wq test. selftests/bpf: Use make_sockaddr in test_sock_addr selftests/bpf: Use connect_to_addr in test_sock_addr ... ==================== Link: https://lore.kernel.org/r/20240429131657.19423-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>