summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-03-28Merge tag 'landlock-6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This brings two main changes to Landlock: - A signal scoping fix with a new interface for user space to know if it is compatible with the running kernel. - Audit support to give visibility on why access requests are denied, including the origin of the security policy, missing access rights, and description of object(s). This was designed to limit log spam as much as possible while still alerting about unexpected blocked access. With these changes come new and improved documentation, and a lot of new tests" * tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (36 commits) landlock: Add audit documentation selftests/landlock: Add audit tests for network selftests/landlock: Add audit tests for filesystem selftests/landlock: Add audit tests for abstract UNIX socket scoping selftests/landlock: Add audit tests for ptrace selftests/landlock: Test audit with restrict flags selftests/landlock: Add tests for audit flags and domain IDs selftests/landlock: Extend tests for landlock_restrict_self(2)'s flags selftests/landlock: Add test for invalid ruleset file descriptor samples/landlock: Enable users to log sandbox denials landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF landlock: Add LANDLOCK_RESTRICT_SELF_LOG_*_EXEC_* flags landlock: Log scoped denials landlock: Log TCP bind and connect denials landlock: Log truncate and IOCTL denials landlock: Factor out IOCTL hooks landlock: Log file-related denials landlock: Log mount-related denials landlock: Add AUDIT_LANDLOCK_DOMAIN and log domain status landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials ...
2025-03-28arm64: Add support for HIP09 Spectre-BHB mitigationJinqian Yang
The HIP09 processor is vulnerable to the Spectre-BHB (Branch History Buffer) attack, which can be exploited to leak information through branch prediction side channels. This commit adds the MIDR of HIP09 to the list for software mitigation. Signed-off-by: Jinqian Yang <yangjinqian1@huawei.com> Link: https://lore.kernel.org/r/20250325141900.2057314-1-yangjinqian1@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-03-28arm64: mm: Drop dead code for pud special bit handlingPeter Xu
Keith Busch observed some incorrect macros defined in arm64 code [1]. It turns out the two lines should never be needed and won't be exposed to anyone, because aarch64 doesn't select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD, hence ARCH_SUPPORTS_PUD_PFNMAP is always N. The only archs that support THP PUDs so far are x86 and powerpc. Instead of fixing the lines (with no way to test it..), remove the two lines that are in reality dead code, to avoid confusing readers. Fixes tag is attached to reflect where the wrong macros were introduced, but explicitly not copying stable, because there's no real issue to be fixed. So it's only about removing the dead code so far. [1] https://lore.kernel.org/all/Z9tDjOk-JdV_fCY4@kbusch-mbp.dhcp.thefacebook.com/#t Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Donald Dutile <ddutile@redhat.com> Cc: Will Deacon <will@kernel.org> Fixes: 3e509c9b03f9 ("mm/arm64: support large pfn mappings") Reported-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Donald Dutile <ddutile@redhat.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250320183405.12659-1-peterx@redhat.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-03-28arm64: mops: Do not dereference src reg for a set operationKeir Fraser
The source register is not used for SET* and reading it can result in a UBSAN out-of-bounds array access error, specifically when the MOPS exception is taken from a SET* sequence with XZR (reg 31) as the source. Architecturally this is the only case where a src/dst/size field in the ESR can be reported as 31. Prior to 2de451a329cf662b the code in do_el0_mops() was benign as the use of pt_regs_read_reg() prevented the out-of-bounds access. Fixes: 2de451a329cf ("KVM: arm64: Add handler for MOPS exceptions") Cc: <stable@vger.kernel.org> # 6.12.x Cc: Kristina Martsenko <kristina.martsenko@arm.com> Cc: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Keir Fraser <keirf@google.com> Reviewed-by: Kristina Martšenko <kristina.martsenko@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20250326110448.3792396-1-keirf@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-03-28arm64: mm: Correct the update of max_pfnZhenhua Huang
Hotplugged memory can be smaller than the original memory. For example, on my target: root@genericarmv8:~# cat /sys/kernel/debug/memblock/memory 0: 0x0000000064005000..0x0000000064023fff 0 NOMAP 1: 0x0000000064400000..0x00000000647fffff 0 NOMAP 2: 0x0000000068000000..0x000000006fffffff 0 DRV_MNG 3: 0x0000000088800000..0x0000000094ffefff 0 NONE 4: 0x0000000094fff000..0x0000000094ffffff 0 NOMAP max_pfn will affect read_page_owner. Therefore, it should first compare and then select the larger value for max_pfn. Fixes: 8fac67ca236b ("arm64: mm: update max_pfn after memory hotplug") Cc: <stable@vger.kernel.org> # 6.1.x Signed-off-by: Zhenhua Huang <quic_zhenhuah@quicinc.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250321070019.1271859-1-quic_zhenhuah@quicinc.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-03-28Merge tag 'caps-pr-20250327' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux Pull capabilities update from Serge Hallyn: "This contains just one patch that removes a helper function whose last user (smack) stopped using it in 2018" * tag 'caps-pr-20250327' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux: capability: Remove unused has_capability
2025-03-28Merge tag 'integrity-v6.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull ima updates from Mimi Zohar: "Two performance improvements, which minimize the number of integrity violations" * tag 'integrity-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: limit the number of ToMToU integrity violations ima: limit the number of open-writers integrity violations
2025-03-28Merge tag 'ipe-pr-20250324' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe Pull ipe update from Fan Wu: "This contains just one commit from Randy Dunlap, which fixes kernel-doc warnings in the IPE subsystem" * tag 'ipe-pr-20250324' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe: ipe: policy_fs: fix kernel-doc warnings
2025-03-28Revert "Merge tag 'irq-msi-2025-03-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip" This reverts commit 36f5f026df6c1cd8a20373adc4388d2b3401ce91, reversing changes made to 43a7eec035a5b64546c8adefdc9cf96a116da14b. Thomas says: "I just noticed that for some incomprehensible reason, probably sheer incompetemce when trying to utilize b4, I managed to merge an outdated _and_ buggy version of that series. Can you please revert that merge completely?" Done. Requested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-28bcachefs: Fix bch2_fs_get_tree() error pathFlorian Albrechtskirchinger
When a filesystem is mounted read-only, subsequent attempts to mount it as read-write fail with EBUSY. Previously, the error path in bch2_fs_get_tree() would unconditionally call __bch2_fs_stop(), improperly freeing resources for a filesystem that was still actively mounted. This change modifies the error path to only call __bch2_fs_stop() if the superblock has no valid root dentry, ensuring resources are not cleaned up prematurely when the filesystem is in use. Signed-off-by: Florian Albrechtskirchinger <falbrechtskirchinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28dm-ebs: fix prefetch-vs-suspend raceMikulas Patocka
There's a possible race condition in dm-ebs - dm bufio prefetch may be in progress while the device is suspended. Fix this by calling dm_bufio_client_reset in the postsuspend hook. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org
2025-03-28dm-verity: fix prefetch-vs-suspend raceMikulas Patocka
There's a possible race condition in dm-verity - the prefetch work item may race with suspend and it is possible that prefetch continues to run while the device is suspended. Fix this by calling flush_workqueue and dm_bufio_client_reset in the postsuspend hook. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org
2025-03-28dm-integrity: fix non-constant-time tag verificationJo Van Bulck
When using dm-integrity in standalone mode with a keyed hmac algorithm, integrity tags are calculated and verified internally. Using plain memcmp to compare the stored and computed tags may leak the position of the first byte mismatch through side-channel analysis, allowing to brute-force expected tags in linear time (e.g., by counting single-stepping interrupts in confidential virtual machine environments). Co-developed-by: Luca Wilke <work@luca-wilke.com> Signed-off-by: Luca Wilke <work@luca-wilke.com> Signed-off-by: Jo Van Bulck <jo.vanbulck@cs.kuleuven.be> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org
2025-03-28bcachefs: fix logging in journal_entry_err_msg()Kent Overstreet
We want to log errors all at once, not spread across multiple printks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: add missing newline in bch2_trans_updates_to_text()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: print_string_as_lines: fix extra newlineKent Overstreet
Don't print a newline on empty string; this was causing us to also print an extra newline when we got to the end of th string. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: Fix WARN() in bch2_bkey_pick_read_device()Kent Overstreet
syzbot discovered that this one is possible: we have pointers, but none of them are to valid devices. Reported-by: syzbot+336a6e6a2dbb7d4dba9a@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: Don't return 0 size holes from bch2_seek_hole()Kent Overstreet
The hole we find in the btree might be fully dirty in the page cache. If so, keep searching. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: Fix bch2_seek_hole() lockingKent Overstreet
We can't call bch2_seek_pagecache_hole(), and block on page locks, with btree locks held. This is easily fixed because we're at the end of the transaction - we can just unlock, we don't need a drop_locks_do(). Reported-by: https://github.com/nagalun Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: Recovery no longer holds state_lockKent Overstreet
state_lock guards against devices coming or leaving, changing state, or the filesystem changing between ro <-> rw. But it's not necessary for running recovery passes, and holding it blocks asynchronous events that would cause us to go RO or kick out devices. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: Fix permissions on version modparamKent Overstreet
There's no reason for this not to be world readable - it provides the currently supported on disk format version. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28iommufd: Test attach before detaching pasidYi Liu
Check if the pasid has been attached before going further in the detach path. This fixes a crash found by syzkaller. Add a selftest as well. Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 UID: 0 PID: 668 Comm: repro Not tainted 6.14.0-next-20250325-eb4bc4b07f66 #1 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org4 RIP: 0010:iommufd_hw_pagetable_detach+0x8a/0x4d0 Code: 00 00 00 44 89 ee 48 89 c7 48 89 75 c8 48 89 45 c0 e8 ca 55 17 02 48 89 c2 49 89 c4 48 b8 00 00 00b RSP: 0018:ffff888021b17b78 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffff888014b5a000 RCX: ffff888021b17a64 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88801dad07fc RBP: ffff888021b17bc8 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000001 R11: ffff88801dad0e58 R12: 0000000000000000 R13: 0000000000000001 R14: ffff888021b17e18 R15: ffff8880132d3008 FS: 00007fca52013600(0000) GS:ffff8880e3684000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000200006c0 CR3: 00000000112d0005 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> iommufd_device_detach+0x2a/0x2e0 iommufd_test+0x2f99/0x5cd0 iommufd_fops_ioctl+0x38e/0x520 __x64_sys_ioctl+0x1ba/0x220 x64_sys_call+0x122e/0x2150 do_syscall_64+0x6d/0x150 entry_SYSCALL_64_after_hwframe+0x76/0x7e Link: https://patch.msgid.link/r/20250328133448.22052-1-yi.l.liu@intel.com Reported-by: Lai Yi <yi1.lai@linux.intel.com> Closes: https://lore.kernel.org/linux-iommu/Z+X0tzxhiaupJT7b@ly-workstation Fixes: c0e301b2978d ("iommufd/device: Add pasid_attach array to track per-PASID attach") Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-28MAINTAINERS: Update the MODULE SUPPORT sectionPetr Pavlu
Change my role for MODULE SUPPORT from a reviewer to a maintainer. We started to rotate its maintainership and I currently look after the modules tree. This not being reflected in MAINTAINERS proved to confuse folks. Add lib/tests/module/ and tools/testing/selftests/module/ to maintained files. They were introduced previously by commit 84b4a51fce4c ("selftests: add new kallsyms selftests"). Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20250306162117.18876-1-petr.pavlu@suse.com Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-28objtool, drm/vmwgfx: Don't ignore vmw_send_msg() for ORCJosh Poimboeuf
The following commit: 0b0d81e3b733 ("objtool, drm/vmwgfx: Fix "duplicate frame pointer save" warning") ... marked vmw_send_msg() STACK_FRAME_NON_STANDARD because it uses RBP in a non-standard way which violates frame pointer convention. That issue only affects the frame pointer unwinder. Remove the annotation for ORC. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/eff3102a7eeb77b4420fcb5e9d9cd9dd81d4514a.1743136205.git.jpoimboe@kernel.org
2025-03-28objtool: Fix STACK_FRAME_NON_STANDARD for cold subfunctionsJosh Poimboeuf
The recent STACK_FRAME_NON_STANDARD refactoring forgot about .cold subfunctions. They must also be ignored. Fixes the following warning: drivers/gpu/drm/vmwgfx/vmwgfx_msg.o: warning: objtool: vmw_recv_msg.cold+0x0: unreachable instruction Fixes: c84301d706c5 ("objtool: Ignore entire functions rather than instructions") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/70a09ec0b0704398b2bbfb3153ce3d7cb8a381be.1743136205.git.jpoimboe@kernel.org
2025-03-28objtool: Fix segfault in ignore_unreachable_insn()Josh Poimboeuf
Check 'prev_insn' before dereferencing it. Fixes: bd841d6154f5 ("objtool: Fix CONFIG_UBSAN_TRAP unreachable warnings") Reported-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/5df4ff89c9e4b9e788b77b0531234ffa7ba03e9e.1743136205.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/d86b4cc6-0b97-4095-8793-a7384410b8ab@app.fastmail.com Closes: https://lore.kernel.org/Z-V_rruKY0-36pqA@gmail.com
2025-03-28objtool: Fix NULL printf() '%s' argument in builtin-check.c:save_argv()Josh Poimboeuf
It's probably not the best idea to pass a string pointer to printf() right after confirming said pointer is NULL. Fix the typo and use argv[i] instead. Fixes: c5995abe1547 ("objtool: Improve error handling") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/r/a814ed8b08fb410be29498a20a5fbbb26e907ecf.1742952512.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/20250326103854.309e3c60@canb.auug.org.au
2025-03-28objtool, lkdtm: Obfuscate the do_nothing() pointerJosh Poimboeuf
If execute_location()'s memcpy of do_nothing() gets inlined and unrolled by the compiler, it copies one word at a time: mov 0x0(%rip),%rax R_X86_64_PC32 .text+0x1374 mov %rax,0x38(%rbx) mov 0x0(%rip),%rax R_X86_64_PC32 .text+0x136c mov %rax,0x30(%rbx) ... Those .text references point to the middle of the function, causing objtool to complain about their lack of ENDBR. Prevent that by resolving the function pointer at runtime rather than build time. This fixes the following warning: drivers/misc/lkdtm/lkdtm.o: warning: objtool: execute_location+0x23: relocation to !ENDBR: .text+0x1378 Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/30b9abffbddeb43c4f6320b1270fa9b4d74c54ed.1742852847.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/oe-kbuild-all/202503191453.uFfxQy5R-lkp@intel.com/
2025-03-28iommufd: Fix iommu_vevent_header tables markupBagas Sanjaya
Stephen Rothwell reports htmldocs warnings on iommufd_vevent_header tables: Documentation/userspace-api/iommufd:323: ./include/uapi/linux/iommufd.h:1048: CRITICAL: Unexpected section title or transition. ------------------------------------------------------------------------- [docutils] WARNING: kernel-doc './scripts/kernel-doc -rst -enable-lineno -sphinx-version 8.1.3 ./include/uapi/linux/iommufd.h' processing failed with: Documentation/userspace-api/iommufd:323: ./include/uapi/linux/iommufd.h:1048: (SEVERE/4) Unexpected section title or transition. ------------------------------------------------------------------------- These are because Sphinx confuses the tables for section headings. Fix the table markup to squash away above warnings. Fixes: e36ba5ab808e ("iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC") Link: https://patch.msgid.link/r/20250328114654.55840-1-bagasdotme@gmail.com Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20250318213359.5dc56fd1@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-28iommu: Convert unreachable() to BUG()Josh Poimboeuf
Bare unreachable() should be avoided as it generates undefined behavior, e.g. falling through to the next function. Use BUG() instead so the error is defined. Fixes the following warnings: drivers/iommu/dma-iommu.o: warning: objtool: iommu_dma_sw_msi+0x92: can't find jump dest instruction at .text+0x54d5 vmlinux.o: warning: objtool: iommu_dma_get_msi_page() falls through to next function __iommu_dma_unmap() Link: https://patch.msgid.link/r/0c801ae017ec078cacd39f8f0898fc7780535f85.1743053325.git.jpoimboe@kernel.org Reported-by: Randy Dunlap <rdunlap@infradead.org> Closes: https://lore.kernel.org/314f8809-cd59-479b-97d7-49356bf1c8d1@infradead.org Reported-by: Paul E. McKenney <paulmck@kernel.org> Closes: https://lore.kernel.org/5dd1f35e-8ece-43b7-ad6d-86d02d2718f6@paulmck-laptop Fixes: 6aa63a4ec947 ("iommu: Sort out domain user data") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-28iommufd: Balance veventq->num_events inc/decYi Liu
iommufd_veventq_fops_read() decrements veventq->num_events when a vevent is read out. However, the report path ony increments veventq->num_events for normal events. To be balanced, make the read path decrement num_events only for normal vevents. Fixes: e36ba5ab808e ("iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC") Link: https://patch.msgid.link/r/20250324120034.5940-3-yi.l.liu@intel.com Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-28iommufd: Initialize the flags of vevent in iommufd_viommu_report_event()Yi Liu
The vevent->header.flags is not initialized per allocation, hence the vevent read path may treat the vevent as lost_events_header wrongly. Use kzalloc() to alloc memory for new vevent. Fixes: e8e1ef9b77a7 ("iommufd/viommu: Add iommufd_viommu_report_event helper") Link: https://patch.msgid.link/r/20250324120034.5940-2-yi.l.liu@intel.com Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-28iommufd/selftest: Add coverage for reporting max_pasid_log2 via IOMMU_HW_INFOYi Liu
IOMMU_HW_INFO is extended to report max_pasid_log2, hence add coverage for it. Link: https://patch.msgid.link/r/20250321180143.8468-6-yi.l.liu@intel.com Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-28iommufd: Extend IOMMU_GET_HW_INFO to report PASID capabilityYi Liu
PASID usage requires PASID support in both device and IOMMU. Since the iommu drivers always enable the PASID capability for the device if it is supported, this extends the IOMMU_GET_HW_INFO to report the PASID capability to userspace. Also, enhances the selftest accordingly. Link: https://patch.msgid.link/r/20250321180143.8468-5-yi.l.liu@intel.com Cc: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> #aarch64 platform Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-28tracing: Use _text and the kernel offset in last_boot_infoSteven Rostedt
Instead of using kaslr_offset() just record the location of "_text". This makes it possible for user space to use either the System.map or /proc/kallsyms as what to map all addresses to functions with. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250326220304.38dbedcd@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Show last module text symbols in the stacktraceMasami Hiramatsu (Google)
Since the previous boot trace buffer can include module text address in the stacktrace. As same as the kernel text address, convert the module text address using the module address information. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/174282689201.356346.17647540360450727687.stgit@mhiramat.tok.corp.google.com Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28ring-buffer: Remove the unused variable bmetaJiapeng Chong
Variable bmeta is not effectively used, so delete it. kernel/trace/ring_buffer.c:1952:27: warning: variable ‘bmeta’ set but not used. Link: https://lore.kernel.org/20250317015524.3902-1-jiapeng.chong@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=19524 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Skip update_last_data() if cleared and remove active check for ↵Masami Hiramatsu (Google)
save_mod() If the last boot data is already cleared, there is no reason to update it again. Skip if the TRACE_ARRAY_FL_LAST_BOOT is cleared. Also, for calling save_mod() when module loading, we don't need to check the trace is active or not because any module address can be on the stacktrace. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/174165660328.1173316.15529357882704817499.stgit@devnote2 Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Initialize scratch_size to zero to prevent UBSteven Rostedt
In allocate_trace_buffer() the following code: buf->buffer = ring_buffer_alloc_range(size, rb_flags, 0, tr->range_addr_start, tr->range_addr_size, struct_size(tscratch, entries, 128)); tscratch = ring_buffer_meta_scratch(buf->buffer, &scratch_size); setup_trace_scratch(tr, tscratch, scratch_size); Has undefined behavior if ring_buffer_alloc_range() fails because "scratch_size" is not initialize. If the allocation fails, then buf->buffer will be NULL. The ring_buffer_meta_scratch() will return NULL immediately if it is passed a NULL buffer and it will not update scratch_size. Then setup_trace_scratch() will return immediately if tscratch is NULL. Although there's no real issue here, but it is considered undefined behavior to pass an uninitialized variable to a function as input, and UBSan may complain about it. Just initialize scratch_size to zero to make the code defined behavior and a little more robust. Link: https://lore.kernel.org/all/44c5deaa-b094-4852-90f9-52f3fb10e67a@stanley.mountain/ Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Fix a compilation error without CONFIG_MODULESMasami Hiramatsu (Google)
There are some code which depends on CONFIG_MODULES. #ifdef to enclose it. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/174230515367.2909896.8132122175220657625.stgit@mhiramat.tok.corp.google.com Fixes: dca91c1c5468 ("tracing: Have persistent trace instances save module addresses") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Freeable reserved ring bufferMasami Hiramatsu (Google)
Make the ring buffer on reserved memory to be freeable. This allows us to free the trace instance on the reserved memory without changing cmdline and rebooting. Even if we can not change the kernel cmdline for security reason, we can release the reserved memory for the ring buffer as free (available) memory. For example, boot kernel with reserved memory; "reserve_mem=20M:2M:trace trace_instance=boot_mapped^traceoff@trace" ~ # free total used free shared buff/cache available Mem: 1995548 50544 1927568 14964 17436 1911480 Swap: 0 0 0 ~ # rmdir /sys/kernel/tracing/instances/boot_mapped/ [ 23.704023] Freeing reserve_mem:trace memory: 20476K ~ # free total used free shared buff/cache available Mem: 2016024 41844 1956740 14968 17440 1940572 Swap: 0 0 0 Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Link: https://lore.kernel.org/173989134814.230693.18199312930337815629.stgit@devnote2 Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28mm/memblock: Add reserved memory release functionMasami Hiramatsu (Google)
Add reserve_mem_release_by_name() to release a reserved memory region with a given name. This allows us to release reserved memory which is defined by kernel cmdline, after boot. Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: linux-mm@kvack.org Link: https://lore.kernel.org/173989133862.230693.14094993331347437600.stgit@devnote2 Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Update modules to persistent instances when loadedSteven Rostedt
When a module is loaded and a persistent buffer is actively tracing, add it to the list of modules in the persistent memory. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164609.469844721@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Show module names and addresses of last bootSteven Rostedt
Add the last boot module's names and addresses to the last_boot_info file. This only shows the module information from a previous boot. If the buffer is started and is recording the current boot, this file still will only show "current". ~# cat instances/boot_mapped/last_boot_info 10c00000 [kernel] ffffffffc00ca000 usb_serial_simple ffffffffc00ae000 usbserial ffffffffc008b000 bfq ~# echo function > instances/boot_mapped/current_tracer ~# cat instances/boot_mapped/last_boot_info # Current Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164609.299186021@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Have persistent trace instances save module addressesSteven Rostedt
For trace instances that are mapped to persistent memory, have them use the scratch area to save the currently loaded modules. This will allow where the modules have been loaded on the next boot so that their addresses can be deciphered by using where they were loaded previously. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164609.129741650@goodmis.org Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28module: Add module_for_each_mod() functionSteven Rostedt
The tracing system needs a way to save all the currently loaded modules and their addresses into persistent memory so that it can evaluate the addresses on a reboot from a crash. When the persistent memory trace starts, it will load the module addresses and names into the persistent memory. To do so, it will call the module_for_each_mod() function and pass it a function and data structure to get called on each loaded module. Then it can record the memory. This only implements that function. Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: linux-modules@vger.kernel.org Link: https://lore.kernel.org/20250305164608.962615966@goodmis.org Acked-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Have persistent trace instances save KASLR offsetSteven Rostedt
There's no reason to save the KASLR offset for the ring buffer itself. That is used by the tracer. Now that the tracer has a way to save data in the persistent memory of the ring buffer, have the tracing infrastructure take care of the saving of the KASLR offset. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164608.792722274@goodmis.org Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28ring-buffer: Add ring_buffer_meta_scratch()Steven Rostedt
Now that there's one meta data at the start of the persistent memory used by the ring buffer, allow the caller to request some memory right after that data that it can use as its own persistent memory. Also fix some white space issues with ring_buffer_alloc(). Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164608.619631731@goodmis.org Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28ring-buffer: Add buffer meta data for persistent ring bufferSteven Rostedt
Instead of just having a meta data at the first page of each sub buffer that has duplicate data, add a new meta page to the entire block of memory that holds the duplicate data and remove it from the sub buffer meta data. This will open up the extra memory in this first page to be used by the tracer for its own persistent data. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164608.446351513@goodmis.org Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28ring-buffer: Use kaslr address instead of text deltaSteven Rostedt
Instead of saving off the text and data pointers and using them to compare with the current boot's text and data pointers, just save off the KASLR offset. Then that can be used to figure out how to read the previous boots buffer. The last_boot_info will now show this offset, but only if it is for a previous boot: ~# cat instances/boot_mapped/last_boot_info 39000000 [kernel] ~# echo function > instances/boot_mapped/current_tracer ~# cat instances/boot_mapped/last_boot_info # Current If the KASLR offset saved is for the current boot, the last_boot_info will show the value of "current". Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164608.274956504@goodmis.org Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>