summaryrefslogtreecommitdiff
path: root/kernel/trace/trace.c
AgeCommit message (Collapse)Author
2025-04-07tracing: Hide get_vm_area() from MMUless buildsSteven Rostedt
The function get_vm_area() is not defined for non-MMU builds and causes a build error if it is used. Hide the map_pages() function around a: #ifdef CONFIG_MMU to keep it from being compiled when CONFIG_MMU is not set. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250407120111.2ccc9319@gandalf.local.home Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/4f8ece8b-8862-4f7c-8ede-febd28f8a9fe@roeck-us.net/ Fixes: 394f3f02de531 ("tracing: Use vmap_page_range() to map memmap ring buffer") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-03Merge tag 'trace-ringbuffer-v6.15-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring-buffer updates from Steven Rostedt: "Persistent buffer cleanups and simplifications. It was mistaken that the physical memory returned from "reserve_mem" had to be vmap()'d to get to it from a virtual address. But reserve_mem already maps the memory to the virtual address of the kernel so a simple phys_to_virt() can be used to get to the virtual address from the physical memory returned by "reserve_mem". With this new found knowledge, the code can be cleaned up and simplified. - Enforce that the persistent memory is page aligned As the buffers using the persistent memory are all going to be mapped via pages, make sure that the memory given to the tracing infrastructure is page aligned. If it is not, it will print a warning and fail to map the buffer. - Use phys_to_virt() to get the virtual address from reserve_mem Instead of calling vmap() on the physical memory returned from "reserve_mem", use phys_to_virt() instead. As the memory returned by "memmap" or any other means where a physical address is given to the tracing infrastructure, it still needs to be vmap(). Since this memory can never be returned back to the buddy allocator nor should it ever be memmory mapped to user space, flag this buffer and up the ref count. The ref count will keep it from ever being freed, and the flag will prevent it from ever being memory mapped to user space. - Use vmap_page_range() for memmap virtual address mapping For the memmap buffer, instead of allocating an array of struct pages, assigning them to the contiguous phsycial memory and then passing that to vmap(), use vmap_page_range() instead - Replace flush_dcache_folio() with flush_kernel_vmap_range() Instead of calling virt_to_folio() and passing that to flush_dcache_folio(), just call flush_kernel_vmap_range() directly. This also fixes a bug where if a subbuffer was bigger than PAGE_SIZE only the PAGE_SIZE portion would be flushed" * tag 'trace-ringbuffer-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Use flush_kernel_vmap_range() over flush_dcache_folio() tracing: Use vmap_page_range() to map memmap ring buffer tracing: Have reserve_mem use phys_to_virt() and separate from memmap buffer tracing: Enforce the persistent ring buffer to be page aligned
2025-04-03Merge tag 'trace-v6.15-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix build error when CONFIG_PROBE_EVENTS_BTF_ARGS is not enabled The tracing of arguments in the function tracer depends on some functions that are only defined when PROBE_EVENTS_BTF_ARGS is enabled. In fact, PROBE_EVENTS_BTF_ARGS also depends on all the same configs as the function argument tracing requires. Just have the function argument tracing depend on PROBE_EVENTS_BTF_ARGS. - Free module_delta for persistent ring buffer instance When an instance holds the persistent ring buffer, it allocates a helper array to hold the deltas between where modules are loaded on the last boot and the current boot. This array needs to be freed when the instance is freed. - Add cond_resched() to loop in ftrace_graph_set_hash() The hash functions in ftrace loop over every function that can be enabled by ftrace. This can be 50,000 functions or more. This loop is known to trigger soft lockup warnings and requires a cond_resched(). The loop in ftrace_graph_set_hash() was missing it. - Fix the event format verifier to include "%*p.." arguments To prevent events from dereferencing stale pointers that can happen if a trace event uses a dereferece pointer to something that was not copied into the ring buffer and can be freed by the time the trace is read, a verifier is called. At boot or module load, the verifier scans the print format string for pointers that can be dereferenced and it checks the arguments to make sure they do not contain something that can be freed. The "%*p" was not handled, which would add another argument and cause the verifier to not only not verify this pointer, but it will look at the wrong argument for every pointer after that. - Fix mcount sorttable building for different endian type target When modifying the ELF file to sort the mcount_loc table in the sorttable.c code, the endianess of the file and the host is used to determine if the bytes need to be swapped when calculations are done. A change was made to the sorting of the mcount_loc that read the values from the ELF file into an array and the swap happened on the filling of the array. But one of the calculations of the array still did the swap when it did not need to. This caused building on a little endian machine for a big endian target to not find the mcount function in the 'nm' table and it zeroed it out, causing there to be no functions available to trace. - Add goto out_unlock jump to rv_register_monitor() on error path One of the error paths in rv_register_monitor() just returned the error when it should have jumped to the out_unlock label to release the mutex. * tag 'trace-v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rv: Fix missing unlock on double nested monitors return path scripts/sorttable: Fix endianness handling in build-time mcount sort tracing: Verify event formats that have "%*p.." ftrace: Add cond_resched() to ftrace_graph_set_hash() tracing: Free module_delta on freeing of persistent ring buffer ftrace: Have tracing function args depend on PROBE_EVENTS_BTF_ARGS
2025-04-02Merge tag 'printk-for-6.15-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull more printk updates from Petr Mladek: - Silence warnings about candidates for ‘gnu_print’ format attribute * tag 'printk-for-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: vsnprintf: Silence false positive GCC warning for va_format() vsnprintf: Drop unused const char fmt * in va_format() vsnprintf: Mark binary printing functions with __printf() attribute tracing: Mark binary printing functions with __printf() attribute seq_file: Mark binary printing functions with __printf() attribute seq_buf: Mark binary printing functions with __printf() attribute
2025-04-02tracing: Use vmap_page_range() to map memmap ring bufferSteven Rostedt
The code to map the physical memory retrieved by memmap currently allocates an array of pages to cover the physical memory and then calls vmap() to map it to a virtual address. Instead of using this temporary array of struct page descriptors, simply use vmap_page_range() that can directly map the contiguous physical memory to a virtual address. Link: https://lore.kernel.org/all/CAHk-=whUOfVucfJRt7E0AH+GV41ELmS4wJqxHDnui6Giddfkzw@mail.gmail.com/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/20250402144953.754618481@goodmis.org Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-02tracing: Have reserve_mem use phys_to_virt() and separate from memmap bufferSteven Rostedt
The reserve_mem kernel command line option may pass back a physical address, but the memory is still part of the normal memory just like using memblock_alloc() would be. This means that the physical memory returned by the reserve_mem command line option can be converted directly to virtual memory by simply using phys_to_virt(). When freeing the buffer there's no need to call vunmap() anymore as the memory allocated by reserve_mem is freed by the call to reserve_mem_release_by_name(). Because the persistent ring buffer can also be allocated via the memmap option, which *is* different than normal memory as it cannot be added back to the buddy system, it must be treated differently. It still needs to be virtually mapped to have access to it. It also can not be freed nor can it ever be memory mapped to user space. Create a new trace_array flag called TRACE_ARRAY_FL_MEMMAP which gets set if the buffer is created by the memmap option, and this will prevent the buffer from being memory mapped by user space. Also increment the ref count for memmap'ed buffers so that they can never be freed. Link: https://lore.kernel.org/all/Z-wFszhJ_9o4dc8O@kernel.org/ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/20250402144953.583750106@goodmis.org Suggested-by: Mike Rapoport <rppt@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-02tracing: Enforce the persistent ring buffer to be page alignedSteven Rostedt
Enforce that the address and the size of the memory used by the persistent ring buffer is page aligned. Also update the documentation to reflect this requirement. Link: https://lore.kernel.org/all/CAHk-=whUOfVucfJRt7E0AH+GV41ELmS4wJqxHDnui6Giddfkzw@mail.gmail.com/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/20250402144953.412882844@goodmis.org Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-02tracing: Free module_delta on freeing of persistent ring bufferSteven Rostedt
If a persistent ring buffer is created, a "module_delta" array is also allocated to hold the module deltas of loaded modules that match modules in the scratch area. If this buffer gets freed, the module_delta array is not freed and causes a memory leak. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250401124525.1f9ac02a@gandalf.local.home Fixes: 35a380ddbc65 ("tracing: Show last module text symbols in the stacktrace") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-31Merge tag 'trace-ringbuffer-v6.15-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring-buffer updates from Steven Rostedt: - Restructure the persistent memory to have a "scratch" area Instead of hard coding the KASLR offset in the persistent memory by the ring buffer, push that work up to the callers of the persistent memory as they are the ones that need this information. The offsets and such is not important to the ring buffer logic and it should not be part of that. A scratch pad is now created when the caller allocates a ring buffer from persistent memory by stating how much memory it needs to save. - Allow where modules are loaded to be saved in the new scratch pad Save the addresses of modules when they are loaded into the persistent memory scratch pad. - A new module_for_each_mod() helper function was created With the acknowledgement of the module maintainers a new module helper function was created to iterate over all the currently loaded modules. This has a callback to be called for each module. This is needed for when tracing is started in the persistent buffer and the currently loaded modules need to be saved in the scratch area. - Expose the last boot information where the kernel and modules were loaded The last_boot_info file is updated to print out the addresses of where the kernel "_text" location was loaded from a previous boot, as well as where the modules are loaded. If the buffer is recording the current boot, it only prints "# Current" so that it does not expose the KASLR offset of the currently running kernel. - Allow the persistent ring buffer to be released (freed) To have this in production environments, where the kernel command line can not be changed easily, the ring buffer needs to be freed when it is not going to be used. The memory for the buffer will always be allocated at boot up, but if the system isn't going to enable tracing, the memory needs to be freed. Allow it to be freed and added back to the kernel memory pool. - Allow stack traces to print the function names in the persistent buffer Now that the modules are saved in the persistent ring buffer, if the same modules are loaded, the printing of the function names will examine the saved modules. If the module is found in the scratch area and is also loaded, then it will do the offset shift and use kallsyms to display the function name. If the address is not found, it simply displays the address from the previous boot in hex. * tag 'trace-ringbuffer-v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Use _text and the kernel offset in last_boot_info tracing: Show last module text symbols in the stacktrace ring-buffer: Remove the unused variable bmeta tracing: Skip update_last_data() if cleared and remove active check for save_mod() tracing: Initialize scratch_size to zero to prevent UB tracing: Fix a compilation error without CONFIG_MODULES tracing: Freeable reserved ring buffer mm/memblock: Add reserved memory release function tracing: Update modules to persistent instances when loaded tracing: Show module names and addresses of last boot tracing: Have persistent trace instances save module addresses module: Add module_for_each_mod() function tracing: Have persistent trace instances save KASLR offset ring-buffer: Add ring_buffer_meta_scratch() ring-buffer: Add buffer meta data for persistent ring buffer ring-buffer: Use kaslr address instead of text delta ring-buffer: Fix bytes_dropped calculation issue
2025-03-28tracing: Use _text and the kernel offset in last_boot_infoSteven Rostedt
Instead of using kaslr_offset() just record the location of "_text". This makes it possible for user space to use either the System.map or /proc/kallsyms as what to map all addresses to functions with. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250326220304.38dbedcd@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Show last module text symbols in the stacktraceMasami Hiramatsu (Google)
Since the previous boot trace buffer can include module text address in the stacktrace. As same as the kernel text address, convert the module text address using the module address information. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/174282689201.356346.17647540360450727687.stgit@mhiramat.tok.corp.google.com Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Skip update_last_data() if cleared and remove active check for ↵Masami Hiramatsu (Google)
save_mod() If the last boot data is already cleared, there is no reason to update it again. Skip if the TRACE_ARRAY_FL_LAST_BOOT is cleared. Also, for calling save_mod() when module loading, we don't need to check the trace is active or not because any module address can be on the stacktrace. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/174165660328.1173316.15529357882704817499.stgit@devnote2 Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Initialize scratch_size to zero to prevent UBSteven Rostedt
In allocate_trace_buffer() the following code: buf->buffer = ring_buffer_alloc_range(size, rb_flags, 0, tr->range_addr_start, tr->range_addr_size, struct_size(tscratch, entries, 128)); tscratch = ring_buffer_meta_scratch(buf->buffer, &scratch_size); setup_trace_scratch(tr, tscratch, scratch_size); Has undefined behavior if ring_buffer_alloc_range() fails because "scratch_size" is not initialize. If the allocation fails, then buf->buffer will be NULL. The ring_buffer_meta_scratch() will return NULL immediately if it is passed a NULL buffer and it will not update scratch_size. Then setup_trace_scratch() will return immediately if tscratch is NULL. Although there's no real issue here, but it is considered undefined behavior to pass an uninitialized variable to a function as input, and UBSan may complain about it. Just initialize scratch_size to zero to make the code defined behavior and a little more robust. Link: https://lore.kernel.org/all/44c5deaa-b094-4852-90f9-52f3fb10e67a@stanley.mountain/ Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Fix a compilation error without CONFIG_MODULESMasami Hiramatsu (Google)
There are some code which depends on CONFIG_MODULES. #ifdef to enclose it. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/174230515367.2909896.8132122175220657625.stgit@mhiramat.tok.corp.google.com Fixes: dca91c1c5468 ("tracing: Have persistent trace instances save module addresses") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Freeable reserved ring bufferMasami Hiramatsu (Google)
Make the ring buffer on reserved memory to be freeable. This allows us to free the trace instance on the reserved memory without changing cmdline and rebooting. Even if we can not change the kernel cmdline for security reason, we can release the reserved memory for the ring buffer as free (available) memory. For example, boot kernel with reserved memory; "reserve_mem=20M:2M:trace trace_instance=boot_mapped^traceoff@trace" ~ # free total used free shared buff/cache available Mem: 1995548 50544 1927568 14964 17436 1911480 Swap: 0 0 0 ~ # rmdir /sys/kernel/tracing/instances/boot_mapped/ [ 23.704023] Freeing reserve_mem:trace memory: 20476K ~ # free total used free shared buff/cache available Mem: 2016024 41844 1956740 14968 17440 1940572 Swap: 0 0 0 Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Link: https://lore.kernel.org/173989134814.230693.18199312930337815629.stgit@devnote2 Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Update modules to persistent instances when loadedSteven Rostedt
When a module is loaded and a persistent buffer is actively tracing, add it to the list of modules in the persistent memory. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164609.469844721@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Show module names and addresses of last bootSteven Rostedt
Add the last boot module's names and addresses to the last_boot_info file. This only shows the module information from a previous boot. If the buffer is started and is recording the current boot, this file still will only show "current". ~# cat instances/boot_mapped/last_boot_info 10c00000 [kernel] ffffffffc00ca000 usb_serial_simple ffffffffc00ae000 usbserial ffffffffc008b000 bfq ~# echo function > instances/boot_mapped/current_tracer ~# cat instances/boot_mapped/last_boot_info # Current Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164609.299186021@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Have persistent trace instances save module addressesSteven Rostedt
For trace instances that are mapped to persistent memory, have them use the scratch area to save the currently loaded modules. This will allow where the modules have been loaded on the next boot so that their addresses can be deciphered by using where they were loaded previously. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164609.129741650@goodmis.org Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Have persistent trace instances save KASLR offsetSteven Rostedt
There's no reason to save the KASLR offset for the ring buffer itself. That is used by the tracer. Now that the tracer has a way to save data in the persistent memory of the ring buffer, have the tracing infrastructure take care of the saving of the KASLR offset. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164608.792722274@goodmis.org Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28ring-buffer: Add ring_buffer_meta_scratch()Steven Rostedt
Now that there's one meta data at the start of the persistent memory used by the ring buffer, allow the caller to request some memory right after that data that it can use as its own persistent memory. Also fix some white space issues with ring_buffer_alloc(). Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164608.619631731@goodmis.org Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28ring-buffer: Use kaslr address instead of text deltaSteven Rostedt
Instead of saving off the text and data pointers and using them to compare with the current boot's text and data pointers, just save off the KASLR offset. Then that can be used to figure out how to read the previous boots buffer. The last_boot_info will now show this offset, but only if it is for a previous boot: ~# cat instances/boot_mapped/last_boot_info 39000000 [kernel] ~# echo function > instances/boot_mapped/current_tracer ~# cat instances/boot_mapped/last_boot_info # Current If the KASLR offset saved is for the current boot, the last_boot_info will show the value of "current". Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250305164608.274956504@goodmis.org Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-28tracing: Mark binary printing functions with __printf() attributeAndy Shevchenko
Binary printing functions are using printf() type of format, and compiler is not happy about them as is: kernel/trace/trace.c:3292:9: error: function ‘trace_vbprintk’ might be a candidate for ‘gnu_printf’ format attribute [-Werror=suggest-attribute=format] kernel/trace/trace_seq.c:182:9: error: function ‘trace_seq_bprintf’ might be a candidate for ‘gnu_printf’ format attribute [-Werror=suggest-attribute=format] Fix the compilation errors by adding __printf() attribute. While at it, move existing __printf() attributes from the implementations to the declarations. IT also fixes incorrect attribute parameters that are used for trace_array_printk(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20250321144822.324050-4-andriy.shevchenko@linux.intel.com Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-03-27Merge tag 'trace-v6.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Add option traceoff_after_boot In order to debug kernel boot, it sometimes is helpful to enable tracing via the kernel command line. Unfortunately, by the time the login prompt appears, the trace is overwritten by the init process and other user space start up applications. Adding a "traceoff_after_boot" will disable tracing when the kernel passes control to init which will allow developers to be able to see the traces that occurred during boot. - Clean up the mmflags macros that display the GFP flags in trace events The macros to print the GFP flags for trace events had a bit of duplication. The code was restructured to remove duplication and in the process it also adds some flags that were missed before. - Removed some dead code and scripts/draw_functrace.py draw_functrace.py hasn't worked in years and as nobody complained about it, remove it. - Constify struct event_trigger_ops The event_trigger_ops is just a structure that has function pointers that are assigned when the variables are created. These variables should all be constants. - Other minor clean ups and fixes * tag 'trace-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Replace strncpy with memcpy for fixed-length substring copy tracing: Fix synth event printk format for str fields tracing: Do not use PERF enums when perf is not defined tracing: Ensure module defining synth event cannot be unloaded while tracing tracing: fix return value in __ftrace_event_enable_disable for TRACE_REG_UNREGISTER tracing/osnoise: Fix possible recursive locking for cpus_read_lock() tracing: Align synth event print fmt tracing: gfp: vsprintf: Do not print "none" when using %pGg printf format tracepoint: Print the function symbol when tracepoint_debug is set tracing: Constify struct event_trigger_ops scripts/tracing: Remove scripts/tracing/draw_functrace.py tracing: Update MAINTAINERS file to include tracepoint.c tracing/user_events: Slightly simplify user_seq_show() tracing/user_events: Don't use %pK through printk tracing: gfp: Remove duplication of recording GFP flags tracing: Remove orphaned event_trace_printk ring-buffer: Fix typo in comment about header page pointer tracing: Add traceoff_after_boot option
2025-03-27Merge tag 'ftrace-v6.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace updates from Steven Rostedt: - Record function parameters for function and function graph tracers An option has been added to function tracer (func-args) and the function graph tracer (funcgraph-args) that when set, the tracers will record the registers that hold the arguments into each function event. On reading of the trace, it will use BTF to print those arguments. Most archs support up to 6 arguments (depending on the complexity of the arguments) and those are printed. If a function has more arguments then what was recorded, the output will end with " ... )". Example of function graph tracer: 6) | dummy_xmit [dummy](skb = 0x8887c100, dev = 0x872ca000) { 6) | consume_skb(skb = 0x8887c100) { 6) | skb_release_head_state(skb = 0x8887c100) { 6) 0.178 us | sock_wfree(skb = 0x8887c100) 6) 0.627 us | } - The rest of the changes are minor clean ups and fixes * tag 'ftrace-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Use hashtable.h for event_hash tracing: Fix use-after-free in print_graph_function_flags during tracer switching function_graph: Remove the unused variable func ftrace: Add arguments to function tracer ftrace: Have funcgraph-args take affect during tracing ftrace: Add support for function argument to graph tracer ftrace: Add print_function_args() ftrace: Have ftrace_free_filter() WARN and exit if ops is active fgraph: Correct typo in ftrace_return_to_handler comment
2025-03-17tracing: Use preempt_model_str()Sebastian Andrzej Siewior
Use preempt_model_str() instead of manually conducting the preemption model. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: "Steven Rostedt (Google)" <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20250314160810.2373416-10-bigeasy@linutronix.de
2025-03-04ftrace: Add arguments to function tracerSven Schnelle
Wire up the code to print function arguments in the function tracer. This functionality can be enabled/disabled during runtime with options/func-args. ping-689 [004] b.... 77.170220: dummy_xmit(skb = 0x82904800, dev = 0x882d0000) <-dev_hard_start_xmit Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Guo Ren <guoren@kernel.org> Cc: Donglin Peng <dolinux.peng@gmail.com> Cc: Zheng Yejian <zhengyejian@huaweicloud.com> Link: https://lore.kernel.org/20250227185823.154996172@goodmis.org Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Co-developed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-02-25tracing: Add traceoff_after_boot optionSteven Rostedt
Sometimes tracing is used to debug issues during the boot process. Since the trace buffer has a limited amount of storage, it may be prudent to disable tracing after the boot is finished, otherwise the critical information may be overwritten. With this option, the main tracing buffer will be turned off at the end of the boot process. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Borislav Petkov <bp@alien8.de> Link: https://lore.kernel.org/20250208103017.48a7ec83@batman.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-02-15tracing: Do not allow mmap() of persistent ring bufferSteven Rostedt
When trying to mmap a trace instance buffer that is attached to reserve_mem, it would crash: BUG: unable to handle page fault for address: ffffe97bd00025c8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 2862f3067 P4D 2862f3067 PUD 0 Oops: Oops: 0000 [#1] PREEMPT_RT SMP PTI CPU: 4 UID: 0 PID: 981 Comm: mmap-rb Not tainted 6.14.0-rc2-test-00003-g7f1a5e3fbf9e-dirty #233 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:validate_page_before_insert+0x5/0xb0 Code: e2 01 89 d0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <48> 8b 46 08 a8 01 75 67 66 90 48 89 f0 8b 50 34 85 d2 74 76 48 89 RSP: 0018:ffffb148c2f3f968 EFLAGS: 00010246 RAX: ffff9fa5d3322000 RBX: ffff9fa5ccff9c08 RCX: 00000000b879ed29 RDX: ffffe97bd00025c0 RSI: ffffe97bd00025c0 RDI: ffff9fa5ccff9c08 RBP: ffffb148c2f3f9f0 R08: 0000000000000004 R09: 0000000000000004 R10: 0000000000000000 R11: 0000000000000200 R12: 0000000000000000 R13: 00007f16a18d5000 R14: ffff9fa5c48db6a8 R15: 0000000000000000 FS: 00007f16a1b54740(0000) GS:ffff9fa73df00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffe97bd00025c8 CR3: 00000001048c6006 CR4: 0000000000172ef0 Call Trace: <TASK> ? __die_body.cold+0x19/0x1f ? __die+0x2e/0x40 ? page_fault_oops+0x157/0x2b0 ? search_module_extables+0x53/0x80 ? validate_page_before_insert+0x5/0xb0 ? kernelmode_fixup_or_oops.isra.0+0x5f/0x70 ? __bad_area_nosemaphore+0x16e/0x1b0 ? bad_area_nosemaphore+0x16/0x20 ? do_kern_addr_fault+0x77/0x90 ? exc_page_fault+0x22b/0x230 ? asm_exc_page_fault+0x2b/0x30 ? validate_page_before_insert+0x5/0xb0 ? vm_insert_pages+0x151/0x400 __rb_map_vma+0x21f/0x3f0 ring_buffer_map+0x21b/0x2f0 tracing_buffers_mmap+0x70/0xd0 __mmap_region+0x6f0/0xbd0 mmap_region+0x7f/0x130 do_mmap+0x475/0x610 vm_mmap_pgoff+0xf2/0x1d0 ksys_mmap_pgoff+0x166/0x200 __x64_sys_mmap+0x37/0x50 x64_sys_call+0x1670/0x1d70 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f The reason was that the code that maps the ring buffer pages to user space has: page = virt_to_page((void *)cpu_buffer->subbuf_ids[s]); And uses that in: vm_insert_pages(vma, vma->vm_start, pages, &nr_pages); But virt_to_page() does not work with vmap()'d memory which is what the persistent ring buffer has. It is rather trivial to allow this, but for now just disable mmap() of instances that have their ring buffer from the reserve_mem option. If an mmap() is performed on a persistent buffer it will return -ENODEV just like it would if the .mmap field wasn't defined in the file_operations structure. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Vincent Donnefort <vdonnefort@google.com> Link: https://lore.kernel.org/20250214115547.0d7287d3@gandalf.local.home Fixes: 9b7bdf6f6ece6 ("tracing: Have trace_printk not use binary prints if boot buffer") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-02-14tracing: Have the error of __tracing_resize_ring_buffer() passed to userSteven Rostedt
Currently if __tracing_resize_ring_buffer() returns an error, the tracing_resize_ringbuffer() returns -ENOMEM. But it may not be a memory issue that caused the function to fail. If the ring buffer is memory mapped, then the resizing of the ring buffer will be disabled. But if the user tries to resize the buffer, it will get an -ENOMEM returned, which is confusing because there is plenty of memory. The actual error returned was -EBUSY, which would make much more sense to the user. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Vincent Donnefort <vdonnefort@google.com> Link: https://lore.kernel.org/20250213134132.7e4505d7@gandalf.local.home Fixes: 117c39200d9d7 ("ring-buffer: Introducing ring-buffer mapping functions") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2025-01-23Merge tag 'trace-v6.14-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Cleanup with guard() and free() helpers There were several places in the code that had a lot of "goto out" in the error paths to either unlock a lock or free some memory that was allocated. But this is error prone. Convert the code over to use the guard() and free() helpers that let the compiler unlock locks or free memory when the function exits. - Update the Rust tracepoint code to use the C code too There was some duplication of the tracepoint code for Rust that did the same logic as the C code. Add a helper that makes it possible for both algorithms to use the same logic in one place. - Add poll to trace event hist files It is useful to know when an event is triggered, or even with some filtering. Since hist files of events get updated when active and the event is triggered, allow applications to poll the hist file and wake up when an event is triggered. This will let the application know that the event it is waiting for happened. - Add :mod: command to enable events for current or future modules The function tracer already has a way to enable functions to be traced in modules by writing ":mod:<module>" into set_ftrace_filter. That will enable either all the functions for the module if it is loaded, or if it is not, it will cache that command, and when the module is loaded that matches <module>, its functions will be enabled. This also allows init functions to be traced. But currently events do not have that feature. Add the command where if ':mod:<module>' is written into set_event, then either all the modules events are enabled if it is loaded, or cache it so that the module's events are enabled when it is loaded. This also works from the kernel command line, where "trace_event=:mod:<module>", when the module is loaded at boot up, its events will be enabled then. * tag 'trace-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits) tracing: Fix output of set_event for some cached module events tracing: Fix allocation of printing set_event file content tracing: Rename update_cache() to update_mod_cache() tracing: Fix #if CONFIG_MODULES to #ifdef CONFIG_MODULES selftests/ftrace: Add test that tests event :mod: commands tracing: Cache ":mod:" events for modules not loaded yet tracing: Add :mod: command to enabled module events selftests/tracing: Add hist poll() support test tracing/hist: Support POLLPRI event for poll on histogram tracing/hist: Add poll(POLLIN) support on hist file tracing: Fix using ret variable in tracing_set_tracer() tracepoint: Reduce duplication of __DO_TRACE_CALL tracing/string: Create and use __free(argv_free) in trace_dynevent.c tracing: Switch trace_stat.c code over to use guard() tracing: Switch trace_stack.c code over to use guard() tracing: Switch trace_osnoise.c code over to use guard() and __free() tracing: Switch trace_events_synth.c code over to use guard() tracing: Switch trace_events_filter.c code over to use guard() tracing: Switch trace_events_trigger.c code over to use guard() tracing: Switch trace_events_hist.c code over to use guard() ...
2025-01-21Merge tag 'ftrace-v6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace updates from Steven Rostedt: - Have fprobes built on top of function graph infrastructure The fprobe logic is an optimized kprobe that uses ftrace to attach to functions when a probe is needed at the start or end of the function. The fprobe and kretprobe logic implements a similar method as the function graph tracer to trace the end of the function. That is to hijack the return address and jump to a trampoline to do the trace when the function exits. To do this, a shadow stack needs to be created to store the original return address. Fprobes and function graph do this slightly differently. Fprobes (and kretprobes) has slots per callsite that are reserved to save the return address. This is fine when just a few points are traced. But users of fprobes, such as BPF programs, are starting to add many more locations, and this method does not scale. The function graph tracer was created to trace all functions in the kernel. In order to do this, when function graph tracing is started, every task gets its own shadow stack to hold the return address that is going to be traced. The function graph tracer has been updated to allow multiple users to use its infrastructure. Now have fprobes be one of those users. This will also allow for the fprobe and kretprobe methods to trace the return address to become obsolete. With new technologies like CFI that need to know about these methods of hijacking the return address, going toward a solution that has only one method of doing this will make the kernel less complex. - Cleanup with guard() and free() helpers There were several places in the code that had a lot of "goto out" in the error paths to either unlock a lock or free some memory that was allocated. But this is error prone. Convert the code over to use the guard() and free() helpers that let the compiler unlock locks or free memory when the function exits. - Remove disabling of interrupts in the function graph tracer When function graph tracer was first introduced, it could race with interrupts and NMIs. To prevent that race, it would disable interrupts and not trace NMIs. But the code has changed to allow NMIs and also interrupts. This change was done a long time ago, but the disabling of interrupts was never removed. Remove the disabling of interrupts in the function graph tracer is it is not needed. This greatly improves its performance. - Allow the :mod: command to enable tracing module functions on the kernel command line. The function tracer already has a way to enable functions to be traced in modules by writing ":mod:<module>" into set_ftrace_filter. That will enable either all the functions for the module if it is loaded, or if it is not, it will cache that command, and when the module is loaded that matches <module>, its functions will be enabled. This also allows init functions to be traced. But currently events do not have that feature. Because enabling function tracing can be done very early at boot up (before scheduling is enabled), the commands that can be done when function tracing is started is limited. Having the ":mod:" command to trace module functions as they are loaded is very useful. Update the kernel command line function filtering to allow it. * tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits) ftrace: Implement :mod: cache filtering on kernel command line tracing: Adopt __free() and guard() for trace_fprobe.c bpf: Use ftrace_get_symaddr() for kprobe_multi probes ftrace: Add ftrace_get_symaddr to convert fentry_ip to symaddr Documentation: probes: Update fprobe on function-graph tracer selftests/ftrace: Add a test case for repeating register/unregister fprobe selftests: ftrace: Remove obsolate maxactive syntax check tracing/fprobe: Remove nr_maxactive from fprobe fprobe: Add fprobe_header encoding feature fprobe: Rewrite fprobe on function-graph tracer s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS tracing: Add ftrace_fill_perf_regs() for perf event tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs fprobe: Use ftrace_regs in fprobe exit handler fprobe: Use ftrace_regs in fprobe entry handler fgraph: Pass ftrace_regs to retfunc fgraph: Replace fgraph_ret_regs with ftrace_regs ...
2025-01-16ftrace: Implement :mod: cache filtering on kernel command lineSteven Rostedt
Module functions can be set to set_ftrace_filter before the module is loaded. # echo :mod:snd_hda_intel > set_ftrace_filter This will enable all the functions for the module snd_hda_intel. If that module is not loaded, it is "cached" in the trace array for when the module is loaded, its functions will be traced. But this is not implemented in the kernel command line. That's because the kernel command line filtering is added very early in boot up as it is needed to be done before boot time function tracing can start, which is also available very early in boot up. The code used by the "set_ftrace_filter" file can not be used that early as it depends on some other initialization to occur first. But some of the functions can. Implement the ":mod:" feature of "set_ftrace_filter" in the kernel command line parsing. Now function tracing on just a single module that is loaded at boot up can be done. Adding: ftrace=function ftrace_filter=:mod:sna_hda_intel To the kernel command line will only enable the sna_hda_intel module functions when the module is loaded, and it will start tracing. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250116175832.34e39779@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-01-16tracing: Cache ":mod:" events for modules not loaded yetSteven Rostedt
When the :mod: command is written into /sys/kernel/tracing/set_event (or that file within an instance), if the module specified after the ":mod:" is not yet loaded, it will store that string internally. When the module is loaded, it will enable the events as if the module was loaded when the string was written into the set_event file. This can also be useful to enable events that are in the init section of the module, as the events are enabled before the init section is executed. This also works on the kernel command line: trace_event=:mod:<module> Will enable the events for <module> when it is loaded. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250116143533.514730995@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-01-16tracing: Add :mod: command to enabled module eventsSteven Rostedt
Add a :mod: command to enable only events from a given module from the set_events file. echo '*:mod:<module>' > set_events Or echo ':mod:<module>' > set_events Will enable all events for that module. Specific events can also be enabled via: echo '<event>:mod:<module>' > set_events Or echo '<system>:<event>:mod:<module>' > set_events Or echo '*:<event>:mod:<module>' > set_events The ":mod:" keyword is consistent with the function tracing filter to enable functions from a given module. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250116143533.214496360@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-01-14tracing: Print lazy preemption modelShrikanth Hegde
Print lazy preemption model in ftrace header when latency-format=1. # cat /sys/kernel/debug/sched/preempt none voluntary full (lazy) Without patch: latency: 0 us, #232946/232946, CPU#40 | (M:unknown VP:0, KP:0, SP:0 HP:0 #P:80) ^^^^^^^ With Patch: latency: 0 us, #1897938/25566788, CPU#16 | (M:lazy VP:0, KP:0, SP:0 HP:0 #P:80) ^^^^ Now that lazy preemption is part of the kernel, make sure the tracing infrastructure reflects that. Link: https://lore.kernel.org/20250103093647.575919-1-sshegde@linux.ibm.com Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-01-07tracing: Fix using ret variable in tracing_set_tracer()Steven Rostedt
When the function tracing_set_tracer() switched over to using the guard() infrastructure, it did not need to save the 'ret' variable and would just return the value when an error arised, instead of setting ret and jumping to an out label. When CONFIG_TRACER_SNAPSHOT is enabled, it had code that expected the "ret" variable to be initialized to zero and had set 'ret' while holding an arch_spin_lock() (not used by guard), and then upon releasing the lock it would check 'ret' and exit if set. But because ret was only set when an error occurred while holding the locks, 'ret' would be used uninitialized if there was no error. The code in the CONFIG_TRACER_SNAPSHOT block should be self contain. Make sure 'ret' is also set when no error occurred. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250106111143.2f90ff65@gandalf.local.home Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202412271654.nJVBuwmF-lkp@intel.com/ Fixes: d33b10c0c73ad ("tracing: Switch trace.c code over to use guard()") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-12-26tracing: Return -EINVAL if a boot tracer tries to enable the mmiotracer at bootSteven Rostedt
The mmiotracer is not set to be enabled at boot up from the kernel command line. If the boot command line tries to enable that tracer, it will fail to be enabled. The return code is currently zero when that happens so the caller just thinks it was enabled. Return -EINVAL in this case. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/20241219201344.854254394@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-26tracing: Switch trace.c code over to use guard()Steven Rostedt
There are several functions in trace.c that have "goto out;" or equivalent on error in order to release locks or free values that were allocated. This can be error prone or just simply make the code more complex. Switch every location that ends with unlocking a mutex or freeing on error over to using the guard(mutex)() and __free() infrastructure to let the compiler worry about releasing locks. This makes the code easier to read and understand. There's one place that should probably return an error but instead return 0. This does not change the return as the only changes are to do the conversion without changing the logic. Fixing that location will have to come later. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/20241224221413.7b8c68c3@batman.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-23tracing: Prevent bad count for tracing_cpumask_writeLizhi Xu
If a large count is provided, it will trigger a warning in bitmap_parse_user. Also check zero for it. Cc: stable@vger.kernel.org Fixes: 9e01c1b74c953 ("cpumask: convert kernel trace functions") Link: https://lore.kernel.org/20241216073238.2573704-1-lizhi.xu@windriver.com Reported-by: syzbot+0aecfd34fb878546f3fd@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0aecfd34fb878546f3fd Tested-by: syzbot+0aecfd34fb878546f3fd@syzkaller.appspotmail.com Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-20Merge tag 'trace-ringbuffer-v6.13-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring-buffer fixes from Steven Rostedt: - Fix possible overflow of mmapped ring buffer with bad offset If the mmap() to the ring buffer passes in a start address that is passed the end of the mmapped file, it is not caught and a slab-out-of-bounds is triggered. Add a check to make sure the start address is within the bounds - Do not use TP_printk() to boot mapped ring buffers As a boot mapped ring buffer's data may have pointers that map to the previous boot's memory map, it is unsafe to allow the TP_printk() to be used to read the boot mapped buffer's events. If a TP_printk() points to a static string from within the kernel it will not match the current kernel mapping if KASLR is active, and it can fault. Have it simply print out the raw fields. * tag 'trace-ringbuffer-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: trace/ring-buffer: Do not use TP_printk() formatting for boot mapped buffers ring-buffer: Fix overflow in __rb_map_vma
2024-12-18trace/ring-buffer: Do not use TP_printk() formatting for boot mapped buffersSteven Rostedt
The TP_printk() of a TRACE_EVENT() is a generic printf format that any developer can create for their event. It may include pointers to strings and such. A boot mapped buffer may contain data from a previous kernel where the strings addresses are different. One solution is to copy the event content and update the pointers by the recorded delta, but a simpler solution (for now) is to just use the print_fields() function to print these events. The print_fields() function just iterates the fields and prints them according to what type they are, and ignores the TP_printk() format from the event itself. To understand the difference, when printing via TP_printk() the output looks like this: 4582.696626: kmem_cache_alloc: call_site=getname_flags+0x47/0x1f0 ptr=00000000e70e10e0 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL node=-1 accounted=false 4582.696629: kmem_cache_alloc: call_site=alloc_empty_file+0x6b/0x110 ptr=0000000095808002 bytes_req=360 bytes_alloc=384 gfp_flags=GFP_KERNEL node=-1 accounted=false 4582.696630: kmem_cache_alloc: call_site=security_file_alloc+0x24/0x100 ptr=00000000576339c3 bytes_req=16 bytes_alloc=16 gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1 accounted=false 4582.696653: kmem_cache_free: call_site=do_sys_openat2+0xa7/0xd0 ptr=00000000e70e10e0 name=names_cache But when printing via print_fields() (echo 1 > /sys/kernel/tracing/options/fields) the same event output looks like this: 4582.696626: kmem_cache_alloc: call_site=0xffffffff92d10d97 (-1831793257) ptr=0xffff9e0e8571e000 (-107689771147264) bytes_req=0x1000 (4096) bytes_alloc=0x1000 (4096) gfp_flags=0xcc0 (3264) node=0xffffffff (-1) accounted=(0) 4582.696629: kmem_cache_alloc: call_site=0xffffffff92d0250b (-1831852789) ptr=0xffff9e0e8577f800 (-107689770747904) bytes_req=0x168 (360) bytes_alloc=0x180 (384) gfp_flags=0xcc0 (3264) node=0xffffffff (-1) accounted=(0) 4582.696630: kmem_cache_alloc: call_site=0xffffffff92efca74 (-1829778828) ptr=0xffff9e0e8d35d3b0 (-107689640864848) bytes_req=0x10 (16) bytes_alloc=0x10 (16) gfp_flags=0xdc0 (3520) node=0xffffffff (-1) accounted=(0) 4582.696653: kmem_cache_free: call_site=0xffffffff92cfbea7 (-1831879001) ptr=0xffff9e0e8571e000 (-107689771147264) name=names_cache Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/20241218141507.28389a1d@gandalf.local.home Fixes: 07714b4bb3f98 ("tracing: Handle old buffer mappings for event strings and functions") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-17tracing: Check "%s" dereference via the field and not the TP_printk formatSteven Rostedt
The TP_printk() portion of a trace event is executed at the time a event is read from the trace. This can happen seconds, minutes, hours, days, months, years possibly later since the event was recorded. If the print format contains a dereference to a string via "%s", and that string was allocated, there's a chance that string could be freed before it is read by the trace file. To protect against such bugs, there are two functions that verify the event. The first one is test_event_printk(), which is called when the event is created. It reads the TP_printk() format as well as its arguments to make sure nothing may be dereferencing a pointer that was not copied into the ring buffer along with the event. If it is, it will trigger a WARN_ON(). For strings that use "%s", it is not so easy. The string may not reside in the ring buffer but may still be valid. Strings that are static and part of the kernel proper which will not be freed for the life of the running system, are safe to dereference. But to know if it is a pointer to a static string or to something on the heap can not be determined until the event is triggered. This brings us to the second function that tests for the bad dereferencing of strings, trace_check_vprintf(). It would walk through the printf format looking for "%s", and when it finds it, it would validate that the pointer is safe to read. If not, it would produces a WARN_ON() as well and write into the ring buffer "[UNSAFE-MEMORY]". The problem with this is how it used va_list to have vsnprintf() handle all the cases that it didn't need to check. Instead of re-implementing vsnprintf(), it would make a copy of the format up to the %s part, and call vsnprintf() with the current va_list ap variable, where the ap would then be ready to point at the string in question. For architectures that passed va_list by reference this was possible. For architectures that passed it by copy it was not. A test_can_verify() function was used to differentiate between the two, and if it wasn't possible, it would disable it. Even for architectures where this was feasible, it was a stretch to rely on such a method that is undocumented, and could cause issues later on with new optimizations of the compiler. Instead, the first function test_event_printk() was updated to look at "%s" as well. If the "%s" argument is a pointer outside the event in the ring buffer, it would find the field type of the event that is the problem and mark the structure with a new flag called "needs_test". The event itself will be marked by TRACE_EVENT_FL_TEST_STR to let it be known that this event has a field that needs to be verified before the event can be printed using the printf format. When the event fields are created from the field type structure, the fields would copy the field type's "needs_test" value. Finally, before being printed, a new function ignore_event() is called which will check if the event has the TEST_STR flag set (if not, it returns false). If the flag is set, it then iterates through the events fields looking for the ones that have the "needs_test" flag set. Then it uses the offset field from the field structure to find the pointer in the ring buffer event. It runs the tests to make sure that pointer is safe to print and if not, it triggers the WARN_ON() and also adds to the trace output that the event in question has an unsafe memory access. The ignore_event() makes the trace_check_vprintf() obsolete so it is removed. Link: https://lore.kernel.org/all/CAHk-=wh3uOnqnZPpR0PeLZZtyWbZLboZ7cHLCKRWsocvs9Y7hQ@mail.gmail.com/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/20241217024720.848621576@goodmis.org Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-11-22tracing: Record task flag NEED_RESCHED_LAZY.Thomas Gleixner
The scheduler added NEED_RESCHED_LAZY scheduling. Record this state as part of trace flags and expose it in the need_resched field. Record and expose NEED_RESCHED_LAZY. [bigeasy: Commit description, documentation bits.] Cc: Peter Zijlstra <peterz@infradead.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241122202849.7DfYpJR0@linutronix.de Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-11-22Merge tag 'trace-v6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Addition of faultable tracepoints There's a tracepoint attached to both a system call entry and exit. This location is known to allow page faults. The tracepoints are called under an rcu_read_lock() which does not allow faults that can sleep. This limits the ability of tracepoint handlers to page fault in user space system call parameters. Now these tracepoints have been made "faultable", allowing the callbacks to fault in user space parameters and record them. Note, only the infrastructure has been implemented. The consumers (perf, ftrace, BPF) now need to have their code modified to allow faults. - Fix up of BPF code for the tracepoint faultable logic - Update tracepoints to use the new static branch API - Remove trace_*_rcuidle() variants and the SRCU protection they used - Remove unused TRACE_EVENT_FL_FILTERED logic - Replace strncpy() with strscpy() and memcpy() - Use replace per_cpu_ptr(smp_processor_id()) with this_cpu_ptr() - Fix perf events to not duplicate samples when tracing is enabled - Replace atomic64_add_return(1, counter) with atomic64_inc_return(counter) - Make stack trace buffer 4K instead of PAGE_SIZE - Remove TRACE_FLAG_IRQS_NOSUPPORT flag as it was never used - Get the true return address for function tracer when function graph tracer is also running. When function_graph trace is running along with function tracer, the parent function of the function tracer sometimes is "return_to_handler", which is the function graph trampoline to record the exit of the function. Use existing logic that calls into the fgraph infrastructure to find the real return address. - Remove (un)regfunc pointers out of tracepoint structure - Added last minute bug fix for setting pending modules in stack function filter. echo "write*:mod:ext3" > /sys/kernel/tracing/stack_trace_filter Would cause a kernel NULL dereference. - Minor clean ups * tag 'trace-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (31 commits) ftrace: Fix regression with module command in stack_trace_filter tracing: Fix function name for trampoline ftrace: Get the true parent ip for function tracer tracing: Remove redundant check on field->field in histograms bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links bpf: decouple BPF link/attach hook and BPF program sleepable semantics bpf: put bpf_link's program when link is safe to be deallocated tracing: Replace strncpy() with strscpy() when copying comm tracing: Add might_fault() check in __DECLARE_TRACE_SYSCALL tracing: Fix syscall tracepoint use-after-free tracing: Introduce tracepoint_is_faultable() tracing: Introduce tracepoint extended structure tracing: Remove TRACE_FLAG_IRQS_NOSUPPORT tracing: Replace multiple deprecated strncpy with memcpy tracing: Make percpu stack trace buffer invariant to PAGE_SIZE tracing: Use atomic64_inc_return() in trace_clock_counter() trace/trace_event_perf: remove duplicate samples on the first tracepoint event tracing/bpf: Add might_fault check to syscall probes tracing/perf: Add might_fault check to syscall probes tracing/ftrace: Add might_fault check to syscall probes ...
2024-11-20Merge tag 'ftrace-v6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace updates from Steven Rostedt: - Restructure the function graph shadow stack to prepare it for use with kretprobes With the goal of merging the shadow stack logic of function graph and kretprobes, some more restructuring of the function shadow stack is required. Move out function graph specific fields from the fgraph infrastructure and store it on the new stack variables that can pass data from the entry callback to the exit callback. Hopefully, with this change, the merge of kretprobes to use fgraph shadow stacks will be ready by the next merge window. - Make shadow stack 4k instead of using PAGE_SIZE. Some architectures have very large PAGE_SIZE values which make its use for shadow stacks waste a lot of memory. - Give shadow stacks its own kmem cache. When function graph is started, every task on the system gets a shadow stack. In the future, shadow stacks may not be 4K in size. Have it have its own kmem cache so that whatever size it becomes will still be efficient in allocations. - Initialize profiler graph ops as it will be needed for new updates to fgraph - Convert to use guard(mutex) for several ftrace and fgraph functions - Add more comments and documentation - Show function return address in function graph tracer Add an option to show the caller of a function at each entry of the function graph tracer, similar to what the function tracer does. - Abstract out ftrace_regs from being used directly like pt_regs ftrace_regs was created to store a partial pt_regs. It holds only the registers and stack information to get to the function arguments and return values. On several archs, it is simply a wrapper around pt_regs. But some users would access ftrace_regs directly to get the pt_regs which will not work on all archs. Make ftrace_regs an abstract structure that requires all access to its fields be through accessor functions. - Show how long it takes to do function code modifications When code modification for function hooks happen, it always had the time recorded in how long it took to do the conversion. But this value was never exported. Recently the code was touched due to new ROX modification handling that caused a large slow down in doing the modifications and had a significant impact on boot times. Expose the timings in the dyn_ftrace_total_info file. This file was created a while ago to show information about memory usage and such to implement dynamic function tracing. It's also an appropriate file to store the timings of this modification as well. This will make it easier to see the impact of changes to code modification on boot up timings. - Other clean ups and small fixes * tag 'ftrace-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (22 commits) ftrace: Show timings of how long nop patching took ftrace: Use guard to take ftrace_lock in ftrace_graph_set_hash() ftrace: Use guard to take the ftrace_lock in release_probe() ftrace: Use guard to lock ftrace_lock in cache_mod() ftrace: Use guard for match_records() fgraph: Use guard(mutex)(&ftrace_lock) for unregister_ftrace_graph() fgraph: Give ret_stack its own kmem cache fgraph: Separate size of ret_stack from PAGE_SIZE ftrace: Rename ftrace_regs_return_value to ftrace_regs_get_return_value selftests/ftrace: Fix check of return value in fgraph-retval.tc test ftrace: Use arch_ftrace_regs() for ftrace_regs_*() macros ftrace: Consolidate ftrace_regs accessor functions for archs using pt_regs ftrace: Make ftrace_regs abstract from direct use fgragh: No need to invoke the function call_filter_check_discard() fgraph: Simplify return address printing in function graph tracer function_graph: Remove unnecessary initialization in ftrace_graph_ret_addr() function_graph: Support recording and printing the function return address ftrace: Have calltime be saved in the fgraph storage ftrace: Use a running sleeptime instead of saving on shadow stack fgraph: Use fgraph data to store subtime for profiler ...
2024-11-18tracing: Fix function name for trampolineTatsuya S
The issue that unrelated function name is shown on stack trace like following even though it should be trampoline code address is caused by the creation of trampoline code in the area where .init.text section of module was freed after module is loaded. bash-1344 [002] ..... 43.644608: <stack trace> => (MODULE INIT FUNCTION) => vfs_write => ksys_write => do_syscall_64 => entry_SYSCALL_64_after_hwframe To resolve this, when function address of stack trace entry is in trampoline, output without looking up symbol name. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241021071454.34610-2-tatsuya.s2862@gmail.com Signed-off-by: Tatsuya S <tatsuya.s2862@gmail.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-11-16Merge tag 'trace-ringbuffer-v6.12-rc7-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring buffer fixes from Steven Rostedt: - Revert: "ring-buffer: Do not have boot mapped buffers hook to CPU hotplug" A crash that happened on cpu hotplug was actually caused by the incorrect ref counting that was fixed by commit 2cf9733891a4 ("ring-buffer: Fix refcount setting of boot mapped buffers"). The removal of calling cpu hotplug callbacks on memory mapped buffers was not an issue even though the tests at the time pointed toward it. But in fact, there's a check in that code that tests to see if the buffers are already allocated or not, and will not allocate them again if they are. Not calling the cpu hotplug callbacks ended up not initializing the non boot CPU buffers. Simply remove that change. - Clear all CPU buffers when starting tracing in a boot mapped buffer To properly process events from a previous boot, the address space needs to be accounted for due to KASLR and the events in the buffer are updated accordingly when read. This also requires that when the buffer has tracing enabled again in the current boot that the buffers are reset so that events from the previous boot do not interact with the events of the current boot and cause confusing due to not having the proper meta data. It was found that if a CPU is taken offline, that its per CPU buffer is not reset when tracing starts. This allows for events to be from both the previous boot and the current boot to be in the buffer at the same time. Clear all CPU buffers when tracing is started in a boot mapped buffer. * tag 'trace-ringbuffer-v6.12-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/ring-buffer: Clear all memory mapped CPU ring buffers on first recording Revert: "ring-buffer: Do not have boot mapped buffers hook to CPU hotplug"
2024-11-14tracing/ring-buffer: Clear all memory mapped CPU ring buffers on first recordingSteven Rostedt
The events of a memory mapped ring buffer from the previous boot should not be mixed in with events from the current boot. There's meta data that is used to handle KASLR so that function names can be shown properly. Also, since the timestamps of the previous boot have no meaning to the timestamps of the current boot, having them intermingled in a buffer can also cause confusion because there could possibly be events in the future. When a trace is activated the meta data is reset so that the pointers of are now processed for the new address space. The trace buffers are reset when tracing starts for the first time. The problem here is that the reset only happens on online CPUs. If a CPU is offline, it does not get reset. To demonstrate the issue, a previous boot had tracing enabled in the boot mapped ring buffer on reboot. On the following boot, tracing has not been started yet so the function trace from the previous boot is still visible. # trace-cmd show -B boot_mapped -c 3 | tail <idle>-0 [003] d.h2. 156.462395: __rcu_read_lock <-cpu_emergency_disable_virtualization <idle>-0 [003] d.h2. 156.462396: vmx_emergency_disable_virtualization_cpu <-cpu_emergency_disable_virtualization <idle>-0 [003] d.h2. 156.462396: __rcu_read_unlock <-__sysvec_reboot <idle>-0 [003] d.h2. 156.462397: stop_this_cpu <-__sysvec_reboot <idle>-0 [003] d.h2. 156.462397: set_cpu_online <-stop_this_cpu <idle>-0 [003] d.h2. 156.462397: disable_local_APIC <-stop_this_cpu <idle>-0 [003] d.h2. 156.462398: clear_local_APIC <-disable_local_APIC <idle>-0 [003] d.h2. 156.462574: mcheck_cpu_clear <-stop_this_cpu <idle>-0 [003] d.h2. 156.462575: mce_intel_feature_clear <-stop_this_cpu <idle>-0 [003] d.h2. 156.462575: lmce_supported <-mce_intel_feature_clear Now, if CPU 3 is taken offline, and tracing is started on the memory mapped ring buffer, the events from the previous boot in the CPU 3 ring buffer is not reset. Now those events are using the meta data from the current boot and produces just hex values. # echo 0 > /sys/devices/system/cpu/cpu3/online # trace-cmd start -B boot_mapped -p function # trace-cmd show -B boot_mapped -c 3 | tail <idle>-0 [003] d.h2. 156.462395: 0xffffffff9a1e3194 <-0xffffffff9a0f655e <idle>-0 [003] d.h2. 156.462396: 0xffffffff9a0a1d24 <-0xffffffff9a0f656f <idle>-0 [003] d.h2. 156.462396: 0xffffffff9a1e6bc4 <-0xffffffff9a0f7323 <idle>-0 [003] d.h2. 156.462397: 0xffffffff9a0d12b4 <-0xffffffff9a0f732a <idle>-0 [003] d.h2. 156.462397: 0xffffffff9a1458d4 <-0xffffffff9a0d12e2 <idle>-0 [003] d.h2. 156.462397: 0xffffffff9a0faed4 <-0xffffffff9a0d12e7 <idle>-0 [003] d.h2. 156.462398: 0xffffffff9a0faaf4 <-0xffffffff9a0faef2 <idle>-0 [003] d.h2. 156.462574: 0xffffffff9a0e3444 <-0xffffffff9a0d12ef <idle>-0 [003] d.h2. 156.462575: 0xffffffff9a0e4964 <-0xffffffff9a0d12ef <idle>-0 [003] d.h2. 156.462575: 0xffffffff9a0e3fb0 <-0xffffffff9a0e496f Reset all CPUs when starting a boot mapped ring buffer for the first time, and not just the online CPUs. Fixes: 7a1d1e4b9639f ("tracing/ring-buffer: Add last_boot_info file to boot instance") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-11-01tracing: Replace strncpy() with strscpy() when copying commJinjie Ruan
Replace the depreciated[1] strncpy() calls with strscpy() when copying comm. Link: https://github.com/KSPP/linux/issues/90 [1] Cc: <mhiramat@kernel.org> Cc: <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241031120139.1343025-1-ruanjinjie@huawei.com Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-11-01tracing: Document tracefs gid mount optionKalesh Singh
Commit ee7f3666995d ("tracefs: Have new files inherit the ownership of their parent") and commit 48b27b6b5191 ("tracefs: Set all files to the same group ownership as the mount option") introduced a new gid mount option that allows specifying a group to apply to all entries in tracefs. Document this in the tracing readme. Cc: Eric Sandeen <sandeen@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Ali Zahraee <ahzahraee@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/20241030171928.4168869-3-kaleshsingh@google.com Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>