summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2025-11-06perf record: Make sure to update build-ID cacheNamhyung Kim
Recent change on enabling --buildid-mmap by default brought an issue with build-id handling. With build-ID in MMAP2 records, we don't need to save the build-ID table in the header of a perf data file. But the actual file contents still need to be cached in the debug directory for annotation etc. Split the build-ID header processing and caching and make sure perf record to save hit DSOs in the build-ID cache by moving perf_session__cache_build_ids() to the end of the record__ finish_output(). Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-04net: Convert struct sockaddr to fixed-size "sa_data[14]"Kees Cook
Revert struct sockaddr from flexible array to fixed 14-byte "sa_data", to solve over 36,000 -Wflex-array-member-not-at-end warnings, since struct sockaddr is embedded within many network structs. With socket/proto sockaddr-based internal APIs switched to use struct sockaddr_unsized, there should be no more uses of struct sockaddr that depend on reading beyond the end of struct sockaddr::sa_data that might trigger bounds checking. Comparing an x86_64 "allyesconfig" vmlinux build before and after this patch showed no new "ud1" instructions from CONFIG_UBSAN_BOUNDS nor any new "field-spanning" memcpy CONFIG_FORTIFY_SOURCE instrumentations. Cc: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20251104002617.2752303-8-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-03perf jevents: Make all tables staticIan Rogers
The tables created by jevents.py are only used within the pmu-events.c file. Change the declarations of those global variables to be static to encapsulate this. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-03perf metricgroup: When copy metrics copy default informationIan Rogers
When copy metrics into a group also copy default information from the original metrics. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-03perf metricgroup: Missed free on error pathIan Rogers
If an out-of-memory occurs the expr also needs freeing. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-03perf metricgroup: Update comment on location of metric_event listIan Rogers
Update comment as the stat_config no longer holds all metrics. Signed-off-by: Ian Rogers <irogers@google.com> Fixes: faebee18d720 ("perf stat: Move metric list from config to evlist") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-03perf evsel: Remove unused metric_events variableIan Rogers
The metric_events exist in the metric_expr list and so this variable has been unused for a while. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-03perf symbols: Handle '1' symbols in /proc/kallsymsArnaldo Carvalho de Melo
I started seeing this in recent Fedora 42 kernels: root@x1:~# uname -a Linux x1 6.17.4-200.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Oct 19 18:47:49 UTC 2025 x86_64 GNU/Linux root@x1:~# root@x1:~# perf test 1 1: vmlinux symtab matches kallsyms : FAILED! root@x1:~# Related to: root@x1:~# grep ' 1 ' /proc/kallsyms ffffffffb098bc00 1 __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ ffffffffb098bc10 1 _RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ root@x1:~# That is found in: root@x1:~# pahole --running_kernel_vmlinux /usr/lib/debug/lib/modules/6.17.4-200.fc42.x86_64/vmlinux root@x1:~# root@x1:~# readelf -sW /usr/lib/debug/lib/modules/6.17.4-200.fc42.x86_64/vmlinux | grep __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ 150649: ffffffff81f8bc00 16 FUNC LOCAL DEFAULT 1 __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ root@x1:~# But was being filtered out when reading /proc/kallsyms, as the '1' symbol type was not being handled, do it, there are just two of them at this point. Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <lossin@kernel.org> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Trevor Gross <tmgross@umich.edu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01tools headers x86: Sync table due to introducion of uprobe syscallArnaldo Carvalho de Melo
To pick the changes in this cset: 56101b69c9190667 ("uprobes/x86: Add uprobe syscall to speed up uprobe") That add support for this new 'uprobe' syscall in tools such as 'perf trace'. Now it is possible to do a system wide 'perf trace' to look if this new syscall is being used: root@number:~# perf trace -v -e uprobe <SNIP> event qualifier tracepoint filter: (common_pid != 33989) && (id == 336) ^C root@number# $ grep -w uprobe tools/perf/arch/x86/entry/syscalls/syscall_64.tbl 336 common uprobe sys_uprobe $ This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl Please see tools/include/uapi/README for further details. Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01tools headers: Sync uapi/linux/fcntl.h with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in this cset: e83f0b5d10dcf628 ("nsfs: support exhaustive file handles") That doesn't introduce anything of interest for tools/, just addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Please see tools/include/uapi/README for further details. Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01tools headers: Sync uapi/linux/prctl.h with the kernel sourceArnaldo Carvalho de Melo
To pick up the changes in these csets: 8cdc4d27019356b0 ("mm/huge_memory: respect MADV_COLLAPSE with PR_THP_DISABLE_EXCEPT_ADVISED") 9dc21bbd62edeae6 ("prctl: extend PR_SET_THP_DISABLE to optionally exclude VM_HUGEPAGE") That don't introduce anything of interest for the tools/, just addressing these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Please see tools/include/uapi/README for further details. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01tools headers uapi: Update fs.h with the kernel sourcesArnaldo Carvalho de Melo
To pick up changes from: db2ab24a341ce893 ("Add RWF_NOSIGNAL flag for pwritev2") These are used to beautify fs syscall arguments, albeit the changes in this update are not affecting those beautifiers. This addresses these tools/ build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/fs.h include/uapi/linux/fs.h Please see tools/include/uapi/README for details (it's in the first patch of this series). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lauri Vasama <git@vasama.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-31perf tools: Cache counter names for raw samples on s390Ian Rogers
Searching all event names is slower now that legacy names are included. Add a cache to avoid long iterative searches. Note, the cache isn't cleaned up and is as such a memory leak, however, globally reachable leaks like this aren't treated as leaks by leak sanitizer. Reported-by: Thomas Richter <tmricht@linux.ibm.com> Closes: https://lore.kernel.org/linux-perf-users/09943f4f-516c-4b93-877c-e4a64ed61d38@linux.ibm.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-31perf trace: Increase syscall handler map size to 1024Namhyung Kim
The syscalls_sys_{enter,exit} map in augmented_raw_syscalls.bpf.c has max entries of 512. Usually syscall numbers are smaller than this but x86 has x32 ABI where syscalls start from 512. That makes trace__init_syscalls_bpf_prog_array_maps() fail in the middle of the loop when it accesses those keys. As the loop iteration is not ordered by syscall numbers anymore, the failure can affect non-x32 syscalls. Let's increase the map size to 1024 so that it can handle those ABIs too. While most systems won't need this, increasing the size will be safer for potential future changes. Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-31perf vendor events AmpereOneX: Fix spelling typo in the metrics fileChu Guangqing
The json file incorrectly used "acceses" instead of "accesses". Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-31perf vendor events arm64: Fix typo in Ampere eMag json fileChu Guangqing
Correct instruction spelling errors. Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-31perf record: skip synthesize event when open evsel failedShuai Xue
When using perf record with the `--overwrite` option, a segmentation fault occurs if an event fails to open. For example: perf record -e cycles-ct -F 1000 -a --overwrite Error: cycles-ct:H: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' perf: Segmentation fault #0 0x6466b6 in dump_stack debug.c:366 #1 0x646729 in sighandler_dump_stack debug.c:378 #2 0x453fd1 in sigsegv_handler builtin-record.c:722 #3 0x7f8454e65090 in __restore_rt libc-2.32.so[54090] #4 0x6c5671 in __perf_event__synthesize_id_index synthetic-events.c:1862 #5 0x6c5ac0 in perf_event__synthesize_id_index synthetic-events.c:1943 #6 0x458090 in record__synthesize builtin-record.c:2075 #7 0x45a85a in __cmd_record builtin-record.c:2888 #8 0x45deb6 in cmd_record builtin-record.c:4374 #9 0x4e5e33 in run_builtin perf.c:349 #10 0x4e60bf in handle_internal_command perf.c:401 #11 0x4e6215 in run_argv perf.c:448 #12 0x4e653a in main perf.c:555 #13 0x7f8454e4fa72 in __libc_start_main libc-2.32.so[3ea72] #14 0x43a3ee in _start ??:0 The --overwrite option implies --tail-synthesize, which collects non-sample events reflecting the system status when recording finishes. However, when evsel opening fails (e.g., unsupported event 'cycles-ct'), session->evlist is not initialized and remains NULL. The code unconditionally calls record__synthesize() in the error path, which iterates through the NULL evlist pointer and causes a segfault. To fix it, move the record__synthesize() call inside the error check block, so it's only called when there was no error during recording, ensuring that evlist is properly initialized. Fixes: 4ea648aec019 ("perf record: Add --tail-synthesize option") Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-31perf lock contention: Load kernel map before lookupNamhyung Kim
On some machines, it caused troubles when it tried to find kernel symbols. I think it's because kernel modules and kallsyms are messed up during load and split. Basically we want to make sure the kernel map is loaded and the code has it in the lock_contention_read(). But recently we added more lookups in the lock_contention_prepare() which is called before _read(). Also the kernel map (kallsyms) may not be the first one in the group like on ARM. Let's use machine__kernel_map() rather than just loading the first map. Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 688d2e8de231c54e ("perf lock contention: Add -l/--lock-addr option") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-28perf test workload: Add thread count argument to thloopIan Rogers
Allow the number of threads for the thloop workload to be increased beyond the normal 2. Add error checking to the parsed time and thread count values. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-27perf hwmon_pmu: Fix uninitialized variable warningMichal Suchanek
The line_len is only set on success. Check the return value instead. util/hwmon_pmu.c: In function ‘perf_pmus__read_hwmon_pmus’: util/hwmon_pmu.c:742:20: warning: ‘line_len’ may be used uninitialized [-Wmaybe-uninitialized] 742 | if (line_len > 0 && line[line_len - 1] == '\n') | ^ util/hwmon_pmu.c:719:24: note: ‘line_len’ was declared here 719 | size_t line_len; Fixes: 53cc0b351ec9 ("perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-25perf auxtrace: Add auxtrace_synth_id_range_start() helpertanze
To avoid hardcoding the offset value for synthetic event IDs in multiple auxtrace modules (arm-spe, cs-etm, intel-pt, etc.), and to improve code reusability, this patch unifies the handling of the ID offset via a dedicated helper function. Signed-off-by: tanze <tanze@kylinos.cn> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-25perf stat: Add/fix bperf cgroup max events workaroundsIan Rogers
Commit b8308511f6e0 bumped the max events to 1024 but this results in BPF verifier issues if the number of command line events is too large. Workaround this by: 1) moving the constants to a header file to share between BPF and perf C code, 2) testing that the maximum number of events doesn't cause BPF verifier issues in debug builds, 3) lower the max events from 1024 to 128, 4) in perf stat, if there are more events than the BPF counters can support then disable BPF counter usage. The rodata setup is factored into its own function to avoid duplicating it in the testing code. Signed-off-by: Ian Rogers <irogers@google.com> Fixes: b8308511f6e0 ("perf stat bperf cgroup: Increase MAX_EVENTS from 32 to 1024") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-23perf cs-etm: Mute enumeration value warningLeo Yan
When the OpenCSD library introduces a new enumeration value (for example, in the v1.7.1 release), the perf build fails with an error: util/cs-etm-decoder/cs-etm-decoder.c:600:10: error: enumeration value 'OCSD_GEN_TRC_ELEM_ITMTRACE' not explicitly handled in switch [-Werror, -Wswitch-enum] 600 | switch (elem->elem_type) { | ^~~~~~~~~~~~~~~ 1 error generated. Convert to if-else sentences to mute the enumeration value warning, which can avoid build failures whenever the lib is updated. No functional change. Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-23tools: arm64: Add Cortex-A720AE definitionsKuninori Morimoto
Add cputype definitions for Cortex-A720AE. These will be used for errata detection in subsequent patches. These values can be found in the Cortex-A720AE TRM: https://developer.arm.com/documentation/102828/0001/ ... in Table A-187 Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-23perf annotate: Fix Clang build by adding block in switch caseJames Clark
Clang and GCC disagree with what constitutes a "declaration after statement". GCC allows declarations in switch cases without an extra block, as long as it's immediately after the label. Clang does not. Unfortunately this is the case even in the latest versions of both compilers. The only option that makes them behave in the same way is -Wpedantic, which can't be enabled in Perf because of the number of warnings it generates. Add a block to fix the Clang build, which is the only thing we can do. Fixes the build error: ui/browsers/annotate.c:999:4: error: expected expression struct annotation_line *al = NULL; ui/browsers/annotate.c:1008:4: error: use of undeclared identifier 'al' al = annotated_source__get_line(notes->src, offset); ui/browsers/annotate.c:1009:24: error: use of undeclared identifier 'al' browser->curr_hot = al ? &al->rb_node : NULL; ui/browsers/annotate.c:1009:30: error: use of undeclared identifier 'al' browser->curr_hot = al ? &al->rb_node : NULL; ui/browsers/annotate.c:1000:8: error: mixing declarations and code is incompatible with standards before C99 [-Werror,-Wdeclaration-after-statement] s64 offset = annotate_browser__curr_hot_offset(browser); Fixes: ad83f3b7155d ("perf c2c annotate: Start from the contention line") Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-23perf build: Fix perf build issues with fixdepJosh Poimboeuf
Commit a808a2b35f66 ("tools build: Fix fixdep dependencies") broke the perf build ("make -C tools/perf") by introducing two inadvertent conflicts: 1) tools/build/Makefile includes tools/build/Makefile.include, which defines a phony 'fixdep' target. This conflicts with the $(FIXDEP) file target in tools/build/Makefile when OUTPUT is empty, causing make to report duplicate recipes for the same target. 2) The FIXDEP variable in tools/build/Makefile conflicts with the previously existing one in tools/perf/Makefile.perf. Remove the unnecessary include of tools/build/Makefile.include from tools/build/Makefile, and rename the FIXDEP variable in tools/perf/Makefile.perf to FIXDEP_BUILT. Fixes: a808a2b35f66 ("tools build: Fix fixdep dependencies") Reported-by: Thorsten Leemhuis <linux@leemhuis.info> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Thorsten Leemhuis <linux@leemhuis.info> Link: https://patch.msgid.link/8881bc3321bd9fa58802e4f36286eefe3667806b.1760992391.git.jpoimboe@kernel.org
2025-10-21perf annotate: Invalidate register states for untracked instructionsZecheng Li
When tracking variable types, instructions that modify a pointer value in an untracked way can lead to incorrect type propagation. To prevent this, invalidate the register state when encountering such instructions. This change invalidates pointer types for various arithmetic and bitwise operations that current pointer offset tracking doesn't support, like imul, shl, and, inc, etc. A special case is added for 'xor reg, reg', which is a common idiom for zeroing a register. For this, the register state is updated to be a constant with a value of 0. This could introduce slight regressions if a variable is zeroed and then reused. This can be addressed in the future by using all DWARF locations for instruction tracking instead of only the first one. Signed-off-by: Zecheng Li <zecheng@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-21perf annotate: Save pointer offset in stack stateZecheng Li
The tracked pointer offset was not being preserved in the stack state, which could lead to incorrect type analysis. This change adds a ptr_offset field to the type_state_stack struct and passes it to set_stack_state and findnew_stack_state to ensure the offset is preserved after the pointer is loaded from a stack location. It improves the type annotation coverage and quality. Signed-off-by: Zecheng Li <zecheng@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-21perf annotate: Track arithmetic instructions on pointersZecheng Li
Track the arithmetic operations on registers with pointer types. We handle only add, sub and lea instructions. The original pointer information needs to be preserved for getting outermost struct types. For example, reg0 points to a struct cfs_rq, when we add 0x10 to reg0, it should preserve the information of struct cfs_rq + 0x10 in the register instead of a pointer type to the child field at 0x10. Details: 1. struct type_state_reg now includes an offset, indicating if the register points to the start or an internal part of its associated type. This offset is used in mem to reg and reg to stack mem transfers, and also applied to the final type offset. 2. lea offset(%sp/%fp), reg is now treated as taking the address of a stack variable. It worked fine in most cases, but an issue with this approach is the pointer type may not exist. 3. lea offset(%base), reg is handled by moving the type from %base and adding an offset, similar to an add operation followed by a mov reg to reg. 4. Non-stack variables from DWARF with non-zero offsets in their location expressions are now accepted with register offset tracking. Multi-register addressing modes in LEA are not supported. Signed-off-by: Zecheng Li <zecheng@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-21perf annotate: Track address registers via TSR_KIND_POINTERZecheng Li
Introduce TSR_KIND_POINTER to improve the data type profiler's ability to track pointer-based memory accesses and address register variables. TSR_KIND_POINTER represents that the location holds a pointer type to the type in the type state. The semantics match the `breg` registers that describe a memory location. This change implements handling for this new kind in mov instructions and in the check_matching_type() function. When a TSR_KIND_POINTER is moved to the stack, the stack state size is set to the architecture's pointer size. Signed-off-by: Zecheng Li <zecheng@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-21perf annotate: Skip annotating data types to lea instructionsZecheng Li
Introduce a helper function is_address_gen_insn() to check arch-dependent address generation instructions like lea in x86. Remove type annotation on these instructions since they are not accessing memory. It should be counted as `no_mem_ops`. Signed-off-by: Zecheng Li <zecheng@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-21perf annotate: Check return value of evsel__get_arch() properlyTianyou Li
Check the error code of evsel__get_arch() in the symbol__annotate(). Previously it checked non-zero value but after the refactoring it does only for negative values. Fixes: 0669729eb0afb0cf ("perf annotate: Factor out evsel__get_arch()") Suggested-by: James Clark <james.clark@linaro.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Tianyou Li <tianyou.li@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-21perf annotate: fix a crash when annotate the same symbol with 's' and 'T'Tianyou Li
When perf report with annotation for a symbol, press 's' and 'T', then exit the annotate browser. Once annotate the same symbol, the annotate browser will crash. The browser.arch was required to be correctly updated when data type feature was enabled by 'T'. Usually it was initialized by symbol__annotate2 function. If a symbol has already been correctly annotated at the first time, it should not call the symbol__annotate2 function again, thus the browser.arch will not get initialized. Then at the second time to show the annotate browser, the data type needs to be displayed but the browser.arch is empty. Stack trace as below: Perf: Segmentation fault -------- backtrace -------- #0 0x55d365 in ui__signal_backtrace setup.c:0 #1 0x7f5ff1a3e930 in __restore_rt libc.so.6[3e930] #2 0x570f08 in arch__is perf[570f08] #3 0x562186 in annotate_get_insn_location perf[562186] #4 0x562626 in __hist_entry__get_data_type annotate.c:0 #5 0x56476d in annotation_line__write perf[56476d] #6 0x54e2db in annotate_browser__write annotate.c:0 #7 0x54d061 in ui_browser__list_head_refresh perf[54d061] #8 0x54dc9e in annotate_browser__refresh annotate.c:0 #9 0x54c03d in __ui_browser__refresh browser.c:0 #10 0x54ccf8 in ui_browser__run perf[54ccf8] #11 0x54eb92 in __hist_entry__tui_annotate perf[54eb92] #12 0x552293 in do_annotate hists.c:0 #13 0x55941c in evsel__hists_browse hists.c:0 #14 0x55b00f in evlist__tui_browse_hists perf[55b00f] #15 0x42ff02 in cmd_report perf[42ff02] #16 0x494008 in run_builtin perf.c:0 #17 0x494305 in handle_internal_command perf.c:0 #18 0x410547 in main perf[410547] #19 0x7f5ff1a295d0 in __libc_start_call_main libc.so.6[295d0] #20 0x7f5ff1a29680 in __libc_start_main@@GLIBC_2.34 libc.so.6[29680] #21 0x410b75 in _start perf[410b75] Fixes: 1d4374afd000 ("perf annotate: Add 'T' hot key to toggle data type display") Reviewed-by: James Clark <james.clark@linaro.org> Tested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Tianyou Li <tianyou.li@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-21perf annotate: Fix build with NO_SLANG=1Namhyung Kim
The recent change for perf c2c annotate broke build without slang support like below. builtin-annotate.c: In function 'hists__find_annotations': builtin-annotate.c:522:73: error: 'NO_ADDR' undeclared (first use in this function); did you mean 'NR_ADDR'? 522 | key = hist_entry__tui_annotate(he, evsel, NULL, NO_ADDR); | ^~~~~~~ | NR_ADDR builtin-annotate.c:522:73: note: each undeclared identifier is reported only once for each function it appears in builtin-annotate.c:522:31: error: too many arguments to function 'hist_entry__tui_annotate' 522 | key = hist_entry__tui_annotate(he, evsel, NULL, NO_ADDR); | ^~~~~~~~~~~~~~~~~~~~~~~~ In file included from util/sort.h:6, from builtin-annotate.c:28: util/hist.h:756:19: note: declared here 756 | static inline int hist_entry__tui_annotate(struct hist_entry *he __maybe_unused, | ^~~~~~~~~~~~~~~~~~~~~~~~ And I noticed that it missed to update the other side of #ifdef HAVE_SLANG_SUPPORT. Let's fix it. Cc: Tianyou Li <tianyou.li@intel.com> Fixes: cd3466cd2639783d ("perf c2c: Add annotation support to perf c2c report") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-20perf jevents: Suppress circular dependency warningsJames Clark
When doing an in source build, $(OUTPUT) is empty so the rule has the same input and output file. Suppress the warning by only adding the rule when doing an out of source build. The same condition already exists for the clean rule for json files. This fixes the following warnings: make[3]: Circular pmu-events/arch/nds32/mapfile.csv <- pmu-events/arch/nds32/mapfile.csv dependency dropped. make[3]: Circular pmu-events/arch/powerpc/mapfile.csv <- pmu-events/arch/powerpc/mapfile.csv dependency dropped. ... Signed-off-by: James Clark <james.clark@linaro.org> Tested-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-20perf jevents: Remove unused makefile variableJames Clark
JDIR is unused since commit 4bb55de4ff03 ("perf jevents: Support copying the source json files to OUTPUT"), remove it. Signed-off-by: James Clark <james.clark@linaro.org> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-20perf jevents: Fix build when there are other json files in the treeJames Clark
The unquoted glob *.json will expand to a real file if, for example, there is any file in the Perf source ending in .json. This can happen when using tools like Bear and clangd which generate a compile_commands.json file. With the glob already expanded by the shell, the find command will fail to wildcard any real json events files. Fix it by wrapping the star in quotes so it's passed to find rather than the shell. This fixes the following build error (most of the diff output omitted): $ make V=1 -C tools/perf O=/tmp/perf_build_with_json TEST /tmp/perf_build_with_json/pmu-events/empty-pmu-events.log ... /* offset=121053 */ "node-access\000legacy cache\000Local memory read accesses\000legacy-cache-config=6\000\00010\000\000\000\000\000" /* offset=121135 */ "node-misses\000legacy cache\000Local memory read misses\000legacy-cache-config=0x10006\000\00010\000\000\000\000\000" /* offset=121221 */ "node-miss\000legacy cache\000Local memory read misses\000legacy-cache-config=0x10006\000\00010\000\000\000\000\000" ... - { .event_table = { 0, 0 }, .metric_table = { 0, 0 }, }, make[3]: *** [pmu-events/Build:54: /tmp/perf_build_with_json/pmu-events/empty-pmu-events.log] Error 1 Fixes: 4bb55de4ff03 ("perf jevents: Support copying the source json files to OUTPUT") Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Leo Yan <leo.yan@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-20perf parse-events: Make X modifier more respectful of groupsIan Rogers
Events with an X modifier were reordered within a group, for example slots was made the leader in: ``` $ perf record -e '{cpu/mem-stores/ppu,cpu/slots/uX}' -- sleep 1 ``` Fix by making `dont_regroup` evsels always use their index for sorting. Make the cur_leader, when fixing the groups, be that of `dont_regroup` evsel so that the `dont_regroup` evsel doesn't become a leader. On a tigerlake this patch corrects this and meets expectations in: ``` $ perf stat -e '{cpu/mem-stores/,cpu/slots/uX}' -a -- sleep 0.1 Performance counter stats for 'system wide': 83,458,652 cpu/mem-stores/ 2,720,854,880 cpu/slots/uX 0.103780587 seconds time elapsed $ perf stat -e 'slots,slots:X' -a -- sleep 0.1 Performance counter stats for 'system wide': 732,042,247 slots (48.96%) 643,288,155 slots:X (51.04%) 0.102731018 seconds time elapsed ``` Closes: https://lore.kernel.org/lkml/18f20d38-070c-4e17-bc90-cf7102e1e53d@linux.intel.com/ Fixes: 035c17893082 ("perf parse-events: Add 'X' modifier to exclude an event from being regrouped") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-19perf c2c annotate: Start from the contention lineTianyou Li
Add support to highlight the contention line in the annotate browser, use 'TAB'/'UNTAB' to refocus to the contention line. Signed-off-by: Tianyou Li <tianyou.li@intel.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Reviewed-by: Jiebin Sun <jiebin.sun@intel.com> Reviewed-by: Pan Deng <pan.deng@intel.com> Reviewed-by: Zhiguo Zhou <zhiguo.zhou@intel.com> Reviewed-by: Wangyang Guo <wangyang.guo@intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-19perf c2c: Add annotation support to perf c2c reportTianyou Li
Perf c2c report currently specified the code address and source:line information in the cacheline browser, while it is lack of annotation support like perf report to directly show the disassembly code for the particular symbol shared that same cacheline. This patches add a key 'a' binding to the cacheline browser which reuse the annotation browser to show the disassembly view for easier analysis of cacheline contentions. Signed-off-by: Tianyou Li <tianyou.li@intel.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Reviewed-by: Jiebin Sun <jiebin.sun@intel.com> Reviewed-by: Pan Deng <pan.deng@intel.com> Reviewed-by: Zhiguo Zhou <zhiguo.zhou@intel.com> Reviewed-by: Wangyang Guo <wangyang.guo@intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-19perf stat bperf cgroup: Increase MAX_EVENTS from 32 to 1024Ian Rogers
The MAX_EVENTS value ensured a counted loop presumably to satisfy the BPF verifier. It is possible to go past 32 events when gathering uncore events. Increase the amount to 1024 as that should provide some amount of headroom. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-19perf ilist: Add PMU information to metricsIan Rogers
Duplicate metrics may exist on hybrid platforms, with the metric's PMU being used to select the metric to use. Incorporate the metric PMU into the ilist display and support opening it just for a given PMU. Before: ``` ⭘ Interactive Perf List ├── ▼ TopdownL1 tma_backend_bound │ ├── tma_backend_bound Counts the total number of issue slots that were │ ├── ▶ tma_backend_bound_group not consumed by the backend due to backend stalls │ ├── tma_backend_bound Counts the total number of issue slots that were │ ├── ▶ tma_backend_bound_group not consumed by the backend due to backend stalls. │ ├── tma_bad_speculation Note that uops must be available for consumption │ ├── ▶ tma_bad_speculation_group in order for this event to count. If a uop is not │ ├── tma_bad_speculation available (IQ is empty), this event will not count │ ├── ▶ tma_bad_speculation_group cpu_atom@TOPDOWN_BE_BOUND.ALL@ / (5 * │ ├── tma_frontend_bound cpu_atom@CPU_CLK_UNHALTED.CORE@) │ ├── ▶ tma_frontend_bound_group tma_backend_bound > 0.1 │ ├── tma_frontend_bound ▆▆ │ ├── ▶ tma_frontend_bound_group │ ├── tma_retiring │ ├── ▶ tma_retiring_group │ ├── tma_retiring │ └── ▶ tma_retiring_group ├── ▶ TopdownL2 total▄▄▅▅▆▅▅▂▁▁▁▁▂▃▂▂▃▄▄▇▇█▆▆▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▄▅▅▅▄▆▆▆▅▅▅▅▅▅▇▇▇▇▆▅▆▆▆▆▅▅▅▄▃▃▃▃▃▃▃▃▃▃▄▄▄▅▅▅▅▅▆▆▆▆▆▆ cpu0▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ cpu1▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▃▃▃▃▃▃▃▃▃▃▄▄▄▄▄▄▄▄▄▄▅▅▅▅▅▅▅▅▅▅▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ cpu2▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇█████▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ cpu3▁▁▁▁▁▁▁▁▁▄▄▄▄▄▄▄▄▄▄█████▆▆▆▆▆▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅ cpu4████▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ cpu5▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▇▇▇▇▇▇▇▇▇▇▆▆ cpu6▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇█████▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ cpu7▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▃▃▃▃▃▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ cpu8▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ cpu9▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂█████▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅ cpu10▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ cpu11▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇█████▁▁▁▁▁▁▁▁▁ cpu12▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ cpu13▁▁▁▁▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ cpu14▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇█████▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ cpu15▁▁▁▁▁▁▁▁▁▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ cpu16▃▃▃▃▃▃▃▃▃▄▄▃▃▃▃▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▄▃▃▃▃▃▃▃▃▄▄▄▄▄▄▁▁▃▃▃▃▃▃▃▃▃▃▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ cpu17▁▁▁▁▁▄▄▅▅▅▅▅▅▅▅▄▄▄▄▄▄▄▄▃▃▃▃▂▂▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▃▃▄▄▄▄▃▃▃▃▂▂▂▂▄▄▄▄▄▄▆▆▆▆▆▆▆▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ s Search n Next p Previous c Collapse ^q Quit ▏^p palette ``` After: ``` ⭘ Interactive Perf List ├── ▼ TopdownL1 tma_backend_bound │ ├── tma_backend_bound (cpu_atom) Counts the total number of issue slots that were │ ├── ▼ tma_backend_bound_group (cpu_atom) not consumed by the backend due to backend stalls │ │ ├── tma_core_bound (cpu_atom) Counts the total number of issue slots that were │ │ ├── ▶ tma_core_bound_group (cpu_atom not consumed by the backend due to backend stalls. │ │ ├── tma_resource_bound (cpu_atom) Note that uops must be available for consumption │ │ └── ▶ tma_resource_bound_group (cpu_ in order for this event to count. If a uop is not │ ├── tma_backend_bound (cpu_core) available (IQ is empty), this event will not count │ ├── ▶ tma_backend_bound_group (cpu_core) cpu_atom@TOPDOWN_BE_BOUND.ALL@ / (5 * │ ├── tma_bad_speculation (cpu_atom) cpu_atom@CPU_CLK_UNHALTED.CORE@) │ ├── ▶ tma_bad_speculation_group (cpu_ato▆▆tma_backend_bound > 0.1 │ ├── tma_bad_speculation (cpu_core) │ ├── ▶ tma_bad_speculation_group (cpu_cor▃▃ │ ├── tma_frontend_bound (cpu_atom) │ ├── ▶ tma_frontend_bound_group (cpu_atom │ ├── tma_frontend_bound (cpu_core) │ ├── ▶ tma_frontend_bound_group (cpu_core ▌ total▁▁▁▁▂▂▂▂▂▂▂▂▂▃▃▃▃▃▃▃▃▃▃▂▂▂▂▃▃▃▄▄▄▄▄▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▄▄▄▄▄▄▅▅▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▇▆▇▇ cpu16▇▇▇▇▇▇▇▇▇▇▇▆▆▁▁▁▁▁▁▁▁▁▁▁▁▂▂▄▄▅▅▅▅▅▅▆▆▆▆▆▆▆▆▇▇▇▇▆▆▆▆▆▆▆▆▅▅▅▅▅▅▅▅▅▅▅▅▅▅▆▆▆▆▄▄▄▄▃▃▄▄▄▄▇▇▇▇▇▇▇▇▇▇ cpu17█▇▇▇▇▇▇▇▇▆▆▆▆▆▆▅▅▅▅▃▃▃▃▂▂▁▁▅▅▅▅▅▅▅▅▄▄▄▄▄▄▄▄▄▄▄▄▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▅▅▄▄▂▂▇▇▇▇▆▆▅▅▆▆ cpu18▇▇▇▇▇██▇▇▃▃▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▃▃▃▃▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▄▄▄▄▄▅▅▅▅▅▅▅▅ cpu19▇▃▃▄▄▄▄▄▄▄▄▄▄▅▅▅▅▅▅▅▅▁▁▂▂▃▃▃▃▅▅▆▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇██▇▇▇▇▇▇▆▆▅▅▅▅▆▆▄▄▄▄▅▅ cpu20▃▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▄▄▄▄▅▅▅▅▅▅▆▆▇▇ cpu21▇▇▇▇▇▇▇██▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▆▆▆▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▆▆▆▆▆▆▆▆▆▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▅▅▄▄▂▂▂▂▂▂▁▁▁▁ cpu22█▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▆▆▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▂▂▁▁▁▁▂▂▂▂▂▂▂▂ cpu23▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▄▄▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▇▇▇▇▆▆██▇▇▇▇▇▇ cpu24▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▃▃▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▅▅▇▇▆▆▆▆▆▆▇▇▇▇ cpu25▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▆▆▆▆▇▇▇▇▇▇██ cpu26▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▄▄▇▇▇▇▇▇▇▇▇▇██▇▇▇▇▇▇▇▇▇▇▆▆▆▆▆▆▆▆▆▆▆▆▇▇▇▇▇▇▇▇▆▆▆▆▆▆▆▆▆▆▂▂▁▁▁▁▂▂▂▂▂▂▂▂ cpu27▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▃▃▄▄▄▄▅▅▅▅▅▅▇▇ total 7.4923074548462605 cpu16 0.2961618003253457 cpu17 0.3065719718925585 cpu18 0.27800656881051855 cpu19 0.28564742078353406 cpu20 0.2764790653117084 s Search n Next p Previous c Collapse ^q Quit ▏^p palette ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-19perf python: Add PMU argument to parse_metricsIan Rogers
Add an optional PMU argument to parse_metrics to allow restriction of the particular metrics to be opened. If no argument is provided then all metrics with the given name/group are opened Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Gautam Menghani <gautam@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-19perf ilist: Don't display deprecated eventsIan Rogers
Unsupported legacy events are flagged as deprecated. Don't display these events in ilist as they won't open and there are over 1,000 legacy cache events. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-19perf trace: Don't synthesize mmaps unless callchains are enabledIan Rogers
Synthesizing mmaps in perf trace is unnecessary unless call chains are being generated. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-15perf test parse-events: Add evsel test helperIan Rogers
Add TEST_ASSERT_EVSEL to dump the failing evsel in the event of a failure. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-15perf test parse-events: Add evlist test helperIan Rogers
Add TEST_ASSERT_EVLIST to dump the failing evlist in the event of a failure. Add the macro to a number of tests not currently checking the evlist length. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-15perf test: Clean up test_..config helpersIan Rogers
Just have a single test_hw_config helper that strips extended type information in the case of hardware and hardware cache events. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-15perf test: Switch cycles event to cpu-cyclesIan Rogers
Without a PMU perf matches an event against any PMU with the event. Unfortunately some PMU drivers advertise a "cycles" event which is typically just a core event. As tests assume a core event, switch to use "cpu-cycles" that avoids the overloaded "cycles" event on troublesome PMUs and is so far not overloaded. Note, on x86 this changes a legacy event into a sysfs one. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-15perf test parse-events: Remove cpu PMU requirementIan Rogers
In the event parse string, switch "cpu" to "default_core" and then rewrite this to the first core PMU name prior to parsing. This enables testing with a PMU on hybrid x86 and other systems that don't use "cpu" for the core PMU name. The name "default_core" is already used by jevents. Update test expectations to match. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>