summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2025-01-17perf lock: Rename fields in lock_type_tableChun-Tse Shao
`lock_type_table` contains `name` and `str` which can be confusing. Rename them to `flags_name` and `lock_name` and add descriptions to enhance understanding. Tested by building perf for x86. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Cc: nick.forrington@arm.com Link: https://lore.kernel.org/r/20250116235838.2769691-3-ctshao@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-17perf lock: Add percpu-rwsem for type filterChun-Tse Shao
percpu-rwsem was missing in man page. And for backward compatibility, replace `pcpu-sem` with `percpu-rwsem` before parsing lock name. Tested `./perf lock con -ab -Y pcpu-sem` and `./perf lock con -ab -Y percpu-rwsem` Fixes: 4f701063bfa2 ("perf lock contention: Show lock type with address") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Cc: nick.forrington@arm.com Link: https://lore.kernel.org/r/20250116235838.2769691-2-ctshao@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-17perf lock: Fix parse_lock_type which only retrieve one lock flagChun-Tse Shao
`parse_lock_type` can only add the first lock flag in `lock_type_table` given input `str`. For example, for `Y rwlock`, it only adds `rwlock:R` into this perf session. Another example is for `-Y mutex`, it only adds the mutex without `LCB_F_SPIN` flag. The patch fixes this issue, makes sure both `rwlock:R` and `rwlock:W` will be added with `-Y rwlock`, and so on. Testing: $ ./perf lock con -ab -Y mutex,rwlock -- perf bench sched pipe # Running 'sched/pipe' benchmark: # Executed 1000000 pipe operations between two processes Total time: 9.313 [sec] 9.313976 usecs/op 107365 ops/sec contended total wait max wait avg wait type caller 176 1.65 ms 19.43 us 9.38 us mutex pipe_read+0x57 34 180.14 us 10.93 us 5.30 us mutex pipe_write+0x50 7 77.48 us 16.09 us 11.07 us mutex do_epoll_wait+0x24d 7 74.70 us 13.50 us 10.67 us mutex do_epoll_wait+0x24d 3 35.97 us 14.44 us 11.99 us rwlock:W ep_done_scan+0x2d 3 35.00 us 12.23 us 11.66 us rwlock:W do_epoll_wait+0x255 2 15.88 us 11.96 us 7.94 us rwlock:W do_epoll_wait+0x47c 1 15.23 us 15.23 us 15.23 us rwlock:W do_epoll_wait+0x4d0 1 14.26 us 14.26 us 14.26 us rwlock:W ep_done_scan+0x2d 2 14.00 us 7.99 us 7.00 us mutex pipe_read+0x282 1 12.29 us 12.29 us 12.29 us rwlock:R ep_poll_callback+0x35 1 12.02 us 12.02 us 12.02 us rwlock:W do_epoll_ctl+0xb65 1 10.25 us 10.25 us 10.25 us rwlock:R ep_poll_callback+0x35 1 7.86 us 7.86 us 7.86 us mutex do_epoll_ctl+0x6c1 1 5.04 us 5.04 us 5.04 us mutex do_epoll_ctl+0x3d4 [namhyung: Add a comment and rename to 'mutex:spin' for consistency Fixes: d783ea8f62c4 ("perf lock contention: Simplify parse_lock_type()") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Cc: nick.forrington@arm.com Link: https://lore.kernel.org/r/20250116235838.2769691-1-ctshao@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-17perf lock: Fix return code for functions in __cmd_contentionAthira Rajeev
perf lock contention returns zero exit value even if the lock contention BPF setup failed. # ./perf lock con -b true libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled? libbpf: failed to find '.BTF' ELF section in /lib/modules/6.13.0-rc3+/build/vmlinux libbpf: failed to find valid kernel BTF libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled? libbpf: failed to find '.BTF' ELF section in /lib/modules/6.13.0-rc3+/build/vmlinux libbpf: failed to find valid kernel BTF libbpf: Error loading vmlinux BTF: -ESRCH libbpf: failed to load object 'lock_contention_bpf' libbpf: failed to load BPF skeleton 'lock_contention_bpf': -ESRCH Failed to load lock-contention BPF skeleton lock contention BPF setup failed # echo $? 0 Fix this by saving the return code for lock_contention_prepare so that command exits with proper return code. Similarly set the return code properly for two other functions in builtin-lock, namely setup_output_field() and select_key(). Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250110093730.93610-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-17perf hist: Fix width calculation in hpp__fmt()Dmitry Vyukov
hpp__width_fn() round up width to length of the field name, hpp__fmt() should do it too. Otherwise, the numbers may end up unaligned if the field name is long. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250108065949.235718-1-dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16perf hist: Fix bogus profiles when filters are enabledDmitry Vyukov
When a filtered column is not present in the sort order, profiles become arbitrary broken. Filtered and non-filtered entries are collapsed together, and the filtered-by field ends up with a random value (either from a filtered or non-filtered entry). If we end up with filtered entry/value, then the whole collapsed entry will be filtered out and will be missing in the profile. If we end up with non-filtered entry/value, then the overhead value will be wrongly larger (include some subset of filtered out samples). This leads to very confusing profiles. The problem is hard to notice, and if noticed hard to understand. If the filter is for a single value, then it can be fixed by adding the corresponding field to the sort order (provided user understood the problem). But if the filter is for multiple values, it's impossible to fix b/c there is no concept of binary sorting based on filter predicate (we want to group all non-filtered values in one bucket, and all filtered values in another). Examples of affected commands: perf report --tid=123 perf report --sort overhead,symbol --comm=foo,bar Fix this by considering filtered status as the highest priority sort/collapse predicate. As a side effect this effectively adds a new feature of showing profile where several lines are combined based on arbitrary filtering predicate. For example, showing symbols from binaries foo and bar combined together, but not from other binaries; or showing combined overhead of several particular threads. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Link: https://lore.kernel.org/r/359dc444ce94d20e59d3a9e360c36fbeac833a04.1736927981.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16perf hist: Deduplicate cmp/sort/collapse codeDmitry Vyukov
Application of cmp/sort/collapse fmt callbacks is duplicated 6 times. Factor it into a common helper function. NFC. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Link: https://lore.kernel.org/r/84c4b55614e24a344f86ae0db62e8fa8f251f874.1736927981.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16perf test: Improve verbose documentationIan Rogers
Add a little more detail on the output expectations for each verbose level. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250110045736.598281-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16perf test: Add a runs-per-test flagIan Rogers
To detect flakes it is useful to run tests more than once. Add a runs-per-test flag that will run each test multiple times. Example output: ``` $ perf test -r 3 lbr -v 122: perf record LBR tests : Ok 122: perf record LBR tests : Ok 122: perf record LBR tests : Ok ``` Update the documentation for the runs-per-test option. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250110045736.598281-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16perf test: Fix parallel/sequential option documentationIan Rogers
The parallel option was removed in commit 94d1a913bdc4 ("perf test: Make parallel testing the default"). Update the sequential documentation to reflect it isn't the default except for "exclusive" tests. Fixes: 94d1a913bdc4 ("perf test: Make parallel testing the default") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250110045736.598281-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16perf test: Send list output to stdout rather than stderrIan Rogers
Follow the workload listing in using stdout rather than stderr. Correct the numbering of sub-tests to be 1.1 rather than 1:1. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250110045736.598281-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16perf test: Rename functions and variables for better clarityIan Rogers
The relationship between subtests and test cases is somewhat confusing, so let's do away with the notion of sub-tests and switch to just working with some number of test cases. Add a test_suite__for_each_test_case as in many cases, except the special one test case situation, the iteration can just be on all test cases. Switch variable names to be more intention revealing of what their value is. This work was motivated by discussion with Kan where it was noted the code is becoming overly indented: https://lore.kernel.org/lkml/20241109160219.49976-1-irogers@google.com/ Unifying more of the sub-test/no-sub-tests avoids one level of indentation in a number of places. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250110045736.598281-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16perf tools: Expose quiet/verbose variables in Makefile.perfCharlie Jenkins
The variables to make builds silent/verbose live inside tools/build/Makefile.build. Move those variables to the top-level Makefile.perf to be generally available. Committer testing: See the SYSCALL lines, now they are consistent with the other operations in other lines: SYSTBL /tmp/build/perf-tools-next/arch/x86/include/generated/asm/syscalls_32.h SYSTBL /tmp/build/perf-tools-next/arch/x86/include/generated/asm/syscalls_64.h GEN /tmp/build/perf-tools-next/common-cmds.h GEN /tmp/build/perf-tools-next/arch/arm64/include/generated/asm/sysreg-defs.h PERF_VERSION = 6.13.rc2.g3d94bb6ed1d0 GEN perf-archive MKDIR /tmp/build/perf-tools-next/jvmti/ MKDIR /tmp/build/perf-tools-next/jvmti/ MKDIR /tmp/build/perf-tools-next/jvmti/ MKDIR /tmp/build/perf-tools-next/jvmti/ GEN perf-iostat CC /tmp/build/perf-tools-next/jvmti/libjvmti.o Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20250114-perf_make_test-v1-1-decc1c517b11@rivosinc.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-14perf config: Add a function to set one variable in .perfconfigArnaldo Carvalho de Melo
To allow for setting a variable from some other tool, like with the "wallclock" patchset needs to allow the user to opt-in to having that key in the sort order for 'perf report'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/lkml/Z4akewi7UPXpagce@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14perf test perftool_testsuite: Return correct value for skippingVeronika Molnarova
In 'perf test', a return value 2 represents that the test case was skipped. Fix this value for perftool_testsuite test cases to differentiate between skip and pass values. Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250113182605.130719-3-vmolnaro@redhat.com Signed-off-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14perf test perftool_testsuite: Add missing descriptionVeronika Molnarova
Properly name the test cases of perftool_testsuite instead of the license being taken as the name for 'perf test'. Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250113182605.130719-2-vmolnaro@redhat.com Signed-off-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14perf test record+probe_libc_inet_pton: Make test resilientLeo Yan
The test failed back and forth due to the call chain being heavily impacted by the libc, which varies across different architectures and distros. The libc contains the symbols for "gaih_inet" and "getaddrinfo" in some cases, but not always. Moreover, these symbols can be either normal symbols or dynamic symbols, making it difficult to decide the call chain entries due to the symbols are inconsistent. To fix the issue, this commit identifies three call chain entries are always present. These entries are matched by iterating through the lines in the "perf script" result. The recording attribute max-stack is set to 4 for the possible maximum call chain depth. After: # perf test -vF pton --- start --- Pattern: ping[][0-9 \.:]+probe_libc:inet_pton: \([[:xdigit:]]+\) Matching: ping 285058 [025] 1219802.466939: probe_libc:inet_pton: (ffffa14b7cf0) Pattern: .*inet_pton\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib/aarch64-linux-gnu/libc-2.31.so|inlined\)$ Matching: ping 285058 [025] 1219802.466939: probe_libc:inet_pton: (ffffa14b7cf0) Matching: ffffa14b7cf0 __GI___inet_pton+0x0 (/usr/lib/aarch64-linux-gnu/libc-2.31.so) Pattern: .*(\+0x[[:xdigit:]]+|\[unknown\])[[:space:]]\(.*/bin/ping.*\)$ Matching: ping 285058 [025] 1219802.466939: probe_libc:inet_pton: (ffffa14b7cf0) Matching: ffffa14b7cf0 __GI___inet_pton+0x0 (/usr/lib/aarch64-linux-gnu/libc-2.31.so) Matching: ffffa1488040 getaddrinfo+0xe8 (/usr/lib/aarch64-linux-gnu/libc-2.31.so) Matching: aaaab8672da4 [unknown] (/usr/bin/ping) ---- end ---- 82: probe libc's inet_pton & backtrace it with ping : Ok Closes: https://lore.kernel.org/linux-perf-users/1728978807-81116-1-git-send-email-renyu.zj@linux.alibaba.com/ Closes: https://lore.kernel.org/linux-perf-users/Z0X3AYUWkAgfPpWj@x1/T/#m57327e135b156047e37d214a0d453af6ae1e02be Reported-by: Guilherme Amadio <amadio@gentoo.org> Reported-by: Jing Zhang <renyu.zj@linux.alibaba.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241202111958.553403-1-leo.yan@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14perf inject: Fix use without initialization of local variablesIan Rogers
Local variables were missing initialization and command line processing didn't provide default values. Fixes: 64eed019f3fce124 ("perf inject: Lazy build-id mmap2 event insertion") Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241211060831.806539-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14perf probe: Rename err labelJames Clark
Rename err to out to avoid confusion because buf is still supposed to be freed in non error cases. Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: James Clark <james.clark@linaro.org> Tested-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241211085525.519458-3-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14perf test stat: Avoid hybrid assumption when virtualizedIan Rogers
The cycles event will fallback to task-clock in the hybrid test when running virtualized. Change the test to not fail for this. Fixes: 65d11821910bd910 ("perf test: Add a test for default perf stat command") Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241212173354.9860-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14perf record: Fix segfault with --off-cpu when debuginfo is not enabledAthira Rajeev
When kernel is built without debuginfo, running 'perf record' with --off-cpu results in segfault as below: ./perf record --off-cpu -e dummy sleep 1 libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled? libbpf: failed to find '.BTF' ELF section in /lib/modules/6.13.0-rc3+/build/vmlinux libbpf: failed to find valid kernel BTF Segmentation fault (core dumped) The backtrace pointed to: #0 0x00000000100fb17c in btf.type_cnt () #1 0x00000000100fc1a8 in btf_find_by_name_kind () #2 0x00000000100fc38c in btf.find_by_name_kind () #3 0x00000000102ee3ac in off_cpu_prepare () #4 0x000000001002f78c in cmd_record () #5 0x00000000100aee78 in run_builtin () #6 0x00000000100af3e4 in handle_internal_command () #7 0x000000001001004c in main () Code sequence is: static void check_sched_switch_args(void) { struct btf *btf = btf__load_vmlinux_btf(); const struct btf_type *t1, *t2, *t3; u32 type_id; type_id = btf__find_by_name_kind(btf, "btf_trace_sched_switch", BTF_KIND_TYPEDEF); btf__load_vmlinux_btf() fails when CONFIG_DEBUG_INFO_BTF is not enabled. Here bpf__find_by_name_kind() calls btf__type_cnt() with NULL btf value and results in segfault. To fix this, add a check to see if btf is not NULL before invoking bpf__find_by_name_kind(). Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://lore.kernel.org/r/20241223135813.8175-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14perf tests base_probe: Fix check for the count of existing probes in ↵Athira Rajeev
test_adding_kernel perftool-testsuite_probe fails in test_adding_kernel as below: Regexp not found: "probe:inode_permission_11" -- [ FAIL ] -- perf_probe :: test_adding_kernel :: force-adding probes :: second probe adding (with force) (output regexp parsing) event syntax error: 'probe:inode_permission_11' \___ unknown tracepoint Error: File /sys/kernel/tracing//events/probe/inode_permission_11 not found. Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?. The test does the following: 1) Adds a probe point first using: $CMD_PERF probe --add $TEST_PROBE 2) Then tries to add same probe again without —force and expects it to fail. Next tries to add same probe again with —force. In this case, perf probe succeeds and adds the probe with a suffix number. Example: ./perf probe --add inode_permission Added new event: probe:inode_permission (on inode_permission) ./perf probe --add inode_permission --force Added new event: probe:inode_permission_1 (on inode_permission) ./perf probe --add inode_permission --force Added new event: probe:inode_permission_2 (on inode_permission) Each time, suffix is added to existing probe name. To get the suffix number, test cases uses: NO_OF_PROBES=`$CMD_PERF probe -l | wc -l` This will work if there is no other probe existing in the system. If there are any other probes other than kernel probes or inode_permission, ( example: any probe), "perf probe -l" will include count for other probes too. Example, in the system where this failed, already some probes were default added. So count became 10 ./perf probe -l | wc -l 10 So to be specific for "inode_permission", restrict the probe count check to that probe point alone using: NO_OF_PROBES=`$CMD_PERF probe -l $TEST_PROBE| wc -l` Similarly while removing the probe using "probe --del *", (removing all probes), check uses: ../common/check_all_lines_matched.pl "Removed event: probe:$TEST_PROBE" But if there are other probes in the system, the log will contain reference to other existing probe too. Hence change usage of check_all_lines_matched.pl to check_all_patterns_found.pl This will make sure expecting string comes in the result Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Veronika Molnarova <vmolnaro@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250110094324.94604-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14perf MANIFEST: Add license filesMichel Lind
The standalone tarballs should include the license files - both the COPYING declaration as well as the text of GPLv2. Signed-off-by: Michel Lind <michel@michel-slm.name> Link: https://lore.kernel.org/r/Z0Zcx0WRqtlUYpgw@hyperscale.parallels Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14perf test brstack: Speed up running test by using tr -s instead of xargsJames Clark
The brstack test runs quite slowly in software models. Part of the reason is "xargs -n1" is quite inefficient in replacing spaces with newlines. While that's not noticeable on normal machines, it is on software models. Use "tr -s ' ' '\n'" instead which can do the same transformation, but is much faster. For comparison on an M1 Macbook Pro: $ time seq -s ' ' 10000 | xargs -n1 > /dev/null real 0m2.729s user 0m2.009s sys 0m0.914s $ time seq -s ' ' 10000 | tr -s ' ' '\n' | grep '.' > /dev/null real 0m0.002s user 0m0.001s sys 0m0.001s The "grep '.'" is also needed to remove any remaining blank lines. Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241213231312.2640687-2-robh@kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> [robh: Drop changing loop iterations on arm64. Squash blank line fix and redo commit msg] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-13perf tools mips: Fix mips syscall generationCharlie Jenkins
The mips syscall generation was still based on the old method. Delete the Makefile since it is no longer needed with the new method of generation. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Fixes: 619ffe669496a288 ("perf tools mips: Use generic syscall scripts") Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250110-perf_fix_mips-v1-1-4e661c3b710a@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-13perf tests arm_spe: Add test for discard modeJames Clark
Add a test that checks that there were no AUX or AUXTRACE events recorded when discard mode is used. Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Graham Woodward <graham.woodward@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108142904.401139-6-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-13perf tools arm-spe: Don't allocate buffer or tracking event in discard modeJames Clark
The buffer will never be written to so don't bother allocating it. The tracking event is also not required. Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Graham Woodward <graham.woodward@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108142904.401139-5-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-13perf tools arm-spe: Pull out functions for aux buffer and tracking setupJames Clark
These won't be used in the next commit in discard mode, so put them in their own functions. No functional changes intended. Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Graham Woodward <graham.woodward@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108142904.401139-4-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf report: Fix misleading help message about --demangleJiachen Zhang
The wrong help message may mislead users. This commit fixes it. Fixes: 328ccdace8855289 ("perf report: Add --no-demangle option") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Jiachen Zhang <me@jcix.top> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250109152220.1869581-1-me@jcix.top Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf ftrace: Fix display for range of the first bucketNamhyung Kim
When min_latency is not given, it prints 0 - 0. It should be 0 - 1. Before: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 0 us | 321 | ########### | ... After: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 1 us | 699 | ############ | ... Fixes: 08b875b6bf608589 ("perf ftrace latency: Introduce --min-latency to narrow down into a latency range") Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gabriele Monaco <gmonaco@redhat.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108210015.1188531-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf ftrace: Check min/max latency only with bucket rangeNamhyung Kim
It's an optional feature and remains 0 when bucket range is not given. And it makes the histogram goes to the last entry always because any latency (num) is greater than or equal to 0. Before: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 0 us | 0 | | 1 - 2 us | 0 | | 2 - 4 us | 0 | | 4 - 8 us | 0 | | 8 - 16 us | 0 | | 16 - 32 us | 0 | | 32 - 64 us | 0 | | 64 - 128 us | 0 | | 128 - 256 us | 0 | | 256 - 512 us | 0 | | 512 - 1024 us | 0 | | 1 - 2 ms | 0 | | 2 - 4 ms | 0 | | 4 - 8 ms | 0 | | 8 - 16 ms | 0 | | 16 - 32 ms | 0 | | 32 - 64 ms | 0 | | 64 - 128 ms | 0 | | 128 - 256 ms | 0 | | 256 - 512 ms | 0 | | 512 - 1024 ms | 0 | | 1 - ... s | 1353 | ############################################## | After: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 0 us | 321 | ########### | 1 - 2 us | 132 | #### | 2 - 4 us | 202 | ####### | 4 - 8 us | 188 | ###### | 8 - 16 us | 16 | | 16 - 32 us | 12 | | 32 - 64 us | 30 | # | 64 - 128 us | 98 | ### | 128 - 256 us | 53 | # | 256 - 512 us | 57 | ## | 512 - 1024 us | 9 | | 1 - 2 ms | 9 | | 2 - 4 ms | 1 | | 4 - 8 ms | 98 | ### | 8 - 16 ms | 5 | | 16 - 32 ms | 7 | | 32 - 64 ms | 32 | # | 64 - 128 ms | 10 | | 128 - 256 ms | 10 | | 256 - 512 ms | 2 | | 512 - 1024 ms | 0 | | 1 - ... s | 0 | | Fixes: 690a052a6d85c530 ("perf ftrace latency: Add --max-latency option") Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gabriele Monaco <gmonaco@redhat.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108210015.1188531-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf docs: arm_spe: Document new discard modeJames Clark
Document the flag along with PMU events to hint what it's used for and give an example with other useful options to get minimal output. Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250108142904.401139-3-james.clark@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2025-01-10perf MANIFEST: Add arch/*/include/uapi/asm/bpf_perf_event.h to the perf tarballArnaldo Carvalho de Melo
Needed to build tools/lib/bpf/ on various arches other than x86_64, notably arm64 when using the perf tarballs generated by: $ make help | grep perf- perf-tar-src-pkg - Build the perf source tarball with no compression perf-targz-src-pkg - Build the perf source tarball with gzip compression perf-tarbz2-src-pkg - Build the perf source tarball with bz2 compression perf-tarxz-src-pkg - Build the perf source tarball with xz compression perf-tarzst-src-pkg - Build the perf source tarball with zst compression $ Building with BPF support was opt-in in perf for a long time, and testing it via the tarball main kernel Makefile targets in an architecture other than x86_64 was an odd case. I had noticed this at some point earlier this year while cross building perf to some arches, including arm64, but it fell thru the cracks, see the Link tag below. Fix it now by adding those arch/*/include/uapi/asm/bpf_perf_event.h files to the MANIFEST file used in building the perf source tarball. Tested with: perfbuilder@number:~$ time dm debian:experimental-x-arm64 1 21.60 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 14.1.0-5) 14.1.0 flex 2.6.4 BUILD_TARBALL_HEAD=d31a974f6edc576f84c35be9526fec549a3b3520 $ $ git log --oneline -1 d31a974f6edc576f84c35be9526fec549a3b3520 d31a974f6edc576f (HEAD -> perf-tools-next) perf MANIFEST: Add arch/*/include/uapi/asm/bpf_perf_event.h to the perf tarball $ That was previously failing: perfbuilder@number:~$ grep debian:experimental-x-arm64 dm.log.old/summary 19 4.80 debian:experimental-x-arm64 : FAIL gcc version 14.1.0 (Debian 14.1.0-5) $ perfbuilder@number:~$ grep -B6 'Error 1' dm.log.old/debian:experimental-x-arm64 In file included from /git/perf-6.12.0-rc6/tools/include/uapi/linux/bpf_perf_event.h:11, from libbpf.c:36: /git/perf-6.12.0-rc6/tools/include/uapi/asm/bpf_perf_event.h:2:10: fatal error: ../../arch/arm64/include/uapi/asm/bpf_perf_event.h: No such file or directory 2 | #include "../../arch/arm64/include/uapi/asm/bpf_perf_event.h" | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. make[4]: *** [/git/perf-6.12.0-rc6/tools/build/Makefile.build:105: /tmp/build/perf/libbpf/staticobjs/libbpf.o] Error 1 perfbuilder@number:~$ Closes: https://lore.kernel.org/all/Z0UNRCRYKunbDYxP@hyperscale.parallels Fixes: 9eea8fafe33eb708 ("libbpf: fix __arg_ctx type enforcement for perf_event programs") Reported-by: Michel Lind <michel@michel-slm.name> Tested-by: Michel Lind <michel@michel-slm.name> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: 317c11923cf676437456e44a7f408d4ce589a9c0.camel@michel-slm.name Link: https://lore.kernel.org/bpf/ZfyEgoG3JFiOs2Fs@x1/ Link: https://lore.kernel.org/r/Z0Yy5u42Q1hWoEzz@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf vendor events arm64: Add FUJITSU-MONAKA PMU eventYoshihiro Furudera
Add PMU events for FUJITSU-MONAKA. And, also updated common-and-microarch.json and recommended.json. FUJITSU-MONAKA Specification URL: https://github.com/fujitsu/FUJITSU-MONAKA Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Akio Kakuno <fj3333bs@aa.jp.fujitsu.com> Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Xu Yang <xu.yang_2@nxp.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20241217065751.1448755-1-fj5100bi@fujitsu.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf tools: Fixup end address of modulesNamhyung Kim
In machine__create_module(), it reads /proc/modules to get a list of modules in the system. The file shows the start address (of text) and the size of the module so it uses the info to reconstruct system memory maps for symbol resolution. But module memory consists of multiple segments and they can be scaterred. Currently perf tools assume they are contiguous and see some overlaps. This can confuse the tool when it finds a map containing a given address. As we mostly care about the function symbols in the text segment, it can fixup the size or end address of modules when there's an overlap. We can use maps__fixup_end() which updates the end address using the start address of the next map. Ideally it should be able to track other segments (like data/rodata), but that would require some changes in /proc/modules IMHO. Reported-by: Blake Jones <blakejones@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20241218220453.203069-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf symbol: Prefer non-label symbols with same addressNamhyung Kim
When there are more than one symbols at the same address, it needs to choose which one is better. In choose_best_symbol() it didn't check the type of symbols. It's possible to have labels in other symbols and in that case, it would be better to pick the actual symbol over the labels. To minimize the possible impact on other symbols, I only check NOTYPE symbols specifically. $ readelf -sW vmlinux | grep -e __do_softirq -e __softirqentry_text_start 105089: ffffffff82000000 814 FUNC GLOBAL DEFAULT 1 __do_softirq 111954: ffffffff82000000 0 NOTYPE GLOBAL DEFAULT 1 __softirqentry_text_start The commit 77b004f4c5c3c90b tried to do the same by not giving the size to the label symbols but it seems there's some label-only symbols in asm code. Let's restore the original code and choose the right symbol using type of the symbols. Fixes: 77b004f4c5c3c90b ("perf symbol: Do not fixup end address of labels") Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: http://lore.kernel.org/lkml/Z3b-DqBMnNb4ucEm@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf symbol-elf: Avoid a weak cxx_demangle_sym functionIan Rogers
cxx_demangle_sym is weak in case demangle-cxx.c replaces the definition in symbol-elf.c. When demangle-cxx.c is built HAVE_CXA_DEMANGLE_SUPPORT is defined, as such the define can be used to avoid a weak symbol. As weak symbols are outside of the C standard their use can lead to strange behaviors, in particular with LTO, as well as causing issues to be hidden at link time. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241119031754.1021858-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf trace: Fix unaligned access for augmented argsNamhyung Kim
Some version of compilers reported unaligned accesses in perf trace when undefined-behavior sanitizer is on. I found that it uses raw data in the sample directly and assuming it's properly aligned. Unlike other sample fields, the raw data is not 8-byte aligned because there's a size field (u32) before the actual data. So I added a static buffer in syscall__augmented_args() and return it instead. This is not ideal but should work well as perf trace is single-threaded. A better approach would be aligning the raw data by adding a 4-byte data before the augmented args but I'm afraid it'd break the backward compatibility. Committer testing: To build with the undefined behaviour sanitizer: $ make CC=clang EXTRA_CFLAGS=-fsanitize=undefined -C tools/perf Checking if the resulting binary is instrumented: root@number:~# nm ~/bin/perf | grep ubsan | wc -l 113 root@number:~# nm ~/bin/perf | grep ubsan | tail -5 000000000043d5b0 t _ZN7__ubsanL19UBsanOnDeadlySignalEiPvS0_ 000000000043ce50 T _ZNK7__ubsan5Value12getSIntValueEv 000000000043cf40 T _ZNK7__ubsan5Value12getUIntValueEv 000000000043d140 T _ZNK7__ubsan5Value13getFloatValueEv 000000000043cfd0 T _ZNK7__ubsan5Value19getPositiveIntValueEv root@number:~# Now running something that will access timespec, as reported in the Closes URL: root@number:~# perf trace --max-events=1 -e *nano* sleep 1.1 trace/beauty/timespec.c:10:64: runtime error: member access within misaligned address 0x7fc583cfb2a4 for type 'struct augmented_arg', which requires 8 byte alignment 0x7fc583cfb2a4: note: pointer points here 99 99 11 00 10 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 01 e1 f5 05 00 00 00 00 00 00 00 00 ^ SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior trace/beauty/timespec.c:10:64 <SNIP> As Namhyung said we need to make the raw_data to be 64-bit aligned, probably we need to add a PERF_SAMPLE_ALIGNED_RAW with a 64-bit raw_size instead of the current u32 done at kernel/events/core.c, perf_output_sample(), that perf_output_put(handle, raw->size) where raw->size is an u32 and then the raw_data is always 64-bit unaligned... After the patch: root@number:~# perf trace -e *nano* sleep 1.1 0.000 (1100.064 ms): sleep/1984224 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 100000001 }, rmtp: 0x7fff5b3fe970) = 0 root@number:~# Closes: https://lore.kernel.org/r/Z2STgyD1p456Qqhg@google.com Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250102201248.790841-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf test: Mark remaining probe tests as exclusiveJames Clark
Probes are global and other probe tests are already exclusive. These two tests can throw warnings when run at the same time so mark them as exclusive too: $ perf test -vvv 81 79 79: perftool-testsuite_probe: --- start --- test child forked, pid 46419 ../common/init.sh: line 137: /sys/kernel/debug/tracing/uprobe_events: Device or resource busy Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Link: https://lore.kernel.org/r/20250107165933.292225-1-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf tools: Remove dependency on libauditCharlie Jenkins
All architectures now support HAVE_SYSCALL_TABLE_SUPPORT, so the flag is no longer needed. With the removal of the flag, the related GENERIC_SYSCALL_TABLE can also be removed. libaudit was only used as a fallback for when HAVE_SYSCALL_TABLE_SUPPORT was not defined, so libaudit is also no longer needed for any architecture. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-16-7543b5293098@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf tools s390: Use generic syscall table scriptsCharlie Jenkins
Use the generic scripts to generate headers from the syscall table instead of the custom ones for s390. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-15-7543b5293098@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf tools powerpc: Use generic syscall table scriptsCharlie Jenkins
Use the generic scripts to generate headers from the syscall table instead of the custom ones for powerpc. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-14-7543b5293098@rivosinc.com Link: https://lore.kernel.org/lkml/20250110100505.78d81450@canb.auug.org.au [ Stephen Rothwell noticed on linux-next that the powerpc build for perf was broken and ...] Link: https://lore.kernel.org/lkml/20250109-perf_powerpc_spu-v1-1-c097fc43737e@rivosinc.com [ ... Charlie fixed it up and asked for it to be squashed to avoid breaking bisection. ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09perf tools mips: Use generic syscall scriptsCharlie Jenkins
Use the generic scripts to generate headers from the syscall table for mips. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-13-7543b5293098@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09perf tools loongarch: Use syscall tableCharlie Jenkins
loongarch uses a syscall table, use that in perf instead of using unistd.h. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-12-7543b5293098@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09perf tools arm64: Use syscall tableCharlie Jenkins
arm64 uses a syscall table, use that in perf instead of using unistd.h. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-11-7543b5293098@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09perf tools parisc: Support syscall headerCharlie Jenkins
parisc uses a syscall table, use that in perf instead of requiring libaudit. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-10-7543b5293098@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09perf tools alpha: Support syscall headerCharlie Jenkins
alpha uses a syscall table, use that in perf instead of requiring libaudit. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-9-7543b5293098@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09perf tools x86: Use generic syscall scriptsCharlie Jenkins
Use the generic scripts to generate headers from the syscall table for both 32- and 64-bit x86. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-8-7543b5293098@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09perf tools xtensa: Support syscall headerCharlie Jenkins
xtensa uses a syscall table, use that in perf instead of requiring libaudit. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-7-7543b5293098@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09perf tools sparc: Support syscall headersCharlie Jenkins
sparc uses a syscall table, use that in perf instead of requiring libaudit. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-6-7543b5293098@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>