Age | Commit message (Collapse) | Author |
|
Conflicts:
tools/arch/x86/include/asm/cpufeatures.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Since we added --off-cpu-thresh, add tests for when a sample's off-cpu
time is above the threshold, and when it's below the threshold.
Note that the basic test performed in test_offcpu_basic() collects a
direct sample now, since sleep 1 has duration of 1000ms, higher than the
default value of --off-cpu-thresh of 500ms, resulting in a direct
sample.
An example:
$ sudo perf test offcpu
124: perf record offcpu profiling tests : Ok
$
Committer testing:
root@number:~# perf test offcpu
126: perf record offcpu profiling tests : Ok
root@number:~# perf test -v offcpu
126: perf record offcpu profiling tests : Ok
root@number:~# perf test -vv offcpu
126: perf record offcpu profiling tests:
--- start ---
test child forked, pid 1410791
Checking off-cpu privilege
Basic off-cpu test
Basic off-cpu test [Success]
Child task off-cpu test
Child task off-cpu test [Success]
Threshold test (above threshold)
Threshold test (above threshold) [Success]
Threshold test (below threshold)
Threshold test (below threshold) [Success]
---- end(0) ----
126: perf record offcpu profiling tests : Ok
root@number:~#
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250501022809.449767-11-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Specify the threshold for dumping offcpu samples with --off-cpu-thresh,
the unit is milliseconds. Default value is 500ms.
Example:
perf record --off-cpu --off-cpu-thresh 824
The example above collects direct off-cpu samples where the off-cpu time
is longer than 824ms.
Committer testing:
After commenting out the end off-cpu dump to have just the ones that are
added right after the task is scheduled back, and using a threshould of
1000ms, we see some periods (the 5th column, just before "offcpu-time"
in the 'perf script' output) that are over 1000.000.000 nanoseconds:
root@number:~# perf record --off-cpu --off-cpu-thresh 10000
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 3.902 MB perf.data (34335 samples) ]
root@number:~# perf script
<SNIP>
Isolated Web Co 59932 [028] 63839.594437: 1000049427 offcpu-time:
7fe63c7976c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
7fe63c78c04c __futex_abstimed_wait_common+0x7c (/usr/lib64/libc.so.6)
7fe63c78e928 pthread_cond_timedwait@@GLIBC_2.3.2+0x178 (/usr/lib64/libc.so.6)
5599974a9fe7 mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&)+0xe7 (/usr/lib64/fir>
100000000 [unknown] ([unknown])
swapper 0 [025] 63839.594459: 195724 cycles:P: ffffffffac328270 read_tsc+0x0 ([kernel.kallsyms])
Isolated Web Co 59932 [010] 63839.594466: 1000055278 offcpu-time:
7fe63c7976c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
7fe63c78ba24 __syscall_cancel+0x14 (/usr/lib64/libc.so.6)
7fe63c804c4e __poll+0x1e (/usr/lib64/libc.so.6)
7fe633b0d1b8 PollWrapper(_GPollFD*, unsigned int, int) [clone .lto_priv.0]+0xf8 (/usr/lib64/firefox/libxul.so)
10000002c [unknown] ([unknown])
swapper 0 [027] 63839.594475: 134433 cycles:P: ffffffffad4c45d9 irqentry_enter+0x19 ([kernel.kallsyms])
swapper 0 [028] 63839.594499: 215838 cycles:P: ffffffffac39199a switch_mm_irqs_off+0x10a ([kernel.kallsyms])
MediaPD~oder #1 1407676 [027] 63839.594514: 134433 cycles:P: 7f982ef5e69f dct_IV(int*, int, int*)+0x24f (/usr/lib64/libfdk-aac.so.2.0.0)
swapper 0 [024] 63839.594524: 267411 cycles:P: ffffffffad4c6ee6 poll_idle+0x56 ([kernel.kallsyms])
MediaSu~sor #75 1093827 [026] 63839.594555: 332652 cycles:P: 55be753ad030 moz_xmalloc+0x200 (/usr/lib64/firefox/firefox)
swapper 0 [027] 63839.594616: 160548 cycles:P: ffffffffad144840 menu_select+0x570 ([kernel.kallsyms])
Isolated Web Co 14019 [027] 63839.595120: 1000050178 offcpu-time:
7fc9537cc6c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
7fc9537c104c __futex_abstimed_wait_common+0x7c (/usr/lib64/libc.so.6)
7fc9537c3928 pthread_cond_timedwait@@GLIBC_2.3.2+0x178 (/usr/lib64/libc.so.6)
7fc95372a3c8 pt_TimedWait+0xb8 (/usr/lib64/libnspr4.so)
7fc95372a8d8 PR_WaitCondVar+0x68 (/usr/lib64/libnspr4.so)
7fc94afb1f7c WatchdogMain(void*)+0xac (/usr/lib64/firefox/libxul.so)
7fc947498660 [unknown] ([unknown])
7fc9535fce88 [unknown] ([unknown])
7fc94b620e60 WatchdogManager::~WatchdogManager()+0x0 (/usr/lib64/firefox/libxul.so)
fff8548387f8b48 [unknown] ([unknown])
swapper 0 [003] 63839.595712: 212948 cycles:P: ffffffffacd5b865 acpi_os_read_port+0x55 ([kernel.kallsyms])
<SNIP>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-2-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-10-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
BPF's stack trace map
Dump the remaining PERF_SAMPLE_ data, as if it is dumping a direct
sample.
Put the stack trace, tid, off-cpu time and cgroup id into the raw_data
section, just like a direct off-cpu sample coming from BPF's
bpf_perf_event_output().
This ensures that evsel__parse_sample() correctly parses both direct
samples and accumulated samples.
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-10-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-9-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
No PERF_SAMPLE_CALLCHAIN in sample_type, but 'perf script' needs to
display a callchain, have to specify manually.
Also, prefer displaying a callchain:
gvfs-afc-volume 2267 [001] 3829232.955656: 1001115340 offcpu-time:
77f05292603f __pselect+0xbf (/usr/lib/x86_64-linux-gnu/libc.so.6)
77f052a1801c [unknown] (/usr/lib/x86_64-linux-gnu/libusbmuxd-2.0.so.6.0.0)
77f052a18d45 [unknown] (/usr/lib/x86_64-linux-gnu/libusbmuxd-2.0.so.6.0.0)
77f05289ca94 start_thread+0x384 (/usr/lib/x86_64-linux-gnu/libc.so.6)
77f052929c3c clone3+0x2c (/usr/lib/x86_64-linux-gnu/libc.so.6)
to a raw binary BPF output:
BPF output: 0000: dd 08 00 00 db 08 00 00 <DD>...<DB>...
0008: cc ce ab 3b 00 00 00 00 <CC>Ϋ;....
0010: 06 00 00 00 00 00 00 00 ........
0018: 00 fe ff ff ff ff ff ff .<FE><FF><FF><FF><FF><FF><FF>
0020: 3f 60 92 52 f0 77 00 00 ?`.R<F0>w..
0028: 1c 80 a1 52 f0 77 00 00 ..<A1>R<F0>w..
0030: 45 8d a1 52 f0 77 00 00 E.<A1>R<F0>w..
0038: 94 ca 89 52 f0 77 00 00 .<CA>.R<F0>w..
0040: 3c 9c 92 52 f0 77 00 00 <..R<F0>w..
0048: 00 00 00 00 00 00 00 00 ........
0050: 00 00 00 00 ....
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-9-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-8-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There is a check in evsel.c that does this:
if (evsel__is_offcpu_event(evsel))
evsel->core.attr.sample_type &= OFFCPU_SAMPLE_TYPES;
This along with:
#define OFFCPU_SAMPLE_TYPES (PERF_SAMPLE_IDENTIFIER | PERF_SAMPLE_IP | \
PERF_SAMPLE_TID | PERF_SAMPLE_TIME | \
PERF_SAMPLE_ID | PERF_SAMPLE_CPU | \
PERF_SAMPLE_PERIOD | PERF_SAMPLE_CALLCHAIN | \
PERF_SAMPLE_CGROUP)
will tell perf_event to collect callchain.
We don't need the callchain from perf_event when collecting off-cpu
samples, because it's prev's callchain, not next's callchain.
(perf_event) (task_storage) (needed)
prev next
| |
---sched_switch---->
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-8-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-7-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use the data in bpf-output samples, to assemble off-cpu samples.
In evsel__is_offcpu_event(), check if sample_type is PERF_SAMPLE_RAW to
support off-cpu sample data created by an older version of perf.
Testing compatibility on off-cpu samples collected by perf before this patch series:
See below, the sample_type still uses PERF_SAMPLE_CALLCHAIN
$ perf script --header -i ./perf.data.ptn | grep "event : name = offcpu-time"
# event : name = offcpu-time, , id = { 237917, 237918, 237919, 237920 }, type = 1 (software), size = 136, config = 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, sample_id_all = 1
The output is correct.
$ perf script -i ./perf.data.ptn | grep offcpu-time
gmain 2173 [000] 18446744069.414584: 100102015 offcpu-time:
NetworkManager 901 [000] 18446744069.414584: 5603579 offcpu-time:
Web Content 1183550 [000] 18446744069.414584: 46278 offcpu-time:
gnome-control-c 2200559 [000] 18446744069.414584: 11998247014 offcpu-time:
<SNIP>
$
And after this patch series:
$ perf script --header -i ./perf.data.off-cpu-v9 | grep "event : name = offcpu-time"
# event : name = offcpu-time, , id = { 237959, 237960, 237961, 237962 }, type = 1 (software), size = 136, config = 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, sample_id_all = 1
$ ./perf script -i ./perf.data.off-cpu-v9 | grep offcpu-time
gnome-shell 1875 [001] 4789616.361225: 100097057 offcpu-time:
gnome-shell 1875 [001] 4789616.461419: 100107463 offcpu-time:
firefox 2206821 [002] 4789616.475690: 255257245 offcpu-time:
$
Committer testing:
The command to record those samples:
root@number:~# perf record --off-cpu -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.092 MB perf.data (1552 samples) ]
root@number:~#
Then, before this patch series, the sample_type for the "offcpu-time" event is:
root@number:~# perf evlist -v | grep offcpu-time
offcpu-time: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, freq: 1, sample_id_all: 1
root@number:~#
And after it, after recording it again:
root@number:~# perf record --off-cpu -a sleep 1 ; perf evlist -v | grep offcpu-time
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.151 MB perf.data (2843 samples) ]
offcpu-time: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, sample_id_all: 1
root@number:~#
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-7-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-6-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Collect tid, period, callchain, and cgroup id and dump them when off-cpu
time threshold is reached.
We don't collect the off-cpu time twice (the delta), it's either in
direct samples, or accumulated samples that are dumped at the end of
perf.data.
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-6-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-5-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Set the perf_event map in BPF for dumping off-cpu samples, and set the
offcpu_thresh to specify the threshold.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-5-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-4-howardchu95@gmail.com
[ Added some missing iteration variables to off_cpu_config() and fixed up
a manually edited patch hunk line boundary line ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Parse the off-cpu event using parse_event(), as bpf-output.
Call evlist__enable_evsel() on off-cpu event. This fixes the inability
to collect direct off-cpu samples on a workload, as reported by Arnaldo
Carvalho de Melo <acme@redhat.com>.
The reason being, workload sets enable_on_exec instead of calling
evlist__enable(), but off-cpu event does not attach to an executable and
execve won't be called, so the fds from perf_event_open() are not
enabled.
no-inherit should be set to 1, here's the reason:
We update the BPF perf_event map for direct off-cpu sample dumping (in
following patches), it executes as follows:
bpf_map_update_value()
bpf_fd_array_map_update_elem()
perf_event_fd_array_get_ptr()
perf_event_read_local()
In perf_event_read_local(), there is:
int perf_event_read_local(struct perf_event *event, u64 *value,
u64 *enabled, u64 *running)
{
...
/*
* It must not be an event with inherit set, we cannot read
* all child counters from atomic context.
*/
if (event->attr.inherit) {
ret = -EOPNOTSUPP;
goto out;
}
Which means no-inherit has to be true for updating the BPF perf_event
map.
Moreover, for bpf-output events, we primarily want a system-wide event
instead of a per-task event.
The reason is that in BPF's bpf_perf_event_output(), BPF uses the CPU
index to retrieve the perf_event file descriptor it outputs to.
Making a bpf-output event system-wide naturally satisfies this
requirement by mapping CPU appropriately.
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-4-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-3-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Expose evsel__is_offcpu_event() so it can be used in off_cpu_config(),
evsel__parse_sample() and 'perf script'.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-3-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-2-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add the -b/ --buckets argument to specify the number of hash buckets for
the private futex hash. This is directly passed to
prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_SET_SLOTS, buckets, immutable)
and must return without an error if specified. The `immutable' is 0 by
default and can be set to 1 via the -I/ --immutable argument.
The size of the private hash is verified with PR_FUTEX_HASH_GET_SLOTS.
If PR_FUTEX_HASH_GET_SLOTS failed then it is assumed that an older
kernel was used without the support and that the global hash is used.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250416162921.513656-20-bigeasy@linutronix.de
|
|
Running the "perf script task-analyzer tests" with address sanitizer
showed a double free:
```
FAIL: "test_csv_extended_times" Error message: "Failed to find required string:'Out-Out;'."
=================================================================
==19190==ERROR: AddressSanitizer: attempting double-free on 0x50b000017b10 in thread T0:
#0 0x55da9601c78a in free (perf+0x26078a) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
#1 0x55da96640c63 in filename__read_build_id tools/perf/util/symbol-minimal.c:221:2
0x50b000017b10 is located 0 bytes inside of 112-byte region [0x50b000017b10,0x50b000017b80)
freed by thread T0 here:
#0 0x55da9601ce40 in realloc (perf+0x260e40) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
#1 0x55da96640ad6 in filename__read_build_id tools/perf/util/symbol-minimal.c:204:10
previously allocated by thread T0 here:
#0 0x55da9601ca23 in malloc (perf+0x260a23) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
#1 0x55da966407e7 in filename__read_build_id tools/perf/util/symbol-minimal.c:181:9
SUMMARY: AddressSanitizer: double-free (perf+0x26078a) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a) in free
==19190==ABORTING
FAIL: "invocation of perf script report task-analyzer --csv-summary csvsummary --summary-extended command failed" Error message: ""
FAIL: "test_csvsummary_extended" Error message: "Failed to find required string:'Out-Out;'."
---- end(-1) ----
132: perf script task-analyzer tests : FAILED!
```
The buf_size if always set to phdr->p_filesz, but that may be 0
causing a free and realloc to return NULL. This is treated in
filename__read_build_id like a failure and the buffer is freed again.
To avoid this problem only grow buf, meaning the buf_size will never
be 0. This also reduces the number of memory (re)allocations.
Fixes: b691f64360ecec49 ("perf symbols: Implement poor man's ELF parser")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250501070003.22251-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is a breakdown of perf_mem_data_src.mem_dtlb values. It assumes
PMU drivers would set PERF_MEM_TLB_HIT bit with an appropriate level.
And having PERF_MEM_TLB_MISS means that it failed to find one in any
levels of TLB. For now, it doesn't use PERF_MEM_TLB_{WK,OS} bits.
Also it seems Intel machines don't distinguish L1 or L2 precisely. So I
added ANY_HIT (printed as "L?-Hit") to handle the case.
$ perf mem report -F overhead,dtlb,dso --stdio
...
# --- D-TLB ----
# Overhead L?-Hit Miss Shared Object
# ........ .............. .................
#
67.03% 99.5% 0.5% [unknown]
31.23% 99.2% 0.8% [kernel.kallsyms]
1.08% 97.8% 2.2% [i915]
0.36% 100.0% 0.0% [JIT] tid 6853
0.12% 100.0% 0.0% [drm]
0.05% 100.0% 0.0% [drm_kms_helper]
0.05% 100.0% 0.0% [ext4]
0.02% 100.0% 0.0% [aesni_intel]
0.02% 100.0% 0.0% [crc32c_intel]
0.02% 100.0% 0.0% [dm_crypt]
...
Committer testing:
# perf report --header | grep cpudesc
# cpudesc : AMD Ryzen 9 9950X3D 16-Core Processor
# perf mem report -F overhead,dtlb,dso --stdio | head -20
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 2K of event 'cycles:P'
# Total weight : 2637
# Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc
#
# ---------- D-TLB -----------
# Overhead L1-Hit L2-Hit Miss Other Shared Object
# ........ ............................ .................................
#
77.47% 18.4% 0.1% 0.6% 80.9% [kernel.kallsyms]
5.61% 36.5% 0.7% 1.4% 61.5% libxul.so
2.77% 39.7% 0.0% 12.3% 47.9% libc.so.6
2.01% 34.0% 1.9% 1.9% 62.3% libglib-2.0.so.0.8400.1
1.93% 31.4% 2.0% 2.0% 64.7% [amdgpu]
1.63% 48.8% 0.0% 0.0% 51.2% [JIT] tid 60168
1.14% 3.3% 0.0% 0.0% 96.7% [vdso]
#
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is a breakdown of perf_mem_data_src.mem_snoop values. For now, it
doesn't use mem_snoopx values like FWD and PEER.
$ perf mem report -F overhead,snoop,comm --stdio
...
# ---------- Snoop -----------
# Overhead Hit HitM Miss Other Command
# ........ ............................ ...............
#
34.24% 0.6% 0.0% 0.0% 99.4% gnome-shell
12.02% 1.0% 0.0% 0.0% 99.0% chrome
9.32% 1.0% 0.0% 0.3% 98.7% Isolated Web Co
6.85% 1.0% 0.3% 0.0% 98.6% swapper
6.30% 0.8% 0.8% 0.0% 98.5% Xorg
3.02% 2.4% 0.0% 0.0% 97.6% VizCompositorTh
2.35% 0.0% 0.0% 0.0% 100.0% firefox-esr
2.04% 0.0% 0.0% 0.0% 100.0% JS Helper
1.51% 3.2% 0.0% 0.0% 96.8% threaded-ml
1.44% 0.0% 0.0% 0.0% 100.0% AudioIP~allback
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-11-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is a breakdown of perf_mem_data_src.mem_lvl_num. But it's also
divided into two parts because the combination is bigger than 8.
Since there are many entries for different cache levels, 'cache' field
focuses on them. I generalized buffers like LFB, MAB and MHB to L1-buf
and L2-buf.
The rest goes to 'memory' field which can be RAM, CXL, PMEM, IO, etc.
$ perf mem report -F cache,mem,dso --stdio
...
#
# -------------- Cache -------------- --- Memory ---
# L1 L2 L3 L1-buf Other RAM Other Shared Object
# ................................... .............. ....................................
#
53.9% 3.6% 16.2% 21.6% 4.8% 4.8% 95.2% [kernel.kallsyms]
64.7% 1.7% 3.5% 17.4% 12.8% 12.8% 87.2% chrome (deleted)
78.3% 2.8% 0.0% 1.0% 17.9% 17.9% 82.1% libc.so.6
39.6% 1.5% 0.0% 5.7% 53.2% 53.2% 46.8% libxul.so
26.2% 0.0% 0.0% 0.0% 73.8% 73.8% 26.2% [unknown]
85.5% 0.0% 0.0% 14.5% 0.0% 0.0% 100.0% libspa-audioconvert.so
66.3% 4.4% 0.0% 29.4% 0.0% 0.0% 100.0% libglib-2.0.so.0.8200.1 (deleted)
1.9% 0.0% 0.0% 0.0% 98.1% 98.1% 1.9% libmutter-cogl-15.so.0.0.0 (deleted)
10.6% 0.0% 0.0% 89.4% 0.0% 0.0% 100.0% libpulsecommon-16.1.so
0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 100.0% libfreeblpriv3.so (deleted)
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some mem_stat types don't use all 8 columns. And there are cases only
samples in certain kinds of mem_stat types are available only. For that
case hide columns which has no samples.
The new output for the previous data would be:
$ perf mem report -F overhead,op,comm --stdio
...
# ------ Mem Op -------
# Overhead Load Store Other Command
# ........ ..................... ...............
#
44.85% 21.1% 30.7% 48.3% swapper
26.82% 98.8% 0.3% 0.9% netsli-prober
7.19% 51.7% 13.7% 34.6% perf
5.81% 89.7% 2.2% 8.1% qemu-system-ppc
4.77% 100.0% 0.0% 0.0% notifications_c
1.77% 95.9% 1.2% 3.0% MemoryReleaser
0.77% 71.6% 4.1% 24.3% DefaultEventMan
0.19% 66.7% 22.2% 11.1% gnome-shell
...
On Intel machines, the event is only for loads or stores so it'll have
only one column:
# Mem Op
# Overhead Load Command
# ........ ....... ...............
#
20.55% 100.0% swapper
17.13% 100.0% chrome
9.02% 100.0% data-loop.0
6.26% 100.0% pipewire-pulse
5.63% 100.0% threaded-ml
5.47% 100.0% GraphRunner
5.37% 100.0% AudioIP~allback
5.30% 100.0% Chrome_ChildIOT
3.17% 100.0% Isolated Web Co
...
Committer testing:
# grep "model name" -m1 /proc/cpuinfo
model name : AMD Ryzen 9 9950X3D 16-Core Processo
# perf mem report -F overhead,op,comm --stdio
# Total Lost Samples: 0
#
# Samples: 2K of event 'cycles:P'
# Total weight : 2637
# Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc
#
# ------ Mem Op -------
# Overhead Load Store Other Command
# ........ ..................... ...............
#
61.02% 14.4% 25.5% 60.1% swapper
5.61% 26.4% 13.5% 60.1% Isolated Web Co
5.50% 21.4% 29.7% 49.0% perf
4.74% 27.2% 15.2% 57.6% gnome-shell
4.63% 33.6% 11.5% 54.9% mdns_service
4.29% 28.3% 12.4% 59.3% ptyxis
2.16% 24.6% 19.3% 56.1% DOM Worker
0.99% 23.1% 34.6% 42.3% firefox
0.72% 26.3% 15.8% 57.9% IPC I/O Parent
0.61% 12.5% 12.5% 75.0% kworker/u130:20
0.61% 37.5% 18.8% 43.8% podman
0.57% 33.3% 6.7% 60.0% Timer
0.53% 14.3% 7.1% 78.6% KMS thread
0.49% 30.8% 7.7% 61.5% kworker/u130:3-
0.46% 41.7% 33.3% 25.0% IPDL Background
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is an actual example of the he_mem_stat based sample breakdown. It
uses 'mem_op' field of union perf_mem_data_src which means memory
operations.
It'd have basically 'load' or 'store' which can be useful if PMU doesn't
have separate events for them like IBS or SPE. In addition, there's an
entry in case load and store happen at the same time. Also adds entries
for prefetching and execution.
$ perf mem report -F +op -s comm --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 4K of event 'ibs_op//'
# Total weight : 9559
# Sort order : comm
#
# --------------------- Mem Op ----------------------
# Overhead Samples Load Store Ld+St Pfetch Exec Other N/A N/A Command
# ........ ....... ................................................... ...............
#
44.85% 4077 21.1% 30.7% 0.0% 0.0% 0.0% 48.3% 0.0% 0.0% swapper
26.82% 45 98.8% 0.3% 0.0% 0.0% 0.0% 0.9% 0.0% 0.0% netsli-prober
7.19% 442 51.7% 13.7% 0.0% 0.0% 0.0% 34.6% 0.0% 0.0% perf
5.81% 75 89.7% 2.2% 0.0% 0.0% 0.0% 8.1% 0.0% 0.0% qemu-system-ppc
4.77% 1 100.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% notifications_c
1.77% 10 95.9% 1.2% 0.0% 0.0% 0.0% 3.0% 0.0% 0.0% MemoryReleaser
0.77% 32 71.6% 4.1% 0.0% 0.0% 0.0% 24.3% 0.0% 0.0% DefaultEventMan
0.19% 10 66.7% 22.2% 0.0% 0.0% 0.0% 11.1% 0.0% 0.0% gnome-shell
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is a preparation for later changes to support mem_stat output. The
new fields will need two lines for the header - the first line will show
type of mem stat and the second line will show the name of each item
which is returned by mem_stat_name().
Each element in the mem_stat array will be printed in percentage for the
hist_entry and their sum would be 100%.
Add new output field dimension only for SORT_MODE__MEM using mem_stat.
To handle possible name conflict with existing sort keys, move the order
of checking output field dimensions after the sort dimensions when it
looks for sort keys.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a logic to account he->mem_stat based on mem_stat_type in hists.
Each mem_stat entry will have different meaning based on the type so the
index in the array is calculated at runtime using the corresponding
value in the sample.data_src.
Still hists has no mem_stat_types yet so this code won't work for now.
Later hists->mem_stat_types will be allocated based on what users want
in the output actually.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The 'struct he_mem_stat' is to save detailed information about memory
instruction. It'll be used to show breakdown of various data from
PERF_SAMPLE_DATA_SRC. Note that this structure is generic and the
contents will be different depending on actual data it'll use later.
The information about the actual data will be saved in 'struct hists'
and its length is in nr_mem_stats. This commit just adds ground works
and does nothing since hists->nr_mem_stats is 0 for now.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is a preparation to support multi-line headers in 'perf mem report'.
Normal sort keys and output fields that don't have contents for multi-
line will print the header string at the last line only.
As we don't use multi-line headers normally, it should not have any
changes in the output.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There's no way to enable PERF_SAMPLE_DATA_SRC without PERF_SAMPLE_ADDR
which brings a lot of overhead due to the number of MMAP[2] records.
Let's add a new option to enable this information separately.
Committer testing:
# perf record -a --sample-mem-info
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.815 MB perf.data (2637 samples) ]
#
# perf evlist -v
cycles:P: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER|DATA_SRC, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 2, sample_id_all: 1
dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER|DATA_SRC, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
#
# perf report -D |& grep -w PERF_RECORD_SAMPLE -A3 -m1
0 44675164447282 0x1a7590 [0x40]: PERF_RECORD_SAMPLE(IP, 0x4001): 107299/107299: 0xffffffffac4a5e11 period: 144 addr: 0
. data_src: 0x229080142
... thread: perf:107299
...... dso: /lib/modules/6.15.0-rc4+/build/vmlinux
#
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When it removes an output format for cancelled children or latency, it
should delete itself from the sort list as well. Otherwise assertion
in fmt_free() will fire.
$ perf report -H --stdio
perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
Aborted (core dumped)
Also convert to perf_hpp__column_unregister() for the same open codes.
Committer notes:
Before this patch:
# perf test hierarchy
83: perf report --hierarchy : FAILED!
# perf test -v hierarchy
--- start ---
test child forked, pid 102242
perf report --hierarchy
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB /tmp/perf-test-report.HX0N85TlPq/perf-report-hierarchy-perf.data (6 samples) ]
perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
/home/acme/libexec/perf-core/tests/shell/perf-report-hierarchy.sh: line 34: 102250 Aborted (core dumped) perf report --hierarchy > /dev/null
--- Cleaning up ---
---- end(-1) ----
83: perf report --hierarchy : FAILED!
#
After:
# perf test hierarchy
83: perf report --hierarchy : Ok
#
Fixes: dbd11b6bdab12f60 ("perf hist: Remove formats in hierarchy when cancel children")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250430180321.736939-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Super simple test to check that at least we're not segfaulting when
trying to use 'perf report --hierarchy', more subtests should be added
to make sure the output is the expected one.
This is being merged right before a fix for that that this test detects:
# perf test hierarchy
83: perf report --hierarchy : FAILED!
# perf test -v hierarchy
--- start ---
test child forked, pid 102242
perf report --hierarchy
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB /tmp/perf-test-report.HX0N85TlPq/perf-report-hierarchy-perf.data (6 samples) ]
perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
/home/acme/libexec/perf-core/tests/shell/perf-report-hierarchy.sh: line 34: 102250 Aborted (core dumped) perf report --hierarchy > /dev/null
--- Cleaning up ---
---- end(-1) ----
83: perf report --hierarchy : FAILED!
#
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/lkml/20250430180321.736939-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
IBS Fetch and IBS Op PMUs has various constraints on supported sample
periods. Add perf unit tests to test those.
Running it in parallel with other tests causes intermittent failures.
Mark it exclusive to force it to run sequentially. Sample output on a
Zen5 machine:
Without kernel fixes:
$ sudo ./perf test -vv 112
112: AMD IBS sample period:
--- start ---
test child forked, pid 8774
Using CPUID AuthenticAMD-26-2-1
IBS config tests:
-----------------
Fetch PMU tests:
0xffff : Ok (nr samples: 1078)
0x1000 : Ok (nr samples: 17030)
0xff : Ok (nr samples: 41068)
0x1 : Ok (nr samples: 40543)
0x0 : Ok
0x10000 : Ok
Op PMU tests:
0x0 : Ok
0x1 : Fail
0x8 : Fail
0x9 : Ok (nr samples: 40543)
0xf : Ok (nr samples: 40543)
0x1000 : Ok (nr samples: 18736)
0xffff : Ok (nr samples: 1168)
0x10000 : Ok
0x100000 : Fail (nr samples: 14)
0xf00000 : Fail (nr samples: 1)
0xf0ffff : Fail (nr samples: 1)
0x1f0ffff : Fail (nr samples: 1)
0x7f0ffff : Fail (nr samples: 0)
0x8f0ffff : Ok
0x17f0ffff : Ok
IBS sample period constraint tests:
-----------------------------------
Fetch PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Fail
freq 0, sample_freq 15: Fail
freq 0, sample_freq 16: Ok (nr samples: 1604)
freq 0, sample_freq 17: Ok (nr samples: 1604)
freq 0, sample_freq 143: Ok (nr samples: 1604)
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1604)
freq 0, sample_freq 1234: Ok (nr samples: 1566)
freq 0, sample_freq 4103: Ok (nr samples: 1119)
freq 0, sample_freq 65520: Ok (nr samples: 2264)
freq 0, sample_freq 65535: Ok (nr samples: 2263)
freq 0, sample_freq 65552: Ok (nr samples: 1166)
freq 0, sample_freq 8388607: Ok (nr samples: 268)
freq 0, sample_freq 268435455: Ok (nr samples: 8)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Ok (nr samples: 4)
freq 1, sample_freq 15: Ok (nr samples: 4)
freq 1, sample_freq 16: Ok (nr samples: 4)
freq 1, sample_freq 17: Ok (nr samples: 4)
freq 1, sample_freq 143: Ok (nr samples: 5)
freq 1, sample_freq 144: Ok (nr samples: 5)
freq 1, sample_freq 145: Ok (nr samples: 5)
freq 1, sample_freq 1234: Ok (nr samples: 7)
freq 1, sample_freq 4103: Ok (nr samples: 35)
freq 1, sample_freq 65520: Ok (nr samples: 642)
freq 1, sample_freq 65535: Ok (nr samples: 636)
freq 1, sample_freq 65552: Ok (nr samples: 651)
freq 1, sample_freq 8388607: Ok
Op PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Fail
freq 0, sample_freq 15: Fail
freq 0, sample_freq 16: Fail
freq 0, sample_freq 17: Fail
freq 0, sample_freq 143: Fail
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1604)
freq 0, sample_freq 1234: Ok (nr samples: 1604)
freq 0, sample_freq 4103: Ok (nr samples: 1604)
freq 0, sample_freq 65520: Ok (nr samples: 2227)
freq 0, sample_freq 65535: Ok (nr samples: 2296)
freq 0, sample_freq 65552: Ok (nr samples: 2213)
freq 0, sample_freq 8388607: Ok (nr samples: 250)
freq 0, sample_freq 268435455: Ok (nr samples: 8)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Fail (nr samples: 4)
freq 1, sample_freq 15: Fail (nr samples: 4)
freq 1, sample_freq 16: Fail (nr samples: 4)
freq 1, sample_freq 17: Fail (nr samples: 4)
freq 1, sample_freq 143: Fail (nr samples: 5)
freq 1, sample_freq 144: Fail (nr samples: 5)
freq 1, sample_freq 145: Fail (nr samples: 5)
freq 1, sample_freq 1234: Fail (nr samples: 8)
freq 1, sample_freq 4103: Fail (nr samples: 33)
freq 1, sample_freq 65520: Fail (nr samples: 546)
freq 1, sample_freq 65535: Fail (nr samples: 544)
freq 1, sample_freq 65552: Fail (nr samples: 555)
freq 1, sample_freq 8388607: Ok
IBS ioctl() tests:
------------------
Fetch PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Fail
ioctl(period = 0xf ): Fail
ioctl(period = 0x10 ): Ok
ioctl(period = 0x11 ): Fail
ioctl(period = 0x1f ): Fail
ioctl(period = 0x20 ): Ok
ioctl(period = 0x80 ): Ok
ioctl(period = 0x8f ): Fail
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Fail
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Fail
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Fail
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
Op PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Fail
ioctl(period = 0xf ): Fail
ioctl(period = 0x10 ): Fail
ioctl(period = 0x11 ): Fail
ioctl(period = 0x1f ): Fail
ioctl(period = 0x20 ): Fail
ioctl(period = 0x80 ): Fail
ioctl(period = 0x8f ): Fail
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Fail
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Fail
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Fail
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
IBS freq (negative) tests:
--------------------------
freq 1, sample_freq 200000: Fail
IBS L3MissOnly test: (takes a while)
--------------------
Fetch L3MissOnly: Fail (nr_samples: 1213)
Op L3MissOnly: Ok (nr_samples: 1193)
---- end(-1) ----
112: AMD IBS sample period : FAILED!
With kernel fixes:
$ sudo ./perf test -vv 112
112: AMD IBS sample period:
--- start ---
test child forked, pid 6939
Using CPUID AuthenticAMD-26-2-1
IBS config tests:
-----------------
Fetch PMU tests:
0xffff : Ok (nr samples: 969)
0x1000 : Ok (nr samples: 15540)
0xff : Ok (nr samples: 40555)
0x1 : Ok (nr samples: 40543)
0x0 : Ok
0x10000 : Ok
Op PMU tests:
0x0 : Ok
0x1 : Ok
0x8 : Ok
0x9 : Ok (nr samples: 40543)
0xf : Ok (nr samples: 40543)
0x1000 : Ok (nr samples: 19156)
0xffff : Ok (nr samples: 1169)
0x10000 : Ok
0x100000 : Ok (nr samples: 1151)
0xf00000 : Ok (nr samples: 76)
0xf0ffff : Ok (nr samples: 73)
0x1f0ffff : Ok (nr samples: 33)
0x7f0ffff : Ok (nr samples: 10)
0x8f0ffff : Ok
0x17f0ffff : Ok
IBS sample period constraint tests:
-----------------------------------
Fetch PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Ok
freq 0, sample_freq 15: Ok
freq 0, sample_freq 16: Ok (nr samples: 1203)
freq 0, sample_freq 17: Ok (nr samples: 1604)
freq 0, sample_freq 143: Ok (nr samples: 1604)
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1604)
freq 0, sample_freq 1234: Ok (nr samples: 1604)
freq 0, sample_freq 4103: Ok (nr samples: 1343)
freq 0, sample_freq 65520: Ok (nr samples: 2254)
freq 0, sample_freq 65535: Ok (nr samples: 2136)
freq 0, sample_freq 65552: Ok (nr samples: 1158)
freq 0, sample_freq 8388607: Ok (nr samples: 257)
freq 0, sample_freq 268435455: Ok (nr samples: 8)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Ok (nr samples: 4)
freq 1, sample_freq 15: Ok (nr samples: 4)
freq 1, sample_freq 16: Ok (nr samples: 4)
freq 1, sample_freq 17: Ok (nr samples: 4)
freq 1, sample_freq 143: Ok (nr samples: 5)
freq 1, sample_freq 144: Ok (nr samples: 5)
freq 1, sample_freq 145: Ok (nr samples: 5)
freq 1, sample_freq 1234: Ok (nr samples: 8)
freq 1, sample_freq 4103: Ok (nr samples: 34)
freq 1, sample_freq 65520: Ok (nr samples: 458)
freq 1, sample_freq 65535: Ok (nr samples: 628)
freq 1, sample_freq 65552: Ok (nr samples: 396)
freq 1, sample_freq 8388607: Ok
Op PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Ok
freq 0, sample_freq 15: Ok
freq 0, sample_freq 16: Ok
freq 0, sample_freq 17: Ok
freq 0, sample_freq 143: Ok
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1604)
freq 0, sample_freq 1234: Ok (nr samples: 1604)
freq 0, sample_freq 4103: Ok (nr samples: 1604)
freq 0, sample_freq 65520: Ok (nr samples: 2250)
freq 0, sample_freq 65535: Ok (nr samples: 2158)
freq 0, sample_freq 65552: Ok (nr samples: 2296)
freq 0, sample_freq 8388607: Ok (nr samples: 243)
freq 0, sample_freq 268435455: Ok (nr samples: 6)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Ok (nr samples: 4)
freq 1, sample_freq 15: Ok (nr samples: 4)
freq 1, sample_freq 16: Ok (nr samples: 4)
freq 1, sample_freq 17: Ok (nr samples: 4)
freq 1, sample_freq 143: Ok (nr samples: 4)
freq 1, sample_freq 144: Ok (nr samples: 5)
freq 1, sample_freq 145: Ok (nr samples: 4)
freq 1, sample_freq 1234: Ok (nr samples: 6)
freq 1, sample_freq 4103: Ok (nr samples: 27)
freq 1, sample_freq 65520: Ok (nr samples: 542)
freq 1, sample_freq 65535: Ok (nr samples: 550)
freq 1, sample_freq 65552: Ok (nr samples: 552)
freq 1, sample_freq 8388607: Ok
IBS ioctl() tests:
------------------
Fetch PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Ok
ioctl(period = 0xf ): Ok
ioctl(period = 0x10 ): Ok
ioctl(period = 0x11 ): Ok
ioctl(period = 0x1f ): Ok
ioctl(period = 0x20 ): Ok
ioctl(period = 0x80 ): Ok
ioctl(period = 0x8f ): Ok
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Ok
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Ok
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Ok
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
Op PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Ok
ioctl(period = 0xf ): Ok
ioctl(period = 0x10 ): Ok
ioctl(period = 0x11 ): Ok
ioctl(period = 0x1f ): Ok
ioctl(period = 0x20 ): Ok
ioctl(period = 0x80 ): Ok
ioctl(period = 0x8f ): Ok
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Ok
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Ok
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Ok
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
IBS freq (negative) tests:
--------------------------
freq 1, sample_freq 200000: Ok
IBS L3MissOnly test: (takes a while)
--------------------
Fetch L3MissOnly: Ok (nr_samples: 1301)
Op L3MissOnly: Ok (nr_samples: 1590)
---- end(0) ----
112: AMD IBS sample period : Ok
Committer notes:
Avoid using PAGE_SIZE as that define is also in sys/user.h
Make it a variable not to call sysconf() multiple times.
Also cast func to void * when passing it as the first arg to memcpy to
avoid this with some versions of clang:
arch/x86/tests/amd-ibs-period.c:81:3: error: no matching function for call to 'memcpy'
memcpy(func, insn1, sizeof(insn1));
^~~~~~
/usr/include/string.h:27:7: note: candidate function not viable: no known conversion from 'int (*)(void)' to 'void *' for 1st argument
void *memcpy (void *__restrict, const void *__restrict, size_t);
^
/usr/include/fortify/string.h:40:27: note: candidate function not viable: no known conversion from 'int (*)(void)' to 'void *const' for 1st argument
_FORTIFY_FN(memcpy) void *memcpy(void * _FORTIFY_POS0 __od,
^
arch/x86/tests/amd-ibs-period.c:87:3: error: no matching function for call to 'memcpy'
This one, for instance:
Alpine clang version 19.1.4
Target: x86_64-alpine-linux-musl
Thread model: posix
InstalledDir: /usr/lib/llvm19/bin
Configuration file: /etc/clang19/x86_64-alpine-linux-musl.cfg
System configuration file directory: /etc/clang19
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-5-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf mem/c2c' uses IBS Op PMU on AMD platforms.
IBS Op PMU on Zen5 uarch has added support for Load Latency filtering.
Implement 'perf mem/c2c' --ldlat using IBS Op Load Latency filtering
capability.
Some subtle differences between AMD and other arch:
o --ldlat is disabled by default on AMD
o Supported values are 128 to 2048.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-4-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
IBS Op PMU on Zen5 reports DTLB and page size information differently
compared to prior generation.
IBS_OP_DATA3 Zen3/4 Zen5
----------------------------------------------------------------
19 IbsDcL2TlbHit1G Reserved
----------------------------------------------------------------
6 IbsDcL2tlbHit2M Reserved
----------------------------------------------------------------
5 IbsDcL1TlbHit1G PageSize:
4 IbsDcL1TlbHit2M 0 - 4K
1 - 2M
2 - 1G
3 - Reserved
Valid only if
IbsDcPhyAddrValid = 1
----------------------------------------------------------------
3 IbsDcL2TlbMiss IbsDcL2TlbMiss
Valid only if
IbsDcPhyAddrValid = 1
----------------------------------------------------------------
2 IbsDcL1tlbMiss IbsDcL1tlbMiss
Valid only if
IbsDcPhyAddrValid = 1
----------------------------------------------------------------
Kernel expose this change as "dtlb_pgsize" capability in PMU sysfs.
Change IBS register raw-dump logic according to new bit definitions.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-3-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
IBS OP PMU on Zen5 supports Load Latency filtering. Decode and dump Load
Latency filtering related bits into perf script raw dump.
Also add oneliner example in the perf-amd-ibs man page.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I started seeing this in recent Fedora 42 kernels:
# uname -a
Linux number 6.14.3-300.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Apr 20 16:08:39 UTC 2025 x86_64 GNU/Linux
#
# perf test vmlinux
1: vmlinux symtab matches kallsyms : FAILED!
#
Where we have Rust enabled:
# grep CONFIG_RUST /boot/config-6.14.3-300.fc42.x86_64
CONFIG_RUSTC_VERSION=108600
CONFIG_RUST_IS_AVAILABLE=y
CONFIG_RUSTC_LLVM_VERSION=200101
CONFIG_RUSTC_HAS_COERCE_POINTEE=y
CONFIG_RUST=y
CONFIG_RUSTC_VERSION_TEXT="rustc 1.86.0 (05f9846f8 2025-03-31) (Fedora 1.86.0-1.fc42)"
CONFIG_RUST_FW_LOADER_ABSTRACTIONS=y
CONFIG_RUST_PHYLIB_ABSTRACTIONS=y
# CONFIG_RUST_DEBUG_ASSERTIONS is not set
CONFIG_RUST_OVERFLOW_CHECKS=y
# CONFIG_RUST_BUILD_ASSERT_ALLOW is not set
#
Looking at the reason for the failure:
# perf test -v vmlinux |& grep ^ERR
ERR : 0xffffffff99efc7d0: __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms
ERR : 0xffffffff99efc7e0: _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms
#
But:
# grep -w u /proc/kallsyms
ffffffff99efc7d0 u __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
ffffffff99efc7e0 u _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
#
The test checks that "vmlinux symtab matches kallsyms", so it finds those two
symbols in vmlinux:
# pahole --running_kernel_vmlinux
/usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux
#
# readelf -sW /usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux | grep -Ew '(__pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_|_RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_)'
81844: ffffffff81efc7e0 524 FUNC LOCAL DEFAULT 1 _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
144259: ffffffff81efc7d0 16 FUNC LOCAL DEFAULT 1 __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
#
It is there.
From the nm documentation we can see that:
"U" The symbol is undefined.
"u" The symbol is a unique global symbol. This is a GNU extension to the
standard set of ELF symbol bindings. For such a symbol the dynamic
linker will make sure that in the entire process there is just one
symbol with this name and type in use.
So lets consider 'u' symbols in /proc/kallsyms when loading it to cover this case.
Fedora:40 shows this as a 'l' symbol, so consider that as well.
With this patch 'perf test 1' is happy again:
# perf test vmlinux
1: vmlinux symtab matches kallsyms : Ok
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/aBE_n0PGl3g6h-cS@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When libperf is built alone in-source, $(OUTPUT) isn't set. This causes
the generated uapi path to resolve to '/../arch' which results in a
permissions error:
mkdir: cannot create directory '/../arch': Permission denied
Fix it by removing the preceding '/..' which means that it gets
generated either in the tools/lib/perf part of the tree or the OUTPUT
folder. Some other rules that rely on OUTPUT further refine this
conditionally depending on whether it's an in-source or out-of-source
build, but I don't think we need the extra complexity here. And this
rule is slightly different to others because the header is needed by
both libperf and Perf. This is further complicated by the fact that Perf
always passes O=... to libperf even for in source builds, meaning that
OUTPUT isn't set consistently between projects.
Because we're no longer going one level up to try to generate the file
in the tools/ folder, Perf's include rule needs to descend into libperf.
Also fix the clean rule while we're here.
Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Closes: https://lore.kernel.org/linux-perf-users/7703f88e-ccb7-4c98-9da4-8aad224e780f@leemhuis.info/
Fixes: bfb713ea53c7 ("perf tools: Fix arm64 build by generating unistd_64.h")
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/20250429-james-perf-fix-libperf-in-source-build-v1-1-a1a827ac15e5@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
In some cases when calling function add_probe_vfs_getname, line number
can't be detected by 'perf probe -L getname_flags':
78 atomic_set(&result->refcnt, 1);
// one of the following lines should have line number
// but sometimes it does not because of optimization
result->uptr = filename;
result->aname = NULL;
81 audit_getname(result);
To prevent false failures, skip the affected tests if no suitable line
numbers can be detected.
Signed-off-by: Jakub Brnak <jbrnak@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/20250324144523.597557-1-jbrnak@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The struct zone is embedded in struct pglist_data which can be allocated
for each NUMA node early in the boot process. As it's not a slab object
nor a global lock, this was not symbolized.
Since the zone->lock is often contended, it'd be nice if we can
symbolize it. On NUMA systems, node_data array will have pointers for
struct pglist_data. By following the pointer, it can calculate the
address of each zone and its lock using BTF. On UMA, it can just use
contig_page_data and its zones.
The following example shows the zone lock contention at the end.
$ sudo ./perf lock con -abl -E 5 -- ./perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 0.038 [sec]
contended total wait max wait avg wait address symbol
5167 18.17 ms 10.27 us 3.52 us ffff953340052d00 &kmem_cache_node (spinlock)
38 11.75 ms 465.49 us 309.13 us ffff95334060c480 &sock_inode_cache (spinlock)
3916 10.13 ms 10.43 us 2.59 us ffff953342aecb40 &kmem_cache_node (spinlock)
2963 10.02 ms 13.75 us 3.38 us ffff9533d2344098 &kmalloc-rnd-08-2k (spinlock)
216 5.05 ms 99.49 us 23.39 us ffff9542bf7d65d0 zone_lock (spinlock)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20250401063055.7431-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
$ sudo ./perf test -vv 'trace summary'
109: perf trace summary:
--- start ---
test child forked, pid 3501572
testing: perf trace -s -- true
testing: perf trace -S -- true
testing: perf trace -s --summary-mode=thread -- true
testing: perf trace -S --summary-mode=total -- true
testing: perf trace -as --summary-mode=thread --no-bpf-summary -- true
testing: perf trace -as --summary-mode=total --no-bpf-summary -- true
testing: perf trace -as --summary-mode=thread --bpf-summary -- true
testing: perf trace -as --summary-mode=total --bpf-summary -- true
testing: perf trace -aS --summary-mode=total --bpf-summary -- true
---- end(0) ----
109: perf trace summary : Ok
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20250326044001.3503432-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When -s/--summary option is used, it doesn't need (augmented) arguments
of syscalls. Let's skip the augmentation and load another small BPF
program to collect the statistics in the kernel instead of copying the
data to the ring-buffer to calculate the stats in userspace. This will
be much more light-weight than the existing approach and remove any lost
events.
Let's add a new option --bpf-summary to control this behavior. I cannot
make it default because there's no way to get e_machine in the BPF which
is needed for detecting different ABIs like 32-bit compat mode.
No functional changes intended except for no more LOST events. :)
$ sudo ./perf trace -as --summary-mode=total --bpf-summary sleep 1
Summary of events:
total, 6194 events
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
epoll_wait 561 0 4530.843 0.000 8.076 520.941 18.75%
futex 693 45 4317.231 0.000 6.230 500.077 21.98%
poll 300 0 1040.109 0.000 3.467 120.928 17.02%
clock_nanosleep 1 0 1000.172 1000.172 1000.172 1000.172 0.00%
ppoll 360 0 872.386 0.001 2.423 253.275 41.91%
epoll_pwait 14 0 384.349 0.001 27.453 380.002 98.79%
pselect6 14 0 108.130 7.198 7.724 8.206 0.85%
nanosleep 39 0 43.378 0.069 1.112 10.084 44.23%
...
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250326044001.3503432-1-namhyung@kernel.org
[ Added fixup sent from Namhyung in response to my report to make it also dependent on CONFIG_TRACE ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
BriefDescription
If BriefDescription and PublicDescription are the same, only
BriefDescription is needed. It will be used for both long and short
format outputs.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250418070812.3771441-3-hejunhao3@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In the same PMU, when some JSON events have the "BriefDescription" field
populated while others do not, the cmp_sevent() function will split these
two types of events into separate groups. As a result, when using perf
list to display events, the two types of events cannot be grouped together
in the output.
before patch:
$ perf list pmu
...
uncore hha:
hisi_sccl1_hha2/sdir-hit/
hisi_sccl1_hha2/sdir-lookup/
...
uncore hha:
edir-hit
[Count of The number of HHA E-Dir hit operations. Unit: hisi_sccl1_hha2]
...
after patch:
$ perf list pmu
...
uncore hha:
edir-hit
[Count of The number of HHA E-Dir hit operations. Unit: hisi_sccl1_hha2]
sdir-hit
[Count of The number of HHA S-Dir hit operations. Unit: hisi_sccl1_hha2]
sdir-lookup
[Count of the number of HHA S-Dir lookup operations. Unit: hisi_sccl1_hha2]
...
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250418070812.3771441-2-hejunhao3@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Make 2 global variables local. Reduces ELF binary size by removing
relocations. For a no flags build, the perf binary size is reduced by
4,144 bytes on x86-64.
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Hao Ge <gehao@kylinos.cn>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tengda Wu <wutengda@huaweicloud.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20250410173631.1713627-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Remove the script output file. Add a trap debug message. Minor style
consistency changes.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250410173631.1713627-2-irogers@google.com
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Tengda Wu <wutengda@huaweicloud.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Hao Ge <gehao@kylinos.cn>
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: bpf@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On s390x KVM and z/VM machines the CPU Measurement Facility is not
available. Events cycles and instructions do not exist. Running above
tests on s390 KVM and z/VM guests always fail with this error:
# ./perf test 84 86
84: perf stat JSON output linter : FAILED!
86: perf stat STD output linter : FAILED!
#
Root cause is command:
# perf stat -j --metric-only -e instructions,cycles -- true
{"metric-value" : "none"}
#
Which fails due to unsupported events and returns "none".
Do not execute this test case on s390 KVM and z/VM machines.
Output after:
# ./perf test 84 86
84: perf stat JSON output linter : Ok
86: perf stat STD output linter : Ok
#
Fixes: 45a86d017adf4d6c ("perf test: Add --metric-only to perf stat output tests")
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Suggested-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133310.37452-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
evsel__count_has_error() fails counters when the enabled or running time
are 0. The duration_time event reads 0 when the cpu_map_idx != 0 to
avoid aggregating time over CPUs. Change the enable and running time
to always have a ratio of 100% so that evsel__count_has_error won't
fail.
Before:
```
$ sudo /tmp/perf/perf stat --per-core -a -M UNCORE_FREQ sleep 1
Performance counter stats for 'system wide':
S0-D0-C0 1 2,615,819,485 UNC_CLOCK.SOCKET # 2.61 UNCORE_FREQ
S0-D0-C0 2 <not counted> duration_time
1.002111784 seconds time elapsed
```
After:
```
$ perf stat --per-core -a -M UNCORE_FREQ sleep 1
Performance counter stats for 'system wide':
S0-D0-C0 1 758,160,296 UNC_CLOCK.SOCKET # 0.76 UNCORE_FREQ
S0-D0-C0 2 1,003,438,246 duration_time
1.002486017 seconds time elapsed
```
Note: the metric reads the value a different way and isn't impacted.
Fixes: 240505b2d0adcdc8 ("perf tool_pmu: Factor tool events into their own PMU")
Reported-by: Stephane Eranian <eranian@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20250423050358.94310-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
`perf report` currently halts with an error when encountering
unsupported new event types (`event.type >= PERF_RECORD_HEADER_MAX`).
This patch modifies the behavior to skip these samples and continue
processing the remaining events.
Additionally, stops reporting if the new event size is not 8-byte
aligned.
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250414173921.2905822-1-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now it can handle multiple output fields and sort keys in separate
levels, so it should be ok to use it in the hierarchy mode. This
allows fully customized output format.
$ perf report -F latency,comm,parallelism -H --stdio
...
# Latency Command / Parallelism
# ........... .....................
#
31.84% cc1
29.96% 5
1.24% 4
0.37% 6
0.26% 3
0.02% 2
24.68% as
22.39% 5
1.12% 2
0.98% 4
0.12% 3
0.07% 6
...
Committer testing:
Before:
$ perf report -F latency,comm,parallelism -H --stdio
Error: --hierarchy and --fields options cannot be used together
Usage: perf report [<options>]
-F, --fields <key[,keys...]>
output field(s): overhead latency overhead_sys overhead_us
overhead_guest_sys overhead_guest_us overhead_children
latency_children sample period weight1 weight2 weight3
<SNIP>
-H, --hierarchy Show entries in a hierarchy
$
After:
$ perf report -F latency,comm,parallelism -H --stdio
# Total Lost Samples: 0
#
# Samples: 1K of event 'cycles:Pu'
# Event count (approx.): 1581450138
#
# Latency Command / Parallelism
# ........... .....................
#
97.66% git
96.95% 1
0.55% 2
0.04% 5
0.03% 8
0.03% 4
0.02% 3
0.01% 9
0.01% 7
0.01% 6
0.01% 10
0.00% 12
2.34% git-remote-http
2.24% 1
0.07% 5
0.02% 2
0.00% 4
#
# (Tip: To analyze particular parallelism levels, try: perf report --latency --parallelism=32-64)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It turns out that the output fields didn't consider the hierarchy mode
and put all the fields in the same level. To support hierarchy, each
non-output field should be in a separate level.
Pass a pointer to level to output_field_add() and make it increase the
level when it sees non-output fields.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Likewise, it should remove latency output fields in hierarchy list.
Pass evlist to perf_hpp__cancel_latency() to handle them properly.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is to support hierarchy options with custom output fields.
Currently perf_hpp__cancel_cumulate() only removes accumulated
overhead and latency fields from the global perf_hpp_list.
This is not used in the hierarchy mode because each evsel's hist
has its own separate hpp_list. So it needs to remove the fields
from the lists too. Pass evlist to the function so that it can
iterate the evsels.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf record' will fail with retirement latency events as the open
doesn't do a perf_event_open system call.
Use evsel__config() to set up such events for recording by removing the
flag and enabling sample weights - the sample weights containing the
retirement latency.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The updated Intel vendor events add retirement latency for
graniterapids:
https://lore.kernel.org/lkml/20250322063403.364981-14-irogers@google.com/
This change makes those values available within an alias/event within a
PMU and saves them into the evsel at event parse time.
When no TPEBS data is available the default values are substituted in
for TMA metrics that are using retirement latency events - currently
just those on graniterapids.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add command line configuration option for how retirement latency
events are combined.
The default "mean" gives the average of retirement latency.
"min" or "max" give the smallest or largest retirment latency times
respectively.
"last" uses the last retirment latency sample's time.
Committer notes:
Enclose parse_tpebs_mode() under HAVE_ARCH_X86_64_SUPPORT to match the
ifdef block where it is used, fixing the build in systems like:
20 5.60 debian:experimental-x-mips : FAIL gcc version 14.2.0 (Debian 14.2.0-1)
builtin-stat.c:2330:12: error: 'parse_tpebs_mode' defined but not used [-Werror=unused-function]
2330 | static int parse_tpebs_mode(const struct option *opt, const char *str,
| ^~~~~~~~~~~~~~~~
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
struct stats provides access to mean, min and max.
It also provides uniformity with statistics code used elsewhere in perf.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|