summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2024-12-09perf trace-event: Always build trace-event-info.cIan Rogers
trace-event-info.c has no libtraceevent dependencies, always build it and use it in builtin-record and perf_event_attr printing. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Zixian Cai <fzczx123@gmail.com> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20241118225345.889810-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf trace-event: Constify print argumentsIan Rogers
Capture that these functions don't mutate their input. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Zixian Cai <fzczx123@gmail.com> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20241118225345.889810-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf env: Ensure failure broken topology file reads are always -1 encodedIan Rogers
get_core_id returns 0 on success and a negative errno value on error. Currently the error can only be -1, but fixing this to be any errno value breaks perf: https://lore.kernel.org/lkml/Zzu4Sdebve-NXEMX@google.com/ To avoid this, make sure all error values are written as -1. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Zixian Cai <fzczx123@gmail.com> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20241118225345.889810-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf btf: Make the sigtrap test helper to find a member by name widely availableArnaldo Carvalho de Melo
By introducing a tools/perf/util/btf.c to collect utilities not yet available via libbpf, the first being a way to find a member by name once we get the type_id for the struct. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf pmu: Remove use of perf_cpu_map__read()Ian Rogers
Remove use of a FILE and switch to reading a string that is then passed to perf_cpu_map__new(). Being able to remove perf_cpu_map__read() avoids duplicated parsing logic. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kyle Meyer <kyle.meyer@hpe.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241206044035.1062032-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf cpumap: Reduce transitive dependencies on libperf MAX_NR_CPUSIan Rogers
libperf exposes MAX_NR_CPUS via tools/lib/perf/include/internal/cpumap.h which is internal. The preferred dependency should be the definition in tools/perf/perf.h. Add the includes of perf.h so that MAX_NR_CPUS can be hidden in libperf. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kyle Meyer <kyle.meyer@hpe.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241206044035.1062032-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf: Increase MAX_NR_CPUS to 4096Kyle Meyer
Systems have surpassed 2048 CPUs. Increase MAX_NR_CPUS to 4096. Bitmaps declared with MAX_NR_CPUS bits will increase from 256B to 512B, cpus_runtime will increase from 81960B to 163880B, and max_entries will increase from 8192B to 16384B. Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241206044035.1062032-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf arm-spe: Add support for SPE Data Source packet on AmpereOneIlkka Koskinen
Decode SPE Data Source packets on AmpereOne. The field is IMPDEF. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Graham Woodward <graham.woodward@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20241108202946.16835-3-ilkka@os.amperecomputing.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf arm-spe: Prepare for adding data source packet implementations for ↵Ilkka Koskinen
other cores Split Data Source Packet handling to prepare adding support for other implementations. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Graham Woodward <graham.woodward@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20241108202946.16835-2-ilkka@os.amperecomputing.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf cpumap: Add checking for reference counterLeo Yan
For the CPU map merging test, add an extra check for the reference counter before releasing the last CPU map. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241107125308.41226-4-leo.yan@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf cpumap: Add more tests for CPU map mergingLeo Yan
Add additional tests for CPU map merging to cover more cases. These tests include different types of arguments, such as when one CPU map is a subset of another, as well as cases with or without overlap between the two maps. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241107125308.41226-3-leo.yan@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09libperf cpumap: Refactor perf_cpu_map__merge()Leo Yan
The perf_cpu_map__merge() function has two arguments, 'orig' and 'other'. The function definition might cause confusion as it could give the impression that the CPU maps in the two arguments are copied into a new allocated structure, which is then returned as the result. The purpose of the function is to merge the CPU map 'other' into the CPU map 'orig'. This commit changes the 'orig' argument to a pointer to pointer, so the new result will be updated into 'orig'. The return value is changed to an int type, as an error number or 0 for success. Update callers and tests for the new function definition. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241107125308.41226-2-leo.yan@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf config: Fix trival typo 'an' -> 'can'Arnaldo Carvalho de Melo
Just a trivial typo, should be 'can', did a spell check on the rest of the file just in case, nothing more stood out. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf script python: Improve physical mem type resolutionIan Rogers
Previously system RAM and persistent memory were hard code matched, change so that the label of the memory region is just read from /proc/iomem. This avoids frequent N/A samples. Change the /proc/iomem reading, event processing and output so that nested entries appear and their counts count toward their parent. As labels may be repeated, include the memory ranges in the output to make it clear why, for example, "System RAM" appears twice. Before: Event: mem_inst_retired.all_loads:P Memory type count percentage ---------------------------------------- ---------- ---------- System RAM 9460 96.5% N/A 998 3.5% After: Event: mem_inst_retired.all_loads:P Memory type count percentage ---------------------------------------- ---------- ---------- 100000000-105f7fffff : System RAM 36741 96.5 841400000-8416599ff : Kernel data 89 0.2 840800000-8412a6fff : Kernel rodata 60 0.2 841ebe000-8423fffff : Kernel bss 34 0.1 0-fff : Reserved 1345 3.5 100000-89dd9fff : System RAM 2 0.0 Before: Event: mem_inst_retired.any:P Memory type count percentage ---------------------------------------- ----------- ----------- System RAM 9460 90.5% N/A 998 9.5% After: Event: mem_inst_retired.any:P Memory type count percentage ---------------------------------------- ---------- ---------- 100000000-105f7fffff : System RAM 9460 90.5 841400000-8416599ff : Kernel data 45 0.4 840800000-8412a6fff : Kernel rodata 19 0.2 841ebe000-8423fffff : Kernel bss 12 0.1 0-fff : Reserved 998 9.5 The code has been updated to python 3 with type hints and resolving issues reported by mypy and pylint. Tabs are swapped to spaces as preferred in PEP8, because most lines of code were modified (of this small file) and this makes pylint significantly less noisy. Committer testing: root@number:/tmp# grep -m1 "model name" /proc/cpuinfo model name : Intel(R) Core(TM) i7-14700K root@number:/tmp# root@number:/tmp# perf script mem-phys-addr -a find / /bin /lib /lib64 /sbin Warning: 744 out of order events recorded. Event: cpu_core/mem_inst_retired.all_loads/P Memory type count percentage ---------------------------------------- ---------- ---------- 100000000-8bfbfffff : System RAM 364561 76.5 621400000-6223a6fff : Kernel rodata 10474 2.2 622400000-62283d4bf : Kernel data 4828 1.0 623304000-6237fffff : Kernel bss 1063 0.2 620000000-6213fffff : Kernel code 98 0.0 0-fff : Reserved 111480 23.4 100000-2b0ca017 : System RAM 337 0.1 2fbad000-30d92fff : System RAM 44 0.0 2c79d000-2fbabfff : System RAM 30 0.0 30d94000-316d5fff : System RAM 16 0.0 2b131a58-2c71dfff : System RAM 7 0.0 root@number:/tmp# Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241119180130.19160-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf disasm: Return a proper error when not determining the file typeArnaldo Carvalho de Melo
Before: ⬢ [acme@toolbox a]$ perf annotate --stdio2 -i acme-perf-injected.data 'java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int)' Error: Couldn't annotate java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int): Internal error: Invalid -1 error code ⬢ [acme@toolbox a]$ After: ⬢ [acme@toolbox a]$ perf annotate --stdio2 -i acme-perf-injected.data 'java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int)' Error: Couldn't annotate java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int): Couldn't determine the file /tmp/perf-3308868.map type. ⬢ [acme@toolbox a]$ Reported-by: Francesco Nigro <fnigro@redhat.com> Reported-by: Ilan Green <igreen@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/lkml/Z092D9-r_iOgwIWM@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09tools features: Don't check for libunwind devel files by defaultArnaldo Carvalho de Melo
Since 13e17c9ff49119aa ("perf build: Make libunwind opt-in rather than opt-out"), so we shouldn't by default be testing for its availability at build time in tools/build/features/test-all.c. That test was designed to test the features we expect to be the most common ones in most builds, so if we test build just that file, then we assume the features there are present and will not test one by one. Removing it from test-all.c gets rid of the first impediment for test-all.c to build successfully: $ cat /tmp/build/perf-tools-next/feature/test-all.make.output In file included from test-all.c:62: test-libunwind.c:2:10: fatal error: libunwind.h: No such file or directory 2 | #include <libunwind.h> | ^~~~~~~~~~~~~ compilation terminated. $ We then get to: $ cat /tmp/build/perf-tools-next/feature/test-all.make.output /usr/bin/ld: cannot find -lunwind-x86_64: No such file or directory /usr/bin/ld: cannot find -lunwind: No such file or directory collect2: error: ld returned 1 exit status $ So make all the logic related to setting CFLAGS, LDFLAGS, etc for libunwind to be conditional on NO_LIBWUNWIND=1, which is now the default, now we get a faster build: $ cat /tmp/build/perf-tools-next/feature/test-all.make.output $ ldd /tmp/build/perf-tools-next/feature/test-all.bin linux-vdso.so.1 (0x00007fef04cde000) libdw.so.1 => /lib64/libdw.so.1 (0x00007fef04a49000) libpython3.12.so.1.0 => /lib64/libpython3.12.so.1.0 (0x00007fef04478000) libm.so.6 => /lib64/libm.so.6 (0x00007fef04394000) libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007fef0436c000) libtracefs.so.1 => /lib64/libtracefs.so.1 (0x00007fef04345000) libcrypto.so.3 => /lib64/libcrypto.so.3 (0x00007fef03e95000) libz.so.1 => /lib64/libz.so.1 (0x00007fef03e72000) libelf.so.1 => /lib64/libelf.so.1 (0x00007fef03e56000) libnuma.so.1 => /lib64/libnuma.so.1 (0x00007fef03e48000) libslang.so.2 => /lib64/libslang.so.2 (0x00007fef03b65000) libperl.so.5.38 => /lib64/libperl.so.5.38 (0x00007fef037c6000) libc.so.6 => /lib64/libc.so.6 (0x00007fef035d5000) liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fef035a0000) libzstd.so.1 => /lib64/libzstd.so.1 (0x00007fef034e1000) libbz2.so.1 => /lib64/libbz2.so.1 (0x00007fef034cd000) /lib64/ld-linux-x86-64.so.2 (0x00007fef04ce0000) libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007fef03495000) $ Fixes: 13e17c9ff49119aa ("perf build: Make libunwind opt-in rather than opt-out") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Z09zTztD8X8qIWCX@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-05perf tools: Fix precise_ip fallback logicNamhyung Kim
Sometimes it returns other than EOPNOTSUPP for invalid precise_ip so it cannot check the error code. Let's move the fallback after the missing feature checks so that it can handle EINVAL as well. This also aligns well with the existing behavior which blindly turns off the precise_ip but we check the missing features correctly now. Fixes: af954f76eea56453 ("perf tools: Check fallback error and order") Reported-by: kernel test robot <oliver.sang@intel.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Closes: https://lore.kernel.org/oe-lkp/202411301431.799e5531-lkp@intel.com Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/Z1DV0lN8qHSysX7f@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04perf tools: Fix build error on generated/fs_at_flags_array.cNamhyung Kim
It should only have generic flags in the array but the recent header sync brought a new flags to fcntl.h and caused a build error. Let's update the shell script to exclude flags specific to name_to_handle_at(). CC trace/beauty/fs_at_flags.o In file included from trace/beauty/fs_at_flags.c:21: tools/perf/trace/beauty/generated/fs_at_flags_array.c:13:30: error: initialized field overwritten [-Werror=override-init] 13 | [ilog2(0x002) + 1] = "HANDLE_CONNECTABLE", | ^~~~~~~~~~~~~~~~~~~~ tools/perf/trace/beauty/generated/fs_at_flags_array.c:13:30: note: (near initialization for ‘fs_at_flags[2]’) Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241203035349.1901262-12-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04tools headers: Sync uapi/linux/prctl.h with the kernel sourcesNamhyung Kim
To pick up the changes in this cset: 09d6775f503b393d riscv: Add support for userspace pointer masking 91e102e79740ae43 prctl: arch-agnostic prctl for shadow stack This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Mark Brown <broonie@kernel.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20241203035349.1901262-11-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04tools headers: Sync uapi/linux/mount.h with the kernel sourcesNamhyung Kim
To pick up the changes in this cset: aefff51e1c2986e1 statmount: retrieve security mount options 2f4d4503e9e5ab76 statmount: add flag to retrieve unescaped options 44010543fc8bedad fs: add the ability for statmount() to report the sb_source ed9d95f691c29748 fs: add the ability for statmount() to report the fs_subtype This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/mount.h include/uapi/linux/mount.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20241203035349.1901262-10-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04tools headers: Sync uapi/linux/fcntl.h with the kernel sourcesNamhyung Kim
To pick up the changes in this cset: c374196b2b9f4b80 ("fs: name_to_handle_at() support for "explicit connectable" file handles") 95f567f81e43a1bc ("fs: Simplify getattr interface function checking AT_GETATTR_NOSEC flag") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Alexander Aring <alex.aring@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20241203035349.1901262-9-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04tools headers: Sync *xattrat syscall changes with the kernel sourcesNamhyung Kim
To pick up the changes in this cset: 6140be90ec70c39f ("fs/xattr: add *at family syscalls") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h diff -u tools/perf/arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_32.tbl diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl The arm64 changes are not included as it requires more changes in the tools. It'll be worked for the later cycle. Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> CC: x86@kernel.org CC: linux-mips@vger.kernel.org CC: linuxppc-dev@lists.ozlabs.org CC: linux-s390@vger.kernel.org Link: https://lore.kernel.org/r/20241203035349.1901262-7-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-03perf machine: Initialize machine->env to address a segfaultArnaldo Carvalho de Melo
Its used from trace__run(), for the 'perf trace' live mode, i.e. its strace-like, non-perf.data file processing mode, the most common one. The trace__run() function will set trace->host using machine__new_host() that is supposed to give a machine instance representing the running machine, and since we'll use perf_env__arch_strerrno() to get the right errno -> string table, we need to use machine->env, so initialize it in machine__new_host(). Before the patch: (gdb) run trace --errno-summary -a sleep 1 <SNIP> Summary of events: gvfs-afc-volume (3187), 2 events, 0.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ pselect6 1 0 0.000 0.000 0.000 0.000 0.00% GUsbEventThread (3519), 2 events, 0.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ poll 1 0 0.000 0.000 0.000 0.000 0.00% <SNIP> Program received signal SIGSEGV, Segmentation fault. 0x00000000005caba0 in perf_env__arch_strerrno (env=0x0, err=110) at util/env.c:478 478 if (env->arch_strerrno == NULL) (gdb) bt #0 0x00000000005caba0 in perf_env__arch_strerrno (env=0x0, err=110) at util/env.c:478 #1 0x00000000004b75d2 in thread__dump_stats (ttrace=0x14f58f0, trace=0x7fffffffa5b0, fp=0x7ffff6ff74e0 <_IO_2_1_stderr_>) at builtin-trace.c:4673 #2 0x00000000004b78bf in trace__fprintf_thread (fp=0x7ffff6ff74e0 <_IO_2_1_stderr_>, thread=0x10fa0b0, trace=0x7fffffffa5b0) at builtin-trace.c:4708 #3 0x00000000004b7ad9 in trace__fprintf_thread_summary (trace=0x7fffffffa5b0, fp=0x7ffff6ff74e0 <_IO_2_1_stderr_>) at builtin-trace.c:4747 #4 0x00000000004b656e in trace__run (trace=0x7fffffffa5b0, argc=2, argv=0x7fffffffde60) at builtin-trace.c:4456 #5 0x00000000004ba43e in cmd_trace (argc=2, argv=0x7fffffffde60) at builtin-trace.c:5487 #6 0x00000000004c0414 in run_builtin (p=0xec3068 <commands+648>, argc=5, argv=0x7fffffffde60) at perf.c:351 #7 0x00000000004c06bb in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:404 #8 0x00000000004c0814 in run_argv (argcp=0x7fffffffdc4c, argv=0x7fffffffdc40) at perf.c:448 #9 0x00000000004c0b5d in main (argc=5, argv=0x7fffffffde60) at perf.c:560 (gdb) After: root@number:~# perf trace -a --errno-summary sleep 1 <SNIP> pw-data-loop (2685), 1410 events, 16.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ epoll_wait 188 0 983.428 0.000 5.231 15.595 8.68% ioctl 94 0 0.811 0.004 0.009 0.016 2.82% read 188 0 0.322 0.001 0.002 0.006 5.15% write 141 0 0.280 0.001 0.002 0.018 8.39% timerfd_settime 94 0 0.138 0.001 0.001 0.007 6.47% gnome-control-c (179406), 1848 events, 20.9% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ poll 222 0 959.577 0.000 4.322 21.414 11.40% recvmsg 150 0 0.539 0.001 0.004 0.013 5.12% write 300 0 0.442 0.001 0.001 0.007 3.29% read 150 0 0.183 0.001 0.001 0.009 5.53% getpid 102 0 0.101 0.000 0.001 0.008 7.82% root@number:~# Fixes: 54373b5d53c1f6aa ("perf env: Introduce perf_env__arch_strerrno()") Reported-by: Veronika Molnarova <vmolnaro@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Veronika Molnarova <vmolnaro@redhat.com> Acked-by: Michael Petlan <mpetlan@redhat.com> Tested-by: Michael Petlan <mpetlan@redhat.com> Link: https://lore.kernel.org/r/Z0XffUgNSv_9OjOi@x1 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-02perf test: Don't signal all processes on system when interrupting testsJames Clark
This signal handler loops over all tests on ctrl-C, but it's active while the test list is being constructed. process.pid is 0, then -1, then finally set to the child pid on fork. If the Ctrl-C is received during this point a kill(-1, SIGINT) can be sent which affects all processes. Make sure the child has forked first before forwarding the signal. This can be reproduced with ctrl-C immediately after launching perf test which terminates the ssh connection. Fixes: 553d5efeb341 ("perf test: Add a signal handler to kill forked child processes") Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20241129151948.3199732-1-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-02perf tools: Fix build-id event recordingNamhyung Kim
The build-id events written at the end of the record session are broken due to unexpected data. The write_buildid() writes the fixed length event first and then variable length filename. But a recent change made it write more data in the padding area accidentally. So readers of the event see zero-filled data for the next entry and treat it incorrectly. This resulted in wrong kernel symbols because the kernel DSO loaded a random vmlinux image in the path as it didn't have a valid build-id. Fixes: ae39ba16554e ("perf inject: Fix build ID injection") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/Z0aRFFW9xMh3mqKB@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-26Merge tag 'perf-tools-for-v6.13-2024-11-24' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Namhyung Kim: "perf record: - Enable leader sampling for inherited task events. It was supported only for system-wide events but the kernel started to support such a setup since v6.12. This is to reduce the number of PMU interrupts. The samples of the leader event will contain counts of other events and no samples will be generated for the other member events. $ perf record -e '{cycles,instructions}:S' ${MYPROG} perf report: - Fix --branch-history option to display more branch-related information like prediction, abort and cycles which is available on Intel machines. $ perf record -bg -- perf test -w brstack $ perf report --branch-history ... # # Overhead Source:Line Symbol Shared Object Predicted Abort Cycles IPC [IPC Coverage] # ........ ........................ .............. .................... ......... ..... ...... .................... # 8.17% copy_page_64.S:19 [k] copy_page [kernel.kallsyms] 50.0% 0 5 - - | ---xas_load xarray.h:171 | |--5.68%--xas_load xarray.c:245 (cycles:1) | xas_load xarray.c:242 | xas_load xarray.h:1260 (cycles:1) | xas_descend xarray.c:146 | xas_load xarray.c:244 (cycles:2) | xas_load xarray.c:245 | xas_descend xarray.c:218 (cycles:10) ... perf stat: - Add HWMON PMU support. The HWMON provides various system information like CPU/GPU temperature, fan speed and so on. Expose them as PMU events so that users can see the values using perf stat commands. $ perf stat -e temp_cpu,fan1 true Performance counter stats for 'true': 60.00 'C temp_cpu 0 rpm fan1 0.000745382 seconds time elapsed 0.000883000 seconds user 0.000000000 seconds sys - Display metric threshold in JSON output. Some metrics define thresholds to classify value ranges. It used to be in a different color but it won't work for JSON. Add "metric-threshold" field to the JSON that can be one of "good", "less good", "nearly bad" and "bad". # perf stat -a -M TopdownL1 -j true {"counter-value" : "18693525.000000", "unit" : "", "event" : "TOPDOWN.SLOTS", "event-runtime" : 5552708, "pcnt-running" : 100.00, "metric-value" : "43.226002", "metric-unit" : "% tma_backend_bound", "metric-threshold" : "bad"} {"metric-value" : "29.212267", "metric-unit" : "% tma_frontend_bound", "metric-threshold" : "bad"} {"metric-value" : "7.138972", "metric-unit" : "% tma_bad_speculation", "metric-threshold" : "good"} {"metric-value" : "20.422759", "metric-unit" : "% tma_retiring", "metric-threshold" : "good"} {"counter-value" : "3817732.000000", "unit" : "", "event" : "topdown-retiring", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "5472824.000000", "unit" : "", "event" : "topdown-fe-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "7984780.000000", "unit" : "", "event" : "topdown-be-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "1418181.000000", "unit" : "", "event" : "topdown-bad-spec", "event-runtime" : 5552708, "pcnt-running" : 100.00, } ... perf sched: - Add -P/--pre-migrations option for 'timehist' sub-command to track time a task waited on a run-queue before migrating to a different CPU. $ perf sched timehist -P time cpu task name wait time sch delay run time pre-mig time [tid/pid] (msec) (msec) (msec) (msec) --------------- ------ ------------------------------ --------- --------- --------- --------- 585940.535527 [0000] perf[584885] 0.000 0.000 0.000 0.000 585940.535535 [0000] migration/0[20] 0.000 0.002 0.008 0.000 585940.535559 [0001] perf[584885] 0.000 0.000 0.000 0.000 585940.535563 [0001] migration/1[25] 0.000 0.001 0.004 0.000 585940.535678 [0002] perf[584885] 0.000 0.000 0.000 0.000 585940.535686 [0002] migration/2[31] 0.000 0.002 0.008 0.000 585940.535905 [0001] <idle> 0.000 0.000 0.342 0.000 585940.535938 [0003] perf[584885] 0.000 0.000 0.000 0.000 585940.537048 [0001] sleep[584886] 0.000 0.019 1.142 0.001 585940.537749 [0002] <idle> 0.000 0.000 2.062 0.000 ... Build: - Make libunwind opt-in (LIBUNWIND=1) rather than opt-out. The perf tools are generally built with libelf and libdw which has unwinder functionality. The libunwind support predates it and no need to have duplicate unwinders by default. - Rename NO_DWARF=1 build option to NO_LIBDW=1 in order to clarify it's using libdw for handling DWARF information. Internals: - Do not set exclude_guest bit in the perf_event_attr by default. This was causing a trouble in AMD IBS PMU as it doesn't support the bit. The bit will be set when it's needed later by the fallback logic. Also update the missing feature detection logic to make sure not clear supported bits unnecessarily. - Run perf test in parallel by default and mark flaky tests "exclusive" to run them serially at the end. Some test numbers are changed but the test can complete in less than half the time. JSON vendor events: - Add AMD Zen 5 events and metrics. - Add i.MX91 and i.MX95 DDR metrics - Fix HiSilicon HIP08 Topdown metric name. - Support compat events on PowerPC" * tag 'perf-tools-for-v6.13-2024-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (232 commits) perf tests: Fix hwmon parsing with PMU name test perf hwmon_pmu: Ensure hwmon key union is zeroed before use perf tests hwmon_pmu: Remove double evlist__delete() perf/test: fix perf ftrace test on s390 perf bpf-filter: Return -ENOMEM directly when pfi allocation fails perf test: Correct hwmon test PMU detection perf: Remove unused del_perf_probe_events() perf pmu: Move pmu_metrics_table__find and remove ARM override perf jevents: Add map_for_cpu() perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str perf header: Avoid transitive PMU includes perf arm64 header: Use cpu argument in get_cpuid perf header: Refactor get_cpuid to take a CPU for ARM perf header: Move is_cpu_online to numa bench perf jevents: fix breakage when do perf stat on system metric perf test: Add missing __exit calls in tool/hwmon tests perf tests: Make leader sampling test work without branch event perf util: Remove kernel version deadcode perf test shell trace_exit_race: Use --no-comm to avoid cases where COMM isn't resolved perf test shell trace_exit_race: Show what went wrong in verbose mode ...
2024-11-22perf tests: Fix hwmon parsing with PMU name testIan Rogers
Incorrectly the hwmon with PMU name test didn't pass "true". Fix and address issue with hwmon_pmu__config_terms needing to load events - a load bearing assert fired. Also fix missing list deletion when putting the hwmon test PMU and lower some debug warnings to make the hwmon PMU less spammy in verbose mode. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241121000955.536930-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-22perf hwmon_pmu: Ensure hwmon key union is zeroed before useIan Rogers
Non-zero values led to mismatches in testing. This was reproducible with -fsanitize=undefined. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Closes: https://lore.kernel.org/lkml/Zzdtj0PEWEX3ATwL@x1/ Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241119230033.115369-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-22perf tests hwmon_pmu: Remove double evlist__delete()Arnaldo Carvalho de Melo
In the error path when failing to parse events the evlist is being deleted twice, keep the one after the out label. Fixes: 531ee0fd4836994f ("perf test: Add hwmon "PMU" test") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/ZzzoJNNcJJVnPCCe@x1 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-22perf/test: fix perf ftrace test on s390Thomas Richter
On s390 the perf test case ftrace sometimes fails as follows: # ./perf test ftrace 79: perf ftrace tests : FAILED! # The failure depends on the kernel .config file. Some configurations always work fine, some do not. The ftrace profile test mostly fails, because the ring buffer was not large enough, and some lines (especially the interesting ones with nanosleep in it) where dropped. To achieve success for all tested kernel configurations, enlarge the buffer to store the traces completely without wrapping. The default buffer size is too small for all kernel configurations. Set the buffer size of for the ftrace profile test to 16 MB. Output after: # ./perf test ftrace 79: perf ftrace tests : Ok # Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: agordeev@linux.ibm.com Cc: gor@linux.ibm.com Cc: hca@linux.ibm.com Cc: sumanthk@linux.ibm.com Link: https://lore.kernel.org/r/20241119064856.641446-1-tmricht@linux.ibm.com Suggested-by: Sven Schnelle <svens@linux.ibm.com> Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-22perf bpf-filter: Return -ENOMEM directly when pfi allocation failsHao Ge
Directly return -ENOMEM when pfi allocation fails, instead of performing other operations on pfi. Fixes: 0fe2b18ddc40 ("perf bpf-filter: Support multiple events properly") Signed-off-by: Hao Ge <gehao@kylinos.cn> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: hao.ge@linux.dev Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20241113030537.26732-1-hao.ge@linux.dev Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-22perf test: Correct hwmon test PMU detectionIan Rogers
Use name to avoid potential other hwmon PMUs. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20241118052638.754981-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-16perf: Remove unused del_perf_probe_events()Dr. David Alan Gilbert
del_perf_probe_events() last use was removed by commit 3d6dfae889174340 ("perf parse-events: Remove BPF event support") Remove it. It was the last user of probe_file__del_events(), so remove it as well. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241022002940.302946-1-linux@treblig.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf pmu: Move pmu_metrics_table__find and remove ARM overrideIan Rogers
Move pmu_metrics_table__find() to the jevents.py generated pmu-events.c and remove indirection override for ARM. The movement removes perf_pmu__find_metrics_table that exists to enable the ARM override. The ARM override isn't necessary as just the CPUID, not PMU, is used in the metric table lookup. On non-ARM the CPU argument is just ignored for the CPUID, for ARM -1 is passed so that the CPUID for the first logical CPU is read. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf jevents: Add map_for_cpu()Ian Rogers
The PMU is no longer part of the map finding process and for metrics doesn't make sense as they lack a PMU. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Pass a perf_cpu rather than a PMU to get_cpuid_strIan Rogers
On ARM the cpuid is dependent on the core type of the CPU in question. The PMU was passed for the sake of the CPU map but this means in places a temporary PMU is created just to pass a CPU value. Just pass the CPU and fix up the callers. As there are no longer PMU users in header.h, shuffle forward declarations earlier to work around build failures. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Avoid transitive PMU includesIan Rogers
Currently satisfied via header.h. Note, pmu.h includes parse-events.h. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf arm64 header: Use cpu argument in get_cpuidIan Rogers
Use the cpu to read the MIDR file requested. If the "any" value (-1) is passed that keep the behavior of returning the first MIDR file that can be read. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Refactor get_cpuid to take a CPU for ARMIan Rogers
ARM BIG.little has no notion of a constant CPUID for both core types. To reflect this reality, change the get_cpuid function to also pass in a possibly unused logical cpu. If the dummy value (-1) is passed in then ARM can, as currently happens, select the first logical CPU's "CPUID". The changes to ARM getcpuid happen in a follow up change. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Move is_cpu_online to numa benchIan Rogers
The helper function is only used in the NUMA benchmark as typically online CPUs are determined through perf_cpu_map__new_online_cpus(). Reduce the scope of the function for now. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf jevents: fix breakage when do perf stat on system metricXu Yang
When do perf stat on sys metric, perf tool output nothing now: $ perf stat -a -M imx95_ddr_read.all -I 1000 $ This command runs on an arm64 machine and the Soc has one DDR hw pmu except one armv8_cortex_a55 pmu. Their maps show as follows: const struct pmu_events_map pmu_events_map[] = { { .arch = "arm64", .cpuid = "0x00000000410fd050", .event_table = { .pmus = pmu_events__arm_cortex_a55, .num_pmus = ARRAY_SIZE(pmu_events__arm_cortex_a55) }, .metric_table = { .pmus = NULL, .num_pmus = 0 } }, static const struct pmu_sys_events pmu_sys_event_tables[] = { { .event_table = { .pmus = pmu_events__freescale_imx95_sys, .num_pmus = ARRAY_SIZE(pmu_events__freescale_imx95_sys) }, .metric_table = { .pmus = pmu_metrics__freescale_imx95_sys, .num_pmus = ARRAY_SIZE(pmu_metrics__freescale_imx95_sys) }, .name = "pmu_events__freescale_imx95_sys", }, Currently, pmu_metrics_table__find() will return NULL when only do perf stat on sys metric. Then parse_groups() will never be called to parse sys metric_name, finally perf tool will exit directly. This should be a common problem. To fix the issue, this will keep the logic before commit f20c15d13f01 ("perf pmu-events: Remember the perf_events_map for a PMU") to return a empty metric table rather than a NULL pointer. This should be fine since the removed part just check if the table match provided metric_name. Without these code, the code in parse_groups() will also check the validity of metrci_name too. Fixes: f20c15d13f017d4b ("perf pmu-events: Remember the perf_events_map for a PMU") Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241107162035.52206-2-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf test: Add missing __exit calls in tool/hwmon testsIan Rogers
Address sanitizer flagged the missing parse_events_error__exit when testing on ARM. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241115201258.509477-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf tests: Make leader sampling test work without branch eventJames Clark
Arm a57 only has speculative branch events so this test fails there. The test doesn't depend on branch instructions so change it to instructions which is pretty much guaranteed to be everywhere. The test_branch_counter() test above already tests for the existence of the branches event and skips if its not present. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Veronika Molnarova <vmolnaro@redhat.com> Link: https://lore.kernel.org/r/20241115161600.228994-1-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf util: Remove kernel version deadcodeDr. David Alan Gilbert
fetch_kernel_version() has been unused since Ian's 2023 commit 3d6dfae889174340 ("perf parse-events: Remove BPF event support") Remove it, and it's helpers. I noticed there are a bunch of kernel-version macros that are also unused nearby. Also remove them. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241116155850.113129-1-linux@treblig.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf test shell trace_exit_race: Use --no-comm to avoid cases where COMM ↵Arnaldo Carvalho de Melo
isn't resolved The purpose of this test is to test for races in the exit of 'perf trace' missing the last events, it was failing when the COMM wasn't resolved either because we missed some PERF_RECORD_COMM or somehow raced on getting it from procfs. Add --no-comm to the 'perf trace' command line so that we get a consistent, pid only output, which allows the test to achieve its goal. This is the output from 'perf trace --no-comm -e syscalls:sys_enter_exit_group': 0.000 21953 syscalls:sys_enter_exit_group() 0.000 21955 syscalls:sys_enter_exit_group() 0.000 21957 syscalls:sys_enter_exit_group() 0.000 21959 syscalls:sys_enter_exit_group() 0.000 21961 syscalls:sys_enter_exit_group() 0.000 21963 syscalls:sys_enter_exit_group() 0.000 21965 syscalls:sys_enter_exit_group() 0.000 21967 syscalls:sys_enter_exit_group() 0.000 21969 syscalls:sys_enter_exit_group() 0.000 21971 syscalls:sys_enter_exit_group() Now it passes: root@number:~# perf test "trace exit race" 110: perf trace exit race : Ok root@number:~# root@number:~# perf test -v "trace exit race" 110: perf trace exit race : Ok root@number:~# If we artificially make it run just 9 times instead of the 10 it runs, i.e. by manually doing: trace_shutdown_race() { for _ in $(seq 9); do that 9 is $iter, 10 in the patch, we get: root@number:~# vim ~acme/libexec/perf-core/tests/shell/trace_exit_race.sh root@number:~# perf test -v "trace exit race" --- start --- test child forked, pid 24629 Missing output, expected 10 but only got 9 ---- end(-1) ---- 110: perf trace exit race : FAILED! root@number:~# I.e. 9 'perf trace' calls produced the expected output, the inverse grep didn't show anything, so the patch provided by Howard for the previous patch kicks in and shows a more informative message. Tested-by: Howard Chu <howardchu95@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Peterson <benjamin@engflow.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/lkml/ZzdknoHqrJbojb6P@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-15perf test shell trace_exit_race: Show what went wrong in verbose modeArnaldo Carvalho de Melo
If it fails we need to check what was the reason, what were the lines that didn't match the expected format, so: root@number:~# perf test -v "trace exit race" --- start --- test child forked, pid 2028724 Lines not matching the expected regexp: ' +[0-9]+\.[0-9]+ +true/[0-9]+ syscalls:sys_enter_exit_group\(\)$': 0.000 :2028750/2028750 syscalls:sys_enter_exit_group() ---- end(-1) ---- 110: perf trace exit race : FAILED! root@number:~# In this case we're not resolving the process COMM for some reason and fallback to printing just the pid/tid, this will be fixed in a followup patch. Howard Chu spotted a problem with single code surrounding a regexp, that made the test always fail, but since there were some failures when I tested (COMM not being resolved in some of the results) the end inverse grep would show some lines and thus didn't notice the single quote problem. He also provided a patch to test if less than the number of expected matches took place but all of them with the expected output, in which case the inverse grep wouldn't show anything, confusing the tester. Reviewed-by: Howard Chu <howardchu95@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Peterson <benjamin@engflow.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/lkml/ZzdknoHqrJbojb6P@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-14perf tests: Add test for trace output lossBenjamin Peterson
Add a test that checks that trace output is not lost to races. This is accomplished by tracing the exit_group syscall of "true" multiple times and checking for correct output. Signed-off-by: Benjamin Peterson <benjamin@engflow.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Howard Chu <howardchu95@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241107232128.108981-3-benjamin@engflow.com [ Addressed two ShellCheck warnings ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-14perf trace: Avoid garbage when not printing a syscall's argumentsBenjamin Peterson
syscall__scnprintf_args may not place anything in the output buffer (e.g., because the arguments are all zero). If that happened in trace__fprintf_sys_enter, its fprintf would receive an unitialized buffer leading to garbage output. Fix the problem by passing the (possibly zero) bounds of the argument buffer to the output fprintf. Fixes: a98392bb1e169a04 ("perf trace: Use beautifiers on syscalls:sys_enter_ handlers") Signed-off-by: Benjamin Peterson <benjamin@engflow.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Howard Chu <howardchu95@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241107232128.108981-2-benjamin@engflow.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-14perf trace: Do not lose last events in a raceBenjamin Peterson
If a perf trace event selector specifies a maximum number of events to output (i.e., "/nr=N/" syntax), the event printing handler, trace__event_handler, disables the event selector after the maximum number events are printed. Furthermore, trace__event_handler checked if the event selector was disabled before doing any work. This avoided exceeding the maximum number of events to print if more events were in the buffer before the selector was disabled. However, the event selector can be disabled for reasons other than exceeding the maximum number of events. In particular, when the traced subprocess exits, the main loop disables all event selectors. This meant the last events of a traced subprocess might be lost to the printing handler's short-circuiting logic. This nondeterministic problem could be seen by running the following many times: $ perf trace -e syscalls:sys_enter_exit_group true trace__event_handler should simply check for exceeding the maximum number of events to print rather than the state of the event selector. Fixes: a9c5e6c1e9bff42c ("perf trace: Introduce per-event maximum number of events property") Signed-off-by: Benjamin Peterson <benjamin@engflow.com> Tested-by: Howard Chu <howardchu95@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241107232128.108981-1-benjamin@engflow.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-14perf probe: Introduce quotation marks supportMasami Hiramatsu (Google)
In non-C languages, it is possible to have ':' in the function names. It is possible to escape it with backslashes, but if there are too many backslashes, it is annoying. This introduce quotation marks (`"` or `'`) support. For example, without quotes, we have to pass it as below $ perf probe -x cro3 -L "cro3\:\:cmd\:\:servo\:\:run_show" <run_show@/work/cro3/src/cmd/servo.rs:0> 0 fn run_show(args: &ArgsShow) -> Result<()> { 1 let list = ServoList::discover()?; 2 let s = list.find_by_serial(&args.servo)?; 3 if args.json { 4 println!("{s}"); With quotes, we can more naturally write the function name as below; $ perf probe -x cro3 -L \"cro3::cmd::servo::run_show\" <run_show@/work/cro3/src/cmd/servo.rs:0> 0 fn run_show(args: &ArgsShow) -> Result<()> { 1 let list = ServoList::discover()?; 2 let s = list.find_by_serial(&args.servo)?; 3 if args.json { 4 println!("{s}"); Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://lore.kernel.org/r/173099116941.2431889.11609129616090100386.stgit@mhiramat.roam.corp.google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>