summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2022-05-10libperf evlist: Add evsel as a parameter to ->idx()Adrian Hunter
Add evsel as a parameter to ->idx() in preparation for correctly determining whether an auxtrace mmap is needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10libperf evlist: Remove ->idx() per_cpu parameterAdrian Hunter
Remove ->idx() per_cpu parameter because it isn't needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10perf auxtrace: Do not mix up mmap idxAdrian Hunter
The idx is with respect to evlist not evsel. That hasn't mattered because they are the same at present. Prepare for that not being the case, which it won't be when sideband tracking events are allowed on all CPUs even when auxtrace is limited to selected CPUs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10perf auxtrace: Move evlist__enable_event_idx() to auxtrace.cAdrian Hunter
evlist__enable_event_idx() is used only by auxtrace. Move it to auxtrace.c in preparation for making it even more auxtrace specific. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10perf evlist: Use libperf functions in evlist__enable_event_idx()Adrian Hunter
evlist__enable_event_idx() is used only for auxtrace events which are never system_wide. Simplify by using libperf enable event functions. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-09perf metrics: Don't add all tool events for sharingIan Rogers
Tool events are added to the set of events for parsing so that having a tool event in a metric doesn't inhibit event sharing of events between metrics. All tool events were added but this meant unused tool events would be counted. Reduce this set of tool events to just those present in the overall metric list. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220507053410.3798748-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-09perf metrics: Support all tool eventsIan Rogers
Previously duration_time was hard coded, which was ok until commit b03b89b350034f22 ("perf stat: Add user_time and system_time events") added additional tool events. Do for all tool events what was previously done just for duration_time. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220507053410.3798748-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-09perf evsel: Add tool event helpersIan Rogers
Convert to and from a string. Fix evsel__tool_name() as array is off-by-1. Support more than just duration_time as a metric-id. Fixes: 75eafc970bd9d36d ("perf list: Print all available tool events") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220507053410.3798748-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-09perf evsel: Constify a few arraysIan Rogers
Remove public definition of evsel__tool_names(). Not used outside util/evsel.c. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220507053410.3798748-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-09Revert "perf stat: Support metrics with hybrid events"Ian Rogers
This reverts commit 60344f1a9a597f2e0efcd57df5dad0b42da15e21. Hybrid metrics place a PMU at the end of the parse string. This is also where tool events are placed. The behavior of the parse string isn't clear and so revert the change for now. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220507053410.3798748-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-08perf tests: Fix coresight `perf test` failure.Jeremy Linton
Currently the `perf test` always fails the coresight test like: 89: Check Arm CoreSight trace data recording and synthesized samples: FAILED! That is because the test_arm_coresight.sh is attempting to SIGINT the parent but is using $$ rather than $PPID and it sigint's itself when run under the perf test framework. Since this is done in a trap clause it ends up returning a non zero return. Since $PPID is a bash ism and not all distros are linking /bin/sh to bash, the alternative parent pid lookups are uglier than just dropping the kill, and its not strictly needed, lets pick the simple solution and drop the sigint. Fixes: 133fe2e617e48ca0 ("perf tests: Improve temp file cleanup in test_arm_coresight.sh") Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Jeremy Linton <jeremy.linton@arm.com> Link: https://lore.kernel.org/r/20220428151947.290146-1-jeremy.linton@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-08perf bench: Fix two numa NDEBUG warningsIan Rogers
BUG_ON is a no-op if NDEBUG is defined, otherwise it is an assert. Compiling with NDEBUG yields: bench/numa.c: In function ‘bind_to_cpu’: bench/numa.c:314:1: error: control reaches end of non-void function [-Werror=return-type] 314 | } | ^ bench/numa.c: In function ‘bind_to_node’: bench/numa.c:367:1: error: control reaches end of non-void function [-Werror=return-type] 367 | } | ^ Add return statements to cover this case. Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428202912.1056444-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-06perf test: Add skip to --per-thread testIan Rogers
As reported in: https://lore.kernel.org/linux-perf-users/20220428122821.3652015-1-tmricht@linux.ibm.com/ the 'instructions:u' event may not be supported. Add a skip using 'perf record'. Switch some code away from pipe to make the failures clearer. Reported-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20220505182505.3313191-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-05perf cpumap: Switch to using perf_cpu_map APIIan Rogers
Switch some raw accesses to the cpu map to using the library API. This can help with reference count checking. Some BPF cases switch from index to CPU for consistency, this shouldn't matter as the CPU map is full. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: http://lore.kernel.org/lkml/20220503041757.2365696-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-04perf vendor events intel: Update CLX events to v1.15Ian Rogers
Events are generated for CascadeLake Server v1.15 with events from: https://download.01.org/perfmon/CLX/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change updates descriptions, adds INST_DECODED.DECODERS and corrects a counter mask in UOPS_RETIRED.TOTAL_CYCLES. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-03perf vendor events intel: Add uncore event list for SapphirerapidsZhengjun Xing
Add JSON uncore events for Sapphirerapids to perf. Based on JSON list v1.01: https://download.01.org/perfmon/SPR/ Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220425132211.801228-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-03perf vendor events intel: Update core event list for SapphirerapidsZhengjun Xing
Update JSON core events for Sapphirerapids to perf. Based on JSON list v1.01: https://download.01.org/perfmon/SPR/ Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220425132211.801228-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-03perf tools: Use Python devtools for version autodetection rather than runtimeJames Clark
This fixes the issue where the build will fail if only the Python2 runtime is installed but the Python3 devtools are installed. Currently the workaround is 'make PYTHON=python3'. Fix it by autodetecting Python based on whether python[x]-config exists rather than just python[x] because both are needed for the build. Then -config is stripped to find the Python runtime. Testing ======= * Auto detect links with Python3 when the v3 devtools are installed and only Python 2 runtime is installed * Auto detect links with Python2 when both devtools are installed * Sensible warning is printed if no Python devtools are installed * 'make PYTHON=x' still automatically sets PYTHON_CONFIG=x-config * 'make PYTHON=x' fails if x-config doesn't exist * 'make PYTHON=python3' overrides Python2 devtools * 'make PYTHON=python2' overrides Python3 devtools * 'make PYTHON_CONFIG=x-config' works * 'make PYTHON=x PYTHON_CONFIG=x' works * 'make PYTHON=missing' reports an error * 'make PYTHON_CONFIG=missing' reports an error Fixes: 79373082fa9de8be ("perf python: Autodetect python3 binary") Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220309194313.3350126-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-03perf stat: Avoid printing cpus with no countersIan Rogers
perf_evlist's user_requested_cpus can contain CPUs not present in any evsel's cpus, for example uncore counters. Avoid printing the prefix and trailing \n until the first valid counter is encountered. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: http://lore.kernel.org/lkml/20220503041757.2365696-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-30perf tools: Add missing headers needed by util/data.hYang Jihong
'struct perf_data' in util/data.h uses the "u64" data type, which is defined in "linux/types.h". If we only include util/data.h, the following compilation error occurs: util/data.h:38:3: error: unknown type name ‘u64’ u64 version; ^~~ Solution: include "linux/types.h." to add the needed type definitions. Fixes: 258031c017c353e8 ("perf header: Add DIR_FORMAT feature to describe directory data") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220429090539.212448-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-30Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo
To pick up fixes from perf/urgent. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf symbol: Remove arch__symbols__fixup_end()Namhyung Kim
Now the generic code can handle kallsyms fixup properly so no need to keep the arch-functions anymore. Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf symbol: Update symbols__fixup_end()Namhyung Kim
Now arch-specific functions all do the same thing. When it fixes the symbol address it needs to check the boundary between the kernel image and modules. For the last symbol in the previous region, it cannot know the exact size as it's discarded already. Thus it just uses a small page size (4096) and rounds it up like the last symbol. Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf symbol: Pass is_kallsyms to symbols__fixup_end()Namhyung Kim
The symbol fixup is necessary for symbols in kallsyms since they don't have size info. So we use the next symbol's address to calculate the size. Now it's also used for user binaries because sometimes they miss size for hand-written asm functions. There's a arch-specific function to handle kallsyms differently but currently it cannot distinguish kallsyms from others. Pass this information explicitly to handle it properly. Note that those arch functions will be moved to the generic function so I didn't added it to the arch-functions. Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf test: Add perf_event_attr test for Arm SPETimothy Hayes
Adds a perf_event_attr test for Arm SPE in which the presence of physical addresses are checked when SPE unit is run with pa_enable=1. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220421165205.117662-4-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf arm-spe: Fix SPE events with phys addressesTimothy Hayes
This patch corrects a bug whereby SPE collection is invoked with pa_enable=1 but synthesized events fail to show physical addresses. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220421165205.117662-3-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf arm-spe: Fix addresses of synthesized SPE eventsTimothy Hayes
This patch corrects a bug whereby synthesized events from SPE samples are missing virtual addresses. Fixes: 54f7815efef7fad9 ("perf arm-spe: Fill address info for samples") Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: bpf@vger.kernel.org Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: netdev@vger.kernel.org Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220421165205.117662-2-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf vendor events intel: Update WSM-EX events to v3Ian Rogers
Events are generated for Westmere EX v3 with events from: https://download.01.org/perfmon/WSM-EX/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change updates descriptions. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf vendor events intel: Update WSM-EP-SP events to v3Ian Rogers
Events are generated for Westmere EP-SP v3 with events from: https://download.01.org/perfmon/WSM-EP-SP/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change updates descriptions. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf vendor events intel: Update SKX events to v1.27Ian Rogers
Events are generated for Skylake Server v1.27 with events from: https://download.01.org/perfmon/SKX/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change updates descriptions, adds INST_DECODED.DECODERS and corrects a counter mask in UOPS_RETIRED.TOTAL_CYCLES. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf vendor events intel: Update SKL events to v53Ian Rogers
Events are generated for Skylake v53 with events from: https://download.01.org/perfmon/SKL/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change updates descriptions, adds INST_DECODED.DECODERS and corrects a counter mask in UOPS_RETIRED.TOTAL_CYCLES. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf vendor events intel: Update IVT events to v21Ian Rogers
Events are generated for Ivytown v21 with events from: https://download.01.org/perfmon/IVT/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change fixes a spelling mistake in a description. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf vendor events intel: Update ICL events to v1.13Ian Rogers
Events are generated for Icelake v1.13 with events from: https://download.01.org/perfmon/ICL/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change updates descriptions and adds INST_DECODED.DECODERS. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf test: Fix test case 81 ("perf record tests") on s390xThomas Richter
perf test -F 81 ("perf record tests") -v fails on s390x on the linux-next branch. The test case is x86 specific can not be executed on s390x. The test case depends on x86 register names such as: ... | egrep -q 'available registers: AX BX CX DX ....' Skip this test case on s390x. Output before: # perf test -F 81 81: perf record tests : FAILED! # Output after: # perf test -F 81 81: perf record tests : Skip # Fixes: 24f378e66021f559 ("perf test: Add basic perf record tests") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20220428122821.3652015-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf intel-pt: Fix timeless decoding with perf.data directoryAdrian Hunter
Intel PT does not capture data in separate directories, so do not use separate directory processing because it doesn't work for timeless decoding. It also looks like it doesn't support one_mmap handling. Example: Before: # perf record --kcore -a -e intel_pt/tsc=0/k sleep 0.1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.799 MB perf.data ] # perf script --itrace=bep | head # After: # perf script --itrace=bep | head perf 21073 [000] psb: psb offs: 0 ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms]) perf 21073 [000] cbr: cbr: 45 freq: 4505 MHz (161%) ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: 0 [unknown] ([unknown]) => ffffffffaa68faf6 native_write_msr+0x6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa68faf8 native_write_msr+0x8 ([kernel.kallsyms]) => ffffffffaa61aab0 pt_config_start+0x60 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61aabd pt_config_start+0x6d ([kernel.kallsyms]) => ffffffffaa61b8ad pt_event_start+0x27d ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61b8bb pt_event_start+0x28b ([kernel.kallsyms]) => ffffffffaa61ba60 pt_event_add+0x40 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61ba76 pt_event_add+0x56 ([kernel.kallsyms]) => ffffffffaa880e86 event_sched_in+0xc6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880e9b event_sched_in+0xdb ([kernel.kallsyms]) => ffffffffaa880ea5 event_sched_in+0xe5 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880eba event_sched_in+0xfa ([kernel.kallsyms]) => ffffffffaa880f96 event_sched_in+0x1d6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880fc8 event_sched_in+0x208 ([kernel.kallsyms]) => ffffffffaa880ec0 event_sched_in+0x100 ([kernel.kallsyms]) Fixes: bb6be405c4a2a5 ("perf session: Load data directory files for analysis") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20220428093109.274641-1-adrian.hunter@intel.com Cc: Ian Rogers <irogers@google.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: linux-kernel@vger.kernel.org
2022-04-27perf tools: Delete perf-with-kcore.sh scriptAdrian Hunter
It has been obsolete since the introduction of the 'perf record --kcore' option. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20220427141946.269523-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-26perf intel-pt: Add link to the perf wiki's Intel PT pageAdrian Hunter
Add an EXAMPLE section and link to the perf wiki's Intel PT page. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20220426133213.248475-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-24Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo
To pick up fixes, such as the llvm one for ubuntu:22.04. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-24perf stat: Support hybrid --topdown optionZhengjun Xing
Since for cpu_core or cpu_atom, they have different topdown events groups. For cpu_core, --topdown equals to: "{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/, cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/, cpu_core/topdown-heavy-ops/,cpu_core/topdown-br-mispredict/, cpu_core/topdown-fetch-lat/,cpu_core/topdown-mem-bound/}" For cpu_atom, --topdown equals to: "{cpu_atom/topdown-retiring/,cpu_atom/topdown-bad-spec/, cpu_atom/topdown-fe-bound/,cpu_atom/topdown-be-bound/}" To simplify the implementation, on hybrid, --topdown is used together with --cputype. If without --cputype, it uses cpu_core topdown events by default. # ./perf stat --topdown -a sleep 1 WARNING: default to use cpu_core topdown events Performance counter stats for 'system wide': retiring bad speculation frontend bound backend bound heavy operations light operations branch mispredict machine clears fetch latency fetch bandwidth memory bound Core bound 4.1% 0.0% 5.1% 90.8% 2.3% 1.8% 0.0% 0.0% 4.2% 0.9% 9.9% 81.0% 1.002624229 seconds time elapsed # ./perf stat --topdown -a --cputype atom sleep 1 Performance counter stats for 'system wide': retiring bad speculation frontend bound backend bound 13.5% 0.1% 31.2% 55.2% 1.002366987 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220422065635.767648-3-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf test: Fix error message for test case 71 on s390, where it is not supportedThomas Richter
Test case 71 'Convert perf time to TSC' is not supported on s390. Subtest 71.1 is skipped with the correct message, but subtest 71.2 is not skipped and fails. The root cause is function evlist__open() called from test__perf_time_to_tsc(). evlist__open() returns -ENOENT because the event cycles:u is not supported by the selected PMU, for example platform s390 on z/VM or an x86_64 virtual machine. The PMU driver returns -ENOENT in this case. This error is leads to the failure. Fix this by returning TEST_SKIP on -ENOENT. Output before: 71: Convert perf time to TSC: 71.1: TSC support: Skip (This architecture does not support) 71.2: Perf time to TSC: FAILED! Output after: 71: Convert perf time to TSC: 71.1: TSC support: Skip (This architecture does not support) 71.2: Perf time to TSC: Skip (perf_read_tsc_conversion is not supported) This also happens on an x86_64 virtual machine: # uname -m x86_64 $ ./perf test -F 71 71: Convert perf time to TSC : 71.1: TSC support : Ok 71.2: Perf time to TSC : FAILED! $ Committer testing: Continues to work on x86_64: $ perf test 71 71: Convert perf time to TSC : 71.1: TSC support : Ok 71.2: Perf time to TSC : Ok $ Fixes: 290fa68bdc458863 ("perf test tsc: Fix error message when not supported") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Chengdong Li <chengdongli@tencent.com> Cc: chengdongli@tencent.com Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20220420062921.1211825-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE eventLeo Yan
Since commit bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem info is not available") "perf mem report" and "perf report --mem-mode" don't report result if the PERF_SAMPLE_DATA_SRC bit is missed in sample type. The commit ffab487052054162 ("perf: arm-spe: Fix perf report --mem-mode") partially fixes the issue. It adds PERF_SAMPLE_DATA_SRC bit for Arm SPE event, this allows the perf data file generated by kernel v5.18-rc1 or later version can be reported properly. On the other hand, perf tool still fails to be backward compatibility for a data file recorded by an older version's perf which contains Arm SPE trace data. This patch is a workaround in reporting phase, when detects ARM SPE PMU event and without PERF_SAMPLE_DATA_SRC bit, it will force to set the bit in the sample type and give a warning info. Fixes: bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem info is not available") Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: German Gomez <german.gomez@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: https://lore.kernel.org/r/20220414123201.842754-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf script: Always allow field 'data_src' for auxtraceLeo Yan
If use command 'perf script -F,+data_src' to dump memory samples with Arm SPE trace data, it reports error: # perf script -F,+data_src Samples for 'dummy:u' event do not have DATA_SRC attribute set. Cannot print 'data_src' field. This is because the 'dummy:u' event is absent DATA_SRC bit in its sample type, so if a file contains AUX area tracing data then always allow field 'data_src' to be selected as an option for perf script. Fixes: e55ed3423c1bb29f ("perf arm-spe: Synthesize memory event") Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220417114837.839896-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf clang: Fix header include for LLVM >= 14Guilherme Amadio
The header TargetRegistry.h has moved in LLVM/clang 14. Committer notes: The problem as noticed when building in ubuntu:22.04: 90 98.61 ubuntu:22.04 : FAIL gcc version 11.2.0 (Ubuntu 11.2.0-19ubuntu1) util/c++/clang.cpp:23:10: fatal error: llvm/Support/TargetRegistry.h: No such file or directory 23 | #include "llvm/Support/TargetRegistry.h" | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. Fixed after applying this patch. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Guilherme Amadio <amadio@gentoo.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://twitter.com/GuilhermeAmadio/status/1514970524232921088 Link: http://lore.kernel.org/lkml/Ylp0M/VYgHOxtcnF@gentoo.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf stat: Merge event counts from all hybrid PMUsZhengjun Xing
For hybrid events, by default stat aggregates and reports the event counts per pmu. # ./perf stat -e cycles -a sleep 1 Performance counter stats for 'system wide': 14,066,877,268 cpu_core/cycles/ 6,814,443,147 cpu_atom/cycles/ 1.002760625 seconds time elapsed Sometimes, it's also useful to aggregate event counts from all PMUs. Create a new option '--hybrid-merge' to enable that behavior and report the counts without PMUs. # ./perf stat -e cycles -a --hybrid-merge sleep 1 Performance counter stats for 'system wide': 20,732,982,512 cycles 1.002776793 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220422065635.767648-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf stat: Support metrics with hybrid eventsZhengjun Xing
One metric such as 'Kernel_Utilization' may be from different PMUs and consists of different events. For core, Kernel_Utilization = cpu_clk_unhalted.thread:k / cpu_clk_unhalted.thread For atom, Kernel_Utilization = cpu_clk_unhalted.core:k / cpu_clk_unhalted.core The metric group string for core is: '{cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread:k/k,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W' It's internally expanded to: '{cpu_clk_unhalted.thread_p/metric-id=cpu_clk_unhalted.thread_p:k/k,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W#cpu_core' The metric group string for atom is: '{cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core:k/k,cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core/}:W' It's internally expanded to: '{cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core:k/k,cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core/}:W#cpu_atom' That means the group "{cpu_clk_unhalted.thread:k,cpu_clk_unhalted.thread}:W" is from cpu_core PMU and the group "{cpu_clk_unhalted.core:k,cpu_clk_unhalted.core}" is from cpu_atom PMU. And then next, check if the events in the group are valid on that PMU. If one event is not valid on that PMU, the associated group would be removed internally. In this example, cpu_clk_unhalted.thread is valid on cpu_core and cpu_clk_unhalted.core is valid on cpu_atom. So the checks for these two groups are passed. Before: # ./perf stat -M Kernel_Utilization -a sleep 1 WARNING: events in group from different hybrid PMUs! WARNING: grouped events cpus do not match, disabling group: anon group { CPU_CLK_UNHALTED.THREAD_P:k, CPU_CLK_UNHALTED.THREAD_P:k, CPU_CLK_UNHALTED.THREAD, CPU_CLK_UNHALTED.THREAD } Performance counter stats for 'system wide': 17,639,501 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 1.00 Kernel_Utilization 17,578,757 cpu_atom/CPU_CLK_UNHALTED.CORE:k/ 1,005,350,226 ns duration_time 43,012,352 cpu_core/CPU_CLK_UNHALTED.THREAD_P:k/ # 0.99 Kernel_Utilization 17,608,010 cpu_atom/CPU_CLK_UNHALTED.THREAD_P:k/ 43,608,755 cpu_core/CPU_CLK_UNHALTED.THREAD/ 17,630,838 cpu_atom/CPU_CLK_UNHALTED.THREAD/ 1,005,350,226 ns duration_time 1.005350226 seconds time elapsed After: # ./perf stat -M Kernel_Utilization -a sleep 1 Performance counter stats for 'system wide': 17,981,895 CPU_CLK_UNHALTED.CORE [cpu_atom] # 1.00 Kernel_Utilization 17,925,405 CPU_CLK_UNHALTED.CORE:k [cpu_atom] 1,004,811,366 ns duration_time 41,246,425 CPU_CLK_UNHALTED.THREAD_P:k [cpu_core] # 0.99 Kernel_Utilization 41,819,129 CPU_CLK_UNHALTED.THREAD [cpu_core] 1,004,811,366 ns duration_time 1.004811366 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220422065635.767648-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf vendor events intel: Add metrics for AlderlakeZhengjun Xing
Add JSON metrics for Alderlake to perf. It included both P-core and E-core metrics. P-core metrics based on TMA 4.3-full (TMA_Metrics-full.csv) E-core metrics based on E-core TMA 2.0 (E-core_TMA_Metrics.xlsx) They are all downloaded from: https://download.01.org/perfmon/ Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20220422065336.767582-1-zhengjun.xing@linux.intel.com Cc: irogers@google.com Cc: peterz@infradead.org Cc: adrian.hunter@intel.com Cc: alexander.shishkin@intel.com Cc: acme@kernel.org Cc: ak@linux.intel.com Cc: jolsa@redhat.com Cc: mingo@redhat.com Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org
2022-04-22perf tools: Move libbpf init in libbpf_init functionJiri Olsa
Move the libbpf init code into a single function, so that we have a single place doing that. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220422100025.1469207-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20perf list: Print all available tool eventsFlorian Fischer
Introduce names for the new tool events 'user_time' and 'system_time'. $ perf list ... duration_time [Tool event] user_time [Tool event] system_time [Tool event] ... Committer testing: Before: $ perf list | grep Tool duration_time [Tool event] $ After: $ perf list | grep Tool duration_time [Tool event] user_time [Tool event] system_time [Tool event] $ Signed-off-by: Florian Fischer <florian.fischer@muhq.space> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220420174244.1741958-2-florian.fischer@muhq.space Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20perf stat: Add user_time and system_time eventsFlorian Fischer
It bothered me that during benchmarking using 'perf stat' (to collect for example CPU cache events) I could not simultaneously retrieve the times spend in user or kernel mode in a machine readable format. When running 'perf stat' the output for humans contains the times reported by rusage and wait4. $ perf stat -e cache-misses:u -- true Performance counter stats for 'true': 4,206 cache-misses:u 0.001113619 seconds time elapsed 0.001175000 seconds user 0.000000000 seconds sys But 'perf stat's machine-readable format does not provide this information. $ perf stat -x, -e cache-misses:u -- true 4282,,cache-misses:u,492859,100.00,, I found no way to retrieve this information using the available events while using machine-readable output. This patch adds two new tool internal events 'user_time' and 'system_time', similarly to the already present 'duration_time' event. Both events use the already collected rusage information obtained by wait4 and tracked in the global ru_stats. Examples presenting cache-misses and rusage information in both human and machine-readable form: $ perf stat -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time . Performance counter stats for 'grep -q -r duration_time .': 67,422,542 ns duration_time:u 50,517,000 ns user_time:u 16,839,000 ns system_time:u 30,937 cache-misses:u 0.067422542 seconds time elapsed 0.050517000 seconds user 0.016839000 seconds sys $ perf stat -x, -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time . 72134524,ns,duration_time:u,72134524,100.00,, 65225000,ns,user_time:u,65225000,100.00,, 6865000,ns,system_time:u,6865000,100.00,, 38705,,cache-misses:u,71189328,100.00,, Signed-off-by: Florian Fischer <florian.fischer@muhq.space> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220420102354.468173-3-florian.fischer@muhq.space Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20perf stat: Introduce stats for the user and system rusage timesFlorian Fischer
This is preparation for exporting rusage values as tool events. Add new global stats tracking the values obtained via rusage. For now only ru_utime and ru_stime are part of the tracked stats. Both are stored as nanoseconds to be consistent with 'duration_time', although the finest resolution the struct timeval data in rusage provides are microseconds. Signed-off-by: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220420102354.468173-2-florian.fischer@muhq.space Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>