summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2023-04-04perf list: Use relative path for tracepoint scanNamhyung Kim
Committer notes: Added missing #include <unistd.h> for the close() prototype to fix this on Alma Linux 8: 1 21.54 almalinux:8 : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-16) (GCC) util/print-events.c: In function 'print_tracepoint_events': util/print-events.c:103:4: error: implicit declaration of function 'close'; did you mean 'clone'? [-Werror=implicit-function-declaration] close(evt_fd); ^~~~~ clone Also use the newly added scandirat feature test to check if that function is available, providing a HAVE_SCANDIRAT_SUPPORT conditional warning to the user if it isn't available. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04tools build: Add a feature test for scandirat(), that is not implemented so ↵Arnaldo Carvalho de Melo
far in musl and uclibc We use it just when listing tracepoint events, and for root, so just emit a warning about it to get users to ask the library maintainers to implement it, as suggested in this systemd ticket: https://github.com/systemd/casync/issues/129 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZCwv4z5Dh%2FdHUMG6@kernel.org/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf intel-pt: Fix CYC timestamps after standalone CBRAdrian Hunter
After a standalone CBR (not associated with TSC), update the cycles reference timestamp and reset the cycle count, so that CYC timestamps are calculated relative to that point with the new frequency. Fixes: cc33618619cefc6d ("perf tools: Add Intel PT support for decoding CYC packets") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf auxtrace: Fix address filter entire kernel sizeAdrian Hunter
kallsyms is not completely in address order. In find_entire_kern_cb(), calculate the kernel end from the maximum address not the last symbol. Example: Before: $ sudo cat /proc/kallsyms | grep ' [twTw] ' | tail -1 ffffffffc00b8bd0 t bpf_prog_6deef7357e7b4530 [bpf] $ sudo cat /proc/kallsyms | grep ' [twTw] ' | sort | tail -1 ffffffffc15e0cc0 t iwl_mvm_exit [iwlmvm] $ perf.d093603a05aa record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter Address filter: filter 0xffffffff93200000/0x2ceba000 After: $ perf.8fb0f7a01f8e record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter Address filter: filter 0xffffffff93200000/0x2e3e2000 Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf arm-spe: Add raw decoding for SPEv1.3 MTE and MOPS load/storeRob Herring
Arm SPEv1.3 adds new load/store operation subclasses for Memory Tagging Extension (MTE) and memory operations (MOPS). The memory operations are memcpy and memset. Add support for decoding these new subclasses in the raw decoding. Reviewed-by: Leo Yan <leo.yan@linaro.org Signed-off-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230327162057.4057188-1-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf cs-etm: Handle PERF_RECORD_AUX_OUTPUT_HW_ID packetMike Leach
When using dynamically assigned CoreSight trace IDs the drivers can output the ID / CPU association as a PERF_RECORD_AUX_OUTPUT_HW_ID packet. Update cs-etm decoder to handle this packet by setting the CPU/Trace ID mapping. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Mike Leach <mike.leach@linaro.org> Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Darren Hart <darren@os.amperecomputing.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf cs-etm: Update record event to use new Trace ID protocolMike Leach
Trace IDs are now dynamically allocated. Previously used the static association algorithm that is no longer used. The 'cpu * 2 + seed' was outdated and broken for systems with high core counts (>46). as it did not scale and was broken for larger core counts. Trace ID will now be sent in PERF_RECORD_AUX_OUTPUT_HW_ID record. Legacy ID algorithm renamed and retained for limited backward compatibility use. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Mike Leach <mike.leach@linaro.org> Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Darren Hart <darren@os.amperecomputing.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf cs-etm: Move mapping of Trace ID and cpu into helper functionMike Leach
The information to associate Trace ID and CPU will be changing. Drivers will start outputting this as a hardware ID packet in the data file which if present will be used in preference to the AUXINFO values. To prepare for this we provide a helper functions to do the individual ID mapping, and one to extract the IDs from the completed metadata blocks. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Mike Leach <mike.leach@linaro.org> Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Darren Hart <darren@os.amperecomputing.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf lock contention: Show detail failure reason for BPFNamhyung Kim
It can fail to collect lock stat from BPF for various reasons. For example, I've got a report that sometimes time calculation seems wrong in case of contended spinlocks. I suspect the time delta went negative for some reason. Count them separately and show in the output like below: $ sudo perf lock contention -abE5 sleep 10 contended total wait max wait avg wait type caller 13 785.61 us 79.36 us 60.43 us spinlock remove_wait_queue+0x14 10 469.02 us 87.51 us 46.90 us spinlock prepare_to_wait+0x27 9 289.09 us 69.08 us 32.12 us spinlock finish_wait+0x36 114 251.05 us 8.56 us 2.20 us spinlock try_to_wake_up+0x1f5 132 188.63 us 5.01 us 1.43 us spinlock __wake_up_common_lock+0x62 === output for debug === bad: 1, total: 279 bad rate: 0.36 % histogram of failure reasons task: 1 stack: 0 time: 0 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230327225711.245738-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf lock contention: Fix debug stat if no contentionNamhyung Kim
It should not divide if the total number is 0. Otherwise it'd show NaN in the bad rate output. Also add a whitespace in the "output for debug" message. $ sudo perf lock contention -abv true Looking at the vmlinux_path (8 entries long) symsrc__init: cannot get elf header. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols contended total wait max wait avg wait type caller === output for debug=== bad: 0, total: 0 bad rate: -nan % <------------------------- (here) histogram of events caused bad sequence acquire: 0 acquired: 0 contended: 0 release: 0 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230327225711.245738-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events intel: Update ivybridge and ivytownIan Rogers
Update to versions 24 and 23 respectively. Adds the event BR_MISP_EXEC.INDIRECT. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230328234142.1080045-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf bench numa: Fix type of loop iterator in do_work, it should be 'long'Andreas Herrmann
'j' is of type int and start/end are of type 'long'. Thus 'j' might become negative and cause segfault in access_data(). Fix it by using 'long' for 'j' as well. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Andreas Herrmann <aherrmann@suse.de> Link: https://lore.kernel.org/r/20230330074202.14052-1-aherrmann@suse.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbol: Remove unused branch_callstackAdrian Hunter
branch_callstack was added by commit 8b7bad58efb7 ("perf callchain: Support handling complete branch stacks as histograms") but never used. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230330131833.12864-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf top: Add --branch-history optionAdrian Hunter
Add --branch-history option, to act the same as that option does for perf report. Example: $ cat tcallf.c volatile a = 10000, b = 100000, c; __attribute__((noinline)) f2() { c = a / b; } __attribute__((noinline)) f1() { f2(); f2(); } main() { while (1) f1(); } $ gcc -w -g -o tcallf tcallf.c $ ./tcallf & [1] 29409 $ perf top -e cycles:u -t $(pidof tcallf) --stdio --no-children --branch-history PerfTop: 3819 irqs/sec kernel: 0.0% exact: 0.0% lost: 0/0 drop: 0/0 [4000Hz cycles:u], (target_tid: 29409) -------------------------------------------------------------------------------------------------------------------- 49.01% tcallf.c:5 [.] f2 tcallf | |--24.91%--f2 tcallf.c:4 | | | |--17.14%--f1 tcallf.c:11 (cycles:1) | | f1 tcallf.c:11 | | f2 tcallf.c:6 (cycles:3) | | f2 tcallf.c:4 | | f1 tcallf.c:10 (cycles:2) | | f1 tcallf.c:9 | | main tcallf.c:16 (cycles:1) | | main tcallf.c:16 | | main tcallf.c:16 (cycles:1) | | main tcallf.c:16 | | f1 tcallf.c:12 (cycles:1) | | f1 tcallf.c:12 | | f2 tcallf.c:6 (cycles:3) | | f2 tcallf.c:4 | | f1 tcallf.c:11 (cycles:1 iter:1 avg_cycles:12) | | f1 tcallf.c:11 | | f2 tcallf.c:6 (cycles:3 iter:1 avg_cycles:12) | | f2 tcallf.c:4 | | f1 tcallf.c:10 (cycles:2 iter:1 avg_cycles:12) | | | --7.78%--f1 tcallf.c:10 (cycles:2) | f1 tcallf.c:9 | main tcallf.c:16 (cycles:1) | main tcallf.c:16 | main tcallf.c:16 (cycles:1) | main tcallf.c:16 | f1 tcallf.c:12 (cycles:1) | f1 tcallf.c:12 | f2 tcallf.c:6 (cycles:3) | f2 tcallf.c:4 | f1 tcallf.c:11 (cycles:1) | f1 tcallf.c:11 | f2 tcallf.c:6 (cycles:3) | f2 tcallf.c:4 | f1 tcallf.c:10 (cycles:2 iter:1 avg_cycles:12) | f1 tcallf.c:9 | main tcallf.c:16 (cycles:1 iter:1 avg_cycles:12) | main tcallf.c:16 | main tcallf.c:16 (cycles:1 iter:1 avg_cycles:12) ... $ pkill tcallf [1]+ Terminated ./tcallf Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230330131833.12864-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf build: Conditionally define NDEBUGIan Rogers
When a build is done without DEBUG=1 then define NDEBUG. This will compile out asserts and other debug code. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230330183827.1412303-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf block-range: Move debug code behind ifndef NDEBUGIan Rogers
Make good on a comment and avoid a unused-but-set-variable warning. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230330183827.1412303-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf bench: Avoid NDEBUG warningIan Rogers
With NDEBUG set the asserts are compiled out. This yields "unused-but-set-variable" variables. Move these variables behind NDEBUG to avoid the warning. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230330183827.1412303-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events: Update Alderlake for E-Core TMA v2.3Ian Rogers
https://github.com/intel/perfmon/pull/65 Generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py The PR notes state: - E-Core TMA version 2.3. - FP_UOPS changed to FPDIV_Uops - Added BR_MISP breakdown stats - Frontend_Bandwidth/Latency changed to Fetch_Bandwidth/Latency - Load_Store_Bound changed to Memory_Bound - Icache changed to ICache_Misses - ITLB changed to ITLB_Misses - Store_Fwd changed to Store_Fwd_Blk Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230329162318.1227114-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbol: Add command line support for addr2line pathIan Rogers
Allow addr2line to be set either on the command line or via the perfconfig file. This doesn't currently work with llvm-addr2line as the addr2line code emits two things: 1) the address to decode, 2) a bogus ',' value. The expectation is the bogus value will generate: ?? ??:0 that terminates the addr2line reading. However, the output from llvm-addr2line is a single line with just the input ',' locking up the addr2line reading that is expecting a second line. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf annotate: Allow objdump to be set in perfconfigIan Rogers
Allow the setting of the objdump command in the perfconfig. Update man page for this new option. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf annotate: Own objdump_path and disassembler_style stringsIan Rogers
Make struct annotation_options own the strings objdump_path and disassembler_style, freeing them on exit. Add missing strdup for disassembler_style when read from a config file. Committer notes: Converted free(obj->member) to zfree(&obj->member) in annotation_options__exit() Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf annotate: Add init/exit to annotation_options remove defaultIan Rogers
The annotation__default_options global variable was used to initialize annotation_options. Switch to the init/exit pattern as later changes will give ownership over strings and this will be necessary to avoid memory leaks. Committer note: Fix the GTK2=1 build, hist_entry__gtk_annotate() needs to receive a 'struct annotation_options' pointer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf report: Additional config warningsIan Rogers
If the default_sort_order isn't correctly strdup-ed warn and return an error. Debug warn if no option is matched. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf annotate: Delete session for debug buildsIan Rogers
Use the debug build indicator as the guide to free the session. This implements a behavior described in a comment, which is consequentially removed. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf tools: Avoid warning in do_realloc_array_as_needed()Adrian Hunter
do_realloc_array_as_needed() used memcpy() of zero size with a NULL pointer. Check the size first to avoid sanitize warning. Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/oe-lkp/202303061424.6ad43294-yujie.liu@intel.com Link: https://lore.kernel.org/r/20230316194156.8320-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbols: Fix unaligned access in get_x86_64_plt_disp()Adrian Hunter
Use memcpy() to avoid unaligned access. Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Fixes: ce4c8e7966f317ef ("perf symbols: Get symbols for .plt.got for x86-64") Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/oe-lkp/202303061424.6ad43294-yujie.liu@intel.com Link: https://lore.kernel.org/r/20230316194156.8320-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf symbols: Fix use-after-free in get_plt_got_name()Adrian Hunter
Fix use-after-free in get_plt_got_name(). Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Fixes: ce4c8e7966f317ef ("perf symbols: Get symbols for .plt.got for x86-64") Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/oe-lkp/202303061424.6ad43294-yujie.liu@intel.com Link: https://lore.kernel.org/r/20230316194156.8320-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events power9: Remove UTF-8 characters from JSON filesKajol Jain
Commit 3c22ba5243040c13 ("perf vendor events powerpc: Update POWER9 events") added and updated power9 PMU JSON events. However some of the JSON events which are part of other.json and pipeline.json files, contains UTF-8 characters in their brief description. Having UTF-8 character could breaks the perf build on some distros. Fix this issue by removing the UTF-8 characters from other.json and pipeline.json files. Result without the fix: [command]# file -i pmu-events/arch/powerpc/power9/* pmu-events/arch/powerpc/power9/cache.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/floating-point.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/frontend.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/marked.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/memory.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/nest_metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/other.json: application/json; charset=utf-8 pmu-events/arch/powerpc/power9/pipeline.json: application/json; charset=utf-8 pmu-events/arch/powerpc/power9/pmc.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/translation.json: application/json; charset=us-ascii [command]# Result with the fix: [command]# file -i pmu-events/arch/powerpc/power9/* pmu-events/arch/powerpc/power9/cache.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/floating-point.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/frontend.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/marked.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/memory.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/nest_metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/other.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/pipeline.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/pmc.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/translation.json: application/json; charset=us-ascii [command]# Fixes: 3c22ba5243040c13 ("perf vendor events powerpc: Update POWER9 events") Reported-by: Arnaldo Carvalho de Melo <acme@kernel.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Disha Goel <disgoel@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/lkml/ZBxP77deq7ikTxwG@kernel.org/ Link: https://lore.kernel.org/r/20230328112908.113158-1-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf ftrace: Make system wide the default target for latency subcommandYang Jihong
If no target is specified for 'latency' subcommand, the execution fails because - 1 (invalid value) is written to set_ftrace_pid tracefs file. Make system wide the default target, which is the same as the default behavior of 'trace' subcommand. Before the fix: # perf ftrace latency -T schedule failed to set ftrace pid After the fix: # perf ftrace latency -T schedule ^C# DURATION | COUNT | GRAPH | 0 - 1 us | 0 | | 1 - 2 us | 0 | | 2 - 4 us | 0 | | 4 - 8 us | 2828 | #### | 8 - 16 us | 23953 | ######################################## | 16 - 32 us | 408 | | 32 - 64 us | 318 | | 64 - 128 us | 4 | | 128 - 256 us | 3 | | 256 - 512 us | 0 | | 512 - 1024 us | 1 | | 1 - 2 ms | 4 | | 2 - 4 ms | 0 | | 4 - 8 ms | 0 | | 8 - 16 ms | 0 | | 16 - 32 ms | 0 | | 32 - 64 ms | 0 | | 64 - 128 ms | 0 | | 128 - 256 ms | 4 | | 256 - 512 ms | 2 | | 512 - 1024 ms | 0 | | 1 - ... s | 0 | | Fixes: 53be50282269b46c ("perf ftrace: Add 'latency' subcommand") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230324032702.109964-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf bench syscall: Add fork syscall benchmarkTiezhu Yang
This is a follow up patch for the execve bench which is actually fork + execve, it makes sense to add the fork syscall benchmark to compare the execve part precisely. Some archs have no __NR_fork definition which is used only as a check condition to call test_fork(), let us just define it as -1 to avoid build error. Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: loongson-kernel@lists.loongnix.cn Link: https://lore.kernel.org/r/1679381821-22736-1-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf stat: Suppress warning when using cpum_cf events on s390Thomas Richter
Running command perf stat -vv -e cpu_cycles -C0 -- true displays this warning: Attempting to add event pmu 'cpum_cf' with 'cpu_cycles,' that may result in non-fatal errors Make the PMU cpum_cf selectable and avoid this warning. While at it also fix this warning for PMUs pai_crypto and pai_ext. Output before: # ./perf stat -vv -e cpu_cycles -C0 -- true Using CPUID IBM,3931,704,A01,3.7,002f Attempting to add event pmu 'cpum_cf' with 'cpu_cycles,' that may result in non-fatal errors After aliases, add event pmu 'cpum_cf' with 'event,' that may result in non-fatal errors cpu_cycles -> cpum_cf/event=0/ Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 10 size 128 config 0x1001 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 cpu_cycles: 0: 290434 2479172 2479172: cpu_cycles: 290434 2479172 2479172 Performance counter stats for 'CPU(s) 0': 290,434 cpu_cycles 0.002465617 seconds time elapsed # Now the warning "Attempting to add event pmu 'cpum_cf' ..." does not show up anymore. Output after: # ./perf stat -vv -e cpu_cycles -C0 -- true Using CPUID IBM,3931,704,A01,3.7,002f After aliases, add event pmu 'cpum_cf' with 'event,' that may result in non-fatal errors cpu_cycles -> cpum_cf/event=0/ Control descriptor is not initialized .... Performance counter stats for 'CPU(s) 0': 357,023 cpu_cycles 0.002454995 seconds time elapsed # Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20230316074946.41110-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf tests record_offcpu.sh: Fix redirection of stderr to stdinPatrice Duroux
It's not 2&>1, the correct is 2>&1 Fixes: ade1d0307b2fb3d9 ("perf offcpu: Update offcpu test for child process") Signed-off-by: Patrice Duroux <patrice.duroux@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230303193058.21274-1-patrice.duroux@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events intel: Update metrics to detect pmem at runtimeIan Rogers
By detecting whether nvdimms are installed at runtime the number of events can be reduced if it isn't. These changes come from this PR: https://github.com/intel/perfmon/pull/63 Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230324072218.181880-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf metrics: Add has_pmem literalIan Rogers
Add literal so that if nvdimms aren't installed we can record fewer events. The file detection mechanism was suggested by Dan Williams <dan.j.williams@intel.com> in: https://lore.kernel.org/linux-perf-users/641bbe1eced26_1b98bb29440@dwillia2-xfh.jf.intel.com.notmuch/ Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230324072218.181880-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events intel: Sandybridge v19 eventsIan Rogers
Adds BR_MISP_EXEC.INDIRECT event. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230324072218.181880-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events intel: Jaketown v23 eventsIan Rogers
Adds BR_MISP_EXEC.INDIRECT event. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230324072218.181880-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events intel: Haswellx v27 eventsIan Rogers
Updates descriptions and encodings. Adds BR_MISP_EXEC.INDIRECT events. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230324072218.181880-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events intel: Haswell v33 eventsIan Rogers
Updates descriptions and encodings. Adds BR_MISP_EXEC.INDIRECT events. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230324072218.181880-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events intel: Broadwellx v20 eventsIan Rogers
Updates descriptions and encodings. Adds BR_MISP_EXEC.INDIRECT events. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230324072218.181880-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events intel: Broadwellde v9 eventsIan Rogers
Updates descriptions and encodings. Adds BR_MISP_EXEC.INDIRECT events. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230324072218.181880-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events intel: Broadwell v27 eventsIan Rogers
Description updates and formatting changes. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230324072218.181880-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf lock contention: Fix msan issue in lock_contention_read()Namhyung Kim
I got a report of a msan failure like below: $ sudo perf lock con -ab -- sleep 1 ... ==224416==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x5651160d6c96 in lock_contention_read util/bpf_lock_contention.c:290:8 #1 0x565115f90870 in __cmd_contention builtin-lock.c:1919:3 #2 0x565115f90870 in cmd_lock builtin-lock.c:2385:8 #3 0x565115f03a83 in run_builtin perf.c:330:11 #4 0x565115f03756 in handle_internal_command perf.c:384:8 #5 0x565115f02d53 in run_argv perf.c:428:2 #6 0x565115f02d53 in main perf.c:562:3 #7 0x7f43553bc632 in __libc_start_main #8 0x565115e865a9 in _start It was because the 'key' variable is not initialized. Actually it'd be set by bpf_map_get_next_key() but msan didn't seem to understand it. Let's make msan happy by initializing the variable. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230324001922.937634-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf vendor events s390: Remove UTF-8 characters from JSON fileThomas Richter
Commit 7f76b31130680fb3 ("perf list: Add IBM z16 event description for s390") contains the verbal description for z16 extended counter set. However some entries of the public description contain UTF-8 characters which breaks the build on some distros. Fix this and remove the UTF-8 characters. Fixes: 7f76b31130680fb3 ("perf list: Add IBM z16 event description for s390") Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/ZBwkl77/I31AQk12@osiris Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf hist: Improve srcfile sort key performance (really)Namhyung Kim
The earlier commit f0cdde28fecc0d7f ("perf hist: Improve srcfile sort key performance") updated the srcfile logic but missed to change the ->cmp() callback which is called for every sample. It should use the same logic like in the srcline to speed up the processing because it'd return the same information repeatedly for the same address. The real processing will be done in sort__srcfile_collapse(). Fixes: f0cdde28fecc0d7f ("perf hist: Improve srcfile sort key performance") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230323025005.191239-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf test: Fix wrong size expectation for 'Setup struct perf_event_attr'Thomas Richter
The test case "perf test 'Setup struct perf_event_attr'" is failing. On s390 this output is observed: # ./perf test -Fvvvv 17 17: Setup struct perf_event_attr : --- start --- running './tests/attr/test-stat-C0' Using CPUID IBM,8561,703,T01,3.6,002f ..... Event event:base-stat fd = 1 group_fd = -1 flags = 0|8 cpu = * type = 0 size = 128 <<<--- wrong, specified in file base-stat config = 0 sample_period = 0 sample_type = 65536 ... 'PERF_TEST_ATTR=/tmp/tmpgw574wvg ./perf stat -o \ /tmp/tmpgw574wvg/perf.data -e cycles -C 0 kill >/dev/null \ 2>&1 ret '1', expected '1' loading result events Event event-0-0-4 fd = 4 group_fd = -1 cpu = 0 pid = -1 flags = 8 type = 0 size = 136 <<<--- actual size used in system call ..... compare matching [event-0-0-4] to [event:base-stat] [cpu] 0 * [flags] 8 0|8 [type] 0 0 [size] 136 128 ->FAIL match: [event-0-0-4] matches [] expected size=136, got 128 FAILED './tests/attr/test-stat-C0' - match failure This mismatch is caused by commit 09519ec3b19e ("perf: Add perf_event_attr::config3") which enlarges the structure perf_event_attr by 8 bytes. Fix this by adjusting the expected value of size. Output after: # ./perf test -Fvvvv 17 17: Setup struct perf_event_attr : --- start --- running './tests/attr/test-stat-C0' Using CPUID IBM,8561,703,T01,3.6,002f ... matched compare matching [event-0-0-4] to [event:base-stat] [cpu] 0 * [flags] 8 0|8 [type] 0 0 [size] 136 136 .... ->OK match: [event-0-0-4] matches ['event:base-stat'] matched Fixes: 09519ec3b19e4144 ("perf: Add perf_event_attr::config3") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Rob Herring <robh@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230322094731.1768281-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf build: Add warning for when vmlinux.h generation failsIan Rogers
The warning advises on the NO_BPF_SKEL=1 option. Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230322183108.1380882-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf report: Append inlines to non-DWARF callchainsArtem Savkov
Append information about inlined functions to FP and LBR callchains from DWARF debuginfo when available. Do so by calling append_inlines() from add_callchain_ip(). Testing it: Frame-pointer mode recorded with 'perf record --call-graph=fp --freq=max -- ./a.out' #include <stdio.h> #include <stdint.h> static __attribute__((noinline)) uint32_t func5(uint32_t i) { return i + 10; } static uint32_t func4(uint32_t i) { return func5(i + 5); } static inline uint32_t func3(uint32_t i) { return func4(i + 4); } static __attribute__((noinline)) uint32_t func2(uint32_t i) { return func3(i + 3); } static uint32_t func1(uint32_t i) { return func2(i + 2); } __attribute__((noinline)) uint64_t entry(void) { uint64_t ret = 0; uint32_t i = 0; for (i = 0; i < 1000000; i++) { ret += func1(i); ret -= func2(i); ret += func3(i); ret += func4(i); ret -= func5(i); } return ret; } int main(int argc, char **argv) { printf("%s\n", __func__); return entry(); } ====== Here is the output I get with '--call-graph callee --no-children' ====== # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 250 of event 'cycles:u' # Event count (approx.): 26819859 # # Overhead Command Shared Object Symbol # ........ ....... .................... ..................................... # 43.58% a.out a.out [.] func5 | |--28.93%--entry | main | __libc_start_call_main | --14.65%--func4 (inlined) | |--10.45%--entry | main | __libc_start_call_main | --4.20%--func3 (inlined) entry main __libc_start_call_main 38.80% a.out a.out [.] entry | |--23.27%--func4 (inlined) | | | |--20.28%--func3 (inlined) | | func2 | | main | | __libc_start_call_main | | | --2.99%--entry | main | __libc_start_call_main | |--8.17%--func5 | main | __libc_start_call_main | |--3.89%--func1 (inlined) | entry | main | __libc_start_call_main | --3.48%--entry main __libc_start_call_main 13.07% a.out a.out [.] func2 | ---func5 main __libc_start_call_main 1.54% a.out [unknown] [k] 0xffffffff81e011b7 1.16% a.out [unknown] [k] 0xffffffff81e00193 | --0.57%--__mmap64 (inlined) __mmap64 (inlined) 0.34% a.out ld-linux-x86-64.so.2 [.] __tunable_get_val 0.34% a.out ld-linux-x86-64.so.2 [.] strcmp 0.32% a.out libc.so.6 [.] strchr 0.31% a.out ld-linux-x86-64.so.2 [.] _dl_relocate_object 0.22% a.out ld-linux-x86-64.so.2 [.] _dl_init_paths 0.18% a.out ld-linux-x86-64.so.2 [.] get_common_cache_info.constprop.0 0.14% a.out ld-linux-x86-64.so.2 [.] __GI___tunables_init # # (Tip: Show individual samples with: perf script) # ====== It does not seem to be out of order, or at least it is consistent with what I get with dwarf unwinders. Committer notes: Adrian Hunter pointed out that this breaks --branch-history, so don't do it for branches, see the second Link below. Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: <asavkov@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230316133557.868731-2-asavkov@redhat.com Link: https://lore.kernel.org/r/54129783-2960-84e1-05e9-97ac70ffb432@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-23sh: remove sh5/sh64 last fragmentsRandy Dunlap
A previous patch removed most of the sh5 (sh64) support from the kernel tree. Now remove the last stragglers. Fixes: 37744feebc08 ("sh: remove sh5 support") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: linux-sh@vger.kernel.org Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/20230306040037.20350-6-rdunlap@infradead.org Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2023-03-21perf tools: Add support for perf_event_attr::config3Rob Herring
perf_event_attr has gained a new field, config3, so add support for it extending the existing configN support. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220914-arm-perf-tool-spe1-2-v2-v5-2-2cf5210b2f77@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-21perf vendor events arm64: Add N1 metricsJames Clark
Generated from the telemetry solution repo[1] with this command: ./generate.py <linux-repo>/tools/perf/ --telemetry-files \ ../../data/pmu/cpu/neoverse/neoverse-n1.json Since this data source now includes the SPE events for N1, it has diverged from A76 which means the folder has to be split. The new data also uses more fine grained grouping, but this will be consistent for all future products. Long PublicDescriptions are now included even for common events because this can include product specific details. For non verbose mode the common BriefDescriptions remain the same. [1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: renyu.zj@linux.alibaba.com Link: https://lore.kernel.org/r/20230320114601.524958-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>