summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-11-10perf header: Additional note on AMD IBS for max_precise pmu capArnaldo Carvalho de Melo
x86 core PMU exposes supported maximum precision level via max_precise PMU capability. Although, AMD core PMU does not support precise mode, certain core PMU events with precise_ip > 0 are allowed and forwarded to IBS OP PMU. Display a note about this in the 'perf report' header output and document the details in the perf-list man page. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ross Zwisler <zwisler@chromium.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231107083331.901-2-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09tools: Disable __packed attribute compiler warning due to -Werror=attributesArnaldo Carvalho de Melo
Noticed on several perf tools cross build test containers: [perfbuilder@five ~]$ grep FAIL ~/dm.log/summary 19 10.18 debian:experimental-x-mips : FAIL gcc version 12.3.0 (Debian 12.3.0-6) 20 11.21 debian:experimental-x-mips64 : FAIL gcc version 12.3.0 (Debian 12.3.0-6) 21 11.30 debian:experimental-x-mipsel : FAIL gcc version 12.3.0 (Debian 12.3.0-6) 37 12.07 ubuntu:18.04-x-arm : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 42 11.91 ubuntu:18.04-x-riscv64 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) 44 13.17 ubuntu:18.04-x-sh4 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) 45 12.09 ubuntu:18.04-x-sparc64 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) [perfbuilder@five ~]$ In file included from util/intel-pt-decoder/intel-pt-pkt-decoder.c:10: /tmp/perf-6.6.0-rc1/tools/include/asm-generic/unaligned.h: In function 'get_unaligned_le16': /tmp/perf-6.6.0-rc1/tools/include/asm-generic/unaligned.h:13:29: error: packed attribute causes inefficient alignment for 'x' [-Werror=attributes] 13 | const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ | ^ /tmp/perf-6.6.0-rc1/tools/include/asm-generic/unaligned.h:27:28: note: in expansion of macro '__get_unaligned_t' 27 | return le16_to_cpu(__get_unaligned_t(__le16, p)); | ^~~~~~~~~~~~~~~~~ This comes from the kernel, where the -Wattributes and -Wpacked isn't used, -Wpacked is already disabled, do it for the attributes as well. Fixes: a91c987254651443 ("perf tools: Add get_unaligned_leNN()") Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/7c5b626c-1de9-4c12-a781-e44985b4a797@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf bpf: Don't synthesize BPF events when disabledIan Rogers
If BPF sideband events are disabled on the command line, don't synthesize BPF events too. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Song Liu <song@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231102175735.2272696-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf test: Add support for setting objdump binary via perf configJames Clark
Add a 'perf config' variable that does the same thing as "perf test --objdump <x>". Also update the man page. Committer testing: # perf config test.objdump # perf test "object code reading" 26: Object code reading : Ok # perf config test.objdump=blah # perf config test.objdump test.objdump=blah # perf test "object code reading" 26: Object code reading : FAILED! # perf test -v "object code reading" 26: Object code reading : --- start --- test child forked, pid 600599 Looking at the vmlinux_path (8 entries long) Using /proc/kcore for kernel data Using /proc/kallsyms for symbols Parsing event 'cycles' Using CPUID AuthenticAMD-25-21-0 mmap size 528384B Reading object code for memory address: 0x4d9a02 File is: /home/acme/bin/perf On file address is: 0xd9a02 Objdump command is: blah -z -d --start-address=0x4d9a02 --stop-address=0x4d9a82 /home/acme/bin/perf objdump read too few bytes: 128 Bytes read differ from those read by objdump buf1 (dso): 0x48 0x85 0xff 0x74 0x29 0xe8 0x94 0xdf 0x07 0x00 0x8b 0x73 0x1c 0x48 0x8b 0x43 0x08 0xeb 0xa5 0x0f 0x1f 0x00 0x48 0x8b 0x45 0xe8 0x64 0x48 0x2b 0x04 0x25 0x28 0x00 0x00 0x00 0x75 0x0f 0x48 0x8b 0x5d 0xf8 0xc9 0xc3 0x0f 0x1f 0x00 0x48 0x8b 0x43 0x08 0xeb 0x84 0xe8 0xc5 0x3e 0xf3 0xff 0x0f 0x1f 0x44 0x00 0x00 0x55 0x48 0x89 0xe5 0x41 0x56 0x41 0x55 0x49 0x89 0xd5 0x41 0x54 0x49 0x89 0xfc 0x53 0x48 0x89 0xf3 0x48 0x83 0xec 0x30 0x48 0x8b 0x7e 0x20 0x64 0x48 0x8b 0x04 0x25 0x28 0x00 0x00 0x00 0x48 0x89 0x45 0xd8 0x31 0xc0 0x48 0x89 0x75 0xb0 0x48 0xc7 0x45 0xb8 0x00 0x00 0x00 0x00 0x48 0xc7 0x45 0xc0 0x00 0x00 0x00 0x00 0xe8 0xad 0xfa buf2 (objdump): 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 test child finished with -1 ---- end ---- Object code reading: FAILED! # perf config test.objdump=/usr/bin/objdump # perf config test.objdump test.objdump=/usr/bin/objdump # perf test "object code reading" 26: Object code reading : Ok # Signed-off-by: James Clark <james.clark@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Fangrui Song <maskray@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Yonghong Song <yhs@fb.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20231106151051.129440-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf test: Add option to change objdump binaryJames Clark
All of the other Perf subcommands that use objdump have an option to specify the binary, so add the same option to 'perf test'. This is useful if you have built the kernel with a different toolchain to the system one, where the system objdump may fail to disassemble vmlinux. Now this can be fixed with something like this: $ perf test --objdump llvm-objdump "object code reading" Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Fangrui Song <maskray@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20231106151051.129440-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf tests offcpu: Adjust test case perf record offcpu profiling tests for s390Thomas Richter
On s390 using linux-next the test case: 87: perf record offcpu profiling tests fails. The root cause is this command # ./perf record --off-cpu -e dummy -- ./perf bench sched messaging -l 10 # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 0.231 [sec] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.077 MB perf.data (401 samples) ] # It does not generate 800+ sample entries, on s390 usually around 40[1-9], sometimes a few more, but never more than 450. The higher the number of CPUs the lower the number of samples. Looking at function chain: bench_sched_messaging() +--> group() the senders and receiver threads are created. The senders and receivers call function ready() which writes one bytes and wait for a reply using poll system() call. As context switches are counted, the function ready() will trigger a context switch when no input data is available after the write system call. The write system call does not trigger context switches when the data size is small. And writing 1000 bytes (10 iterations with 100 bytes) is not much and certainly won't block. The 400+ context switch on s390 occur when the some receiver/sender threads call ready() and wait for the response from function bench_sched_messaging() being kicked off. Lower the number of expected context switches to 400 to succeed on s390. Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Co-developed-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20231106091627.2022530-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf tools: Add the python_ext_build directory to .gitignoreYang Jihong
`python_ext_build` is the build directory for python.so, ignore it for cleaner git status. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231030111438.1357962-2-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf tests attr: Fix spelling mistake "whic" to "which"zhaimingbing
There is a spelling mistake, Please fix it. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20231030075825.3701-1-zhaimingbing@cmss.chinamobile.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf annotate: Move offsets array from 'struct annotation' to 'struct ↵Namhyung Kim
annotated_source' The offsets array keeps pointers to 'struct annotation_line' entries which are available in the 'struct annotated_source'. Let's move it to there. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231103191907.54531-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf annotate: Move some source code related fields from 'struct annotation' ↵Namhyung Kim
to 'struct annotated_source' Some fields in the 'struct annotation' are only used with 'struct annotated_source' so better to be moved there in order to reduce memory consumption for other symbols. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231103191907.54531-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf annotate: Move max_coverage from 'struct annotation' to 'struct ↵Namhyung Kim
annotated_branch' The max_coverage field is only used when branch stack info is available so it'd be natural to move to 'struct annotated_branch'. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231103191907.54531-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf annotate: Split branch stack cycles info from 'struct annotation'Namhyung Kim
The cycles info is only meaningful when sample has branch stacks. To save the memory for normal cases, move those fields to a new 'struct annotated_branch' and dynamically allocate it when needed. Also move cycles_hist from annotated_source as it's related here. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231103191907.54531-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf annotate: Split branch stack cycles information out of 'struct ↵Namhyung Kim
annotation_line' The cycles info is used only when branch stack is provided. Separate them from 'struct annotation_line' into a separate struct and lazy allocate them to save some memory. Committer notes: Make annotation__compute_ipc() check if the lazy allocation works, bailing out if so, its callers already do error checking and propagation. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231103191907.54531-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf test: Simplify "object code reading" testNamhyung Kim
It tries cycles (or cpu-clock on s390) event with exclude_kernel bit to open. But other arch on a VM can fail with the hardware event and need to fallback to the software event in the same way. So let's get rid of the cpuid check and use generic fallback mechanism using an array of event candidates. Now event in the odd index excludes the kernel so use that for the return value. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: James Clark <james.clark@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20231103195541.67788-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf machine thread: Remove exited threads by defaultIan Rogers
'struct thread' values hold onto references to mmaps, DSOs, etc. When a thread exits it is necessary to clean all of this memory up by removing the thread from the machine's threads. Some tools require this doesn't happen, such as auxtrace events, 'perf report' if offcpu events exist or if a task list is being generated, so add a 'struct symbol_conf' member to make the behavior optional. When an exited thread is left in the machine's threads, mark it as exited. This change relates to commit 40826c45eb0b8856 ("perf thread: Remove notion of dead threads") . Dead threads were removed as they had a reference count of 0 and were difficult to reason about with the reference count checker. Here a thread is removed from threads when it exits, unless via symbol_conf the exited thread isn't remove and is marked as exited. Reference counting behaves as it normally does. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231102175735.2272696-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf record: Lazy load kernel symbolsIan Rogers
Commit 5b7ba82a75915e73 ("perf symbols: Load kernel maps before using") changed it so that loading a kernel DSO would cause the symbols for the DSO to be eagerly loaded. For 'perf record' this is overhead as the symbols won't be used. Add a field to 'struct symbol_conf' to control the behavior and disable it for 'perf record' and 'perf inject'. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231102175735.2272696-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf tools: Fix spelling mistake "parametrized" -> "parameterized"Colin Ian King
There are spelling mistakes in comments and a pr_debug message. Fix them. Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20231003074911.220216-1-colin.i.king@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09perf tools: Add branch counter knobKan Liang
Add a new branch filter, "counter", for the branch counter option. It is used to mark the events which should be logged in the branch. If it is applied with the -j option, the counters of all the events should be logged in the branch. If the legacy kernel doesn't support the new branch sample type, switching off the branch counter filter. The stored counter values in each branch are displayed right after the regular branch stack information via perf report -D. Usage examples: # perf record -e "{branch-instructions,branch-misses}:S" -j any,counter Only the first event, branch-instructions, collect the LBR. Both branch-instructions and branch-misses are marked as logged events. The occurrences information of them can be found in the branch stack extension space of each branch. # perf record -e "{cpu/branch-instructions,branch_type=any/,cpu/branch-misses,branch_type=counter/}" Only the first event, branch-instructions, collect the LBR. Only the branch-misses event is marked as a logged event. Committer notes: I noticed 'perf test "Sample parsing"' failing, reported to the list and Kan provided a patch that checks if the evsel has a leader and that evsel->evlist is set, the comment in the source code further explains it. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tinghao Zhang <tinghao.zhang@intel.com> Link: https://lore.kernel.org/r/20231025201626.3000228-8-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-06perf header: Support num and width of branch countersKan Liang
To support the branch counters feature, the information of the maximum number of supported counters and the width of the counters is exposed in the sysfs caps folder. The perf tool can use the information to parse the logged counters in each branch. Store the information in the perf_env for later usage. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tinghao Zhang <tinghao.zhang@intel.com> Link: https://lore.kernel.org/r/20231025201626.3000228-7-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-06tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernelKan Liang
Sync the new sample type for the branch counters feature. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tinghao Zhang <tinghao.zhang@intel.com> Link: https://lore.kernel.org/r/20231025201626.3000228-6-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-03perf build: Warn about missing libelf before warning about missing libbpfArnaldo Carvalho de Melo
As libelf is a requirement for libbpf if it is not available, as in some container build tests where NO_LIBELF=1 is used, then better warn about the most basic library first. Ditto for libz, check its availability before libbpf too. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZUEehyDk0FkPnvMR@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-03perf tests make: Remove the last egrep call, use 'grep -E' insteadArnaldo Carvalho de Melo
One last case, caught while testing with amazonlinux:2, centos:stream, etc: 4 7.28 amazonlinux:2 : FAIL egrep: warning: egrep is obsolescent; using grep -E gcc version 7.3.1 20180712 (Red Hat 7.3.1-17) (GCC) 8 13.87 centos:stream : FAIL egrep: warning: egrep is obsolescent; using grep -E Reviewed-by: Guilherme Amadio <amadio@gentoo.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZUEdtblE8qDAQkBK@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-10-31perf beauty socket/prctl_option: Cope with extended regexp complaint by grepArnaldo Carvalho de Melo
Noticed on fedora 38, the extended regexp that so far was ok for both grep and sed now gets complaints by grep, that says '/' doesn't need to be escaped with '\'. So stop using '/' in sed, use '%' instead and remove the \ before / in the common extended regexp. Link: https://x.com/SMT_Solvers/status/1710380010098344192?s=20 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZUEddFPTJHVLhH%2F6@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-10-30Merge tag 'perf-tools-fixes-for-v6.6-2-2023-10-20' into perf-tools-nextNamhyung Kim
To get the latest fixes in the perf tools including perf stat output, dlfilter and LLVM feature detection. Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-28perf vendor events intel: Update tsx_cycles_per_elision metricsIan Rogers
Update tsx_cycles_per_elision as per: https://github.com/intel/perfmon/pull/116 Prefer the el-start event rather than cycles-t for detecting whether the metric will work as HLE may be disabled. Remove the metric from sapphirerapids that has no el-start event. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-28perf vendor events intel: Update bonnell version number to v5Ian Rogers
Spelling fixes were already incorporated in the Linux perf tree, update the version number to reflect this. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-28perf vendor events intel: Update westmereex events to v4Ian Rogers
Update westmereex events from v3 to v4 fixing a spelling issue. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-28perf vendor events intel: Update meteorlake events to v1.06Ian Rogers
Update meteorlake from v1.04 to v1.06 adding the changes from: https://github.com/intel/perfmon/commit/bc84df043091ec7c98c0629f3d074d9d7a108194 https://github.com/intel/perfmon/commit/405d3ee987d756b5b5d9a64d8a8fa77559822ecf Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-28perf vendor events intel: Update knightslanding events to v16Ian Rogers
Update knightslanding from v10 to v16 adding the changes from: https://github.com/intel/perfmon/commit/6c1f169f6ed63ee1fd75ebb303d0fd06d71196f5 https://github.com/intel/perfmon/commit/b22ca587ec8b5ac20471ea2f14924f63e63afe9d https://github.com/intel/perfmon/commit/e685286f083ee81cb7dafd0cd8546c79ee433187 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-28perf vendor events intel: Add typo fix for ivybridge FPIan Rogers
Add a missed space. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-28perf vendor events intel: Update a spelling in haswell/haswellxIan Rogers
The spelling of "in-flight" was switched to "inflight". Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-28perf vendor events intel: Update emeraldrapids to v1.01Ian Rogers
Update emeraldrapids to v1.01 from v1.00 adding the changes from: https://github.com/intel/perfmon/commit/3993b600e032a9fd443ffd828aab73de7cb167e5 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-28perf vendor events intel: Update alderlake/alderlake events to v1.23Ian Rogers
Update alderlake and alderlaken events from v1.21 to v1.23 adding the changes from: https://github.com/intel/perfmon/commit/8df4db9433a2aab59dbbac1a70281032d1af7734 https://github.com/intel/perfmon/commit/846bd247c6e04acc572ca56c992e9e65852bbe63 The tsx_cycles_per_elision metric is updated from PR: https://github.com/intel/perfmon/pull/116 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-27perf build: Disable BPF skeletons if clang version is < 12.0.1Arnaldo Carvalho de Melo
While building on a wide range of distros and clang versions it was noticed that at least version 12.0.1 (noticed on Alpine 3.15 with "Alpine clang version 12.0.1") is needed to not fail with BTF generation errors such as: Debian:10 Debian clang version 11.0.1-2~deb10u1: CLANG /tmp/build/perf/util/bpf_skel/.tmp/sample_filter.bpf.o <SNIP> GENSKEL /tmp/build/perf/util/bpf_skel/sample_filter.skel.h libbpf: failed to find BTF for extern 'bpf_cast_to_kern_ctx' [21] section: -2 Error: failed to open BPF object file: No such file or directory make[2]: *** [Makefile.perf:1121: /tmp/build/perf/util/bpf_skel/sample_filter.skel.h] Error 254 make[2]: *** Deleting file '/tmp/build/perf/util/bpf_skel/sample_filter.skel.h' Amazon Linux 2: clang version 11.1.0 (Amazon Linux 2 11.1.0-1.amzn2.0.2) GENSKEL /tmp/build/perf/util/bpf_skel/sample_filter.skel.h libbpf: elf: skipping unrecognized data section(18) .eh_frame libbpf: elf: skipping relo section(19) .rel.eh_frame for section(18) .eh_frame libbpf: failed to find BTF for extern 'bpf_cast_to_kern_ctx' [21] section: -2 Error: failed to open BPF object file: No such file or directory make[2]: *** [/tmp/build/perf/util/bpf_skel/sample_filter.skel.h] Error 254 make[2]: *** Deleting file `/tmp/build/perf/util/bpf_skel/sample_filter.skel.h' Ubuntu 20.04: clang version 10.0.0-4ubuntu1 CLANG /tmp/build/perf/util/bpf_skel/.tmp/augmented_raw_syscalls.bpf.o GENSKEL /tmp/build/perf/util/bpf_skel/bench_uprobe.skel.h GENSKEL /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h libbpf: sec '.reluprobe': corrupted symbol #27 pointing to invalid section #65522 for relo #0 GENSKEL /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h Error: failed to open BPF object file: BPF object format invalid make[2]: *** [Makefile.perf:1121: /tmp/build/perf/util/bpf_skel/bench_uprobe.skel.h] Error 95 make[2]: *** Deleting file '/tmp/build/perf/util/bpf_skel/bench_uprobe.skel.h' So check if the version is at least 12.0.1 otherwise disable building BPF skels and provide a message about it, continuing the build. The message, when running on amazonlinux:2: Makefile.config:698: Warning: Disabled BPF skeletons as reliable BTF generation needs at least clang version 12.0.1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/ZTvGx/Ou6BVnYBqi@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-27perf callchain: Fix spelling mistake "statisitcs" -> "statistics"Colin Ian King
There are a couple of spelling mistakes in perror messages. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20231027084633.1167530-1-colin.i.king@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-27perf report: Fix spelling mistake "heirachy" -> "hierarchy"Colin Ian King
There is a spelling mistake in a ui error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20231027084011.1167091-1-colin.i.king@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-27perf python: Fix binding linkage due to rename and move of ↵Arnaldo Carvalho de Melo
evsel__increase_rlimit() The changes in ("perf evsel: Rename evsel__increase_rlimit to rlimit__increase_nofile") ended up breaking the python binding that now references the rlimit__increase_nofile function, add the util/rlimit.o to the tools/perf/util/python-ext-sources to cure that. This was detected by the 'perf test python' regression test: $ perf test python 14: 'import perf' in python : FAILED! $ perf test -v python Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc 14: 'import perf' in python : --- start --- test child forked, pid 2912462 python usage test: "echo "import sys ; sys.path.insert(0, '/tmp/build/perf-tools-next/python'); import perf" | '/usr/bin/python3' " Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf-tools-next/python/perf.cpython-311-x86_64-linux-gnu.so: undefined symbol: rlimit__increase_nofile test child finished with -1 ---- end ---- 'import perf' in python: FAILED! $ Fixes: e093a222d7cba1eb ("perf evsel: Rename evsel__increase_rlimit to rlimit__increase_nofile") Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/lkml/ZTrCS5Z3PZAmfPdV@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-26perf tests: test_arm_coresight: Simplify source iterationJames Clark
There are two reasons to do this, firstly there is a shellcheck warning in cs_etm_dev_name(), which can be completely deleted. And secondly the current iteration method doesn't support systems with both ETE and ETM because it picks one or the other. There isn't a known system with this configuration, but it could happen in the future. Iterating over all the sources for each CPU can be done by going through /sys/bus/event_source/devices/cs_etm/cpu* and following the symlink back to the Coresight device in /sys/bus/coresight/devices. This will work whether the device is ETE, ETM or any future name, and is much simpler and doesn't require any hard coded version numbers Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: tianruidong@linux.alibaba.com Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: atrajeev@linux.vnet.ibm.com Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231023131550.487760-1-james.clark@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-26perf vendor events intel: Add tigerlake two metricsIan Rogers
Add tma_info_system_socket_clks and uncore_freq metrics. The associated converter script fix is in: https://github.com/intel/perfmon/pull/112 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Link: https://lore.kernel.org/r/20230926205948.1399594-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-26perf vendor events intel: Add broadwellde two metricsIan Rogers
Add tma_info_system_socket_clks and uncore_freq metrics that require a broadwellx style uncore event for UNC_CLOCK. The associated converter script fix is in: https://github.com/intel/perfmon/pull/112 Fixes: 7d124303d620 ("perf vendor events intel: Update broadwell variant events/metrics") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Link: https://lore.kernel.org/r/20230926205948.1399594-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metricIan Rogers
Broadwell-de has a consumer core and server uncore. The uncore_arb PMU isn't present and the broadwellx style cbox PMU should be used instead. Fix the tma_info_system_dram_bw_use metric to use the server metric rather than client. The associated converter script fix is in: https://github.com/intel/perfmon/pull/111 Fixes: 7d124303d620 ("perf vendor events intel: Update broadwell variant events/metrics") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Link: https://lore.kernel.org/r/20230926031034.1201145-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exitIan Rogers
Fix leak where mem_info__put wouldn't release the maps/map as used by perf mem. Add exit functions and use elsewhere that the maps and map are released. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25perf callchain: Minor layout changes to callchain_listIan Rogers
Avoid 6 byte hole for padding. Place more frequently used fields first in an attempt to use just 1 cacheline in the common case. Before: ``` struct callchain_list { u64 ip; /* 0 8 */ struct map_symbol ms; /* 8 24 */ struct { _Bool unfolded; /* 32 1 */ _Bool has_children; /* 33 1 */ }; /* 32 2 */ /* XXX 6 bytes hole, try to pack */ u64 branch_count; /* 40 8 */ u64 from_count; /* 48 8 */ u64 predicted_count; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 abort_count; /* 64 8 */ u64 cycles_count; /* 72 8 */ u64 iter_count; /* 80 8 */ u64 iter_cycles; /* 88 8 */ struct branch_type_stat * brtype_stat; /* 96 8 */ const char * srcline; /* 104 8 */ struct list_head list; /* 112 16 */ /* size: 128, cachelines: 2, members: 13 */ /* sum members: 122, holes: 1, sum holes: 6 */ }; ``` After: ``` struct callchain_list { struct list_head list; /* 0 16 */ u64 ip; /* 16 8 */ struct map_symbol ms; /* 24 24 */ const char * srcline; /* 48 8 */ u64 branch_count; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 from_count; /* 64 8 */ u64 cycles_count; /* 72 8 */ u64 iter_count; /* 80 8 */ u64 iter_cycles; /* 88 8 */ struct branch_type_stat * brtype_stat; /* 96 8 */ u64 predicted_count; /* 104 8 */ u64 abort_count; /* 112 8 */ struct { _Bool unfolded; /* 120 1 */ _Bool has_children; /* 121 1 */ }; /* 120 2 */ /* size: 128, cachelines: 2, members: 13 */ /* padding: 6 */ }; ``` Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25perf callchain: Make brtype_stat in callchain_list optionalIan Rogers
struct callchain_list is 352bytes in size, 232 of which are brtype_stat. brtype_stat is only used for certain callchain_list items so make it optional, allocating when necessary. So that printing doesn't need to deal with an optional brtype_stat, pass an empty/zero version. Before: ``` struct callchain_list { u64 ip; /* 0 8 */ struct map_symbol ms; /* 8 24 */ struct { _Bool unfolded; /* 32 1 */ _Bool has_children; /* 33 1 */ }; /* 32 2 */ /* XXX 6 bytes hole, try to pack */ u64 branch_count; /* 40 8 */ u64 from_count; /* 48 8 */ u64 predicted_count; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 abort_count; /* 64 8 */ u64 cycles_count; /* 72 8 */ u64 iter_count; /* 80 8 */ u64 iter_cycles; /* 88 8 */ struct branch_type_stat brtype_stat; /* 96 232 */ /* --- cacheline 5 boundary (320 bytes) was 8 bytes ago --- */ const char * srcline; /* 328 8 */ struct list_head list; /* 336 16 */ /* size: 352, cachelines: 6, members: 13 */ /* sum members: 346, holes: 1, sum holes: 6 */ /* last cacheline: 32 bytes */ }; ``` After: ``` struct callchain_list { u64 ip; /* 0 8 */ struct map_symbol ms; /* 8 24 */ struct { _Bool unfolded; /* 32 1 */ _Bool has_children; /* 33 1 */ }; /* 32 2 */ /* XXX 6 bytes hole, try to pack */ u64 branch_count; /* 40 8 */ u64 from_count; /* 48 8 */ u64 predicted_count; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 abort_count; /* 64 8 */ u64 cycles_count; /* 72 8 */ u64 iter_count; /* 80 8 */ u64 iter_cycles; /* 88 8 */ struct branch_type_stat * brtype_stat; /* 96 8 */ const char * srcline; /* 104 8 */ struct list_head list; /* 112 16 */ /* size: 128, cachelines: 2, members: 13 */ /* sum members: 122, holes: 1, sum holes: 6 */ }; ``` Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25perf callchain: Make display use of branch_type_stat constIan Rogers
Display code doesn't modify the branch_type_stat so switch uses to const. This is done to aid refactoring struct callchain_list where current the branch_type_stat is embedded even if not used. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25perf offcpu: Add missed btf_freeIan Rogers
Caught by address/leak sanitizer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25perf threads: Remove unused dead thread listIan Rogers
Commit 40826c45eb0b ("perf thread: Remove notion of dead threads") removed dead threads but the list head wasn't removed. Remove it here. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25perf hist: Add missing puts to hist__account_cyclesIan Rogers
Caught using reference count checking on perf top with "--call-graph=lbr". After this no memory leaks were detected. Fixes: 57849998e2cd ("perf report: Add processing for cycle histograms") Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25libperf rc_check: Add RC_CHK_EQUALIan Rogers
Comparing pointers with reference count checking is tricky to avoid a SEGV. Add a convenience macro to simplify and use. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25libperf rc_check: Make implicit enabling work for GCCIan Rogers
Make the implicit REFCOUNT_CHECKING robust to when building with GCC. Fixes: 9be6ab181b7b ("libperf rc_check: Enable implicitly with sanitizers") Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>