summaryrefslogtreecommitdiff
path: root/tools/perf/arch/powerpc/util
AgeCommit message (Collapse)Author
2025-03-23perf tools: Fix is_compat_mode build break in ppc64Likhitha Korrapati
Commit 54f9aa1092457 ("tools/perf/powerpc/util: Add support to handle compatible mode PVR for perf json events") introduced to select proper JSON events in case of compat mode using auxiliary vector. But this caused a compilation error in ppc64 Big Endian. arch/powerpc/util/header.c: In function 'is_compat_mode': arch/powerpc/util/header.c:20:21: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 20 | if (!strcmp((char *)platform, (char *)base_platform)) | ^ arch/powerpc/util/header.c:20:39: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 20 | if (!strcmp((char *)platform, (char *)base_platform)) | Commit saved the getauxval(AT_BASE_PLATFORM) and getauxval(AT_PLATFORM) return values in u64 which causes the compilation error. Patch fixes this issue by changing u64 to "unsigned long". Fixes: 54f9aa1092457 ("tools/perf/powerpc/util: Add support to handle compatible mode PVR for perf json events") Signed-off-by: Likhitha Korrapati <likhitha@linux.ibm.com> Reviewed-by: Athira Rajeev <atrajeev@linux.ibm.com> Link: https://lore.kernel.org/r/20250321100726.699956-1-likhitha@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-14perf tools: Fix compile error on sample->user_regsNamhyung Kim
It's recently changed to allocate dynamically but misses to update some arch-dependent codes to use perf_sample__user_regs(). Fixes: dc6d2bc2d893a878 ("perf sample: Make user_regs and intr_regs optional") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250214191641.756664-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-18perf tools arch powerpc: Add register mask for power11 PVR in extended regsAthira Rajeev
Perf tools side uses extended mask to display the platform supported register names (with -I? option) to the user and also send this mask to the kernel to capture the extended registers as part of each sample. This mask value is decided based on the processor version ( from PVR ). Add PVR value for power11 to enable capturing the extended regs as part of sample in power11. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20241206135637.36166-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Pass a perf_cpu rather than a PMU to get_cpuid_strIan Rogers
On ARM the cpuid is dependent on the core type of the CPU in question. The PMU was passed for the sake of the CPU map but this means in places a temporary PMU is created just to pass a CPU value. Just pass the CPU and fix up the callers. As there are no longer PMU users in header.h, shuffle forward declarations earlier to work around build failures. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Refactor get_cpuid to take a CPU for ARMIan Rogers
ARM BIG.little has no notion of a constant CPUID for both core types. To reflect this reality, change the get_cpuid function to also pass in a possibly unused logical cpu. If the dummy value (-1) is passed in then ARM can, as currently happens, select the first logical CPU's "CPUID". The changes to ARM getcpuid happen in a follow up change. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-09perf dwarf-regs: Move powerpc dwarf-regs out of archIan Rogers
Move arch/powerpc/util/dwarf-regs.c to util/dwarf-regs-powerpc.c and compile in unconditionally. get_arch_regstr is redundant when EM_NONE is treated as EM_HOST so remove and update dwarf-regs.c conditions. Make get_powerpc_regs unconditionally available whwn libdw is. Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Anup Patel <anup@brainfault.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: David S. Miller <davem@davemloft.net> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Shenlin Liang <liangshenlin@eswincomputing.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Chen Pei <cp0613@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Aditya Gupta <adityag@linux.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: Bibo Mao <maobibo@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/r/20241108234606.429459-14-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09perf dwarf-regs: Remove PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSETIan Rogers
PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET was used for BPF prologue support which was removed in Commit 3d6dfae88917 ("perf parse-events: Remove BPF event support"). The code is no longer used so remove. Remove the offset from various dwarf-regs.c tables and the dependence on ptrace.h. Rename structs starting pt_ as the ptrace derived offset is now removed. Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Anup Patel <anup@brainfault.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: David S. Miller <davem@davemloft.net> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Shenlin Liang <liangshenlin@eswincomputing.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Chen Pei <cp0613@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Aditya Gupta <adityag@linux.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: Bibo Mao <maobibo@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/r/20241108234606.429459-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18perf build: Rename CONFIG_DWARF to CONFIG_LIBDWIan Rogers
In Makefile.config for unwinding the name dwarf implies either libunwind or libdw. Make it clearer that CONFIG_DWARF is really just defined when libdw is present by renaming to CONFIG_LIBDW. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Leo Yan <leo.yan@arm.com> Cc: Anup Patel <anup@brainfault.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: David S. Miller <davem@davemloft.net> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Shenlin Liang <liangshenlin@eswincomputing.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Chen Pei <cp0613@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Aditya Gupta <adityag@linux.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: Bibo Mao <maobibo@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/r/20241017001354.56973-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17tools/perf/powerpc/util: Add support to handle compatible mode PVR for perf ↵Athira Rajeev
json events perf list picks the events supported for specific platform from pmu-events/arch/powerpc/<platform>. Example power10 events are in pmu-events/arch/powerpc/power10, power9 events are part of pmu-events/arch/powerpc/power9. The decision of which platform to pick is determined based on PVR value in powerpc. The PVR value is matched from pmu-events/arch/powerpc/mapfile.csv Example: Format: PVR,Version,JSON/file/pathname,Type 0x004[bcd][[:xdigit:]]{4},1,power8,core 0x0066[[:xdigit:]]{4},1,power8,core 0x004e[[:xdigit:]]{4},1,power9,core 0x0080[[:xdigit:]]{4},1,power10,core 0x0082[[:xdigit:]]{4},1,power10,core The code gets the PVR from system using get_cpuid_str function in arch/powerpc/util/headers.c ( from SPRN_PVR ) and compares with value from mapfile.csv In case of compat mode, say when partition is booted in a power9 mode when the system is a power10, this picks incorrectly. Because PVR will point to power10 where as it should pick events from power9 folder. To support generic events, add new folder pmu-events/arch/powerpc/compat to contain the ISA architected events which is supported in compat mode. Also return 0x00ffffff as pvr when booted in compat mode. Based on this pvr value, json will pick events from pmu-events/arch/powerpc/compat Suggested-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel<disgoel@linux.ibm.com> Cc: akanksha@linux.ibm.com Cc: hbathini@linux.ibm.com Cc: kjain@linux.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20241010145107.51211-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-31perf annotate: Update parameters for reg extract functions to use raw ↵Athira Rajeev
instruction on powerpc Use the raw instruction code and macros to identify memory instructions, extract register fields and also offset. The implementation addresses the D-form, X-form, DS-form instructions. Adds "mem_ref" field to check whether source/target has memory reference. Add function "get_powerpc_regs" which will set these fields: reg1, reg2, offset depending of where it is source or target ops. Update "parse" callback for "struct ins_ops" to also pass "struct disasm_line" as argument. This is needed in parse functions where opcode is used to determine whether to set multi_regs and other fields Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-7-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Add disasm_line__parse() to parse raw instruction for powerpcAthira Rajeev
Currently, the perf tool infrastructure uses the disasm_line__parse function to parse disassembled line. Example snippet from objdump: objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux> c0000000010224b4: lwz r10,0(r9) This line "lwz r10,0(r9)" is parsed to extract instruction name, registers names and offset. In powerpc, the approach for data type profiling uses raw instruction instead of result from objdump to identify the instruction category and extract the source/target registers. Example: 38 01 81 e8 ld r4,312(r1) Here "38 01 81 e8" is the raw instruction representation. Add function "disasm_line__parse_powerpc" to handle parsing of raw instruction. Also update "struct disasm_line" to save the binary code/ With the change, function captures: line -> "38 01 81 e8 ld r4,312(r1)" raw instruction "38 01 81 e8" Raw instruction is used later to extract the reg/offset fields. Macros are added to extract opcode and register fields. "struct disasm_line" is updated to carry union of "bytes" and "raw_insn" of 32 bit to carry raw code (raw). Function "disasm_line__parse_powerpc fills the raw instruction hex value and can use macros to get opcode. There is no changes in existing code paths, which parses the disassembled code. The size of raw instruction depends on architecture. In case of powerpc, the parsing the disasm line needs to handle cases for reading binary code directly from DSO as well as parsing the objdump result. Hence adding the logic into separate function instead of updating "disasm_line__parse". The architecture using the instruction name and present approach is not altered. Since this approach targets powerpc, the macro implementation is added for powerpc as of now. Since the disasm_line__parse is used in other cases (perf annotate) and not only data tye profiling, the powerpc callback includes changes to work with binary code as well as mnemonic representation. Also in case if the DSO read fails and libcapstone is not supported, the approach fallback to use objdump as option. Hence as option, patch has changes to ensure objdump option also works well. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-5-atrajeev@linux.vnet.ibm.com [ Add check for strndup() result ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-12perf dso: Fix address sanitizer buildIan Rogers
Various files had been missed from having accessor functions added for the sake of dso reference count checking. Add the function calls and missing dso accessor functions. Fixes: ee756ef7491e ("perf dso: Add reference count checking and accessor functions") Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yunseong Kim <yskelg@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20240704011745.1021288-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-06-26perf util: Make util its own libraryIan Rogers
Make the util directory into its own library. This is done to avoid compiling code twice, once for the perf tool and once for the perf python module. For convenience: arch/common.c scripts/perl/Perf-Trace-Util/Context.c scripts/python/Perf-Trace-Util/Context.c are made part of this library. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Nick Terrell <terrelln@fb.com> Cc: Gary Guo <gary@garyguo.net> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrei Vagin <avagin@google.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Guo Ren <guoren@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: John Garry <john.g.garry@oracle.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240625214117.953777-7-irogers@google.com
2024-02-15perf parse-regs: Introduce a weak function arch__sample_reg_masks()Leo Yan
Every architecture can provide a register list for sampling. If an architecture doesn't support register sampling, it won't define the data structure 'sample_reg_masks'. Consequently, any code using this structure must be protected by the macro 'HAVE_PERF_REGS_SUPPORT'. This patch defines a weak function, arch__sample_reg_masks(), which will be replaced by an architecture-defined function for returning the architecture's register list. With this refactoring, the function always exists, the condition checking for 'HAVE_PERF_REGS_SUPPORT' is not needed anymore, so remove it. Signed-off-by: Leo Yan <leo.yan@linux.dev> Reviewed-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ming Wang <wangming01@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214113947.240957-4-leo.yan@linux.dev
2024-02-07perf kvm powerpc: Fix buildIan Rogers
Updates to struct parse_events_error needed to be carried through to PowerPC specific event parsing. Fixes: fd7b8e8fb20f ("perf parse-events: Print all errors") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung <namhyung@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240206235902.2917395-1-irogers@google.com
2024-01-24perf mem: Clean up perf_mem_event__supported()Kan Liang
For some ARCHs, e.g., ARM and AMD, to get the availability of the mem-events, perf checks the existence of a specific PMU. For the other ARCHs, e.g., Intel and Power, perf has to check the existence of some specific events. The current perf only iterates the mem-events-supported PMUs. It's not required to check the existence of a specific PMU anymore. Rename sysfs_name to event_name, which stores the specific mem-events. Perf only needs to check those events for the availability of the mem-events. Rename perf_mem_event__supported to perf_pmu__mem_events_supported. Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Tested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: james.clark@arm.com Cc: will@kernel.org Cc: mike.leach@linaro.org Cc: renyu.zj@linux.alibaba.com Cc: yuhaixin.yhx@linux.alibaba.com Cc: tmricht@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Cc: linux-arm-kernel@lists.infradead.org Cc: john.g.garry@oracle.com Link: https://lore.kernel.org/r/20240123185036.3461837-5-kan.liang@linux.intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-24perf mem: Clean up perf_mem_events__name()Kan Liang
Introduce a generic perf_mem_events__name(). Remove the ARCH-specific one. The mem_load events may have a different format. Add ldlat and aux_event in the struct perf_mem_event to indicate the format and the extra aux event. Add perf_mem_events_intel_aux[] to support the extra mem_load_aux event. Rename perf_mem_events__name to perf_pmu__mem_events_name. Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Tested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: james.clark@arm.com Cc: will@kernel.org Cc: mike.leach@linaro.org Cc: renyu.zj@linux.alibaba.com Cc: yuhaixin.yhx@linux.alibaba.com Cc: tmricht@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Cc: linux-arm-kernel@lists.infradead.org Cc: john.g.garry@oracle.com Link: https://lore.kernel.org/r/20240123185036.3461837-4-kan.liang@linux.intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-17tools/perf/arch/powerpc: Fix the CPU ID const char* value by adding 0x prefixAthira Rajeev
Simple expression parser test fails in powerpc as below: 4: Simple expression parser test child forked, pid 170385 Using CPUID 004e2102 division by zero syntax error syntax error FAILED tests/expr.c:65 parse test failed test child finished with -1 Simple expression parser: FAILED! This is observed after commit: 'commit 9d5da30e4ae9 ("perf jevents: Add a new expression builtin strcmp_cpuid_str()")' With this commit, a new expression builtin strcmp_cpuid_str got added. This function takes an 'ID' type value, which is a string. So expression parse for strcmp_cpuid_str expects const char * as cpuid value type. In case of powerpc, CPU IDs are numbers. Hence it doesn't get interpreted correctly by bison parser. Example in case of power9, cpuid string returns as: 004e2102 cpuid of string type is expected in two cases: 1. char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused); Testcase "tests/expr.c" uses "perf_pmu__getcpuid" which calls get_cpuid_str to get the cpuid string. 2. cpuid field in :struct pmu_events_map struct pmu_events_map { const char *arch; const char *cpuid; Here cpuid field is used in "perf_pmu__find_events_table" function as "strcmp_cpuid_str(map->cpuid, cpuid)". The value for cpuid field is picked from mapfile.csv. Fix the mapfile.csv and get_cpuid_str function to prefix cpuid with 0x so that it gets correctly interpreted by the bison parser Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel<disgoel@linux.ibm.com> Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20231009050052.64935-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-08-25perf pmu: Remove logic for PMU name being NULLIan Rogers
The PMU name could be NULL in the case of the fake_pmu. Initialize the name for the fake_pmu to "fake" so that all other logic can assume it is initialized. Add a const to the type of name so that a literal can be used to avoid additional initialization code. Propagate the cost through related routines and remove now unnecessary "(char *)" casts. Doing this located a bug in builtin-list for the pmu_glob that was missing a strdup. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20230825024002.801955-3-irogers@google.com Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Wei Li <liwei391@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Ming Wang <wangming01@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-16perf parse-regs: Move out arch specific header from util/perf_regs.hLeo Yan
util/perf_regs.h includes another perf_regs.h: #include <perf_regs.h> Here it includes architecture specific header, for example, if we build arm64 target, the header tools/perf/arch/arm64/include/perf_regs.h is included. We use this implicit way to include architecture specific header, which is not directive; furthermore, util/perf_regs.c is coupled with the architecture specific definitions. This patch moves out arch specific header from util/perf_regs.h for generalizing the 'util' folder, as a result, the source files in 'arch' folder explicitly include architecture's perf_regs.h. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eric Lin <eric.lin@sifive.com> Cc: Fangrui Song <maskray@google.com> Cc: Guo Ren <guoren@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20230606014559.21783-7-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-16perf parse-regs: Remove PERF_REGS_{MAX|MASK} from common codeLeo Yan
The macros PERF_REGS_MAX and PERF_REGS_MASK are architecture specific, let's remove them from the common file util/perf_regs.c. As a side effect, the weak functions arch__intr_reg_mask() and arch__user_reg_mask() just return zeros, every arch defines its own functions in the 'arch' folder for returning right values. Note, we don't need to return intr/user register masks dynamically, this is because these two functions are invoked during recording phase but not decoding phase, they are always invoked on the native environment, thus we don't need to parse them dynamically. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eric Lin <eric.lin@sifive.com> Cc: Fangrui Song <maskray@google.com> Cc: Guo Ren <guoren@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20230606014559.21783-6-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-07-24perf callchain powerpc: Fix addr location init during ↵Athira Rajeev
arch_skip_callchain_idx function 'perf record; with callchain recording fails as below in powerpc: ./perf record -a -gR sleep 10 ./perf report perf: Segmentation fault gdb trace points to thread__find_map 0 0x00000000101df314 in atomic_cmpxchg (newval=1818846826, oldval=1818846827, v=0x1001a8f3) at /home/athira/linux/tools/include/asm-generic/atomic-gcc.h:70 1 refcount_sub_and_test (i=1, r=0x1001a8f3) at /home/athira/linux/tools/include/linux/refcount.h:135 2 refcount_dec_and_test (r=0x1001a8f3) at /home/athira/linux/tools/include/linux/refcount.h:148 3 map__put (map=0x1001a8b3) at util/map.c:311 4 0x000000001016842c in __map__zput (map=0x7fffffffa368) at util/map.h:190 5 thread__find_map (thread=0x105b92f0, cpumode=<optimized out>, addr=13835058055283572736, al=al@entry=0x7fffffffa358) at util/event.c:582 6 0x000000001016882c in thread__find_symbol (thread=<optimized out>, cpumode=<optimized out>, addr=<optimized out>, al=0x7fffffffa358) at util/event.c:656 7 0x00000000102e12b4 in arch_skip_callchain_idx (thread=<optimized out>, chain=<optimized out>) at arch/powerpc/util/skip-callchain-idx.c:255 8 0x00000000101d3bf4 in thread__resolve_callchain_sample (thread=0x105b92f0, cursor=0x1053d160, evsel=<optimized out>, sample=0x7fffffffa908, parent=0x7fffffffa778, root_al=0x7fffffffa710, max_stack=<optimized out>) at util/machine.c:2940 9 0x00000000101cd210 in sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=<optimized out>, evsel=<optimized out>, al=<optimized out>, max_stack=<optimized out>) at util/callchain.c:1112 10 0x000000001022a9d8 in hist_entry_iter__add (iter=0x7fffffffa750, al=0x7fffffffa710, max_stack_depth=<optimized out>, arg=0x7fffffffbbd0) at util/hist.c:1232 11 0x0000000010056d98 in process_sample_event (tool=0x7fffffffbbd0, event=0x7ffff6223c38, sample=0x7fffffffa908, evsel=<optimized out>, machine=0x10524ef8) at builtin-report.c:332 Here arch_skip_callchain_idx calls thread__find_symbol and which invokes thread__find_map with uninitialised "addr_location". Snippet: thread__find_symbol(thread, PERF_RECORD_MISC_USER, ip, &al); Recent change with commit 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy functions") , introduced "maps__zput" in the function thread__find_map. This could result in segfault while accessing uninitialised map from "struct addr_location". Fix this by adding addr_location__init and addr_location__exit in arch_skip_callchain_idx. Fixes: 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy functions") Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230724165815.17810-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-30perf kvm powerpc: Add missing rename opf pmu_have_event() to ↵Ian Rogers
perf_pmus__have_event() Missed function rename from pmu_have_event to perf_pmus__have_event made the perf build fail on powerpc. Committer notes: The perf_pmus__have_event() is declared in util/pmus.h, so use it instead of by now needless util/pmu.h. Fixes: 1eaf496ed386934f ("perf pmu: Separate pmu and pmus") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230530021433.3107580-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-24perf evsel: Introduce evsel__name_is() method to check if the evsel name is ↵Arnaldo Carvalho de Melo
equal to a given string This makes the logic a bit clear by avoiding the !strcmp() pattern and also a way to intercept the pointer if we need to do extra validation on it or to do lazy setting of evsel->name via evsel__name(evsel). Reviewed-by: "Liang, Kan" <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZEGLM8VehJbS0gP2@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-06perf map: Add helper for ->map_ip() and ->unmap_ip()Ian Rogers
Later changes will add reference count checking for struct map, add a helper function to invoke the map_ip and unmap_ip function pointers. The helper allows the reference count check to be in fewer places. Committer notes: Add missing conversions to: tools/perf/util/map.c tools/perf/util/cs-etm.c tools/perf/util/annotate.c tools/perf/arch/powerpc/util/sym-handling.c tools/perf/arch/s390/annotate/instructions.c Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230404205954.2245628-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf map: Add accessor for start and endIan Rogers
Later changes will add reference count checking for struct map, start and end are frequently accessed variables. Add an accessor so that the reference count check is only necessary in one place. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf map: Add accessor for dsoIan Rogers
Later changes will add reference count checking for struct map, with dso being the most frequently accessed variable. Add an accessor so that the reference count check is only necessary in one place. Additional changes: - add a dso variable to avoid repeated map__dso calls. - in builtin-mem.c dump_raw_samples, code only partially tested for dso == NULL. Make the possibility of NULL consistent. - in thread.c thread__memcpy fix use of spaces and use tabs. Committer notes: Did missing conversions on these files: tools/perf/arch/powerpc/util/skip-callchain-idx.c tools/perf/arch/powerpc/util/sym-handling.c tools/perf/ui/browsers/hists.c tools/perf/ui/gtk/annotate.c tools/perf/util/cs-etm.c tools/perf/util/thread.c tools/perf/util/unwind-libunwind-local.c tools/perf/util/unwind-libunwind.c Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15perf kvm: Use macro to replace variable 'decode_str_len'Leo Yan
The variable 'decode_str_len' defines the string length for KVM event name and every arch defines its own values. This introduces complexity that the variable definition are spreading in multiple source files under arch folder. This patch refactors code to use a macro KVM_EVENT_NAME_LEN to define event name length and thus remove the definitions in arch files. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19perf pmu-events: Change aggr_mode to be an enumIan Rogers
Rather than use a string to encode aggr_mode, use an enum value. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02perf pmu-events: Add separate metric from pmu_eventIan Rogers
Create a new pmu_metric for the metric related variables from pmu_event but that is initially just a clone of pmu_event. Add iterators for pmu_metric and use in places that metrics are desired rather than events. Make the event iterator skip metric only events, and the metric iterator skip event only events. Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230126233645.200509-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14perf build: Use libtraceevent from the systemIan Rogers
Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command line variables. If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the build, don't compile in libtraceevent and libtracefs support. This also disables CONFIG_TRACE that controls "perf trace". CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles, HAVE_LIBTRACEEVENT is used in C code. Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the commands kmem, kwork, lock, sched and timechart are removed. The majority of commands continue to work including "perf test". Committer notes: Fixed up a tools/perf/util/Build reject and added: #include <traceevent/event-parse.h> to tools/perf/util/scripting-engines/trace-event-perl.c. Committer testing: $ rpm -qi libtraceevent-devel Name : libtraceevent-devel Version : 1.5.3 Release : 2.fc36 Architecture: x86_64 Install Date: Mon 25 Jul 2022 03:20:19 PM -03 Group : Unspecified Size : 27728 License : LGPLv2+ and GPLv2+ Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4 Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm Build Date : Fri 15 Apr 2022 10:57:01 AM -03 Build Host : buildvm-x86-05.iad2.fedoraproject.org Packager : Fedora Project Vendor : Fedora Project URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/ Bug URL : https://bugz.fedoraproject.org/libtraceevent Summary : Development headers of libtraceevent Description : Development headers of libtraceevent-libs $ Default build: $ ldd ~/bin/perf | grep tracee libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000) $ # perf trace -e sched:* --max-events 10 0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1) 0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1) 0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120) 1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120) 1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120) 0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2) 0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2) 0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120) 1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1) 1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120) # Had to tweak tools/perf/util/setup.py to make sure the python binding shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is present in CFLAGS. Building with NO_LIBTRACEEVENT=1 uncovered some more build failures: - Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y - perf-$(CONFIG_LIBTRACEEVENT) += scripts/ - bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y - The python binding needed some fixups and util/trace-event.c can't be built and linked with the python binding shared object, so remove it in tools/perf/util/setup.py and exclude it from the list of dependencies in the python/perf.so Makefile.perf target. Building without libtraceevent-devel installed uncovered more build failures: - The python binding tools/perf/util/python.c was assuming that traceevent/parse-events.h was always available, which was the case when we defaulted to using the in-kernel tools/lib/traceevent/ files, now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like the other parts of it that deal with tracepoints. - We have to ifdef the rules in the Build files with CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't detect libtraceevent-devel installed in the system. Simplification here to avoid these two ways of disabling builtin-trace.c and not having CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean way. From Athira: <quote> tools/perf/arch/powerpc/util/Build -perf-y += kvm-stat.o +perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o </quote> Then, ditto for arm64 and s390, detected by container cross build tests. - s/390 uses test__checkevent_tracepoint() that is now only available if HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT. Also from Athira: <quote> With this change, I could successfully compile in these environment: - Without libtraceevent-devel installed - With libtraceevent-devel installed - With “make NO_LIBTRACEEVENT=1” </quote> Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for consistency with other libraries detected in tools/perf/. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-31perf tools: Move 'struct perf_sample' to a separate header file to ↵Arnaldo Carvalho de Melo
disentangle headers Some places were including event.h just to get 'struct perf_sample', move it to a separate place so that we speed up a bit the build. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf symbol: Remove arch__symbols__fixup_end()Namhyung Kim
Now the generic code can handle kallsyms fixup properly so no need to keep the arch-functions anymore. Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10perf powerpc: Update global/local variants for p_stage_cycAthira Rajeev
Update the arch_support_sort_key() function in powerpc to enable presenting local and global variants of sort key 'p_stage_cyc'. Update the "se_header" strings for these in arch_perf_header_entry() along with instruction latency. Reported-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20211203022038.48240-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07perf parse-event: Add init and exit to parse_event_errorIan Rogers
parse_events() may succeed but leave string memory allocations reachable in the error. Add an init/exit that must be called to initialize and clean up the error. This fixes a leak in metricgroup parse_ids. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20211107090002.3784612-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07perf parse-events: Rename parse_events_error functionsIan Rogers
Group error functions and name after the data type they manipulate. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20211107090002.3784612-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-01Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo
To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31perf callchain: Fix compilation on powerpc with gcc11+Jiri Olsa
Got following build fail on powerpc: CC arch/powerpc/util/skip-callchain-idx.o In function ‘check_return_reg’, inlined from ‘check_return_addr’ at arch/powerpc/util/skip-callchain-idx.c:213:7, inlined from ‘arch_skip_callchain_idx’ at arch/powerpc/util/skip-callchain-idx.c:265:7: arch/powerpc/util/skip-callchain-idx.c:54:18: error: ‘dwarf_frame_register’ accessing 96 bytes \ in a region of size 64 [-Werror=stringop-overflow=] 54 | result = dwarf_frame_register(frame, ra_regno, ops_mem, &ops, &nops); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/powerpc/util/skip-callchain-idx.c: In function ‘arch_skip_callchain_idx’: arch/powerpc/util/skip-callchain-idx.c:54:18: note: referencing argument 3 of type ‘Dwarf_Op *’ In file included from /usr/include/elfutils/libdwfl.h:32, from arch/powerpc/util/skip-callchain-idx.c:10: /usr/include/elfutils/libdw.h:1069:12: note: in a call to function ‘dwarf_frame_register’ 1069 | extern int dwarf_frame_register (Dwarf_Frame *frame, int regno, | ^~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors The dwarf_frame_register args changed with [1], Updating ops_mem accordingly. [1] https://sourceware.org/git/?p=elfutils.git;a=commit;h=5621fe5443da23112170235dd5cac161e5c75e65 Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Mark Wieelard <mjw@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20210928195253.1267023-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-25perf powerpc: Add support to expose instruction and data address registers ↵Athira Rajeev
as part of extended regs This patch enables presenting Sampled Instruction Address Register (SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended registers for the perf tool. Add these SPR's to sample_reg_mask in the tool side (to use with -I? option). Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20211018114948.16830-3-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-20perf pmu: Make pmu_event tables const.Ian Rogers
Make lookup nature of data structures clearer through their type. Reduce scope of architecture specific pmu_event tables by making them static. Suggested-by: John Garry <john.garry@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Kilroy <andrew.kilroy@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Changbin Du <changbin.du@intel.com> Cc: Denys Zagorui <dzagorui@cisco.com> Cc: Fabian Hemmer <copy@copy.sh> Cc: Felix Fietkau <nbd@nbd.name> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joakim Zhang <qiangqing.zhang@nxp.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kees Kook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nicholas Fraser <nfraser@codeweavers.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: ShihCheng Tu <mrtoastcheng@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Wan Jiabing <wanjiabing@vivo.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20211015172132.1162559-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01perf tools: Support pmu prefix for mem-load eventJin Yao
The perf_mem_events__name() can generate the mem-load event name. It uses a variable 'mem_loads_name__init' to avoid generating the event name every time (because perf_pmu__scan takes some time). The perf_mem_events__name() assumes the pmu is "cpu" but it's not correct for hybrid platform. For Alderlake, the pmu is "cpu_core" or "cpu_atom" Introduce a new parameter 'pmu_name' in perf_mem_events__name to let the caller specify a pmu name. Considering such event name is x86 specific, so move perf_mem_events[] to arch/x86/util/mem-events.c. We still keep the variable 'mem_loads_name__init' but it's only used when pmu_name is NULL (compatible for original behavior). When pmu_name is not NULL (e.g. "cpu_core"), this patch doesn't have optimization. That can be implemented in follow up patch. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210527001610.10553-3-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-03-26perf sort: Display sort dimension p_stage_cyc only on supported archsAthira Rajeev
The sort dimension "p_stage_cyc" is used to represent pipeline stage cycle information. Presently, this is used only in powerpc. For unsupported platforms, we don't want to display it in the perf report output columns. Hence add check in sort_dimension__add() and skip the sort key incase it is not applicable for the particular arch. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: https://lore.kernel.org/r/1616425047-1666-6-git-send-email-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-03-26perf tools: Support pipeline stage cycles for powerpcAthira Rajeev
The pipeline stage cycles details can be recorded on powerpc from the contents of Performance Monitor Unit (PMU) registers. On ISA v3.1 platform, sampling registers exposes the cycles spent in different pipeline stages. Patch adds perf tools support to present two of the cycle counter information along with memory latency (weight). Re-use the field 'ins_lat' for storing the first pipeline stage cycle. This is stored in 'var2_w' field of 'perf_sample_weight'. Add a new field 'p_stage_cyc' to store the second pipeline stage cycle which is stored in 'var3_w' field of perf_sample_weight. Add new sort function 'Pipeline Stage Cycle' and include this in default_mem_sort_order[]. This new sort function may be used to denote some other pipeline stage in another architecture. So add this to list of sort entries that can have dynamic header string. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: https://lore.kernel.org/r/1616425047-1666-5-git-send-email-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-03-26perf powerpc: Add support for PERF_SAMPLE_WEIGHT_STRUCTAthira Rajeev
Add arch specific arch_evsel__set_sample_weight() to set the new sample type for powerpc. Add arch specific arch_perf_parse_sample_weight() to store the sample->weight values depending on the sample type applied. if the new sample type (PERF_SAMPLE_WEIGHT_STRUCT) is applied, store only the lower 32 bits to sample->weight. If sample type is 'PERF_SAMPLE_WEIGHT', store the full 64-bit to sample->weight. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: https://lore.kernel.org/r/1616425047-1666-4-git-send-email-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-03-23perf tools: Fix various typos in commentsIngo Molnar
Fix ~124 single-word typos and a few spelling errors in the perf tooling code, accumulated over the years. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210321113734.GA248990@gmail.com Link: http://lore.kernel.org/lkml/20210323160915.GA61903@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-09perf powerpc: Fix printf conversion specifier for IP addressesArnaldo Carvalho de Melo
We need to use "%#" PRIx64 for u64 values, not "%lx", fixing this build problem on powerpc 32-bit: 72 13.69 ubuntu:18.04-x-powerpc : FAIL powerpc-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 arch/powerpc/util/machine.c: In function 'arch__symbols__fixup_end': arch/powerpc/util/machine.c:23:12: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'u64 {aka long long unsigned int}' [-Werror=format=] pr_debug4("%s sym:%s end:%#lx\n", __func__, p->name, p->end); ^ /git/linux/tools/perf/util/debug.h:18:21: note: in definition of macro 'pr_fmt' #define pr_fmt(fmt) fmt ^~~ /git/linux/tools/perf/util/debug.h:33:29: note: in expansion of macro 'pr_debugN' #define pr_debug4(fmt, ...) pr_debugN(4, pr_fmt(fmt), ##__VA_ARGS__) ^~~~~~~~~ /git/linux/tools/perf/util/debug.h:33:42: note: in expansion of macro 'pr_fmt' #define pr_debug4(fmt, ...) pr_debugN(4, pr_fmt(fmt), ##__VA_ARGS__) ^~~~~~ arch/powerpc/util/machine.c:23:2: note: in expansion of macro 'pr_debug4' pr_debug4("%s sym:%s end:%#lx\n", __func__, p->name, p->end); ^~~~~~~~~ cc1: all warnings being treated as errors /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed make[5]: *** [util] Error 2 /git/linux/tools/build/Makefile.build:139: recipe for target 'powerpc' failed make[4]: *** [powerpc] Error 2 /git/linux/tools/build/Makefile.build:139: recipe for target 'arch' failed make[3]: *** [arch] Error 2 73 30.47 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Fixes: 557c3eadb7712741 ("perf powerpc: Fix gap between kernel end and module start") Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf powerpc: Support exposing Performance Monitor Counter SPRs as part of ↵Athira Rajeev
extended regs To enable presenting of Performance Monitor Counter Registers (PMC1 to PMC6) as part of extended regsiters, this patch adds these to sample_reg_mask in the tool side (to use with -I? option). Simplified the PERF_REG_PMU_MASK_300/31 definition. Excluded the unsupported SPRs (MMCR3, SIER2, SIER3) from extended mask value for CPU_FTR_ARCH_300. Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf powerpc: Fix gap between kernel end and module startAthira Rajeev
Running "perf mem report" in TUI mode fails with ENOMEM message in powerpc: failed to process sample Running with debug and verbose options points that issue is while allocating memory for sample histograms. The error path is: symbol__inc_addr_samples() -> __symbol__inc_addr_samples() -> annotated_source__histogram() symbol__inc_addr_samples() calls annotated_source__alloc_histograms () to allocate memory for sample histograms using calloc(). Here calloc() fails since the size of symbol is huge. The size of a symbol is calculated as difference between its start and end address. Example histogram allocation that fails is: sym->name is _end sym->start is 0xc0000000027a0000 sym->end is 0xc008000003890000 symbol__size(sym) is 0x80000010f0000 In the above case, the difference between sym->start (0xc0000000027a0000) and sym->end (0xc008000003890000) is huge. This is same problem as in s390 and arm64 which are fixed in commits: b9c0a64901d5 ("perf annotate: Fix s390 gap between kernel end and module start") 78886f3ed37e ("perf symbols: Fix arm64 gap between kernel start and module end") When this symbol was read first, its start and end address was set to address which matches with data from /proc/kallsyms. After symbol__new(): symbol__new: _end 0xc0000000027a0000-0xc0000000027a0000 From /proc/kallsyms: ... c000000002799370 b backtrace_flag c000000002799378 B radix_tree_node_cachep c000000002799380 B __bss_stop c0000000027a0000 B _end c008000003890000 t icmp_checkentry [ip_tables] c008000003890038 t ipt_alloc_initial_table [ip_tables] c008000003890468 T ipt_do_table [ip_tables] c008000003890de8 T ipt_unregister_table_pre_exit [ip_tables] ... Perf calls function symbols__fixup_end() which sets the end of symbol to 0xc008000003890000, which is the next address and this is the start address of first module (icmp_checkentry in above) which will make the huge symbol size of 0x80000010f0000. After symbols__fixup_end: symbols__fixup_end: sym->name: _end sym->start: 0xc0000000027a0000 sym->end: 0xc008000003890000 On powerpc, kernel text segment is located at 0xc000000000000000 whereas the modules are located at very high memory addresses, 0xc00800000xxxxxxx. Since the gap between end of kernel text segment and beginning of first module's address is high, histogram allocation using calloc fails. Fix this by detecting the kernel's last symbol and limiting the range of last kernel symbol to pagesize. Signed-off-by: Athira Rajeev<atrajeev@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-By: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1609208054-1566-1-git-send-email-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-10perf metricgroup: Pass pmu_event structure as a parameter for ↵Kajol Jain
arch_get_runtimeparam() This patch adds passing of pmu_event as a parameter in function 'arch_get_runtimeparam' which can be used to get details like if the event is percore/perchip. Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lore.kernel.org/lkml/20200907064133.75090-5-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-10Merge tag 'perf-tools-2020-08-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools updates from Arnaldo Carvalho de Melo: "New features: - Introduce controlling how 'perf stat' and 'perf record' works via a control file descriptor, allowing starting with events configured but disabled until commands are received via the control file descriptor. This allows, for instance for tools such as Intel VTune to make further use of perf as its Linux platform driver. - Improve 'perf record' to to register in a perf.data file header the clockid used to help later correlate things like syslog files and perf events recorded. - Add basic syscall and find_next_bit benchmarks to 'perf bench'. - Allow using computed metrics in calculating other metrics. For instance: { .metric_expr = "l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit", .metric_name = "DCache_L2_All_Hits", }, { .metric_expr = "max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss", .metric_name = "DCache_L2_All_Miss", }, { .metric_expr = "dcache_l2_all_hits + dcache_l2_all_miss", .metric_name = "DCache_L2_All", } - Add suport for 'd_ratio', '>' and '<' operators to the expression resolver used in calculating metrics in 'perf stat'. Support for new kernel features: - Support TEXT_POKE and KSYMBOL_TYPE_OOL perf metadata events to cope with things like ftrace, trampolines, i.e. changes in the kernel text that gets in the way of properly decoding Intel PT hardware traces, for instance. Intel PT: - Add various knobs to reduce the volume of Intel PT traces by reducing the level of details such as decoding just some types of packets (e.g., FUP/TIP, PSB+), also filtering by time range. - Add new itrace options (log flags to the 'd' option, error flags to the 'e' one, etc), controlling how Intel PT is transformed into perf events, document some missing options (e.g., how to synthesize callchains). BPF: - Properly report BPF errors when parsing events. - Do not setup side-band events if LIBBPF is not linked, fixing a segfault. Libraries: - Improvements to the libtraceevent plugin mechanism. - Improve libtracevent support for KVM trace events SVM exit reasons. - Add a libtracevent plugins for decoding syscalls/sys_enter_futex and for tlb_flush. - Ensure sample_period is set libpfm4 events in 'perf test'. - Fixup libperf namespacing, to make sure what is in libperf has the perf_ namespace while what is now only in tools/perf/ doesn't use that prefix. Arch specific: - Improve the testing of vendor events and metrics in 'perf test'. - Allow no ARM CoreSight hardware tracer sink to be specified on command line. - Fix arm_spe_x recording when mixed with other perf events. - Add s390 idle functions 'psw_idle' and 'psw_idle_exit' to list of idle symbols. - List kernel supplied event aliases for arm64 in 'perf list'. - Add support for extended register capability in PowerPC 9 and 10. - Added nest IMC power9 metric events. Miscellaneous: - No need to setup sample_regs_intr/sample_regs_user for dummy events. - Update various copies of kernel headers, some causing perf to handle new syscalls, MSRs, etc. - Improve usage of flex and yacc, enabling warnings and addressing the fallout. - Add missing '--output' option to 'perf kmem' so that it can pass it along to 'perf record'. - 'perf probe' fixes related to adding multiple probes on the same address for the same event. - Make 'perf probe' warn if the target function is a GNU indirect function. - Remove //anon mmap events from 'perf inject jit' to fix supporting both using ELF files for generated functions and the perf-PID.map approaches" * tag 'perf-tools-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (144 commits) perf record: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set perf tools powerpc: Add support for extended regs in power10 perf tools powerpc: Add support for extended register capability tools headers UAPI: Sync drm/i915_drm.h with the kernel sources tools arch x86: Sync asm/cpufeatures.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools headers UAPI: update linux/in.h copy tools headers API: Update close_range affected files perf script: Add 'tod' field to display time of day perf script: Change the 'enum perf_output_field' enumerators to be 64 bits perf data: Add support to store time of day in CTF data conversion perf tools: Move clockid_res_ns under clock struct perf header: Store clock references for -k/--clockid option perf tools: Add clockid_name function perf clockid: Move parse_clockid() to new clockid object tools lib traceevent: Handle possible strdup() error in tep_add_plugin_path() API libtraceevent: Fixed description of tep_add_plugin_path() API libtraceevent: Fixed type in PRINT_FMT_STING libtraceevent: Fixed broken indentation in parse_ip4_print_args() libtraceevent: Improve error handling of tep_plugin_add_option() API ...