summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2024-10-17perf stat: Display "none" for NaN with metric only jsonIan Rogers
Return earlier for an empty unit case. If snprintf of the fmt doesn't produce digits between vals and ends, as happens with NaN, make the value "none" as happens in print_metric_end. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241017175356.783793-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17perf stat: Fix/add parameter names for print_metricIan Rogers
The print_metric parameter names were rearranged, fix and add comments in the stat-shadow callers to ensure they are correct. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241017175356.783793-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17perf color: Add printf format checking and resolve issuesIan Rogers
Add printf format checking to vararg printf routines in color.h. Resolve build errors/bugs that are found through this checking. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241017175356.783793-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17perf probe: Fix libdw memory leakIan Rogers
Add missing dwarf_cfi_end to free memory associated with probe_finder cfi_eh which is allocated and owned via a call to dwarf_getcfi_elf. Confusingly cfi_dbg shouldn't be freed as its memory is owned by the passed in debuginfo struct. Add comments to highlight this. This addresses leak sanitizer issues seen in: tools/perf/tests/shell/test_uprobe_from_different_cu.sh Fixes: 270bde1e76f4 ("perf probe: Search both .eh_frame and .debug_frame sections for probe location") Signed-off-by: Ian Rogers <irogers@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20241016235622.52166-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17perf disasm: Fix capstone memory leakIan Rogers
The insn argument passed to cs_disasm needs freeing. To support accurately having count, add an additional free_count variable. Fixes: c5d60de1813a ("perf annotate: Add support to use libcapstone in powerpc") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Cc: David S. Miller <davem@davemloft.net> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20241016235622.52166-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17tools/perf/powerpc/util: Add support to handle compatible mode PVR for perf ↵Athira Rajeev
json events perf list picks the events supported for specific platform from pmu-events/arch/powerpc/<platform>. Example power10 events are in pmu-events/arch/powerpc/power10, power9 events are part of pmu-events/arch/powerpc/power9. The decision of which platform to pick is determined based on PVR value in powerpc. The PVR value is matched from pmu-events/arch/powerpc/mapfile.csv Example: Format: PVR,Version,JSON/file/pathname,Type 0x004[bcd][[:xdigit:]]{4},1,power8,core 0x0066[[:xdigit:]]{4},1,power8,core 0x004e[[:xdigit:]]{4},1,power9,core 0x0080[[:xdigit:]]{4},1,power10,core 0x0082[[:xdigit:]]{4},1,power10,core The code gets the PVR from system using get_cpuid_str function in arch/powerpc/util/headers.c ( from SPRN_PVR ) and compares with value from mapfile.csv In case of compat mode, say when partition is booted in a power9 mode when the system is a power10, this picks incorrectly. Because PVR will point to power10 where as it should pick events from power9 folder. To support generic events, add new folder pmu-events/arch/powerpc/compat to contain the ISA architected events which is supported in compat mode. Also return 0x00ffffff as pvr when booted in compat mode. Based on this pvr value, json will pick events from pmu-events/arch/powerpc/compat Suggested-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel<disgoel@linux.ibm.com> Cc: akanksha@linux.ibm.com Cc: hbathini@linux.ibm.com Cc: kjain@linux.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20241010145107.51211-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17tools/perf/pmu-events/powerpc: Add support for compat events in jsonAthira Rajeev
perf list picks the events supported for specific platform from pmu-events/arch/powerpc/<platform>. Example power10 events are in pmu-events/arch/powerpc/power10, power9 events are part of pmu-events/arch/powerpc/power9. The decision of which platform to pick is determined based on PVR value in powerpc. The PVR value is matched from pmu-events/arch/powerpc/mapfile.csv Example: Format: PVR,Version,JSON/file/pathname,Type 0x004[bcd][[:xdigit:]]{4},1,power8,core 0x0066[[:xdigit:]]{4},1,power8,core 0x004e[[:xdigit:]]{4},1,power9,core 0x0080[[:xdigit:]]{4},1,power10,core 0x0082[[:xdigit:]]{4},1,power10,core The code gets the PVR from system using get_cpuid_str function in arch/powerpc/util/headers.c ( from SPRN_PVR ) and compares with value from mapfile.csv In case of compat mode, say when partition is booted in a power9 mode when the system is a power10, add an entry to pick the ISA architected events from "pmu-events/arch/powerpc/compat". Add json file generic-events.json which will contain these events which is supported in compat mode. Suggested-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Cc: akanksha@linux.ibm.com Cc: hbathini@linux.ibm.com Cc: kjain@linux.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20241010145107.51211-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17perf dso: Fix symtab_type for kmod compressionVeronika Molnarova
During the rework of the dso structure in patch ee756ef7491eafd an increment was forgotten for the symtab_type in case the data for the kernel module are compressed. This affects the probing of the kernel modules, which fails if the data are not already cached. Increment the value of the symtab_type to its compressed variant so the data could be recovered successfully. Fixes: ee756ef7491eafd7 ("perf dso: Add reference count checking and accessor functions") Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Acked-by: Michael Petlan <mpetlan@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Michael Petlan <mpetlan@redhat.com> Link: https://lore.kernel.org/r/20241010144836.16424-1-vmolnaro@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17perf probe: Improve log for long event name failureLeo Yan
If a symbol name is longer than the maximum event length (64 bytes), the perf tool reports error: # perf probe -x test_cpp_mangle --add "this_is_a_very_very_long_print_data_abcdefghijklmnopqrstuvwxyz(int)" snprintf() failed: -7; the event name nbase='this_is_a_very_very_long_print_data_abcdefghijklmnopqrstuvwxyz(int)' is too long Error: Failed to add events. An information is missed in the log that the symbol name and the event name can be set separately. Especially, this is recommended for adding probe for a long symbol. This commit refines the log for reminding event syntax. After: # perf probe -x test_cpp_mangle --add "this_is_a_very_very_long_print_data_abcdefghijklmnopqrstuvwxyz(int)" snprintf() failed: -7; the event name 'this_is_a_very_very_long_print_data_abcdefghijklmnopqrstuvwxyz(int)' is too long Hint: Set a shorter event with syntax "EVENT=PROBEDEF" EVENT: Event name (max length: 64 bytes). Error: Failed to add events. Signed-off-by: Leo Yan <leo.yan@arm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20241012204725.928794-4-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17perf probe: Check group string lengthLeo Yan
In the kernel, the probe group string length is limited up to MAX_EVENT_NAME_LEN (including the NULL terminator). Check for this limitation and report an error if it is exceeded. Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20241012204725.928794-3-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17perf probe: Use the MAX_EVENT_NAME_LEN macroLeo Yan
The MAX_EVENT_NAME_LEN macro has been defined in the kernel. Use the same definition in the tool for more readable. Signed-off-by: Leo Yan <leo.yan@arm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20241012204725.928794-2-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17perf test: Speed up some tests using perf listNamhyung Kim
On my system, perf list is very slow to print the whole events. I think there's a performance issue in SDT and uprobes event listing. I noticed this issue while running perf test on x86 but it takes long to check some CoreSight event which should be skipped quickly. Anyway, some test uses perf list to check whether the required event is available before running the test. The perf list command can take an argument to specify event class or (glob) pattern. But glob pattern is only to suppress output for unmatched ones after checking all events. In this case, specifying event class is better to reduce the number of events it checks and to avoid buggy subsystems entirely. No functional changes intended. Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Carsten Haitzler <carsten.haitzler@arm.com> Cc: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20241016065654.269994-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-16perf x86/topdown: Refine helper arch_is_topdown_metrics()Dapeng Mi
Leverage the existed function perf_pmu__name_from_config() to check if an event is topdown metrics event. perf_pmu__name_from_config() goes through the defined formats and figures out the config of pre-defined topdown events. This avoids to figure out the config of topdown pre-defined events with hard-coded format strings "event=" and "umask=" and provides more flexibility. Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20241011110207.1032235-2-dapeng1.mi@linux.intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-16perf x86/topdown: Make topdown metrics comparators be symmetricDapeng Mi
The commit "3b5edc0421e2 (perf x86/topdown: Don't move topdown metric events in group)" modifies topdown metrics comparator to move topdown metrics events which are not in same group with previous event. But it just modifies the 2nd comparator and causes the comparators become asymmetric. Thus modify the 1st topdown metrics comparator and make the two comparators be symmetric, and refine the comments as well. Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20241011110207.1032235-1-dapeng1.mi@linux.intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-16perf tool_pmu: Remove duplicate io.h headerIan Rogers
Remove duplicate inclusion of api/io.h. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410131417.ynhvnEJb-lkp@intel.com/ Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20241016160413.51587-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf arm-spe: Add Cortex CPUs to common data source encoding listLeo Yan
Add Cortex-A720, Cortex-A725, Cortex-X1C, Cortex-X3 and Cortex-X925 into the common data source encoding list. For everyone of these CPUs, it technical reference manual defines the data source packet as the common encoding format. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241003185322.192357-8-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf arm-spe: Add Neoverse-V2 to common data source encoding listBesar Wicaksono
Add Neoverse-V2 MIDR to the common data source encoding range list. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241003185322.192357-7-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf arm-spe: Remove the unused 'midr' fieldLeo Yan
The 'midr' field is replaced by the MIDR values stored in metadata (per CPU wise). Remove the 'midr' field as it is no longer used. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241003185322.192357-6-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf arm-spe: Use metadata to decide the data source featureLeo Yan
Use the info in the metadata to decide if the data source feature is supported. The CPU MIDR must be in the CPU list for the common data source encoding. For the metadata version 1, it doesn't include info for MIDR. In this case, due to absent info for making decision, print out warning to remind users to upgrade tool and returns false. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241003185322.192357-5-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf arm-spe: Introduce arm_spe__is_homogeneous()Leo Yan
Introduce the arm_spe__is_homogeneous() function, it uses to check if Arm SPE is homogeneous cross all CPUs. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241003185322.192357-4-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf arm-spe: Rename the common data source encodingLeo Yan
The Neoverse CPUs follow the common data source encoding, and other CPU variants can share the same format. Rename the CPU list and data source definitions as common data source names. This change prepares for appending more CPU variants. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241003185322.192357-3-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf arm-spe: Rename arm_spe__synth_data_source_generic()Leo Yan
The arm_spe__synth_data_source_generic() function is invoked when the tool detects that CPUs do not support data source packets and falls back to synthesizing only the memory level. Rename it to arm_spe__synth_memory_level() for better reflecting its purpose. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241003185322.192357-2-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf test: Delete unused Intel CQM testHoward Chu
As Ian Rogers <irogers@google.com> pointed out, intel-cqm.c is neither used nor built. It was deleted in the following commit: commit b24413180f56 ("License cleanup: add SPDX GPL-2.0 license identifier to files with no license") However, it resurfaced soon after in the following commit: commit 5c9295bfe6f5 ("perf tests: Remove Intel CQM perf test") It should be deleted once and for all. Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: Matt Fleming <mfleming@cloudflare.com> Link: https://lore.kernel.org/r/20241011055700.4142694-1-howardchu95@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf evsel: Fix missing inherit + sample read checkNamhyung Kim
It should not clear the inherit bit simply because the kernel doesn't support the sample read with it. IOW the inherit bit should be kept when the sample read is not requested for the event. Fixes: 90035d3cd876cb71 ("tools/perf: Allow inherit + PERF_SAMPLE_READ when opening events") Acked-by: Ben Gainey <ben.gainey@arm.com> Link: https://lore.kernel.org/r/20241009062250.730192-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf sched timehist: Add pre-migration wait time optionMadadi Vineeth Reddy
pre-migration wait time is the time that a task unnecessarily spends on the runqueue of a CPU but doesn't get switched-in there. In terms of tracepoints, it is the time between sched:sched_wakeup and sched:sched_migrate_task. Let's say a task woke up on CPU2, then it got migrated to CPU4 and then it's switched-in to CPU4. So, here pre-migration wait time is time that it was waiting on runqueue of CPU2 after it is woken up. The general pattern for pre-migration to occur is: sched:sched_wakeup sched:sched_migrate_task sched:sched_switch The sched:sched_waking event is used to capture the wakeup time, as it aligns with the existing code and only introduces a negligible time difference. pre-migrations are generally not useful and it increases migrations. This metric would be helpful in testing patches mainly related to wakeup and load-balancer code paths as better wakeup logic would choose an optimal CPU where task would be switched-in and thereby reducing pre- migrations. The sample output(s) when -P or --pre-migrations is used: ================= time cpu task name wait time sch delay run time pre-mig time [tid/pid] (msec) (msec) (msec) (msec) --------------- ------ ------------------------------ --------- --------- --------- --------- 38456.720806 [0001] schbench[28634/28574] 4.917 4.768 1.004 0.000 38456.720810 [0001] rcu_preempt[18] 3.919 0.003 0.004 0.000 38456.721800 [0006] schbench[28779/28574] 23.465 23.465 1.999 0.000 38456.722800 [0002] schbench[28773/28574] 60.371 60.237 3.955 60.197 38456.722806 [0001] schbench[28634/28574] 0.004 0.004 1.996 0.000 38456.722811 [0001] rcu_preempt[18] 1.996 0.005 0.005 0.000 38456.723800 [0000] schbench[28833/28574] 4.000 4.000 3.999 0.000 38456.723800 [0004] schbench[28762/28574] 42.951 42.839 3.999 39.867 38456.723802 [0007] schbench[28812/28574] 43.947 43.817 3.999 40.866 38456.723804 [0001] schbench[28587/28574] 7.935 7.822 0.993 0.000 Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Link: https://lore.kernel.org/r/20241004170756.18064-1-vineethr@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf tools: Remove unnecessary parenthesesNamhyung Kim
The hashmap API used to require parentheses for the hashmap argument if it's not a pointer type. It's now fixed so let's drop the parentheses. Link: https://lore.kernel.org/r/20241009202009.884884-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf tools: Fix possible compiler warnings in hashmapNamhyung Kim
The hashmap__for_each_entry[_safe] is accessing 'map' as if it's a pointer. But it does without parentheses so passing a static hash map with an ampersand (like &slab_hash below) caused compiler warnings due to unmatched types. In file included from util/bpf_lock_contention.c:5: util/bpf_lock_contention.c: In function ‘exit_slab_cache_iter’: linux/tools/perf/util/hashmap.h:169:32: error: invalid type argument of ‘->’ (have ‘struct hashmap’) 169 | for (bkt = 0; bkt < map->cap; bkt++) \ | ^~ util/bpf_lock_contention.c:105:9: note: in expansion of macro ‘hashmap__for_each_entry’ 105 | hashmap__for_each_entry(&slab_hash, cur, bkt) | ^~~~~~~~~~~~~~~~~~~~~~~ /home/namhyung/project/linux/tools/perf/util/hashmap.h:170:31: error: invalid type argument of ‘->’ (have ‘struct hashmap’) 170 | for (cur = map->buckets[bkt]; cur; cur = cur->next) | ^~ util/bpf_lock_contention.c:105:9: note: in expansion of macro ‘hashmap__for_each_entry’ 105 | hashmap__for_each_entry(&slab_hash, cur, bkt) | ^~~~~~~~~~~~~~~~~~~~~~~ Cc: bpf@vger.kernel.org Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20241009202009.884884-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14Merge tag 'v6.12-rc3' into perf-tools-nextNamhyung Kim
To get the fixes in the current perf-tools tree. Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14perf tools: Fix compiler error in util/tool_pmu.cNamhyung Kim
util/tool_pmu.c: In function 'evsel__tool_pmu_read': util/tool_pmu.c:419:55: error: passing argument 2 of 'tool_pmu__read_event' from incompatible pointer type [-Werror=incompatible-pointer-types] 419 | if (!tool_pmu__read_event(ev, &val)) { | ^~~~ | | | long unsigned int * util/tool_pmu.c:335:56: note: expected 'u64 *' {aka 'long long unsigned int *'} but argument is of type 'long unsigned int *' 335 | bool tool_pmu__read_event(enum tool_pmu_event ev, u64 *result) | ~~~~~^~~~~~ Link: https://lore.kernel.org/r/Zw1XIGML32VaxE0t@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14tools/perf/tests: Remove duplicate evlist__delete in tests/tool_pmu.cAthira Rajeev
The testcase for tool_pmu failed in powerpc as below: ./perf test -v "Parsing without PMU name" 8: Tool PMU : 8.1: Parsing without PMU name : FAILED! This happens when parse_events results in either skip or fail of an event. Because the code invokes evlist__delete(evlist) and "goto out". ret = parse_events(evlist, str, &err); if (ret) { evlist__delete(evlist); But in the "out" section also evlist__delete happens. out: evlist__delete(evlist); return ret; Hence remove the duplicate evlist__delete from the first path in the testcase With the change: # ./perf test -v "Parsing without PMU name" 8: Tool PMU : 8.1: Parsing without PMU name : Ok Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: akanksha@linux.ibm.com Cc: hbathini@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20241013170732.71339-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14tools/perf/tests: Fix compilation error with strncpy in tests/tool_pmuAthira Rajeev
perf fails to compile on systems with GCC version11 as below: In file included from /usr/include/string.h:519, from /home/athir/perf-tools-next/tools/include/linux/bitmap.h:5, from /home/athir/perf-tools-next/tools/perf/util/pmu.h:5, from /home/athir/perf-tools-next/tools/perf/util/evsel.h:14, from /home/athir/perf-tools-next/tools/perf/util/evlist.h:14, from tests/tool_pmu.c:3: In function ‘strncpy’, inlined from ‘do_test’ at tests/tool_pmu.c:25:3: /usr/include/bits/string_fortified.h:95:10: error: ‘__builtin_strncpy’ specified bound 128 equals destination size [-Werror=stringop-truncation] 95 | return __builtin___strncpy_chk (__dest, __src, __len, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 96 | __glibc_objsize (__dest)); | ~~~~~~~~~~~~~~~~~~~~~~~~~ The compile error is from strncpy refernce in do_test: strncpy(str, tool_pmu__event_to_str(ev), sizeof(str)); This behaviour is not observed with GCC version 8, but observed with GCC version 11 . This is message from gcc for detecting truncation while using strncpu. Use snprintf instead of strncpy here to be safe. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: akanksha@linux.ibm.com Cc: hbathini@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20241013173742.71882-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf report: Display columns Predicted/Abort/Cycles in --branch-historyThomas Falcon
The original commit message: " Use current sort mechanism but the real .se_cmp() just returns 0 so that new columns "Predicted", "Abort" and "Cycles" are created in display but actually these keys are not the sort keys. For example: Overhead Source:Line Symbol Shared Object Predicted Abort Cycles ........ ............ ........ ............. ......... ..... ...... 38.25% div.c:45 [.] main div 97.6% 0 3 " Update missed commit from series "perf report: Show branch flags/cycles in --branch-history callgraph view" to apply to current repository so that new columns described above are visible. Link to original series: https://lore.kernel.org/lkml/1477876794-30749-1-git-send-email-yao.jin@linux.intel.com/ Reported-by: Dr. David Alan Gilbert <linux@treblig.org> Suggested-by: Kan Liang <kan.liang@linux.intel.com> Co-developed-by: Jin Yao <yao.jin@linux.intel.com> Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20241010184046.203822-1-thomas.falcon@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf tests: Add tool PMU testIan Rogers
Ensure parsing with and without PMU creates events with the expected config values. This ensures the tool.json doesn't get out of sync with tool_pmu_event enum. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf tool_pmu: Switch to standard pmu functions and json descriptionsIan Rogers
Use the regular PMU approaches with tool json events to reduce the amount of special tool_pmu code - tool_pmu__config_terms and tool_pmu__for_each_event_cb are removed. Some functions remain, like tool_pmu__str_to_event, as conveniences to metricgroups. Add tool_pmu__skip_event/tool_pmu__num_skip_events to handle the case that tool json events shouldn't appear on certain architectures. This isn't done in jevents.py due to complexity in the empty-pmu-events.c and when all vendor json is built into the tool. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf jevents: Add tool event json under a common architectureIan Rogers
Introduce the notion of a common architecture/model that can be used to find event tables for common PMUs like the tool PMU. By having tool events be json standard PMU attribute configuration, descriptions, etc. can be used and these routines are already optimized for things like binary searching. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf tool_pmu: Move expr literals to tool_pmuIan Rogers
Add the expr literals like "#smt_on" as tool events, this allows stat events to give the values. On my laptop with hyperthreading enabled: ``` $ perf stat -e "has_pmem,num_cores,num_cpus,num_cpus_online,num_dies,num_packages,smt_on,system_tsc_freq" true Performance counter stats for 'true': 0 has_pmem 8 num_cores 16 num_cpus 16 num_cpus_online 1 num_dies 1 num_packages 1 smt_on 2,496,000,000 system_tsc_freq 0.001113637 seconds time elapsed 0.001218000 seconds user 0.000000000 seconds sys ``` And with hyperthreading disabled: ``` $ perf stat -e "has_pmem,num_cores,num_cpus,num_cpus_online,num_dies,num_packages,smt_on,system_tsc_freq" true Performance counter stats for 'true': 0 has_pmem 8 num_cores 16 num_cpus 8 num_cpus_online 1 num_dies 1 num_packages 0 smt_on 2,496,000,000 system_tsc_freq 0.000802115 seconds time elapsed 0.000000000 seconds user 0.000806000 seconds sys ``` As zero matters for these values, in stat-display should_skip_zero_counter only skip the zero value if it is not the first aggregation index. The tool event implementations are used in expr but not evaluated as events for simplicity. Also core_wide isn't made a tool event as it requires command line parameters. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf tool_pmu: Rename perf_tool_event__* to tool_pmu__*Ian Rogers
Now the events are associated with the tool PMU, rename the functions to reflect this. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf tool_pmu: Rename enum perf_tool_event to tool_pmu_eventIan Rogers
To better reflect the events listed are from the tool PMU. Rename the enum values from PERF_TOOL_* to TOOL_PMU__EVENT_*. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf tool_pmu: Factor tool events into their own PMUIan Rogers
Rather than treat tool events as a special kind of event, create a tool only PMU where the events/aliases match the existing duration_time, user_time and system_time events. Remove special parsing and printing support for the tool events, but add function calls for when PMU functions are called on a tool_pmu. Move the tool PMU code in evsel into tool_pmu.c to better encapsulate the tool event behavior in that file. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf parse-events: Expose/rename config_term_nameIan Rogers
Expose config_term_name as parse_events__term_type_str so that PMUs not in pmu.c may access it. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf pmu: Allow hardcoded terms to be applied to attributesIan Rogers
Hard coded terms like "config=10" are skipped by perf_pmu__config assuming they were already applied to a perf_event_attr by parse event's config_attr function. When doing a reverse number to name lookup in perf_pmu__name_from_config, as the hardcoded terms aren't applied the config value is incorrect leading to misses or false matches. Fix this by adding a parameter to have perf_pmu__config apply hardcoded terms too (not just in parse event's config_term_common). Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf pmu: Simplify an asprintf error messageIan Rogers
Use ifs rather than ?: to avoid a large compound statement. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf tools: Remove unused color_fwrite_linesDr. David Alan Gilbert
color_fwrite_lines() was added by 2009's commit 8fc0321f1ad0 ("perf_counter tools: Add color terminal output support") but has never been used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20241009003938.254936-1-linux@treblig.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-09perf test x86: Fix typo in intel-pt-testThomas Falcon
Change function name "is_hydrid" to "is_hybrid". Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20241007194758.78659-1-thomas.falcon@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-09perf probe: Remove unused add_perf_probe_eventsDr. David Alan Gilbert
add_perf_probe_events has been unused since 2015's commit b02137cc6550 ("perf probe: Move print logic into cmd_probe()") which confusingly now uses perf_add_probe_events. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20240929010659.430208-1-linux@treblig.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-08Merge tag 'perf-tools-fixes-for-v6.12-1-2024-10-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix an assert() to handle captured and unprocessed ARM CoreSight CPU traces - Fix static build compilation error when libdw isn't installed or is too old - Add missing include when building with !HAVE_DWARF_GETLOCATIONS_SUPPORT - Add missing refcount put on 32-bit DSOs - Fix disassembly of user space binaries by setting the binary_type of DSO when loading - Update headers with the kernel sources, including asound.h, sched.h, fcntl, msr-index.h, irq_vectors.h, socket.h, list_sort.c and arm64's cputype.h * tag 'perf-tools-fixes-for-v6.12-1-2024-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf cs-etm: Fix the assert() to handle captured and unprocessed cpu trace perf build: Fix build feature-dwarf_getlocations fail for old libdw perf build: Fix static compilation error when libdw is not installed perf dwarf-aux: Fix build with !HAVE_DWARF_GETLOCATIONS_SUPPORT tools headers arm64: Sync arm64's cputype.h with the kernel sources perf tools: Cope with differences for lib/list_sort.c copy from the kernel tools check_headers.sh: Add check variant that excludes some hunks perf beauty: Update copy of linux/socket.h with the kernel sources tools headers UAPI: Sync the linux/in.h with the kernel sources perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools include UAPI: Sync linux/fcntl.h copy with the kernel sources tools include UAPI: Sync linux/sched.h copy with the kernel sources tools include UAPI: Sync sound/asound.h copy with the kernel sources perf vdso: Missed put on 32-bit dsos perf symbol: Set binary_type of dso when loading
2024-10-03perf test attr: Add back missing topdown eventsVeronika Molnarova
With the patch 0b6c5371c03c "Add missing topdown metrics events" eight topdown metric events with numbers ranging from 0x8000 to 0x8700 were added to the test since they were added as 'perf stat' default events. Later the patch 951efb9976ce "Update no event/metric expectations" kept only 4 of those events(0x8000-0x8300). Currently, the topdown events with numbers 0x8400 to 0x8700 are missing from the list of expected events resulting in a failure. Add back the missing topdown events. Fixes: 951efb9976ce ("perf test attr: Update no event/metric expectations") Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Tested-by: Ian Rogers <irogers@google.com> Cc: mpetlan@redhat.com Link: https://lore.kernel.org/r/20240311081611.7835-1-vmolnaro@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-03perf arm-spe: Dump metadata with version 2Leo Yan
This commit dumps metadata with version 2. It dumps metadata for header and per CPU data respectively in the arm_spe_print_info() function to support metadata version 2 format. After: 0 0 0x3c0 [0x1b0]: PERF_RECORD_AUXTRACE_INFO type: 4 Header version :2 Header size :4 PMU type v2 :13 CPU number :8 Magic :0x1010101010101010 CPU # :0 Num of params :3 MIDR :0x410fd801 PMU Type :-1 Min Interval :0 Magic :0x1010101010101010 CPU # :1 Num of params :3 MIDR :0x410fd801 PMU Type :-1 Min Interval :0 Magic :0x1010101010101010 CPU # :2 Num of params :3 MIDR :0x410fd870 PMU Type :13 Min Interval :1024 Magic :0x1010101010101010 CPU # :3 Num of params :3 MIDR :0x410fd870 PMU Type :13 Min Interval :1024 Magic :0x1010101010101010 CPU # :4 Num of params :3 MIDR :0x410fd870 PMU Type :13 Min Interval :1024 Magic :0x1010101010101010 CPU # :5 Num of params :3 MIDR :0x410fd870 PMU Type :13 Min Interval :1024 Magic :0x1010101010101010 CPU # :6 Num of params :3 MIDR :0x410fd850 PMU Type :-1 Min Interval :0 Magic :0x1010101010101010 CPU # :7 Num of params :3 MIDR :0x410fd850 PMU Type :-1 Min Interval :0 Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Besar Wicaksono <bwicaksono@nvidia.com> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241003184302.190806-6-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-03perf arm-spe: Support metadata version 2Leo Yan
This commit is to support metadata version 2 and at the meantime it is backward compatible for version 1's format. The metadata version 1 doesn't include the ARM_SPE_HEADER_VERSION field. As version 1 is fixed with two u64 fields, by checking the metadata size, it distinguishes the metadata is version 1 or version 2 (and any new versions if later will have). For version 2, it reads out CPU number and retrieves the metadata info for every CPU. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Besar Wicaksono <bwicaksono@nvidia.com> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241003184302.190806-5-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-03perf arm-spe: Save per CPU information in metadataLeo Yan
Save the Arm SPE information on a per-CPU basis. This approach is easier in the decoding phase for retrieving metadata based on the CPU number of every Arm SPE record. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Besar Wicaksono <bwicaksono@nvidia.com> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241003184302.190806-4-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>