git.armlinux.org.uk/linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2022-10-06	perf vendor events: Update Intel broadwellx	Ian Rogers
	Events remain at v19, and the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Uncore event updates by Zhengjun Xing <zhengjun.xing@linux.intel.com>. - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. - Addition of all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. - Latest metrics from: https://github.com/intel/perfmon-metrics Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221004021612.325521-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf vendor events: Update Intel broadwell	Ian Rogers
	Events remain at v26, and the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. - Addition of all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221004021612.325521-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf vendor events: Update Intel alderlake	Ian Rogers
	Events are updated to v1.15, the core metrics are based on TMA 4.4 full and the atom metrics on E-core TMA 2.2. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - Addition of all 6 levels of TMA metrics. Previously metrics involving topdown events were dropped. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. - Update mapfile.csv CPUIDs to match 01.org. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221004021612.325521-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf vendor events: Update Intel skylakex	Ian Rogers
	Events remain at v1.28, and the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Removal of ScaleUnit from uncore events by Zhengjun Xing <zhengjun.xing@linux.intel.com>. - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. - Addition of all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. - Latest metrics from: https://github.com/intel/perfmon-metrics Tested on a skylakex manually and with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 93: perf all metricgroups test : Ok 94: perf all metrics test : Skip 95: perf all PMU test : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221004021612.325521-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf metrics: Don't scale counts going into metrics	Ian Rogers
	Counts are scaled prior to going into saved_value, reverse the scaling so that metrics don't double scale values. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221004021612.325521-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf expr: Remove jevents case workaround	Ian Rogers
	jevents.py no longer lowercases metrics and altering the case can cause hashmap lookups to fail, so remove. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221004021612.325521-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf test: Adjust case of test metrics	Ian Rogers
	Icelake and later architectures have slots events and SLOTS metrics meaning case sensitivity is important. Make the test metrics case agree with the name of the metrics. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221004021612.325521-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf expr: Allow a double if expression	Ian Rogers
	Some TMA metrics have double if expressions like: ( CPU_CLK_UNHALTED.THREAD / 2 ) * ( 1 + CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / CPU_CLK_UNHALTED.REF_XCLK ) ) if #core_wide < 1 else ( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else CPU_CLK_UNHALTED.THREAD This currently fails to parse as the left hand side if expression needs to be in parentheses. By allowing the if expression to have a right hand side that is an if expression we can parse the expression above, with left to right evaluation order that matches languages like Python. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221004021612.325521-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf parse-events: Remove unused macros __PERF_EVENT_FIELD()	Chen Zhongjin
	Unused macros reported by [-Wunused-macros]. This macros were introduced as __PERF_COUNTER_FIELD and used for reading the bit in config. cdd6c482c9ff9c55 ("perf: Do the big rename: Performance Counters -> Performance Events") Changes it to __PERF_EVENT_FIELD but at this commit there is already nowhere else using these macros, also no macros called PERF_EVENT_##name##_MASK/SHIFT. Now we are not reading type or id from config. These macros are useless and incomplete. So removing them for code cleaning. Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220926031440.28275-5-chenzhongjin@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf tools: Fix empty version number when building outside of a git repo	Will Chandler
	When perf is built in a full source tree that is not a git repository, e.g. from a kernel source tarball, `perf version` will print empty tag and commit strings: $ perf version perf version Currently the tag version is only generated from the root Makefile when building in a git repository. If PERF-VERSION-FILE has not been generated and the source tree is not in a git repository, then PERF-VERSION-GEN will return an empty version. The problem can be reproduced with the following steps: $ wget https://git.kernel.org/torvalds/t/linux-6.0-rc7.tar.gz $ tar -xf linux-6.0-rc7.tar.gz && cd linux-6.0-rc7 $ make -C tools/perf $ tools/perf/perf -v perf version Builds from tarballs generated with `make perf-tar-src-pkg` are not impacted by this issue as PERF-VERSION-FILE is included in the archive. The perf RPM provided by Fedora for 5.18+ is experiencing this problem. Package build logs[0] show that the build is attempting to fall back on PERF-VERSION-FILE, but it is not present. To resolve this, revert back to the previous logic of using the kernel Makefile version if not in a git repository and PERF-VERSION-FILE does not exist. [0] https://kojipkgs.fedoraproject.org/packages/kernel-tools/5.19.4/200.fc36/data/logs/x86_64/build.log Fixes: 7572733b84997d23 ("perf tools: Fix version kernel tag") Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Will Chandler <wfc@wfchandler.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220930151157.529674-1-wfc@wfchandler.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf lock: Remove unused struct lock_contention_key	Yuan Can
	The struct lock_contention_key is never used, remove it. Signed-off-by: Yuan Can <yuancan@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/linux-perf-users/20220927013931.110475-6-yuancan@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf jit: Remove unused struct debug_line_info	Yuan Can
	The struct debug_line_info is never used, remove it. Signed-off-by: Yuan Can <yuancan@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/linux-perf-users/20220927013931.110475-5-yuancan@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf metric: Remove unused struct metric_ref_node	Yuan Can
	After commit 46bdc0bf8d21 ("perf metric: Simplify metric_refs calculation"), no one use struct metric_ref_node, so remove it. Signed-off-by: Yuan Can <yuancan@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/linux-perf-users/20220927013931.110475-4-yuancan@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf annotate: Remove unused struct disasm_line_samples	Yuan Can
	After commit 3ab6db8d0f3b ("perf annotate browser: Use samples data from struct annotation_line"), no one use struct disasm_line_samples, so remove it. Signed-off-by: Yuan Can <yuancan@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/linux-perf-users/20220927013931.110475-3-yuancan@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf machine: Remove unused struct process_args	Yuan Can
	After commit a93f0e551af9 ("perf symbols: Get kernel start address by symbol name"), no one uses struct process_args any more, so remove it. Signed-off-by: Yuan Can <yuancan@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/linux-perf-users/20220927013931.110475-2-yuancan@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf lock contention: Fix a build error on 32-bit	Namhyung Kim
	It was reported that it failed to build the BPF lock contention skeleton on 32 bit arch due to the size of long. The lost count is used only for reporting errors due to lack of stackmap space through bad_hist which type is 'int'. Let's use int type then. Fixes: 6d499a6b3d90277d ("perf lock: Print the number of lost entries for BPF") Reported-by: Jiri Slaby <jirislaby@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Link: http://lore.kernel.org/lkml/20220926215638.3931222-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06	perf stat: Clean redundant if in process_evlist	Shang XiaoJing
	Since the first if statment is covered by the following one, clean up the first if statment. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220922141438.22487-5-shangxiaojing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: Introduce script for java symbol testing	Leo Yan
	This commit introduces a script for testing java symbols. The test records java program, inject samples with JIT samples, check specific JIT symbols in the report, the test will pass only when these two symbols are detected. Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220925025835.70364-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf trace: Fix incorrectly parsed hexadecimal value for flags in filter	Chen Zhongjin
	When parsing flags in filter, the strtoul function uses wrong parsing condition (tok[1] = 'x'), which can make the flags be corrupted and treat all numbers start with 0 as hex. In fact strtoul() will auto test hex format when base == 0 (See _parse_integer_fixup_radix). So there is no need to test this again. Remove the unnessesary is_hexa test. Fixes: 154c978d484c6104 ("libbeauty: Introduce strarray__strtoul_flags()") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220926031440.28275-3-chenzhongjin@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf trace: Fix show_arg_names not working for tp arg names	Chen Zhongjin
	trace__fprintf_tp_fields() will always print arg names because when implemented it is forced to print arg_names with: (1 \|\| trace->show_arg_names) So the printing looks like: > cat ~/.perfconfig [trace] show_arg_names = no > perf trace -e syscalls:mmap sleep 1 0.000 sleep/1119 syscalls:sys_enter_mmap(NULL, 8192, READ\|WRITE, PRIVATE\|ANONYMOUS) 0.179 sleep/1119 syscalls:sys_exit_mmap(__syscall_nr: 9, ret: 140535426170880) ... Although the comment said that perhaps we need a show_tp_arg_names. I don't think it's necessary to control them separately because it's not so clean that part of the log shows arg names but other not. Also when we are tracing functions it's rare to especially distinguish syscalls and tp trace. Only use one option to control arg names printing is more resonable and simple. So remove the force condition and commit. After fix: > perf trace -e syscalls:mmap sleep 1 0.000 sleep/1121 syscalls:sys_enter_mmap(NULL, 8192, READ\|WRITE, PRIVATE\|ANONYMOUS) 0.163 sleep/1121 syscalls:sys_exit_mmap(9, 140454467661824) ... Fixes: f11b2803bb88655d ("perf trace: Allow choosing how to augment the tracepoint arguments") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220926031440.28275-2-chenzhongjin@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf string: Remove unused macro K()	Chen Zhongjin
	Unused macro reported by [-Wunused-macros]. This macro is introduced to calculate the 'unit' size, in: d2fb8b4151a92223 ("perf tools: Add new perf_atoll() function to parse string representing size in bytes") 8ba7f6c2faada3ad ("saner perf_atoll()") This commit has simplified the perf_atoll() function and remove the 'unit' variable. This macro is not deleted, but nowhere else is using it. A single letter macro is confusing and easy to be misused. So remove it for code cleaning. Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220926031440.28275-6-chenzhongjin@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: Add kernel lock contention test	Namhyung Kim
	Add a new shell test to check if both normal 'perf lock record' + contention and BPF (with -b) option are working. Use 'perf bench sched messaging' as a workload since it creates some contention for sending and receiving messages. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220924004221.841024-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf lock: Add -q/--quiet option to suppress header and debug messages	Namhyung Kim
	Like in 'perf report', this option is to suppress header and debug messages. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220924004221.841024-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf lock: Add -E/--entries option	Namhyung Kim
	Like in 'perf top', the -E option can limit number of entries to print. It can be useful when users want to see top N contended locks only. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220924004221.841024-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: waiting.sh: Parameterize timeouts	Adrian Hunter
	Let helper functions accept a parameter to specify time out values in tenths of a second. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220914080150.5888-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: test_intel_pt.sh: Move helper functions for waiting	Adrian Hunter
	Move helper functions for waiting to a separate file so they can be shared. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220914080150.5888-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: test_intel_pt.sh: Add per-thread test	Adrian Hunter
	When tracing the kernel with Intel PT, text_poke events are recorded per-cpu. In per-thread mode that results in a mixture of per-thread and per-cpu events and mmaps. Check that happens correctly. The debug output from perf record -vvv is recorded and then awk used to process the debug messages that indicate what file descriptors were opened and whether they were mmapped or set-output. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220912083412.7058-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf tools: Add debug messages and comments for testing	Adrian Hunter
	Add debug messages to enable scripts to track aspects of 'perf record' behaviour. The messages will be consumed after 'perf record' has run, with the exception of "perf record has started" which is consequently flushed. Put comments so developers know which messages are also being used by test scripts. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: test_intel_pt.sh: Add more output in preparation for more tests	Adrian Hunter
	When there are more tests it won't be obvious which test failed. Add more output so that it is. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: test_intel_pt.sh: Fix return checking	Adrian Hunter
	The use of set -e will cause a function that returns non-zero to terminate the script unless the result is consumed by \|\| for example. That is OK if there is only 1 test function, but not if there are more. Prepare for more by using \|\|. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: test_intel_pt.sh: Use quotes around variable expansion	Adrian Hunter
	As suggested by shellcheck, use quotes around variable expansion. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: test_intel_pt.sh: Use grep -c instead of grep plus wc -l	Adrian Hunter
	As suggested by shellcheck, use grep -c instead of grep plus wc -l Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: test_intel_pt.sh: Stop using backticks	Adrian Hunter
	As suggested by shellcheck, stop using backticks. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: test_intel_pt.sh: Stop using expr	Adrian Hunter
	As suggested by shellcheck, stop using expr. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: test_intel_pt.sh: Fix redirection	Adrian Hunter
	As reported by shellcheck, 2>&1 must come after >/dev/null Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: test_intel_pt.sh: Use a temp directory	Adrian Hunter
	Create a directory for temporary files so that mktemp needs to be used only once. It also enables more temp files to be added without having to add them also to the cleanup. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: test_intel_pt.sh: Add cleanup function	Adrian Hunter
	Add a cleanup function that will still clean up if the script is terminated prematurely. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf tests: Fix 'perf probe' error log check in skip_if_no_debuginfo	Athira Rajeev
	The perf probe related tests like probe_vfs_getname.sh which is in "tools/perf/tests/shell" directory have dependency on debuginfo information in the kernel. Currently debuginfo check is handled by skip_if_no_debuginfo function in the file "lib/probe_vfs_getname.sh". skip_if_no_debuginfo function looks for this specific error log from perf probe to skip the testcase: <<>> Failed to find the path for the kernel\|Debuginfo-analysis is not supported <>> But in some case, like this one in powerpc, while running this test, observed error logs is: <<>> The /lib/modules/<version>/build/vmlinux file has no debug information. Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo package. Error: Failed to add events. <<>> Update the skip_if_no_debuginfo function to include the above error, to skip the test in these scenarios too. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220916104904.99798-1-atrajeev@linux.vnet.ibm.com Reviewed-By: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf annotate: Toggle full address <-> offset display	Namhyung Kim
	Handle 'f' key to toggle the display offset and full address. Obviously it only works when users set to see disassembler output ('o' key). It'd be useful when users want to see the full virtual address in the TUI annotate browser. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220923173142.805896-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf tools: Add 'addr' sort key	Namhyung Kim
	Sometimes users want to see actual (virtual) address of sampled instructions. Add a new 'addr' sort key to display the raw addresses. $ perf record -o- true \| perf report -i- -s addr # To display the perf.data header info, please use --header/--header-only options. # [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] # # Total Lost Samples: 0 # # Samples: 12 of event 'cycles:u' # Event count (approx.): 252512 # # Overhead Address # ........ .................. # 42.96% 0x7f96f08443d7 29.55% 0x7f96f0859b50 14.76% 0x7f96f0852e02 8.30% 0x7f96f0855028 4.43% 0xffffffff8de01087 Note that it just compares and displays the sample ip. Each process can have a different memory layout and the ip will be different even if they run the same binary. So this sort key is mostly meaningful for per-process profile data. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220923173142.805896-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf inject: Clarify build-id options a little bit	Namhyung Kim
	Update the documentation of --build-id and --buildid-all options to clarify the difference between them. The former requires full sample processing to find which DSOs are actually used. While the latter simply injects every DSO's build-id from MMAP{,2} records, skipping SAMPLEs. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220923173142.805896-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf record: Fix a segfault in record__read_lost_samples()	Namhyung Kim
	When it fails to open events record__open() returns without setting the session->evlist. Then it gets a segfault in the function trying to read lost sample counts. You can easily reproduce it as a normal user like: $ perf record -p 1 true ... perf: Segmentation fault ... Skip the function if it has no evlist. And add more protection for evsels which are not properly initialized. Fixes: a49aa8a54e861af1 ("perf record: Read and inject LOST_SAMPLES events") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220909235024.278281-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf top: Fix error code in cmd_top()	Shang XiaoJing
	There are three error paths which return success: 1. Propagate the errno from evlist__create_maps() if it failed. 2. Return -EINVAL if top.sb_evlist is NULL. 3. Return -EINVAL if evlist__add_bpf_sb_event() failed. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220922141438.22487-4-shangxiaojing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf stat: Merge cases in process_evlist	Shang XiaoJing
	As two cases in process_evlist has same behavior, make the first fall through to the second. Commiter notes: Added __fallthrough, the kernel has "fallthrough", we need to make tools/ use it. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220922141438.22487-3-shangxiaojing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf genelf: Fix error code in jit_write_elf()	Shang XiaoJing
	The error code is set to -1 at the beginning of jit_write_elf(), but it is assigned by jit_add_eh_frame_info() in the middle, hence the following error can only return the error code of jit_add_eh_frame_info(). Reset the error code to the default value after being assigned by jit_add_eh_frame_info(). Fixes: 086f9f3d7897d808 ("perf jit: Generate .eh_frame/.eh_frame_hdr in DSO") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stefano Sanfilippo <ssanfilippo@chromium.org> Link: https://lore.kernel.org/r/20220922141438.22487-2-shangxiaojing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf lock contention: Skip stack trace from BPF	Namhyung Kim
	Currently it collects stack traces to max size then skip entries. Because we don't have control how to skip perf callchains. But BPF can do it with bpf_get_stackid() with a flag. Say we have max-stack=4 and stack-skip=2, we get these stack traces. Before: After: .---> +---+ <--. .---> +---+ <--. \| \| \| \| \| \| \| \| \| +---+ usable \| +---+ \| max \| \| \| max \| \| \| stack +---+ <--' stack +---+ usable \| \| X \| \| \| \| \| \| +---+ skip \| +---+ \| \| \| X \| \| \| \| \| `---> +---+ `---> +---+ <--' <=== collection \| X \| +---+ skip \| X \| +---+ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220912055314.744552-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf lock contention: Allow to change stack depth and skip	Namhyung Kim
	It needs stack traces to find callers of locks. To minimize the performance overhead it only collects up to 8 entries for each stack trace. And it skips first 3 entries as they came from BPF, tracepoint and lock functions which are not interested for most users. But it turned out that those numbers are different in some configuration. Using fixed number can result in non meaningful caller names. Let's make them adjustable with --stack-depth and --skip-stack options. On my setup, the default output is like below: # /perf lock con -ab -F contended,wait_total sleep 3 contended total wait type caller 28 4.55 ms rwlock:W __bpf_trace_contention_begin+0xb 33 1.67 ms rwlock:W __bpf_trace_contention_begin+0xb 12 580.28 us spinlock __bpf_trace_contention_begin+0xb 60 240.54 us rwsem:R __bpf_trace_contention_begin+0xb 27 64.45 us spinlock __bpf_trace_contention_begin+0xb If I change the stack skip to 5, the result will be like: # perf lock con -ab -F contended,wait_total --stack-skip 5 sleep 3 contended total wait type caller 32 715.45 us spinlock folio_lruvec_lock_irqsave+0x61 26 550.22 us spinlock folio_lruvec_lock_irqsave+0x61 15 486.93 us rwsem:R mmap_read_lock+0x13 12 139.66 us rwsem:W vm_mmap_pgoff+0x93 1 7.04 us spinlock tick_do_update_jiffies64+0x25 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220912055314.744552-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf lock contention: Show full callstack with -v option	Namhyung Kim
	Currently it shows a caller function for each entry, but users need to see the full call stacks sometimes. Use -v/--verbose option to do that. # perf lock con -a -b -v sleep 3 Looking at the vmlinux_path (8 entries long) symsrc__init: cannot get elf header. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols contended total wait max wait avg wait type caller 1 10.74 us 10.74 us 10.74 us spinlock __bpf_trace_contention_begin+0xb 0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117 0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117 0xffffffffbb8b8e75 bpf_trace_run2+0x35 0xffffffffbb7eab9b __bpf_trace_contention_begin+0xb 0xffffffffbb7ebe75 queued_spin_lock_slowpath+0x1f5 0xffffffffbc1c26ff _raw_spin_lock+0x1f 0xffffffffbb841015 tick_do_update_jiffies64+0x25 0xffffffffbb8409ee tick_irq_enter+0x9e 1 7.70 us 7.70 us 7.70 us spinlock __bpf_trace_contention_begin+0xb 0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117 0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117 0xffffffffbb8b8e75 bpf_trace_run2+0x35 0xffffffffbb7eab9b __bpf_trace_contention_begin+0xb 0xffffffffbb7ebe75 queued_spin_lock_slowpath+0x1f5 0xffffffffbc1c26ff _raw_spin_lock+0x1f 0xffffffffbb7bc27e raw_spin_rq_lock_nested+0xe 0xffffffffbb7cef9c load_balance+0x66c Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220912055314.744552-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf lock contention: Factor out get_symbol_name_offset()	Namhyung Kim
	It's to convert addr to symbol+offset. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220912055314.744552-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04	perf test: Add basic core_wide expression test	Ian Rogers
	Add basic test for coverage, similar to #smt_on. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220831174926.579643-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>