Age | Commit message (Collapse) | Author |
|
Rather than copying the path and appending the directory entry in a
fresh path buffer, append to the path at the end of where it is for
the recursion level. This saves a PATH_MAX buffer per recursion level
and some unnecessary copying.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
This avoids scanddir reading the directory into memory that's
allocated and instead allocates on the stack.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250222061015.303622-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Switch memory_node__read and build_mem_topology from opendir/readdir
to io_dir__readdir, with smaller stack allocations. Reduces peak
memory consumption of perf record by 10kb.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Compared to glibc's opendir/readdir this lowers the max RSS of perf
record by 1.8MB on a Debian machine.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Prior to commit 70c90e4a6b2f ("perf parse-events: Avoid scanning PMUs
before parsing") names (generally event names) excluded hyphen (minus)
symbols as the formation of legacy names with hyphens was handled in
the yacc code. That commit allowed hyphens supposedly making
name_minus unnecessary. However, changing name_minus to name has
issues in the term config tokens as then name ends up having priority
over numbers and name allows matching numbers since commit
5ceb57990bf4 ("perf parse: Allow tracepoint names to start with digits
"). It is also permissable for a name to match with a colon (':') in
it when its in a config term list. To address this rename name_minus
to term_name, make the pattern match name's except for the colon, add
number matching into the config term region with a higher priority
than name matching. This addresses an inconsistency and allows greater
matching for names inside of term lists, for example, they may start
with a number.
Rename name_tag to quoted_name and update comments and helper
functions to avoid str detecting quoted strings which was already done
by the lexer.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250109175401.161340-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When testing perf trace on NixOS, I noticed significant startup delays:
- `ls`: ~2ms
- `strace ls`: ~10ms
- `perf trace ls`: ~550ms
Profiling showed that 51% of the time is spent reading files,
26% in loading BPF programs, and 11% in `newfstatat`.
This patch optimizes module path exploration by avoiding `stat()` calls
unless necessary. For filesystems that do not implement `d_type`
(DT_UNKNOWN), it falls back to the old behavior.
See `readdir(3)` for details.
This reduces `perf trace ls` time to ~500ms.
A more thorough startup optimization based on command parameters would
be ideal, but that is a larger effort.
Signed-off-by: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Acked-by: Howard Chu <howardchu95@gmail.com>
Link: https://lore.kernel.org/r/20250206113314.335376-2-krzysztof.m.lopatowski@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
In sysfs, the perf events are all located in
/sys/bus/event_source/devices/ but some places ended up hard-coding the
location to be at the root of /sys/devices/ which could be very risky as
you do not exactly know what type of device you are accessing in sysfs
at that location.
So fix this all up by properly pointing everything at the bus device
list instead of the root of the sysfs devices/ tree.
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/2025021955-implant-excavator-179d@gregkh
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Reorder the struct fields by size to reduce paddings and reduce
struct simd_flags size from 8 to 1 byte.
This reduces struct hist_entry size by 8 bytes (592->584),
and leaves a single more usable 6 byte padding hole.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/7c1cb1c8f9901e945162701ba7269d0f9c70be89.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add record/report --latency flag that allows to capture and show
latency-centric profiles rather than the default CPU-consumption-centric
profiles. For latency profiles record captures context switch events,
and report shows Latency as the first column.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/e9640464bcbc47dde2cb557003f421052ebc9eec.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Latency output field is similar to overhead, but represents overhead for
latency rather than CPU consumption. It's re-scaled from overhead by dividing
weight by the current parallelism level at the time of the sample.
It effectively models profiling with 1 sample taken per unit of wall-clock
time rather than unit of CPU time.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/b6269518758c2166e6ffdc2f0e24cfdecc8ef9c1.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add parallelism filter that can be used to look at specific parallelism
levels only. The format is the same as cpu lists. For example:
Only single-threaded samples: --parallelism=1
Low parallelism only: --parallelism=1-4
High parallelism only: --parallelism=64-128
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/e61348985ff0a6a14b07c39e880edbd60a8f8635.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
We already have all u8 bits taken, adding one more filter leads to unpleasant
failure mode, where code compiles w/o warnings, but the last filters silently
don't work. Add a typedef and switch to u16.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/32b4ce1731126c88a2d9e191dc87e39ae4651cb7.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Show parallelism level in profiles if requested by user.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/7f7bb87cbaa51bf1fb008a0d68b687423ce4bad4.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add calculation of the current parallelism level (number of threads actively
running on CPUs). The parallelism level can be shown in reports on its own,
and to calculate latency overheads.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/0f8c1b8eb12619029e31b3d5c0346f4616a5aeda.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The struct dump_regs contains 512 bytes of cache_regs, meaning the two
values in perf_sample contribute 1088 bytes of its total 1384 bytes
size. Initializing this much memory has a cost reported by Tavian
Barnes <tavianator@tavianator.com> as about 2.5% when running `perf
script --itrace=i0`:
https://lore.kernel.org/lkml/d841b97b3ad2ca8bcab07e4293375fb7c32dfce7.1736618095.git.tavianator@tavianator.com/
Adrian Hunter <adrian.hunter@intel.com> replied that the zero
initialization was necessary and couldn't simply be removed.
This patch aims to strike a middle ground of still zeroing the
perf_sample, but removing 79% of its size by make user_regs and
intr_regs optional pointers to zalloc-ed memory. To support the
allocation accessors are created for user_regs and intr_regs. To
support correct cleanup perf_sample__init and perf_sample__exit
functions are created and added throughout the code base.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250113194345.1537821-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
I found that it failed to load a binary using --symfs option. Say I
have a binary in /home/user/prog/xxx and a perf data file with it. If I
move them to a different machine and use --symfs, it tries to find the
binary in some locations under symfs using dso__read_binary_type_filename(),
but not the last one.
${symfs}/usr/lib/debug/home/user/prog/xxx.debug
${symfs}/usr/lib/debug/home/user/prog/xxx
${symfs}/home/user/prog/.debug/xxx
/home/user/prog/xxx
It should check ${symfs}/home/usr/prog/xxx. Let's fix it.
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20250212221445.437481-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
It was only used in perf trace and it switched to use hashmap instead.
Let's delete the code.
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250205205443.1986408-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Some topdown related metrics may fail on hybrid machines.
$ perf stat -M tma_frontend_bound
Cannot resolve IDs for tma_frontend_bound:
cpu_atom@TOPDOWN_FE_BOUND.ALL@ / (8 * cpu_atom@CPU_CLK_UNHALTED.CORE@)
In the find_tool_events(), the tool_pmu__event_to_str() is used to
compare the tool_events. It only checks the event name, no PMU or arch.
So the tool_events[TOOL_PMU__EVENT_SLOTS] is set to true, because the
p-core Topdown metrics has "slots" event.
The tool_events is shared. So when parsing the e-core metrics, the
"slots" is automatically added.
The "slots" event as a tool event should only be available on arm64. It
has a different meaning on X86. The tool_pmu__skip_event() intends
handle the case. Apply it for tool_pmu__event_to_str() as well.
There is a lack of sanity check in the expr__get_id(). Add the check.
Closes: https://lore.kernel.org/lkml/608077bc-4139-4a97-8dc4-7997177d95c4@linux.intel.com/
Fixes: 069057239a67 ("perf tool_pmu: Move expr literals to tool_pmu")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: thomas.falcon@intel.com
Link: https://lore.kernel.org/r/20250207152844.302167-1-kan.liang@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The last use of machine__fprintf_vmlinux_path() was removed in 2011 by
commit ab81f3fd350c ("perf top: Reuse the 'report' hist_entry/hists
classes")
mmap_cpu_mask__duplicate() was added in 2021 by
commit 6bd006c6eb7f ("perf mmap: Introduce mmap_cpu_mask__duplicate()")
but hasn't been used since.
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250204220545.456435-1-linux@treblig.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
To get the various fixes in the current master.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The existing logic would disable uniquification on an evlist or enable
it per evsel, this is unfortunate as uniquification is most needed
when events have the same name and so the whole evlist must be
considered. Change the initial disable uniquify on an evlist
processing to also set a needs_uniquify flag, for cases like the
matching event names. This must be done as an initial pass as
uniquification of an event name will change the behavior of the
check. Keep the per counter uniquification but now only uniquify event
names when the needs_uniquify flag is set.
Before this change a hwmon like temp1 wouldn't be uniquified and
afterwards it will (ie the PMU is added to the temp1 event's name).
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Counter merging was added in commit 942c5593393d ("perf stat: Add
perf_stat_merge_counters()"), however, it merges events with the same
name on different PMUs regardless of whether the different PMUs are
actually of the same type (ie they differ only in the suffix on the
PMU). For hwmon events there may be a temp1 event on every PMU, but
the PMU names are all unique and don't differ just by a suffix. The
merging is over eager and will merge all the hwmon counters together
meaning an aggregated and very large temp1 value is shown. The same
would be true for say cache events and memory controller events where
the PMUs differ but the event names are the same.
Fix the problem by correctly saying two PMUs alias when they differ
only by suffix.
Note, there is an overlap with evsel's merged_stat with aggregation
and the evsel's metric_leader where aggregation happens for metrics.
Fixes: 942c5593393d ("perf stat: Add perf_stat_merge_counters()")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Wildcard PMU naming will match a name like pmu_1 to a PMU name like
pmu_10 but not to a PMU name like pmu_2 as the suffix forms part of
the match. No suffix matching will match pmu_10 to either pmu_1 or
pmu_2. Add or rename matching functions on PMU to make it clearer what
kind of matching is being performed.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Rather than scanning core or all PMUs, allow pmu_read_sysfs to read
some combination of core, other, hwmon and tool PMUs. The PMUs that
should be read and are already read are held as bitmaps. It is known
that a "hwmon_" prefix is necessary for a hwmon PMU's name, similarly
with "tool", so only scan those PMUs in situations the PMU name or the
PMU's type number make sense to.
The number of openat system calls reduces from 276 to 98 for a hwmon
event. The number of openats for regular perf events isn't changed.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
evsel__is_hybrid returns true if there are multiple core PMUs and the
evsel is for a core PMU. Determining the number of core PMUs can
require loading/scanning PMUs. There's no point doing the scanning if
evsel for the is_hybrid test isn't core so reorder the tests to reduce
PMU scanning.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
perf test 11 hwmon fails on s390 with this error
# ./perf test -Fv 11
--- start ---
---- end ----
11.1: Basic parsing test : Ok
--- start ---
Testing 'temp_test_hwmon_event1'
Using CPUID IBM,3931,704,A01,3.7,002f
temp_test_hwmon_event1 -> hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/
FAILED tests/hwmon_pmu.c:189 Unexpected config for
'temp_test_hwmon_event1', 292470092988416 != 655361
---- end ----
11.2: Parsing without PMU name : FAILED!
--- start ---
Testing 'hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/'
FAILED tests/hwmon_pmu.c:189 Unexpected config for
'hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/',
292470092988416 != 655361
---- end ----
11.3: Parsing with PMU name : FAILED!
#
The root cause is in member test_event::config which is initialized
to 0xA0001 or 655361. During event parsing a long list event parsing
functions are called and end up with this gdb call stack:
#0 hwmon_pmu__config_term (hwm=0x168dfd0, attr=0x3ffffff5ee8,
term=0x168db60, err=0x3ffffff81c8) at util/hwmon_pmu.c:623
#1 hwmon_pmu__config_terms (pmu=0x168dfd0, attr=0x3ffffff5ee8,
terms=0x3ffffff5ea8, err=0x3ffffff81c8) at util/hwmon_pmu.c:662
#2 0x00000000012f870c in perf_pmu__config_terms (pmu=0x168dfd0,
attr=0x3ffffff5ee8, terms=0x3ffffff5ea8, zero=false,
apply_hardcoded=false, err=0x3ffffff81c8) at util/pmu.c:1519
#3 0x00000000012f88a4 in perf_pmu__config (pmu=0x168dfd0, attr=0x3ffffff5ee8,
head_terms=0x3ffffff5ea8, apply_hardcoded=false, err=0x3ffffff81c8)
at util/pmu.c:1545
#4 0x00000000012680c4 in parse_events_add_pmu (parse_state=0x3ffffff7fb8,
list=0x168dc00, pmu=0x168dfd0, const_parsed_terms=0x3ffffff6090,
auto_merge_stats=true, alternate_hw_config=10)
at util/parse-events.c:1508
#5 0x00000000012684c6 in parse_events_multi_pmu_add (parse_state=0x3ffffff7fb8,
event_name=0x168ec10 "temp_test_hwmon_event1", hw_config=10,
const_parsed_terms=0x0, listp=0x3ffffff6230, loc_=0x3ffffff70e0)
at util/parse-events.c:1592
#6 0x00000000012f0e4e in parse_events_parse (_parse_state=0x3ffffff7fb8,
scanner=0x16878c0) at util/parse-events.y:293
#7 0x00000000012695a0 in parse_events__scanner (str=0x3ffffff81d8
"temp_test_hwmon_event1", input=0x0, parse_state=0x3ffffff7fb8)
at util/parse-events.c:1867
#8 0x000000000126a1e8 in __parse_events (evlist=0x168b580,
str=0x3ffffff81d8 "temp_test_hwmon_event1", pmu_filter=0x0,
err=0x3ffffff81c8, fake_pmu=false, warn_if_reordered=true,
fake_tp=false) at util/parse-events.c:2136
#9 0x00000000011e36aa in parse_events (evlist=0x168b580,
str=0x3ffffff81d8 "temp_test_hwmon_event1", err=0x3ffffff81c8)
at /root/linux/tools/perf/util/parse-events.h:41
#10 0x00000000011e3e64 in do_test (i=0, with_pmu=false, with_alias=false)
at tests/hwmon_pmu.c:164
#11 0x00000000011e422c in test__hwmon_pmu (with_pmu=false)
at tests/hwmon_pmu.c:219
#12 0x00000000011e431c in test__hwmon_pmu_without_pmu (test=0x1610368
<suite.hwmon_pmu>, subtest=1) at tests/hwmon_pmu.c:23
where the attr::config is set to value 292470092988416 or 0x10a0000000000
in line 625 of file ./util/hwmon_pmu.c:
attr->config = key.type_and_num;
However member key::type_and_num is defined as union and bit field:
union hwmon_pmu_event_key {
long type_and_num;
struct {
int num :16;
enum hwmon_type type :8;
};
};
s390 is big endian and Intel is little endian architecture.
The events for the hwmon dummy pmu have num = 1 or num = 2 and
type is set to HWMON_TYPE_TEMP (which is 10).
On s390 this assignes member key::type_and_num the value of
0x10a0000000000 (which is 292470092988416) as shown in above
trace output.
Fix this and export the structure/union hwmon_pmu_event_key
so the test shares the same implementation as the event parsing
functions for union and bit fields. This should avoid
endianess issues on all platforms.
Output after:
# ./perf test -F 11
11.1: Basic parsing test : Ok
11.2: Parsing without PMU name : Ok
11.3: Parsing with PMU name : Ok
#
Fixes: 531ee0fd4836 ("perf test: Add hwmon "PMU" test")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250131112400.568975-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
This is also used in util/comm.c now, so instead of selectively doing
the feature test, always do it. If it's ever used anywhere else it's
less likely to cause another build failure.
This doesn't remove the need to manually include libc_compat.h, and
missing that will still cause an error for glibc < 2.26. There isn't a
way to fix that without poisoning reallocarray like libbpf did, but that
has other downsides like making memory debugging tools less useful. So
for Perf keep it like this and we'll have to fix up any missed includes.
Fixes the following build error:
util/comm.c:152:31: error: implicit declaration of function
'reallocarray' [-Wimplicit-function-declaration]
152 | tmp = reallocarray(comm_strs->strs,
| ^~~~~~~~~~~~
Fixes: 13ca628716c6 ("perf comm: Add reference count checking to 'struct comm_str'")
Reported-by: Ali Utku Selen <ali.utku.selen@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250129154405.777533-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Legacy events typically don't have a PMU when added leading to
mismatched legacy/non-legacy cases in find_stat. Use evsel__find_pmu
to make sure the evsel PMU is looked up. Update the evsel__find_pmu
code to look for the PMU using the extended config type or, for legacy
hardware/hw_cache events on non-hybrid systems, just use the core PMU.
Before:
```
$ perf stat -e cycles,cpu/instructions/ -a sleep 1
Performance counter stats for 'system wide':
215,309,764 cycles
44,326,491 cpu/instructions/
1.002555314 seconds time elapsed
```
After:
```
$ perf stat -e cycles,cpu/instructions/ -a sleep 1
Performance counter stats for 'system wide':
990,676,332 cycles
1,235,762,487 cpu/instructions/ # 1.25 insn per cycle
1.002667198 seconds time elapsed
```
Fixes: 3612ca8e2935 ("perf stat: Fix the hard-coded metrics calculation on the hybrid")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250109222109.567031-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add helper to get the name of the evsel's PMU. This handles the case
where there's no sysfs PMU via parse_events event_type helper.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250109222109.567031-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Now that filename__read_int() returns -errno instead of -1 these
statements need to be updated otherwise error values will be used as
die IDs.
This appears as a -2 die ID when the platform doesn't export one:
$ perf stat --per-core -a -- true
S36-D-2-C0 1 9.45 msec cpu-clock
And the session topology test fails:
$ perf test -vvv topology
CPU 0, core 0, socket 36
CPU 1, core 1, socket 36
CPU 2, core 2, socket 36
CPU 3, core 3, socket 36
FAILED tests/topology.c:137 Cpu map - Die ID doesn't match
---- end(-1) ----
38: Session topology : FAILED!
Fixes: 05be17eed774 ("tool api fs: Correctly encode errno for read/write open failures")
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241218115552.912517-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Prior to this change a string was used which could cause issues with
an unrecognized disassembler in symbol__disassembler. Change to
initializing an array of perf_disassembler enum values. If a value
already exists then adding it a second time is ignored to avoid array
out of bounds problems present in the previous code, it also allows a
statically sized array and removes memory allocation needs. Errors in
the disassembler string are reported when the config is parsed during
perf annotate or perf top start up. If the array is uninitialized
after processing the config file the default llvm, capstone then
objdump values are added but without a need to parse a string.
Fixes: a6e8a58de629 ("perf disasm: Allow configuring what disassemblers to use")
Closes: https://lore.kernel.org/lkml/CAP-5=fUdfCyxmEiTpzS2uumUp3-SyQOseX2xZo81-dQtWXj6vA@mail.gmail.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250124043856.1177264-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
As reported by Namhyung Kim and acknowledged by Qiao Zhao (link:
https://lore.kernel.org/linux-perf-users/20241206001436.1947528-1-namhyung@kernel.org/),
on certain machines, perf trace failed to load the BPF program into the
kernel. The verifier runs perf trace's BPF program for up to 1 million
instructions, returning an E2BIG error, whereas the perf trace BPF
program should be much less complex than that. This patch aims to fix
the issue described above.
The E2BIG problem from clang-15 to clang-16 is cause by this line:
} else if (size < 0 && size >= -6) { /* buffer */
Specifically this check: size < 0. seems like clang generates a cool
optimization to this sign check that breaks things.
Making 'size' s64, and use
} else if ((int)size < 0 && size >= -6) { /* buffer */
Solves the problem. This is some Hogwarts magic.
And the unbounded access of clang-12 and clang-14 (clang-13 works this
time) is fixed by making variable 'aug_size' s64.
As for this:
-if (aug_size > TRACE_AUG_MAX_BUF)
- aug_size = TRACE_AUG_MAX_BUF;
+aug_size = args->args[index] > TRACE_AUG_MAX_BUF ? TRACE_AUG_MAX_BUF : args->args[index];
This makes the BPF skel generated by clang-18 work. Yes, new clangs
introduce problems too.
Sorry, I only know that it works, but I don't know how it works. I'm not
an expert in the BPF verifier. I really hope this is not a kernel
version issue, as that would make the test case (kernel_nr) *
(clang_nr), a true horror story. I will test it on more kernel versions
in the future.
Fixes: 395d38419f18: ("perf trace augmented_raw_syscalls: Add more check s to pass the verifier")
Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241213023047.541218-1-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
An evsel idx may not be stable due to sorting, evlist removal,
etc. Try to reduce it being part of APIs by explicitly passing the
evsel in annotate code. Internally the code just reads evsel->core.idx
so behavior is unchanged.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Chen Ni <nichen@iscas.ac.cn>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20250117181848.690474-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When a filtered column is not present in the sort order, profiles become
arbitrary broken. Filtered and non-filtered entries are collapsed
together, and the filtered-by field ends up with a random value (either
from a filtered or non-filtered entry). If we end up with filtered
entry/value, then the whole collapsed entry will be filtered out and will
be missing in the profile. If we end up with non-filtered entry/value,
then the overhead value will be wrongly larger (include some subset
of filtered out samples).
This leads to very confusing profiles. The problem is hard to notice,
and if noticed hard to understand. If the filter is for a single value,
then it can be fixed by adding the corresponding field to the sort order
(provided user understood the problem). But if the filter is for multiple
values, it's impossible to fix b/c there is no concept of binary sorting
based on filter predicate (we want to group all non-filtered values in
one bucket, and all filtered values in another).
Examples of affected commands:
perf report --tid=123
perf report --sort overhead,symbol --comm=foo,bar
Fix this by considering filtered status as the highest priority
sort/collapse predicate.
As a side effect this effectively adds a new feature of showing profile
where several lines are combined based on arbitrary filtering predicate.
For example, showing symbols from binaries foo and bar combined together,
but not from other binaries; or showing combined overhead of several
particular threads.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/359dc444ce94d20e59d3a9e360c36fbeac833a04.1736927981.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Application of cmp/sort/collapse fmt callbacks is duplicated 6 times.
Factor it into a common helper function. NFC.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/84c4b55614e24a344f86ae0db62e8fa8f251f874.1736927981.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
To allow for setting a variable from some other tool, like with the
"wallclock" patchset needs to allow the user to opt-in to having
that key in the sort order for 'perf report'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/lkml/Z4akewi7UPXpagce@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rename err to out to avoid confusion because buf is still supposed to be
freed in non error cases.
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241211085525.519458-3-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When kernel is built without debuginfo, running 'perf record' with
--off-cpu results in segfault as below:
./perf record --off-cpu -e dummy sleep 1
libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: failed to find '.BTF' ELF section in /lib/modules/6.13.0-rc3+/build/vmlinux
libbpf: failed to find valid kernel BTF
Segmentation fault (core dumped)
The backtrace pointed to:
#0 0x00000000100fb17c in btf.type_cnt ()
#1 0x00000000100fc1a8 in btf_find_by_name_kind ()
#2 0x00000000100fc38c in btf.find_by_name_kind ()
#3 0x00000000102ee3ac in off_cpu_prepare ()
#4 0x000000001002f78c in cmd_record ()
#5 0x00000000100aee78 in run_builtin ()
#6 0x00000000100af3e4 in handle_internal_command ()
#7 0x000000001001004c in main ()
Code sequence is:
static void check_sched_switch_args(void)
{
struct btf *btf = btf__load_vmlinux_btf();
const struct btf_type *t1, *t2, *t3;
u32 type_id;
type_id = btf__find_by_name_kind(btf, "btf_trace_sched_switch",
BTF_KIND_TYPEDEF);
btf__load_vmlinux_btf() fails when CONFIG_DEBUG_INFO_BTF is not enabled.
Here bpf__find_by_name_kind() calls btf__type_cnt() with NULL btf value
and results in segfault.
To fix this, add a check to see if btf is not NULL before invoking
bpf__find_by_name_kind().
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://lore.kernel.org/r/20241223135813.8175-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In machine__create_module(), it reads /proc/modules to get a list of
modules in the system. The file shows the start address (of text) and
the size of the module so it uses the info to reconstruct system memory
maps for symbol resolution.
But module memory consists of multiple segments and they can be
scaterred. Currently perf tools assume they are contiguous and see some
overlaps. This can confuse the tool when it finds a map containing a
given address.
As we mostly care about the function symbols in the text segment, it can
fixup the size or end address of modules when there's an overlap. We
can use maps__fixup_end() which updates the end address using the start
address of the next map.
Ideally it should be able to track other segments (like data/rodata),
but that would require some changes in /proc/modules IMHO.
Reported-by: Blake Jones <blakejones@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20241218220453.203069-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When there are more than one symbols at the same address, it needs to
choose which one is better. In choose_best_symbol() it didn't check the
type of symbols. It's possible to have labels in other symbols and in
that case, it would be better to pick the actual symbol over the labels.
To minimize the possible impact on other symbols, I only check NOTYPE
symbols specifically.
$ readelf -sW vmlinux | grep -e __do_softirq -e __softirqentry_text_start
105089: ffffffff82000000 814 FUNC GLOBAL DEFAULT 1 __do_softirq
111954: ffffffff82000000 0 NOTYPE GLOBAL DEFAULT 1 __softirqentry_text_start
The commit 77b004f4c5c3c90b tried to do the same by not giving the size
to the label symbols but it seems there's some label-only symbols in asm
code. Let's restore the original code and choose the right symbol using
type of the symbols.
Fixes: 77b004f4c5c3c90b ("perf symbol: Do not fixup end address of labels")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: http://lore.kernel.org/lkml/Z3b-DqBMnNb4ucEm@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
cxx_demangle_sym is weak in case demangle-cxx.c replaces the
definition in symbol-elf.c. When demangle-cxx.c is built
HAVE_CXA_DEMANGLE_SUPPORT is defined, as such the define can be used
to avoid a weak symbol.
As weak symbols are outside of the C standard their use can lead to
strange behaviors, in particular with LTO, as well as causing issues to
be hidden at link time.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241119031754.1021858-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
All architectures now support HAVE_SYSCALL_TABLE_SUPPORT, so the flag is
no longer needed. With the removal of the flag, the related
GENERIC_SYSCALL_TABLE can also be removed.
libaudit was only used as a fallback for when HAVE_SYSCALL_TABLE_SUPPORT
was not defined, so libaudit is also no longer needed for any
architecture.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-16-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use the generic scripts to generate headers from the syscall table
instead of the custom ones for s390.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-15-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use the generic scripts to generate headers from the syscall table
instead of the custom ones for powerpc.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-14-7543b5293098@rivosinc.com
Link: https://lore.kernel.org/lkml/20250110100505.78d81450@canb.auug.org.au
[ Stephen Rothwell noticed on linux-next that the powerpc build for perf was broken and ...]
Link: https://lore.kernel.org/lkml/20250109-perf_powerpc_spu-v1-1-c097fc43737e@rivosinc.com
[ ... Charlie fixed it up and asked for it to be squashed to avoid breaking bisection. ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use the generic scripts to generate headers from the syscall table for
mips.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-13-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
loongarch uses a syscall table, use that in perf instead of using unistd.h.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-12-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
arm64 uses a syscall table, use that in perf instead of using unistd.h.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-11-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|