summaryrefslogtreecommitdiff
path: root/tools/perf/util/machine.c
AgeCommit message (Collapse)Author
2019-08-29perf machine: Replace MAX_NR_CPUS with perf_env::nr_cpus_onlineKyle Meyer
nr_cpus, the number of CPUs online during a record session bound by MAX_NR_CPUS, can be used as a dynamic alternative for MAX_NR_CPUS in __machine__synthesize_threads and machine__set_current_tid. Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <russ.anderson@hpe.com> Link: http://lore.kernel.org/lkml/20190827214352.94272-6-meyerk@stormcage.eag.rdlabs.hpecorp.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26perf tool: Rename perf_tool::bpf_event to bpfArnaldo Carvalho de Melo
No need for that _event suffix, do just like all the other meta event handlers and suppress that suffix. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lkml.kernel.org/n/tip-03spzxtqafbabbbmnm7y4xfx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26perf tools: Rename perf_event::ksymbol_event to perf_event::ksymbolArnaldo Carvalho de Melo
Just like all the other meta events, that extra _event suffix is just redundant, ditch it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lkml.kernel.org/n/tip-0q8b2xnfs17q0g523oej75s0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26libperf: Add PERF_RECORD_LOST_SAMPLES 'struct lost_samples_event' to ↵Jiri Olsa
perf/event.h Move the PERF_RECORD_LOST_SAMPLES event definition into libperf's event.h header include. In order to keep libperf simple, we switch 'u64/u32/u16/u8' types used events to their generic '__u*' versions. Perf added 'u*' types mainly to ease up printing __u64 values as stated in the linux/types.h comment: /* * We define u64 as uint64_t for every architecture * so that we can print it with "%"PRIx64 without getting warnings. * * typedef __u64 u64; * typedef __s64 s64; */ Add and use new PRI_lu64 and PRI_lx64 macros for that. Use extra '_' to ease up the reading and differentiate them from standard PRI*64 macros. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190825181752.722-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26libperf: Add PERF_RECORD_LOST 'struct lost_event' to perf/event.hJiri Olsa
Move the lost_event event definition to libperf's event.h header include. In order to keep libperf simple, we switch 'u64/u32/u16/u8' types used events to their generic '__u*' versions. Perf added 'u*' types mainly to ease up printing __u64 values as stated in the linux/types.h comment: /* * We define u64 as uint64_t for every architecture * so that we can print it with "%"PRIx64 without getting warnings. * * typedef __u64 u64; * typedef __s64 s64; */ Add and use new PRI_lu64 and PRI_lx64 macros for that. Use extra '_' to ease up the reading and differentiate them from standard PRI*64 macros. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190825181752.722-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26perf srcline: Add missing srcline.h header to files needing its defsArnaldo Carvalho de Melo
When srcline was introduced it wrongly added the include to util/sort.h, even with that header not needing the definitions it provides, fix it by adding it to the places that need it as a pre patch to remove srcline.h from sort.h. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-shuebppedtye8hrgxk15qe3x@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26perf record: Move record_opts and other record decls out of perf.hArnaldo Carvalho de Melo
And into a separate util/record.h, to better isolate things and make sure that those who use record_opts and the other moved declarations are explicitly including the necessary header. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-31q8mei1qkh74qvkl9nwidfq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-12Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo
To get closer to upstream and check if we need to sync more UAPI headers, pick up fixes for libbpf that prevent perf's container tests from completing successfuly, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-08perf record: Fix module size on s390Thomas Richter
On s390 the modules loaded in memory have the text segment located after the GOT and Relocation table. This can be seen with this output: [root@m35lp76 perf]# fgrep qeth /proc/modules qeth 151552 1 qeth_l2, Live 0x000003ff800b2000 ... [root@m35lp76 perf]# cat /sys/module/qeth/sections/.text 0x000003ff800b3990 [root@m35lp76 perf]# There is an offset of 0x1990 bytes. The size of the qeth module is 151552 bytes (0x25000 in hex). The location of the GOT/relocation table at the beginning of a module is unique to s390. commit 203d8a4aa6ed ("perf s390: Fix 'start' address of module's map") adjusts the start address of a module in the map structures, but does not adjust the size of the modules. This leads to overlapping of module maps as this example shows: [root@m35lp76 perf] # ./perf report -D 0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x25000) @ 0]: x /lib/modules/.../qeth.ko.xz 0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x8000) @ 0]: x /lib/modules/.../ip6_tables.ko.xz The module qeth.ko has an adjusted start address modified to b3990, but its size is unchanged and the module ends at 0x3ff800d8990. This end address overlaps with the next modules start address of 0x3ff800d85a0. When the size of the leading GOT/Relocation table stored in the beginning of the text segment (0x1990 bytes) is subtracted from module qeth end address, there are no overlaps anymore: 0x3ff800d8990 - 0x1990 = 0x0x3ff800d7000 which is the same as 0x3ff800b2000 + 0x25000 = 0x0x3ff800d7000. To fix this issue, also adjust the modules size in function arch__fix_module_text_start(). Add another function parameter named size and reduce the size of the module when the text segment start address is changed. Output after: 0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x23670) @ 0]: x /lib/modules/.../qeth.ko.xz 0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x7a60) @ 0]: x /lib/modules/.../ip6_tables.ko.xz Reported-by: Stefan Liebler <stli@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: stable@vger.kernel.org Fixes: 203d8a4aa6ed ("perf s390: Fix 'start' address of module's map") Link: http://lkml.kernel.org/r/20190724122703.3996-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evselJiri Olsa
Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'. Committer notes: Fixed up these: tools/perf/arch/arm/util/auxtrace.c tools/perf/arch/arm/util/cs-etm.c tools/perf/arch/arm64/util/arm-spe.c tools/perf/arch/s390/util/auxtrace.c tools/perf/util/cs-etm.c Also cc1: warnings being treated as errors tests/sample-parsing.c: In function 'do_test': tests/sample-parsing.c:162: error: missing initializer tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus') struct evsel evsel = { .needs_swap = false, - .core.attr = { - .sample_type = sample_type, - .read_format = read_format, + .core = { + . attr = { + .sample_type = sample_type, + .read_format = read_format, + }, [perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1 gcc (GCC) 4.4.7 Also we don't need to include perf_event.h in tools/perf/lib/include/perf/evsel.h, forward declaring 'struct perf_event_attr' is enough. And this even fixes the build in some systems where things are used somewhere down the include path from perf_event.h without defining __always_inline. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29perf evsel: Rename struct perf_evsel to struct evselJiri Olsa
Rename struct perf_evsel to struct evsel, so we don't have a name clash when we add struct perf_evsel in libperf. Committer notes: Added fixes for arm64, provided by Jiri. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29perf tools: Rename struct thread_map to struct perf_thread_mapJiri Olsa
Rename struct thread_map to struct perf_thread_map, so it could be part of libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-09perf tools: Use zfree() where applicableArnaldo Carvalho de Melo
In places where the equivalent was already being done, i.e.: free(a); a = NULL; And in placs where struct members are being freed so that if we have some erroneous reference to its struct, then accesses to freed members will result in segfaults, which we can detect faster than use after free to areas that may still have something seemingly valid. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-jatyoofo5boc1bsvoig6bb6i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-09tools lib: Adopt zalloc()/zfree() from tools/perfArnaldo Carvalho de Melo
Eroding a bit more the tools/perf/util/util.h hodpodge header. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-08Merge remote-tracking branch 'tip/perf/core' into perf/urgentArnaldo Carvalho de Melo
To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-06perf thread: Allow references to thread objects after machine__exit()Arnaldo Carvalho de Melo
Threads are created when we either synthesize PERF_RECORD_FORK events for pre-existing threads or when we receive PERF_RECORD_FORK events from the kernel as new threads get created. We then keep them in machine->threads[].entries rb trees till when we receive a PERF_RECORD_EXIT, i.e. that thread terminated. The thread object has a reference count that is grabbed when, for instance, we keep that thread referenced in struct hist_entry, in 'perf report' and 'perf top'. When we receive a PERF_RECORD_EXIT we remove the thread object from the rb tree and move it to the corresponding machine->threads[].dead list, then we do a thread__put(), dropping the reference we had for keeping it in the rb tree. In thread__put() we were assuming that when the reference count hit zero we should remove it from the dead list by simply doing a list_del_init(&thread->node). That works well when all the thread lifetime is during the machine that has the list heads lifetime, since we know that we can do the list_del_init() and it will update the 'dead' list_head. But in 'perf sched lat' we were doing: machine__new() (via perf_session__new) process events, grabbing refcounts to keep those thread objects in 'perf sched' local data structures. machine__exit() (via perf_session__delete) which would delete the 'dead' list heads. And then doing the final thread__put() for the refcounts 'perf sched' rightfully obtained for keeping those thread object references. b00m, since thread__put() would do the list_del_init() touching a dead dead list head. Fix it by removing all the dead threads from machine->threads[].dead at machine__exit(), since whatever is there should have refcounts taken by things like 'perf sched lat', and make thread__put() check if the thread is in a linked list before removing it from that list. Reported-by: Wei Li <liwei391@huawei.com> Link: https://lkml.kernel.org/r/20190508143648.8153-1-liwei391@huawei.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhipeng Xie <xiezhipeng1@huawei.com> Link: https://lkml.kernel.org/r/20190704194355.GI10740@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-25tools perf: Move from sane_ctype.h obtained from git to the Linux's originalArnaldo Carvalho de Melo
We got the sane_ctype.h headers from git and kept using it so far, but since that code originally came from the kernel sources to the git sources, perhaps its better to just use the one in the kernel, so that we can leverage tools/perf/check_headers.sh to be notified when our copy gets out of sync, i.e. when fixes or goodies are added to the code we've copied. This will help with things like tools/lib/string.c where we want to have more things in common with the kernel, such as strim(), skip_spaces(), etc so as to go on removing the things that we have in tools/perf/util/ and instead using the code in the kernel, indirectly and removing things like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements are made to the original code. Hopefully this also should help with reducing the difference of code hosted in tools/ to the one in the kernel proper. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-25perf tools: Add missing util.h to pick up 'page_size' variableArnaldo Carvalho de Melo
Not to depend of getting it indirectly. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-tirjsmvu4ektw0k7lm8k9lhu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28perf machine: Return NULL instead of null-terminating /proc/version arrayDonald Yandt
Return NULL instead of null-terminating version char array when fgets fails due to end-of-file or error. Signed-off-by: Donald Yandt <donald.yandt@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Fixes: 30ba5b0e66c8 ("perf machine: Null-terminate version char array upon fgets(/proc/version) error") Link: http://lkml.kernel.org/r/20190528134128.30841-1-donald.yandt@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28perf machine: Keep zero in pgoff BPF mapJiri Olsa
With pgoff set to zero, the map__map_ip function will return BPF addresses based from 0, which is what we need when we read the data from a BPF DSO. Adding BPF symbols with mapped IP addresses as well. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190508132010.14512-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28perf machine: Read also the end of the kernelJiri Olsa
We mark the end of kernel based on the first module, but that could cover some bpf program maps. Reading _etext symbol if it's present to get precise kernel map end. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lkml.kernel.org/r/20190508132010.14512-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-15perf machine: Null-terminate version char array upon fgets(/proc/version) errorDonald Yandt
If fgets() fails due to any other error besides end-of-file, the version char array may not even be null-terminated. Signed-off-by: Donald Yandt <donald.yandt@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Avi Kivity <avi@scylladb.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Fixes: a1645ce12adb ("perf: 'perf kvm' tool for monitoring guest performance from host") Link: http://lkml.kernel.org/r/20190514110100.22019-1-donald.yandt@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28perf machine: Update kernel map address and re-order properlyWei Li
Since commit 1fb87b8e9599 ("perf machine: Don't search for active kernel start in __machine__create_kernel_maps"), the __machine__create_kernel_maps() just create a map what start and end are both zero. Though the address will be updated later, the order of map in the rbtree may be incorrect. The commit ee05d21791db ("perf machine: Set main kernel end address properly") fixed the logic in machine__create_kernel_maps(), but it's still wrong in function machine__process_kernel_mmap_event(). To reproduce this issue, we need an environment which the module address is before the kernel text segment. I tested it on an aarch64 machine with kernel 4.19.25: [root@localhost hulk]# grep _stext /proc/kallsyms ffff000008081000 T _stext [root@localhost hulk]# grep _etext /proc/kallsyms ffff000009780000 R _etext [root@localhost hulk]# tail /proc/modules hisi_sas_v2_hw 77824 0 - Live 0xffff00000191d000 nvme_core 126976 7 nvme, Live 0xffff0000018b6000 mdio 20480 1 ixgbe, Live 0xffff0000018ab000 hisi_sas_main 106496 1 hisi_sas_v2_hw, Live 0xffff000001861000 hns_mdio 20480 2 - Live 0xffff000001822000 hnae 28672 3 hns_dsaf,hns_enet_drv, Live 0xffff000001815000 dm_mirror 40960 0 - Live 0xffff000001804000 dm_region_hash 32768 1 dm_mirror, Live 0xffff0000017f5000 dm_log 32768 2 dm_mirror,dm_region_hash, Live 0xffff0000017e7000 dm_mod 315392 17 dm_mirror,dm_log, Live 0xffff000001780000 [root@localhost hulk]# Before fix: [root@localhost bin]# perf record sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ] [root@localhost bin]# perf buildid-list -i perf.data 4c4e46c971ca935f781e603a09b52a92e8bdfee8 [vdso] [root@localhost bin]# perf buildid-list -i perf.data -H 0000000000000000000000000000000000000000 /proc/kcore [root@localhost bin]# After fix: [root@localhost tools]# ./perf/perf record sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ] [root@localhost tools]# ./perf/perf buildid-list -i perf.data 28a6c690262896dbd1b5e1011ed81623e6db0610 [kernel.kallsyms] 106c14ce6e4acea3453e484dc604d66666f08a2f [vdso] [root@localhost tools]# ./perf/perf buildid-list -i perf.data -H 28a6c690262896dbd1b5e1011ed81623e6db0610 /proc/kcore Signed-off-by: Wei Li <liwei391@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190228092003.34071-1-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06perf tools: Add missing include for symbols.hArnaldo Carvalho de Melo
Several places were using definitions found in symbols.h but not including it, getting it by sheer luck from some other headers that now are in the process of removing that include because they don't need it or because simply having struct forward declarations is enough, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-xbcvvx296d70kpg9wb0qmeq9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-25perf machine: Use cached rbtreesDavidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something required for nearly every operation dealing with machine->guests and threads->entries. The conversion is straightforward, however, it's worth noticing that the rb_erase_init() calls have been replaced by rb_erase_cached() which has no _init() flavor, however, the node is explicitly cleared next anyway, which was redundant until now. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20181206191819.30182-3-dave@stgolabs.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-21perf tools: Handle PERF_RECORD_BPF_EVENTSong Liu
This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to turn it on. Committer notes: Add dummy machine__process_bpf_event() variant that returns zero for systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking the build in such systems. Remove the needless include <machine.h> from bpf->event.h, provide just forward declarations for the structs and unions in the parameters, to reduce compilation time and needless rebuilds when machine.h gets changed. Committer testing: When running with: # perf record --bpf-event On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL is not present, we fallback to removing those two bits from perf_event_attr, making the tool to continue to work on older kernels: perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off bpf_event ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off ksymbol ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ And then proceeds to work without those two features. As passing --bpf-event is an explicit action performed by the user, perhaps we should emit a warning telling that the kernel has no such feature, but this can be done on top of this patch. Now with a kernel that supports these events, start the 'record --bpf-event -a' and then run 'perf trace sleep 10000' that will use the BPF augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus should generate PERF_RECORD_BPF_EVENT events: [root@quaco ~]# perf record -e dummy -a --bpf-event ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.713 MB perf.data ] [root@quaco ~]# bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 16: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 17: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 18: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 21: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 22: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 31: tracepoint name sys_enter tag 12504ba9402f952f gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 512B jited 374B memlock 4096B map_ids 30,29,28 32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 256B jited 191B memlock 4096B map_ids 30,29 # perf report -D | grep PERF_RECORD_BPF_EVENT | nl 1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13 2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14 3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15 4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16 5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17 6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18 7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21 8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22 9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29 10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29 11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30 12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30 13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31 14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32 # There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-21perf tools: Handle PERF_RECORD_KSYMBOLSong Liu
This patch handles PERF_RECORD_KSYMBOL in perf record/report. Specifically, map and symbol are created for ksymbol register, and removed for ksymbol unregister. This patch also sets perf_event_attr.ksymbol properly. The flag is ON by default. Committer notes: Use proper inttypes.h for u64, fixing the build in some environments like in the android NDK r15c targetting ARM 32-bit. I.e. fixing this build error: util/event.c: In function 'perf_event__fprintf_ksymbol': util/event.c:1489:10: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=] event->ksymbol_event.flags, event->ksymbol_event.name); ^ cc1: all warnings being treated as errors Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-6-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-04perf report: Fix wrong iteration count in --branch-historyJin Yao
By calculating the removed loops, we can get the iteration count. But the iteration count could be reported incorrectly, reporting impossibly high counts. That's because previous code uses the number of removed LBR entries for the iteration count. That's not good. Fix this by increasing the iteration count when a loop is detected. When matching the chain, the iteration count would be added up, finally we need to compute the average value when printing out. For example, $ perf report --branch-history --stdio --no-children Before: ---f2 +0 | |--33.62%--f1 +9 (cycles:1) | f1 +0 | main +22 (cycles:1) | main +17 | main +38 (cycles:1) | main +27 | f1 +26 (cycles:1) | f1 +24 | f2 +27 (cycles:7) | f2 +0 | f1 +19 (cycles:1) | f1 +14 | f2 +27 (cycles:11) | f2 +0 | f1 +9 (cycles:1 iter:2968 avg_cycles:3) | f1 +0 | main +22 (cycles:1 iter:2968 avg_cycles:3) | main +17 | main +38 (cycles:1 iter:2968 avg_cycles:3) 2968 is an impossible high iteration count and avg_cycles is too small. After: ---f2 +0 | |--33.62%--f1 +9 (cycles:1) | f1 +0 | main +22 (cycles:1) | main +17 | main +38 (cycles:1) | main +27 | f1 +26 (cycles:1) | f1 +24 | f2 +27 (cycles:7) | f2 +0 | f1 +19 (cycles:1) | f1 +14 | f2 +27 (cycles:11) | f2 +0 | f1 +9 (cycles:1 iter:1 avg_cycles:23) | f1 +0 | main +22 (cycles:1 iter:1 avg_cycles:23) | main +17 | main +38 (cycles:1 iter:1 avg_cycles:23) avg_cycles:23 is the average cycles of this iteration. Fixes: c4ee06251d42 ("perf report: Calculate the average cycles of iterations") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1546582230-17507-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17perf tools: Allow specifying proc-map-timeout in config fileMark Drayton
The default timeout of 500ms for parsing /proc/<pid>/maps files is too short for profiling many of our services. This can be overridden by passing --proc-map-timeout to the relevant command but it'd be nice to globally increase our default value. This patch permits setting a different default with the core.proc-map-timeout config file parameter. Signed-off-by: Mark Drayton <mbd@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20181204203420.1683114-1-mbd@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17perf tools: Fix diverse comment typosIngo Molnar
Go over the tools/ files that are maintained in Arnaldo's tree and fix common typos: half of them were in comments, the other half in JSON files. No change in functionality intended. Committer notes: This was split from a larger patch as there are code that is, additionally, maintained outside the kernel tree, so to ease cherry-picking and/or backporting, split this into multiple patches. Just typos in comments, no need to backport, reducing the possibility of possible backporting artifacts. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20181203102200.GA104797@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17perf thread: Add fallback functions for cases where cpumode is insufficientAdrian Hunter
For branch stacks or branch samples, the sample cpumode might not be correct because it applies only to the sample 'ip' and not necessary to 'addr' or branch stack addresses. Add fallback functions that can be used to deal with those cases Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org # 4.19 Link: http://lkml.kernel.org/r/20181106210712.12098-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-31perf tools: Don't clone maps from parent when synthesizing forksDavid Miller
When synthesizing FORK events, we are trying to create thread objects for the already running tasks on the machine. Normally, for a kernel FORK event, we want to clone the parent's maps because that is what the kernel just did. But when synthesizing, this should not be done. If we do, we end up with overlapping maps as we process the sythesized MMAP2 events that get delivered shortly thereafter. Use the FORK event misc flags in an internal way to signal this situation, so we can elide the map clone when appropriate. Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Don Zickus <dzickus@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joe Mario <jmario@redhat.com> Link: http://lkml.kernel.org/r/20181030.222404.2085088822877051075.davem@davemloft.net [ Added comment about flag use in machine__process_fork_event(), use ternary op in thread__clone_map_groups() as suggested by Jiri ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-31perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc}David S. Miller
When processing using 'perf report -g caller', which is the default, we ended up reverting the callchain entries received from the kernel, but simply reverting throws away the information that tells that from a point onwards the addresses are for userspace, kernel, guest kernel, guest user, hypervisor. The idea is that if we are walking backwards, for each cluster of non-cpumode entries we have to first scan backwards for the next one and use that for the cluster. This seems silly and more expensive than it needs to be but it is enough for a initial fix. The code here is really complicated because it is intimately intertwined with the lbr and branch handling, as well as this callchain order, further fixes will be needed to properly take into account the cpumode in those cases. Another problem with ORDER_CALLER is that the NULL "0" IP that is at the end of most callchains shows up at the top of the histogram because every callchain contains it and with ORDER_CALLER it is the first entry. Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Souvik Banerjee <souvik1997@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: stable@vger.kernel.org # 4.19 Link: https://lkml.kernel.org/n/tip-2wt3ayp6j2y2f2xowixa8y6y@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-05perf record: Use unmapped IP for inline callchain cursorsMilian Wolff
Only use the mapped IP to find inline frames, but keep using the unmapped IP for the callchain cursor. This ensures we properly show the unmapped IP when displaying a frame we received via the dso__parse_addr_inlines API for a module which does not contain sufficient debug symbols to show the srcline. This is another follow-up to commit 19610184693c ("perf script: Show virtual addresses instead of offsets"). Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sandipan Das <sandipan@linux.ibm.com> Fixes: 19610184693c ("perf script: Show virtual addresses instead of offsets") Link: http://lkml.kernel.org/r/20180926135207.30263-2-milian.wolff@kdab.com Link: http://lkml.kernel.org/r/20181002073949.3297-1-milian.wolff@kdab.com [ Squashed a fix from Milian for a problem reported by Ravi, fixed up space damage ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-09-27perf report: Don't try to map ip to invalid mapMilian Wolff
Fixes a crash when the report encounters an address that could not be associated with an mmaped region: #0 0x00005555557bdc4a in callchain_srcline (ip=<error reading variable: Cannot access memory at address 0x38>, sym=0x0, map=0x0) at util/machine.c:2329 #1 unwind_entry (entry=entry@entry=0x7fffffff9180, arg=arg@entry=0x7ffff5642498) at util/machine.c:2329 #2 0x00005555558370af in entry (arg=0x7ffff5642498, cb=0x5555557bdb50 <unwind_entry>, thread=<optimized out>, ip=18446744073709551615) at util/unwind-libunwind-local.c:586 #3 get_entries (ui=ui@entry=0x7fffffff9620, cb=0x5555557bdb50 <unwind_entry>, arg=0x7ffff5642498, max_stack=<optimized out>) at util/unwind-libunwind-local.c:703 #4 0x0000555555837192 in _unwind__get_entries (cb=<optimized out>, arg=<optimized out>, thread=<optimized out>, data=<optimized out>, max_stack=<optimized out>) at util/unwind-libunwind-local.c:725 #5 0x00005555557c310f in thread__resolve_callchain_unwind (max_stack=127, sample=0x7fffffff9830, evsel=0x555555c7b3b0, cursor=0x7ffff5642498, thread=0x555555c7f6f0) at util/machine.c:2351 #6 thread__resolve_callchain (thread=0x555555c7f6f0, cursor=0x7ffff5642498, evsel=0x555555c7b3b0, sample=0x7fffffff9830, parent=0x7fffffff97b8, root_al=0x7fffffff9750, max_stack=127) at util/machine.c:2378 #7 0x00005555557ba4ee in sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fffffff97b8, evsel=<optimized out>, al=al@entry=0x7fffffff9750, max_stack=<optimized out>) at util/callchain.c:1085 Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Tested-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 2a9d5050dc84 ("perf script: Show correct offsets for DWARF-based unwinding") Link: http://lkml.kernel.org/r/20180926135207.30263-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-20perf tools: Store compression id into struct dsoJiri Olsa
Add comp to 'struct dso' to hold the compression index. It will be used in the following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24perf machine: Use last_match threads cache only in single thread modeJiri Olsa
There's an issue with using threads::last_match in multithread mode which is enabled during the perf top synthesize. It might crash with following assertion: perf: ...include/linux/refcount.h:109: refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed. The gdb backtrace looks like this: 0x00007ffff50839fb in raise () from /lib64/libc.so.6 (gdb) #0 0x00007ffff50839fb in raise () from /lib64/libc.so.6 #1 0x00007ffff5085800 in abort () from /lib64/libc.so.6 #2 0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6 #4 0x0000000000535ff9 in refcount_inc (r=0x7fffe8009a70) at ...include/linux/refcount.h:109 #5 0x0000000000536771 in thread__get (thread=0x7fffe8009a40) at util/thread.c:115 #6 0x0000000000523cd0 in ____machine__findnew_thread (machine=0xbfde38, threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:432 #7 0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38, pid=2, tid=2) at util/machine.c:489 #8 0x0000000000523f24 in machine__findnew_thread (machine=0xbfde38, pid=2, tid=2) at util/machine.c:499 #9 0x0000000000526fbe in machine__process_fork_event (machine=0xbfde38, ... The failing assertion is this one: REFCOUNT_WARN(!refcount_inc_not_zero(r), ... the problem is that we don't serialize access to threads::last_match. We serialize the access to the threads tree, but we don't care how's threads::last_match being accessed. Both locked/unlocked paths use that data and can set it. In multithreaded mode we can end up with invalid object in thread__get call, like in following paths race: thread 1 ... machine__findnew_thread down_write(&threads->lock); __machine__findnew_thread ____machine__findnew_thread th = threads->last_match; if (th->tid == tid) { thread__get thread 2 ... machine__find_thread down_read(&threads->lock); __machine__findnew_thread ____machine__findnew_thread th = threads->last_match; if (th->tid == tid) { thread__get thread 3 ... machine__process_fork_event machine__remove_thread __machine__remove_thread threads->last_match = NULL thread__put thread__put Thread 1 and 2 might got stale last_match, before thread 3 clears it. Thread 1 and 2 then race with thread 3's thread__put and they might trigger the refcnt == 0 assertion above. The patch is disabling the last_match cache for multiple thread mode. It was originally meant for single thread scenarios, where it's common to have multiple sequential searches of the same thread. In multithread mode this does not make sense, because top's threads processes different /proc entries and so the 'struct threads' object is queried for various threads. Moreover we'd need to add more locks to make it work. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20180719143345.12963-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24perf machine: Add threads__set_last_match functionJiri Olsa
Separating threads::last_match cache set into separate threads__set_last_match function. This will be useful in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20180719143345.12963-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24perf machine: Add threads__get_last_match functionJiri Olsa
Separating threads::last_match cache read/check into separate threads__get_last_match function. This will be useful in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20180719143345.12963-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24perf script: Show correct offsets for DWARF-based unwindingSandipan Das
When perf/data is recorded with the dwarf call-graph option, the callchain shown by 'perf script' still shows the binary offsets of the userspace symbols instead of their virtual addresses. Since the symbol offset calculation is based on using virtual address as the ip, we see incorrect offsets as well. The use of virtual addresses affects the ability to find out the line number in the corresponding source file to which an address maps to as described in commit 67540759151a ("perf unwind: Use addr_location::addr instead of ip for entries"). This has also been addressed by temporarily converting the virtual address to the correponding binary offset so that it can be mapped to the source line number correctly. This is a follow-up for commit 19610184693c ("perf script: Show virtual addresses instead of offsets"). This can be verified on a powerpc64le system running Fedora 27 as shown below: # perf probe -x /usr/lib64/libc-2.26.so -a inet_pton # perf record -e probe_libc:inet_pton --call-graph=dwarf ping -6 -c 1 ::1 Before: # perf report --stdio --no-children -s sym,srcline -g address # Samples: 1 of event 'probe_libc:inet_pton' # Event count (approx.): 1 # # Overhead Symbol Source:Line # ........ .................... ........... # 100.00% [.] __GI___inet_pton inet_pton.c | ---gaih_inet getaddrinfo.c:537 (inlined) __GI_getaddrinfo getaddrinfo.c:2304 (inlined) main ping.c:519 generic_start_main libc-start.c:308 (inlined) __libc_start_main libc-start.c:102 ... # perf script -F comm,ip,sym,symoff,srcline,dso ping 15af28 __GI___inet_pton+0xffff000099160008 (/usr/lib64/libc-2.26.so) libc-2.26.so[ffff80004ca0af28] 10fa53 gaih_inet+0xffff000099160f43 libc-2.26.so[ffff80004c9bfa53] (inlined) 1105b3 __GI_getaddrinfo+0xffff000099160163 libc-2.26.so[ffff80004c9c05b3] (inlined) 2d6f main+0xfffffffd9f1003df (/usr/bin/ping) ping[fffffffecf882d6f] 2369f generic_start_main+0xffff00009916013f libc-2.26.so[ffff80004c8d369f] (inlined) 23897 __libc_start_main+0xffff0000991600b7 (/usr/lib64/libc-2.26.so) libc-2.26.so[ffff80004c8d3897] After: # perf report --stdio --no-children -s sym,srcline -g address # Samples: 1 of event 'probe_libc:inet_pton' # Event count (approx.): 1 # # Overhead Symbol Source:Line # ........ .................... ........... # 100.00% [.] __GI___inet_pton inet_pton.c | ---gaih_inet.constprop.7 getaddrinfo.c:537 getaddrinfo getaddrinfo.c:2304 main ping.c:519 generic_start_main.isra.0 libc-start.c:308 __libc_start_main libc-start.c:102 ... # perf script -F comm,ip,sym,symoff,srcline,dso ping 7fffb38aaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so) inet_pton.c:68 7fffb385fa53 gaih_inet.constprop.7+0xf43 (/usr/lib64/libc-2.26.so) getaddrinfo.c:537 7fffb38605b3 getaddrinfo+0x163 (/usr/lib64/libc-2.26.so) getaddrinfo.c:2304 130782d6f main+0x3df (/usr/bin/ping) ping.c:519 7fffb377369f generic_start_main.isra.0+0x13f (/usr/lib64/libc-2.26.so) libc-start.c:308 7fffb3773897 __libc_start_main+0xb7 (/usr/lib64/libc-2.26.so) libc-start.c:102 Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Fixes: 67540759151a ("perf unwind: Use addr_location::addr instead of ip for entries") Link: http://lkml.kernel.org/r/20180703120555.32971-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf machine: Synthesize and process mmap events for x86 PTI entry trampolinesAdrian Hunter
Like the kernel text, the location of x86 PTI entry trampolines must be recorded in the perf.data file. Like the kernel, synthesize a mmap event for that, and add processing for it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-10-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf machine: Create maps for x86 PTI entry trampolinesAdrian Hunter
Create maps for x86 PTI entry trampolines, based on symbols found in kallsyms. It is also necessary to keep track of whether the trampolines have been mapped particularly when the kernel dso is kcore. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-9-git-send-email-adrian.hunter@intel.com [ Fix extra_kernel_map_info.cnt designed struct initializer on gcc 4.4.7 (centos:6, etc) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Allow for extra kernel mapsAdrian Hunter
Identify extra kernel maps by name so that they can be distinguished from the kernel map and module maps. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Workaround missing maps for x86 PTI entry trampolinesAdrian Hunter
On x86_64 the PTI entry trampolines are not in the kernel map created by perf tools. That results in the addresses having no symbols and prevents annotation. It also causes Intel PT to have decoding errors at the trampoline addresses. Workaround that by creating maps for the trampolines. At present the kernel does not export information revealing where the trampolines are. Until that happens, the addresses are hardcoded. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Add nr_cpus_avail()Adrian Hunter
Add a function to return the number of the machine's available CPUs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19perf tools: Fix kernel_start for PTI on x86Adrian Hunter
Opickn x86_64, PTI entry trampolines are less than the start of kernel text, but still above 2^63. So leave kernel_start = 1ULL << 63 for x86_64. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526548928-20790-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19perf machine: Add machine__is() to identify machine archAdrian Hunter
Add a function to identify the machine architecture. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526548928-20790-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17perf script: Show virtual addresses instead of offsetsSandipan Das
When perf data is recorded with the call-graph option enabled, the callchain shown by perf script shows the binary offsets of the symbols as the ip. This is incorrect for kernel symbols as the ip values are always off by a fixed offset depending on the architecture. If the offsets from the start of the symbols are printed, they are also incorrect for both kernel and userspace symbols. Without the call-graph option, the callchain shows the virtual addresses of the symbols rather than their binary offsets. The offsets printed in this case are also correct. This fixes the inconsistency in perf script's output. This can be verified on a powerpc64le system running Fedora 27 as follows: # cat /proc/kallsyms | grep sys_write ... c0000000004025a0 T sys_write c0000000004025a0 T __se_sys_write ... # perf probe -a sys_write Before applying this patch: # perf record -e probe:sys_write -g ~/test # perf script -F ip,sym,symoff 4125b0 sys_write+0x8000000000008010 1b9e0 system_call+0x8000000000008058 118234 __GI___libc_write+0xffff0000f52c0024 92c74 _IO_file_write@@GLIBC_2.17+0xffff0000f52c0044 5afbfd8a [unknown] 91a60 new_do_write+0xffff0000f52c0090 94638 _IO_do_write@@GLIBC_2.17+0xffff0000f52c0038 94bbc _IO_file_overflow@@GLIBC_2.17+0xffff0000f52c014c 95a24 __overflow+0xffff0000f52c0064 84548 _IO_puts+0xffff0000f52c0218 440 main+0xffffffffe0000020 236a0 generic_start_main.isra.0+0xffff0000f52c0140 23898 __libc_start_main+0xffff0000f52c00b8 0 [unknown] ... # perf record -e probe:sys_write ~/test # perf script -F ip,sym,symoff c0000000004025b0 sys_write+0x10 ... After applying this patch: # perf record -e probe:sys_write -g ~/test # perf script -F ip,sym,symoff c0000000004025b0 sys_write+0x10 c00000000000b9e0 system_call+0x58 7fffb70d8234 __GI___libc_write+0x24 7fffb7052c74 _IO_file_write@@GLIBC_2.17+0x44 5afc1818 [unknown] 7fffb7051a60 new_do_write+0x90 7fffb7054638 _IO_do_write@@GLIBC_2.17+0x38 7fffb7054bbc _IO_file_overflow@@GLIBC_2.17+0x14c 7fffb7055a24 __overflow+0x64 7fffb7044548 _IO_puts+0x218 10000440 main+0x20 7fffb6fe36a0 generic_start_main.isra.0+0x140 7fffb6fe3898 __libc_start_main+0xb8 0 [unknown] ... # perf record -e probe:sys_write ~/test # perf script -F ip,sym,symoff c0000000004025b0 sys_write+0x10 ... Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20180517063326.6319-1-sandipan@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-30perf machine: Ditch find_kernel_function variantsArnaldo Carvalho de Melo
Since we do not have split symtabs anymore, no need to have explicit find_kernel_function variants, use the find_kernel_symbol ones. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-hiw2ryflju000f6wl62128it@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-27perf symbols: Unify symbol mapsArnaldo Carvalho de Melo
Remove the split of symbol tables for data (MAP__VARIABLE) and for functions (MAP__FUNCTION), its unneeded and there were various places doing two lookups to find a symbol, so simplify this. We still will consider only the symbols that matched the filters in place, i.e. see the (elf_(sec,sym)|symbol_type)__filter() routines in the patch, just so that we consider only the same symbols as before, to reduce the possibility of regressions. All the tests on 50-something build environments, in varios versions of lots of distros and cross build environments were performed without build regressions, as usual with all pull requests the other tests were also performed: 'perf test' and 'make -C tools/perf build-test'. Also this was done at a great granularity so that regressions can be bisected more easily. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-hiq0fy2rsleupnqqwuojo1ne@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>