summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2017-07-25perf tools: Add EXCLUDE_EXTLIBS and EXTRA_PERFLIBS to makefileDavid Carrillo-Cisneros
The goal is to allow users to override linking of libraries that were automatically added to PERFLIBS. EXCLUDE_EXTLIBS contains linker flags to be removed from LIBS while EXTRA_PERFLIBS contains linker flags to be added. My use case is to force certain library to be build statically, e.g. for libelf: EXCLUDE_EXTLIBS=-lelf EXTRA_PERFLIBS=path/libelf.a Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Kees Kook <keescook@chromium.org> Cc: Paul Turner <pjt@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170719011839.99399-3-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25perf cgroup: Fix refcount usageArnaldo Carvalho de Melo
When converting from atomic_t to refcount_t we didn't follow the usual step of initializing it to one before taking any new reference, which trips over checking if taking a reference for a freed refcount_t, fix it. Brendan's report: --- It's 4.12-rc7, with node v4.4.1. I'm building 4.13-rc1 now, as I hit what I think is another unrelated perf bug and I'm starting to wonder what else is broken on that version: (root) /mnt/src/linux-4.12-rc7/tools/perf # ./perf record -F 99 -a -e cpu-clock --cgroup=docker/f9e9d5df065b14646e8a11edc837a13877fd90c171137b2ba3feb67a0201cb65 -g perf: /mnt/src/linux-4.12-rc7/tools/include/linux/refcount.h:108: refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed. Aborted that used to work... --- Testing it: Before: # perf stat -e cycles -C 0 --cgroup / perf: /home/acme/git/linux/tools/include/linux/refcount.h:108: refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed. Aborted (core dumped) # After: # perf stat -e cycles -C 0 --cgroup / ^C Performance counter stats for 'CPU(s) 0': 132,081,393 cycles / 2.492942763 seconds time elapsed # Reported-by: Brendan Gregg <brendan.d.gregg@gmail.com> Acked-by: Elena Reshetova <elena.reshetova@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Kees Kook <keescook@chromium.org> Cc: Krister Johansen <kjlx@templeofstupid.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sudeep Holla <Sudeep.Holla@arm.com> Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 79c5fe6db8c7 ("perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t") Link: http://lkml.kernel.org/n/tip-l7ovfblq14ip2i08m1g0fkhv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25perf report: Fix kernel symbol adjustment for s390xThomas Richter
On s390x the kernel text segment starts at address 0x0. When perf report reads kernel symbols from vmlinux file it adds an offset of 0x1000. For example see symbol set_reset_devices: [root@s8360047 linux-devel]# nm -A vmlinux| fgrep set_reset_devices vmlinux:0000000001379000 t set_reset_devices [root@s8360047 linux-devel]# [root@s8360047 linux-devel]# fgrep set_reset_devices /proc/kallsyms 0000000001379000 t set_reset_devices [root@s8360047 linux-devel]# The kernel symbol table and the vmlinux file have the same address for symbol set_reset_devices namely 1379000. When perf report reads this symbols it displays it with address symbol__new: set_reset_devices 0x137a000-0x137a018 There is a difference between perf report and vmlinux of 0x1000. The reason for the difference is at kernel symbol load time in function dso__load_sym(). The vmlinux file is investigated with its ELF header. Command readelf shows this: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 0000000000000000 00001000 0000000000b0e0c2 0000000000000000 AX 0 0 128 This leads to an invalid calculation of the symbol start address, see file utit/symbol-elf.c line 974: /* Adjust symbol to map to file offset */ if (adjust_kernel_syms) sym.st_value -= shdr.sh_addr - shdr.sh_offset; With shdr.sh_addr set to 0x0 and shdr.sh_offset set to 0x1000 as read from the ELF .text section 0x1000 is added to the symbol address. I would like to fix this by introducing an archticture specific function named elf__needs_adjust_symbols(). This is the same approach as done by PowerPC. The function currently does not exist for s390x and the default weak one is used. The s390x specific one returns false when symsrc_init() is invoked for kernel symbols and results in variable adjust_kernel_syms being false. This omits the adjustment and the correct address is displayed (when symbol resolvement does not work). The s390x specific function returns false for kernel symbol adjustment and returns true for kernel modules, processes and shared libraries. Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> LPU-Reference: 20170713130252.6167-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25perf annotate stdio: Fix --show-total-periodTaeung Song
We were showing the total number of samples, not the total period as asked by the user, fix it. Reported-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Martin Liška <mliska@suse.cz> Cc: Milian Wolff <milian.wolff@kdab.com> Link: http://lkml.kernel.org/n/tip-lh2nh89rtqn5x5vbfthw6qml@git.kernel.org Fixes: 0c4a5bcea460 ("perf annotate: Display total number of samples with --show-total-period") [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two hw-enablement patches, two race fixes, three fixes for regressions of semantics, plus a number of tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Add proper condition to run sched_task callbacks perf/core: Fix locking for children siblings group read perf/core: Fix scheduling regression of pinned groups perf/x86/intel: Fix debug_store reset field for freq events perf/x86/intel: Add Goldmont Plus CPU PMU support perf/x86/intel: Enable C-state residency events for Apollo Lake perf symbols: Accept zero as the kernel base address Revert "perf/core: Drop kernel samples even though :u is specified" perf annotate: Fix broken arrow at row 0 connecting jmp instruction to its target perf evsel: State in the default event name if attr.exclude_kernel is set perf evsel: Fix attr.exclude_kernel setting for default cycles:p
2017-07-21perf annotate: Do not overwrite sample->periodTaeung Song
In fixing the --show-total-period option it was noticed that the value of sample->period was being overwritten, fix it. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: fd36f3dd7933 ("perf hist: Pass struct sample to __hists__add_entry()") Link: http://lkml.kernel.org/r/1500500215-16646-1-git-send-email-treeze.taeung@gmail.com [ split from a larger patch, added the Fixes tag ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21perf annotate: Store the sample period in each histogram bucketTaeung Song
We'll use it soon, when fixing --show-total-period. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1500500215-16646-1-git-send-email-treeze.taeung@gmail.com [ split from a larger patch, do the math in __symbol__inc_addr_samples() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21perf hists: Pass perf_sample to __symbol__inc_addr_samples()Taeung Song
To pave the way to use perf_sample fields in the annotate code, storing sample->period in sym_hist->addr->period and its sum in sym_hist->period. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1500500215-16646-1-git-send-email-treeze.taeung@gmail.com [ split and adjusted from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21perf annotate: Rename 'sum' to 'nr_samples' in struct sym_histTaeung Song
To make it more clear that it is the sum of all the nr_samples fields in the addr[] entries, i.e.: sym_hist->nr_samples = sum(sym_hist->addr[0 .. symbol__size(sym)]->nr_samples) Committer notes: Taeung had renamed it to total_samples, but using nr_samples, as in the added explanation above, looks clearer and establishes the direct connection, making clear it is about the _number_ of samples. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1500500211-16599-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21perf annotate: Introduce struct sym_hist_entryTaeung Song
struct sym_hist has addr[] but it should have not only number of samples but also the sample period. So use new struct symhist_entry to pave the way to have that. Committer notes: This initial patch will only introduce the struct sym_hist_entry and use only the nr_samples member, which makes the code clearer and paves the way to save the period as well. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1500500205-16553-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20tools include: Adopt strstarts() from the kernelArnaldo Carvalho de Melo
Replacing prefixcmp(), same purpose, inverted result, so standardize on the kernel variant, to reduce silly differences among tools/ and the kernel sources, making it easier for people to work in both codebases. And then doing: if (strstarts(option, "no-")) Looks clearer than doing: if (!prefixcmp(option, "no-")) To figure out if option starts witn "no-". Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-kaei42gi7lpa8subwtv7eug8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf trace: Filter out 'sshd' in the tracer ancestry in syswide tracingArnaldo Carvalho de Melo
Avoiding a loop, so now its quite convenient to ssh to a machine and then simply do: # perf trace To trace all syscalls without causing a loop. This was possible using --filter-pids, i.e. once you noticed the loop, get the sshd pid and add it to --filter-pids, restarting the 'perf trace'. Now to figure out how to do that in a X terminal, the other common scenario, which is way more involved, as there are multiple processes communicating to process terminal activity... Using --filter-pids + '-e \!syscall,names,you,dont,need' may be a good approximation when having to do syswide tracing on your workstation. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-68rjeao9wnpylla41htk7xps@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf trace: Introduce filter_loop_pids()Arnaldo Carvalho de Melo
No change in functionality, just to make clearer that what we want when filtering the tracer pid in a system wide tracing session is to avoid a feedback loop. This also paves the way for a more interesting loop avoidance algorithm, one that tries to figure out if we are in a ssh session, xterm, etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-5fcttc5kdjkcyp9404ezkuy9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf trace beauty clone: Suppress unused args according to 'flags' argArnaldo Carvalho de Melo
The 'parent_tidptr', 'child_tidptr' and 'tls' arguments to the 'clone' syscall are only used when certain flags are set in 'flags', suppress them when those aren't there. E.g: 9886.919 (0.236 ms): fetchmail/19298 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fe43f468590) = 19608 (fetchmail) 12876.052 (0.249 ms): qemu-system-x8/21238 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f48117fc770, parent_tidptr: 0x7f48117ff9d0, child_tidptr: 0x7f48117ff9d0, tls: 0x7f48117ff700) = 19611 (qemu-system-x86) 12876.555 (0.048 ms): worker/19611 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f480f7f8770, parent_tidptr: 0x7f480f7fb9d0, child_tidptr: 0x7f480f7fb9d0, tls: 0x7f480f7fb700) = 19612 (worker) 16575.240 (0.469 ms): fetchmail/19298 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fe43f468590) = 19613 (fetchmail) 20797.270 (0.335 ms): fetchmail/19298 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fe43f468590) = 19614 (fetchmail) 21228.585 (0.501 ms): vim/19519 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fbad6ac27d0) = 19615 (vim) 21232.193 (0.137 ms): bash/19615 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fad8bff49d0) = 19616 (bash) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-0um93djul9knf239gwa5mpcb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf trace beauty clone: Beautify syscall argumentsArnaldo Carvalho de Melo
Now, syswide tracing, selected entries: # trace -e clone 24417.203 ( 0.158 ms): bash/11323 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11325 (bash) ? ( ? ): bash/11325 ... [continued]: clone()) = 0 24419.355 ( 0.093 ms): bash/10586 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11326 (bash) ? ( ? ): bash/11326 ... [continued]: clone()) = 0 24419.744 ( 0.102 ms): bash/11326 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11327 (bash) ? ( ? ): bash/11327 ... [continued]: clone()) = 0 24420.138 ( 0.105 ms): bash/11327 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11328 (bash) ? ( ? ): bash/11328 ... [continued]: clone()) = 0 35747.722 ( 0.044 ms): gpg-agent/18087 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ff0755f6ff0, parent_tidptr: 0x7ff0755f79d0, child_tidptr: 0x7ff0755f79d0, tls: 0x7ff0755f7700) = 11329 (gpg-agent) ? ( ? ): gpg-agent/11329 ... [continued]: clone()) = 0 35748.359 ( 0.022 ms): gpg-agent/18087 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ff075df7ff0, parent_tidptr: 0x7ff075df89d0, child_tidptr: 0x7ff075df89d0, tls: 0x7ff075df8700) = 11330 (gpg-agent) ? ( ? ): gpg-agent/11330 ... [continued]: clone()) = 0 35781.422 ( 0.452 ms): NetworkManager/1112 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f2f1fffedb0, parent_tidptr: 0x7f2f1ffff9d0, child_tidptr: 0x7f2f1ffff9d0, tls: 0x7f2f1ffff700) = 11331 (NetworkManager) ? ( ? ): NetworkManager/11331 ... [continued]: clone()) = 0 Need to improve the formatting of the second return, to the child, this cset only focused on the argument formatting. If we trace just one pid: # trace -e clone -p 19863 0.349 ( 0.025 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb84eaac70, parent_tidptr: 0x7ffb84eab9d0, child_tidptr: 0x7ffb84eab9d0, tls: 0x7ffb84eab700) = 11637 (Chrome_IOThread) 0.392 ( 0.013 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb664b8c70, parent_tidptr: 0x7ffb664b99d0, child_tidptr: 0x7ffb664b99d0, tls: 0x7ffb664b9700) = 11638 (Chrome_IOThread) 0.573 ( 0.015 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb6046cc70, parent_tidptr: 0x7ffb6046d9d0, child_tidptr: 0x7ffb6046d9d0, tls: 0x7ffb6046d700) = 11639 (Chrome_IOThread) 0.617 ( 0.014 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb730dcc70, parent_tidptr: 0x7ffb730dd9d0, child_tidptr: 0x7ffb730dd9d0, tls: 0x7ffb730dd700) = 11640 (Chrome_IOThread) 4.350 ( 0.065 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb720d9c70, parent_tidptr: 0x7ffb720da9d0, child_tidptr: 0x7ffb720da9d0, tls: 0x7ffb720da700) = 11642 (Chrome_IOThread) 5.642 ( 0.079 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb718d8c70, parent_tidptr: 0x7ffb718d99d0, child_tidptr: 0x7ffb718d99d0, tls: 0x7ffb718d9700) = 11643 (Chrome_IOThread) ^C# We'll also have to fix the argument ordering in different arches, probably having multiple syscall_fmt entries with each possible order and then use perf_evsel__env_arch() (if dealing with a perf.data file) or the current system info, for live sessions. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-am068uyubgj83snepolwhbfe@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20tools include uapi: Grab a copy of linux/sched.hArnaldo Carvalho de Melo
So that we make sure we have recent enough defines for things such as 'perf trace' system call argument beautifiers. For instance, the 'clone' syscall argument 'flag' needs to use CLONE_NEWCGROUP, and that is not available in RHEL7. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-81sln0ng4a2lcxrth14vcov4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf trace: Allow specifying names to syscall arguments formattersArnaldo Carvalho de Melo
For tracepointless syscalls, like clone, otherwise get them from the tracepoint's /format file. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ml5qvv1w5k96ghwhxpzzsmm3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf trace: Allow specifying number of syscall args for tracepointless syscallsArnaldo Carvalho de Melo
When we don't have syscalls:sys_{enter,exit}_NAME, we had to resort to dumping all the 6 syscall arguments, fix it by providing that info for such syscalls, like 'clone'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-dfq1jtrxj8dqvqoeqqpr3slu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf trace: Ditch __syscall__arg_val() variant, not needed anymoreArnaldo Carvalho de Melo
All callers now can use syscall__arg_val(arg, idx), be it to iterate thru the syscall arguments while taking into account alignment, or to get values for other arguments that affect how the current argument should be formatted (think of fcntl's 'cmd' and 'arg' arguments). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wm5b156d8kro1r4y3b33eyta@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf trace: Use the syscall_fmt formatters without a tracepointArnaldo Carvalho de Melo
Previously we only used the syscall_fmt when we had sc->tp_format set, i.e. when we found the (enter, exit) pair in tracefs/events/syscalls/. But we really only need to use what is in sc->arg_fmt to apply the arg beautifiers to the syscall argument values, so do it. With this we will be able to provide formatters to the "clone" syscall, which doesn't have entries in tracefs/events/syscalls/. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-y41nl41jrayjo5ucnde2peix@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf trace: Allow allocating sc->arg_fmt even without the syscall tracepointArnaldo Carvalho de Melo
At least "clone" doesn't have (enter, exit) entries tracefs/events/syscalls/, but we can provide a syscall_fmt and use it instead, as will be done for "clone" in the next cset. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-o12kejgcxddyovn2hlg4gbim@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf trace beauty mmap: Ignore 'fd' and 'offset' args for MAP_ANONYMOUSArnaldo Carvalho de Melo
Just suppress them, not used by the kernel. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-atpt07y2x9a8ttlwja94ow3j@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf trace: Add missing ' = ' in the default formatting of syscall returnsArnaldo Carvalho de Melo
We lost it recently, put it back. Before: 789.499 ( 0.001 ms): libvirtd/1175 lseek(fd: 22, whence: CUR) 4328 After: 789.499 ( 0.001 ms): libvirtd/1175 lseek(fd: 22, whence: CUR) = 4328 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 1f63139c3f8a ("perf trace beauty: Simplify syscall return formatting") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf intel-pt: Always set no branch for dummy eventKan Liang
An earlier kernel patch allowed enabling PT and LBR at the same time on Goldmont. commit ccbebba4c6bf ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it") However, users still cannot use Intel PT and LBRs simultaneously. $ sudo perf record -e cycles,intel_pt//u -b -- sleep 1 Error: PMU Hardware doesn't support sampling/overflow-interrupts. PT implicitly adds dummy event in perf tool. dummy event is software event which doesn't support LBR. Always setting no branch for dummy event in Intel PT. Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170630141656.1626-2-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-20perf intel-pt: Set no_aux_samples for the tracking eventKan Liang
The reason of introducing the tracking event (a dummy software event) is to collect side-band information. Additional sampling is wasteful. no_aux_samples should be set for tracking event. Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170630141656.1626-1-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf report: Show branch type in callchain entryJin Yao
Show branch type in callchain entry. The branch type is printed with other LBR information (such as cycles/abort/...). For example: perf record -g -j any,save_type perf report --branch-history --stdio --no-children 38.50% div.c:45 [.] main div | ---main div.c:42 (RET CROSS_2M cycles:2) compute_flag div.c:28 (cycles:2) compute_flag div.c:27 (RET CROSS_2M cycles:1) rand rand.c:28 (cycles:1) rand rand.c:28 (RET CROSS_2M cycles:1) __random random.c:298 (cycles:1) __random random.c:297 (COND_BWD CROSS_2M cycles:1) __random random.c:295 (cycles:1) __random random.c:295 (COND_BWD CROSS_2M cycles:1) __random random.c:295 (cycles:1) __random random.c:295 (RET CROSS_2M cycles:9) Change log v6: Remove the branch_type_str() since it's moved to branch.c. v5: Rewrite the branch info print code in util/callchain.c. v4: Comparing to previous version, the major changes are: Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1500379995-6449-8-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf report: Show branch type statistics for stdio modeJin Yao
Show the branch type statistics at the end of perf report --stdio. For example: perf report --stdio COND_FWD: 28.5% COND_BWD: 9.4% CROSS_4K: 0.7% CROSS_2M: 14.1% COND: 37.9% UNCOND: 0.2% IND: 6.7% CALL: 26.5% RET: 28.7% SYSRET: 0.0% The branch types are: COND_FWD: conditional forward COND_BWD: conditional backward COND: conditional branch UNCOND: unconditional branch IND: indirect CALL: function call IND_CALL: indirect function call RET: function return SYSCALL: syscall SYSRET: syscall return COND_CALL: conditional function call COND_RET: conditional function return CROSS_4K and CROSS_2M: They are the metrics checking for branches cross 4K or 2MB pages. It's an approximate computing. We don't know if the area is 4K or 2MB, so always compute both. To make the output simple, if a branch crosses 2M area, CROSS_4K will not be incremented. Change log v7: Since the common branch type definitions are changed, some tags/strings are updated accordingly. v6: Remove branch_type_stat_display() since it's moved to branch.c. v5: Remove the unnecessary sort__mode checking in hist_iter__branch_callback(). v4: Comparing to previous version, the major changes are: Add the computing of JCC forward/JCC backward and cross page checking by using the from and to addresses. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1500379995-6449-7-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf util: Create branch.c/.h for common branch functionsJin Yao
Create new util/branch.c and util/branch.h to contain the common branch functions. Such as: branch_type_count(): Count the numbers of branch types branch_type_name() : Return the name of branch type branch_type_stat_display(): Display branch type statistics info branch_type_str(): Construct the branch type string. The branch type is saved in branch_flags. Change log: v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN. v7: Since the common branch type name is changed (e.g. JCC->COND), this patch is performed the modification accordingly. v6: Move that multiline conditional code inside {} brackets. Move branch_type_stat_display() from builtin-report.c to branch.c. Move branch_type_str() from callchain.c to branch.c. v5: It's a new patch in v5 patch series. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1500379995-6449-6-git-send-email-yao.jin@linux.intel.com [ Don't use 'index' and 'stat' as names for variables, it shadows global decls in older distros ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf report: Refactor the branch info printing codeJin Yao
The branch info such as predicted/cycles/... are printed at the callchain entries. For example: perf report --branch-history --no-children --stdio --1.07%--main div.c:39 (predicted:52.4% cycles:1 iterations:17) main div.c:44 (predicted:52.4% cycles:1) main div.c:42 (cycles:2) compute_flag div.c:28 (cycles:2) compute_flag div.c:27 (cycles:1) rand rand.c:28 (cycles:1) rand rand.c:28 (cycles:1) __random random.c:298 (cycles:1) __random random.c:297 (cycles:1) __random random.c:295 (cycles:1) __random random.c:295 (cycles:1) __random random.c:295 (cycles:1) But the current code is difficult to maintain and extend. This patch refactors the code for easy maintenance. Change log: v6: 1. Put the multiline condition code into {} brackets in counts_str_build() 2. Keep the original display order, that is: predicted, abort, cycles, iterations v5: It's a new patch in v5 patch series. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1500379995-6449-5-git-send-email-yao.jin@linux.intel.com [ Don't use 'index' as a name for a variable, it shadows a globa decl in older distros ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf record: Create a new option save_type in --branch-filterJin Yao
The option indicates the kernel to save branch type during sampling. One example: perf record -g --branch-filter any,save_type <command> Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1500379995-6449-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Add event desc to pipe-mode headerDavid Carrillo-Cisneros
Add event descriptor to perf header output in pipe-mode. After this patch: $ perf record -e cycles sleep 1 | perf report --header # ======== # captured on: Mon Jun 5 22:52:13 2017 # ======== # # hostname : lphh20 # os release : 4.3.5-smp-801.43.0.0 # perf version : 4.12.rc2.g439987 # arch : x86_64 # nrcpus online : 72 # nrcpus avail : 72 # cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz # cpuid : GenuineIntel,6,63,2 # total memory : 264134144 kB # cmdline : /root/perf record -e cycles sleep 1 # event : name = cycles, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1 # CPU_TOPOLOGY info available, use -I to display # NUMA_TOPOLOGY info available, use -I to display # pmu mappings: intel_bts = 6, cpu = 4, msr = 49, uncore_cbox_10 = 36, uncore_cbox_11 = 37, uncore_cbox_12 = 38, uncore_cbox_13 = 39, uncore_cbox_14 = 40, uncore_cbox_15 = 41, uncore_cbox_16 = 42, uncore_cbox_17 = 43, software = 1, power = 7, uncore_irp = 24, uncore_pcu = 48, tracepoint = 2, uncore_imc_0 = 16, uncore_imc_1 = 17, uncore_imc_2 = 18, uncore_imc_3 = 19, uncore_imc_4 = 20, uncore_imc_5 = 21, uncore_imc_6 = 22, uncore_imc_7 = 23, uncore_qpi_0 = 8, uncore_qpi_1 = 9, uncore_cbox_0 = 26, uncore_cbox_1 = 27, uncore_cbox_2 = 28, uncore_cbox_3 = 29, uncore_cbox_4 = 30, uncore_cbox_5 = 31, uncore_cbox_6 = 32, uncore_cbox_7 = 33, uncore_cbox_8 = 34, uncore_cbox_9 = 35, uncore_r2pcie = 13, uncore_r3qpi_0 = 10, uncore_r3qpi_1 = 11, uncore_r3qpi_2 = 12, uncore_sbox_0 = 44, uncore_sbox_1 = 45, uncore_sbox_2 = 46, uncore_sbox_3 = 47, breakpoint = 5, uncore_ha_0 = 14, uncore_ha_1 = 15, uncore_ubox = 25 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB (null) ] Prior to this patch, event was not printed. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-17-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf tools: Add feature header record to pipe-modeDavid Carrillo-Cisneros
Add header record types to pipe-mode, reusing the functions used in file-mode and leveraging the new struct feat_fd. For alignment, check that synthesized events don't exceed pagesize. Add the perf_event__synthesize_feature event call back to process the new header records. Before this patch: $ perf record -o - -e cycles sleep 1 | perf report --stdio --header [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] ... After this patch: $ perf record -o - -e cycles sleep 1 | perf report --stdio --header # ======== # captured on: Mon May 22 16:33:43 2017 # ======== # # hostname : my_hostname # os release : 4.11.0-dbx-up_perf # perf version : 4.11.rc6.g6277c80 # arch : x86_64 # nrcpus online : 72 # nrcpus avail : 72 # cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz # cpuid : GenuineIntel,6,63,2 # total memory : 263457192 kB # cmdline : /root/perf record -o - -e cycles -c 100000 sleep 1 # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: intel_bts = 6, uncore_imc_4 = 22, uncore_sbox_1 = 47, uncore_cbox_5 = 33, uncore_ha_0 = 16, uncore_cbox [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] ... Support added for the subcommands: report, inject, annotate and script. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-16-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf tool: Add show_feature_header to perf_toolDavid Carrillo-Cisneros
Add show_feat_hdr to control level of printed information of feature headers. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-15-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Change FEAT_OP* macrosDavid Carrillo-Cisneros
There are three FEAT_OP* macros: - FEAT_OPA: for features without process record. - FEAT_OPP: for features with process record. - FEAT_OPF: like FEAT_OPP but to show only if show_full_info flags is set. To add pipe-mode headers we need yet another variation of the macros (one to specify whether a feature generates an auxiliar record). Instead, we redefine macros so that: - show_full_info is specified as an argument (to remove the FEAT_OPF variation) and, - it always sets "process" handler (to remove the FEAT_OPA variation). Individual process handlers can be NULLed individually. This allows to define two variations only: - FEAT_OPR: synthesizes auxiliar event record. - FEAT_OPN: doesn't synthesize an auxiliar event record. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-14-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Add a buffer to struct feat_fdDavid Carrillo-Cisneros
Extend struct feat_fd to use a temporal buffer in pipe-mode, instead of perf.data's file descriptor. The header features build_id and aux_trace already have logic to print in file-mode that heavily rely on lseek the file. For now, leave such features inactive in pipe-mode and print a warning if their functions are called in pipe-mode. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-13-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Make write_pmu_mappings pipe-mode friendlyDavid Carrillo-Cisneros
In pipe-mode, we will operate over a buffer instead of a file descriptor but write_pmu_mappings uses lseek to move over the perf.data file. Refactor write_pmu_mappings to avoid the usage of lseek and allow reusing the same logic in pipe-mode (next patch). Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-12-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Use struct feat_fd in read header recordsDavid Carrillo-Cisneros
As preparation for using header records in-pipe mode, replace int fd with struct feat_fd ff in read functions for all header record types. This patch does not change behavior. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-11-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Don't pass struct perf_file_section to process_##_featDavid Carrillo-Cisneros
struct perf_file_section is used in process_##_feat as container for size and offset in the file descriptor. These attributes are meaninful in pipe-mode but struct perf_file_section is not. Add offset and size variables to struct feat_fd to store perf_file_section's values in file-mode. Later on, the same variables can be reused for pipe-mode. This patch does not change behavior. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-10-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Use struct feat_fd to process header recordsDavid Carrillo-Cisneros
As preparation for using header records in pipe-mode, replace int fd with struct feat_fd ff in process functions for all header record types. This patch does not change behavior. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-9-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Use struct feat_fd for printDavid Carrillo-Cisneros
As preparation for using header records in pipe mode, replace int fd with struct feat_fd ff in print functions for all header record types. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-8-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Add struct feat_fd for writeDavid Carrillo-Cisneros
Introduce struct feat_fd. This patch uses it as a wrapper around fd in write_* functions for feature headers. Next patches will extend its functionality to other feature header functions. This patch does not change behavior. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-7-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Revamp do_write()David Carrillo-Cisneros
Now that writen takes a const buffer, use it in do_write instead of duplicating its functionality. Export do_write to use it consistently in header.c and build_id.c . Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-6-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf util: Add const modifier to buf in "writen" functionDavid Carrillo-Cisneros
Make buf in helper function "writen" constant to simplify the life of its callers. This requires to hack a cast of buf prior to passing it to "ion" which is simpler than the alternative of reworking the "ion" function to provide a read and a write paths, the latter with constant buf argument. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-5-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Fail on write_padded errorDavid Carrillo-Cisneros
Do not proceed if write_padded() error failed. Also, add comments to remind that the return value of write_* functions in util/header.c is an errno code and not the number of bytes written. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-4-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Add PROCESS_STR_FUN macroDavid Carrillo-Cisneros
Simplify code by adding a macro to handle the common case of processing header features that are a simple string. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-3-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf header: Encapsulate read and swapDavid Carrillo-Cisneros
Most callers of readn() in perf header read either a 32 or a 64 bits number, error check it and swap it, if necessary. Create do_read_u32 and do_read_u64 to simplify these use cases. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-2-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf report: Enable finding kernel inline functionsJin Yao
Currently perf supports a mode to query inline stack. It works well for finding user space inline functions but it doesn't work for kernel ones, due to some unnecessary check. This patch removes these unnecessary checks. Now kernel inline functions can be reported. For example: perf report --inline -g func --stdio |--46.19%--do_huge_pmd_anonymous_page | do_huge_pmd_anonymous_page (inline) | __do_huge_pmd_anonymous_page (inline) | __SetPageUptodate (inline) | __set_bit (inline) The result is compared with the output of addr2line. They match. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1500409892-15904-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf trace beauty: Simplify syscall return formattingArnaldo Carvalho de Melo
Removing syscall_fmt::err_msg and instead always formatting negative returns as errno values. With this we can remove a lot of entries that have no special handling besides the ones we can do by looking at the tracefs format files, i.e. the types for the fields (e.g. pid_t), well known names (e.g. fd). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-rg9u7a3qqdnzo37d212vnz2o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf trace beauty fcntl: Beautify the 'arg' for DUPFDArnaldo Carvalho de Melo
Before: 77059.513 ( 0.005 ms): bash/6649 fcntl(fd: 1</dev/pts/12>, cmd: DUPFD, arg: 10) = 10</dev/pts/12> After: 77059.513 ( 0.005 ms): bash/6649 fcntl(fd: 1</dev/pts/12>, cmd: DUPFD, arg: 10</dev/pts/12>) = 10</dev/pts/12> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-0k8iszng0slcuw0rc6xq1x5l@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-18perf trace beauty fcntl: Do not suppress 'cmd' when zero, should be DUPFDArnaldo Carvalho de Melo
Before: 77059.513 ( 0.005 ms): bash/6649 fcntl(fd: 1</dev/pts/12>, arg: 10) = 10</dev/pts/12> After: 77059.513 ( 0.005 ms): bash/6649 fcntl(fd: 1</dev/pts/12>, cmd: DUPFD, arg: 10) = 10</dev/pts/12> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-woois88uwcr4xu38xx1ihiwo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>