summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2016-04-08perf symbols: Record text offset in dso to calculate objdump addressWang Nan
In this patch, the offset of '.text' section is stored into dso and used here to re-calculate address to objdump. In most of the cases, executable code is in '.text' section, so the adjustment made to a symbol in dso__load_sym (using sym.st_value -= shdr.sh_addr - shdr.sh_offset) should equal to 'sym.st_value -= dso->text_offset'. Therefore, adding text_offset back get objdump address from symbol address (rip). However, it is not true for kernel and kernel module since there could be multiple executable sections with different offset. Exclude kernel for this reason. After this patch, even dso->adjust_symbols is set to true for shared objects, map__rip_2objdump() and map__objdump_2mem() would return correct result, so perf behavior of annotate won't be changed. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460024671-64774-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf tools: Build syscall table .c header from kernel's syscall_64.tblArnaldo Carvalho de Melo
We used libaudit to map ids to syscall names and vice-versa, but that imposes a delay in supporting new syscalls, having to wait for libaudit to get those new syscalls on its tables. To remove that delay, for x86_64 initially, grab a copy of arch/x86/entry/syscalls/syscall_64.tbl and use it to generate those tables. Syscalls currently not available in audit-libs: # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd Error: Invalid syscall copy_file_range, membarrier, mlock2, pread64, pwrite64, timerfd_create, userfaultfd Hint: try 'perf list syscalls:sys_enter_*' Hint: and: 'man syscalls' # With this patch: # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd 8505.733 ( 0.010 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 36 8506.688 ( 0.005 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 40 30023.097 ( 0.025 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63ae382000, count: 4096, pos: 529592320) = 4096 31268.712 ( 0.028 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afd8b000, count: 4096, pos: 2314133504) = 4096 31268.854 ( 0.016 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afda2000, count: 4096, pos: 2314137600) = 4096 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-51xfjbxevdsucmnbc4ka5r88@git.kernel.org [ Added make dep for 'prepare' in 'LIBPERF_IN', fix by Wang Nan to fix parallell build ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf tools: Allow generating per-arch syscall table arraysArnaldo Carvalho de Melo
Tools should use a mechanism similar to arch/x86/entry/syscalls/ to generate a header file with the definitions for two variables: static const char *syscalltbl_x86_64[] = { [0] = "read", [1] = "write", <SNIP> [324] = "membarrier", [325] = "mlock2", [326] = "copy_file_range", }; static const int syscalltbl_x86_64_max_id = 326; In a per arch file that should then be included in tools/perf/util/syscalltbl.c. First one will be for x86_64. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-02uuamkxgccczdth8komspgp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf trace: Move syscall table id <-> name routines to separate classArnaldo Carvalho de Melo
We're using libaudit for doing name to id and id to syscall name translations, but that makes 'perf trace' to have to wait for newer libaudit versions supporting recently added syscalls, such as "userfaultfd" at the time of this changeset. We have all the information right there, in the kernel sources, so move this code to a separate place, wrapped behind functions that will progressively use the kernel source files to extract the syscall table for use in 'perf trace'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i38opd09ow25mmyrvfwnbvkj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf trace: Beautify mode_t argumentsArnaldo Carvalho de Melo
When reading the syscall tracepoint /format file, look for arguments of type "mode_t" and attach a beautifier: [root@jouet ~]# cat ~/bin/tp_with_fields_of_type #!/bin/bash grep -w $1 /sys/kernel/tracing/events/syscalls/*/format | sed -r 's%.*sys_enter_(.*)/format.*%\1%g' | paste -d, -s # tp_with_fields_of_type umode_t chmod,creat,fchmodat,fchmod,mkdirat,mkdir,mknodat,mknod,mq_open,openat,open # Testing it: #define S_ISUID 0004000 #define S_ISGID 0002000 #define S_ISVTX 0001000 #define S_IRWXU 0000700 #define S_IRUSR 0000400 #define S_IWUSR 0000200 #define S_IXUSR 0000100 #define S_IRWXG 0000070 #define S_IRGRP 0000040 #define S_IWGRP 0000020 #define S_IXGRP 0000010 #define S_IRWXO 0000007 #define S_IROTH 0000004 #define S_IWOTH 0000002 #define S_IXOTH 0000001 # for mode in 4000 2000 1000 700 400 200 100 70 40 20 10 7 4 2 1 ; do \ echo -n $mode '->' ; trace --no-inherit -e chmod,fchmodat,fchmod chmod $mode x; \ done 4000 -> 0.338 ( 0.012 ms): fchmodat(dfd: CWD, filename: x, mode: ISUID) = 0 2000 -> 0.438 ( 0.015 ms): fchmodat(dfd: CWD, filename: x, mode: ISGID) = 0 1000 -> 0.677 ( 0.040 ms): fchmodat(dfd: CWD, filename: x, mode: ISVTX) = 0 700 -> 0.394 ( 0.013 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXU) = 0 400 -> 0.337 ( 0.010 ms): fchmodat(dfd: CWD, filename: x, mode: IRUSR) = 0 200 -> 0.259 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IWUSR) = 0 100 -> 0.249 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IXUSR) = 0 70 -> 0.266 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXG) = 0 40 -> 0.329 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IRGRP) = 0 20 -> 0.250 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IWGRP) = 0 10 -> 0.259 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IXGRP) = 0 7 -> 0.249 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXO) = 0 4 -> 0.278 ( 0.011 ms): fchmodat(dfd: CWD, filename: x, mode: IROTH) = 0 2 -> 0.276 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IWOTH) = 0 1 -> 0.250 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IXOTH) = 0 # # trace --no-inherit -e chmod,fchmodat,fchmod chmod 7777 x 0.258 ( 0.011 ms): fchmodat(dfd: CWD, filename: x, mode: IALLUGO) = 0 # trace --no-inherit -e chmod,fchmodat,fchmod chmod 7770 x 0.258 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: ISUID|ISGID|ISVTX|IRWXU|IRWXG) = 0 # trace --no-inherit -e chmod,fchmodat,fchmod chmod 777 x 0.293 ( 0.012 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXUGO # Now lets see if check by using the tracepoint for that specific syscall, instead of raw_syscalls:sys_enter as 'trace' does for its strace fu: # trace --no-inherit --ev syscalls:sys_enter_fchmodat -e fchmodat chmod 666 x 0.255 ( ): syscalls:sys_enter_fchmodat:dfd: 0xffffffffffffff9c, filename: 0x55db32a3f0f0, mode: 0x000001b6) 0.268 ( 0.012 ms): fchmodat(dfd: CWD, filename: x, mode: IRUGO|IWUGO ) = 0 # Perfect, 0x1bc == 0666. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-18e8zfgbkj83xo87yoom43kd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf script: Process event update eventsJiri Olsa
Andreas reported following command produces no output: # cat test.py #!/usr/bin/env python def stat__krava(cpu, thread, time, val, ena, run): print "event %s cpu %d, thread %d, time %d, val %d, ena %d, run %d" % \ ("krava", cpu, thread, time, val, ena, run) # perf stat -a -I 1000 -e cycles,"cpu/config=0x6530160,name=krava/" record | perf script -s test.py ^C # The reason is that 'perf script' does not process event update events and will never get the event name update thus the python callback is never called. The fix is just to add already existing callback we use in 'perf stat report'. Committer note: After the patch: # perf stat -a -I 1000 -e cycles,"cpu/config=0x6530160,name=krava/" record | perf script -s test.py event krava cpu -1, thread -1, time 1000239179, val 1789051, ena 4000690920, run 4000690920 event krava cpu -1, thread -1, time 2000479061, val 2391338, ena 4000879596, run 4000879596 event krava cpu -1, thread -1, time 3000740802, val 1939121, ena 4000977209, run 4000977209 event krava cpu -1, thread -1, time 4001006730, val 2356115, ena 4001000489, run 4001000489 ^C # Reported-by: Andreas Hollmann <hollmann@in.tum.de> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460013073-18444-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf tools: Add dedicated unwind addr_space member into thread structJiri Olsa
Milian reported issue with thread::priv, which was double booked by perf trace and DWARF unwind code. So using those together is impossible at the moment. Moving DWARF unwind private data into separate variable so perf trace can keep using thread::priv. Reported-and-Tested-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andreas Hollmann <hollmann@in.tum.de> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460013073-18444-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-07tools/power turbostat: work around RC6 counter wrapLen Brown
Sometimes the rc6 sysfs counter spontaneously resets, causing turbostat prints a very large number as it tries to calcuate % = 100 * (old - new) / interval When we see (old > new), print ***.**% instead of a bogus huge number. Note that this detection is not fool-proof, as the counter could reset several times and still result in new > old. Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07tools/power turbostat: initial KBL supportLen Brown
KBL is similar to SKL Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07tools/power turbostat: initial SKX supportLen Brown
SKX has a lot in common with HSX Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07tools/power turbostat: decode BXT TSC frequency via CPUIDLen Brown
Hard-code BXT ART to 19200MHz, so turbostat --debug can fully enumerate TSC: CPUID(0x15): eax_crystal: 3 ebx_tsc: 186 ecx_crystal_hz: 0 TSC: 1190 MHz (19200000 Hz * 186 / 3 / 1000000) Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07tools/power turbostat: initial BXT supportLen Brown
Broxton has a lot in common with SKL Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07tools/power turbostat: print IRTL MSRsLen Brown
Some processors use the Interrupt Response Time Limit (IRTL) MSR value to describe the maximum IRQ response time latency for deep package C-states. (Though others have the register, but do not use it) Lets print it out to give insight into the cases where it is used. IRTL begain in SNB, with PC3/PC6/PC7, and HSW added PC8/PC9/PC10. Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07tools/power turbostat: SGX state should print only if --debugLen Brown
The CPUID.SGX bit was printed, even if --debug was used Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07perf tools: Introduce trim functionJiri Olsa
To be used in cases for both sides trim. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andreas Hollmann <hollmann@in.tum.de> Cc: David Ahern <dsahern@gmail.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460013073-18444-1-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf trace: Beautify pid_t argumentsArnaldo Carvalho de Melo
When reading the syscall tracepoint /format file, look for arguments of type "pid_t" and attach the PID beautifier, that will do a lookup on the threads it knows, i.e. the ones that came from PERF_RECORD_COMM events and add the COMM after the pid in such args: Excerpt of a system wide trace for syscalls with pid_t args: 55602.977 ( 0.006 ms): bash/12122 setpgid(pid: 24347 (bash), pgid: 24347 (bash)) = 0 55603.024 ( 0.004 ms): bash/24347 setpgid(pid: 24347 (bash), pgid: 24347 (bash)) = 0 55691.527 (88.397 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee1720, options: UNTRACED|CONTINUED) ... 55692.479 ( 0.952 ms): git/24347 wait4(upid: 24368, stat_addr: 0x7ffe030d5724) ... 55694.549 ( 2.070 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4fc10) = 24369 (pre-commit) 55694.575 ( 0.002 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f650, options: NOHANG) = -1 ECHILD No child processes 55695.934 ( 0.010 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f2d0, options: NOHANG) = 24370 (git) 55695.937 ( 0.001 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f2d0, options: NOHANG) = -1 ECHILD No child processes 55717.963 ( 0.000 ms): pre-commit/24371 ... [continued]: wait4()) = 24372 55717.978 (21.468 ms): :24371/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4f230) ... 55718.087 ( 0.109 ms): pre-commit/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4f230) = 24373 (tr) 55718.187 ( 0.096 ms): pre-commit/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4f230) = 24374 (wc) 55718.218 ( 0.002 ms): pre-commit/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4eed0, options: NOHANG) = -1 ECHILD No child processes 55718.367 ( 0.005 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f1d0, options: NOHANG) = 24371 (pre-commit) 55718.369 ( 0.001 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f1d0, options: NOHANG) = -1 ECHILD No child processes 55741.021 (49.494 ms): git/24347 ... [continued]: wait4()) = 24368 (pre-commit) 74146.427 (18319.601 ms): git/24347 wait4(upid: 24375 (git), stat_addr: 0x7ffe030d6824) ... 74149.036 ( 0.891 ms): bash/24391 wait4(upid: -1, stat_addr: 0x7ffe0cee0560) = 24393 (sed) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-75yl9hzjhb020iadc81gdj8t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf trace: Beautify set_tid_address, getpid, getppid return valuesArnaldo Carvalho de Melo
Showing the COMM for that return, if available. # trace -e getpid,getppid,set_tid_address 490.007 ( 0.005 ms): sh/8250 getpid(...) = 8250 (sh) 490.014 ( 0.001 ms): sh/8250 getppid(...) = 7886 (make) 491.156 ( 0.004 ms): install/8251 set_tid_address(tidptr: 0x7f204a9d4ad0) = 8251 (install) ^C Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-psbpplqupatom9x4uohbxid5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf trace: Infrastructure to show COMM strings for syscalls returning PIDsArnaldo Carvalho de Melo
Starting with clone, waitid and wait4: # trace -e waitid,wait4 1.385 ( 1.385 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee1720, options: UNTRACED|CONTINUED) = 1210 (ls) 1.426 ( 0.002 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee1150, options: NOHANG|UNTRACED|CONTINUED) = 0 3.293 ( 0.604 ms): bash/1211 wait4(upid: -1, stat_addr: 0x7ffe0cee0560 ) = 1214 (sed) 3.342 ( 0.002 ms): bash/1211 wait4(upid: -1, stat_addr: 0x7ffe0cee01d0, options: NOHANG ) = -1 ECHILD No child processes 3.576 ( 0.016 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee0550, options: NOHANG|UNTRACED|CONTINUED) = 1211 (bash) ^C# trace -e clone 0.027 ( 0.000 ms): systemd/1 ... [continued]: clone()) = 1227 (systemd) 0.050 ( 0.000 ms): systemd/1227 ... [continued]: clone()) = 0 ^C[root@jouet ~]# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-lyf5d3y5j15wikjb6pe6ukoi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf trace: Beautify wait4/waitid 'options' argumentArnaldo Carvalho de Melo
# trace -e waitid,wait4 0.557 ( 0.557 ms): bash/27335 wait4(upid: -1, stat_addr: 0x7ffd02f449f0) = 27336 1.250 ( 0.685 ms): bash/27335 wait4(upid: -1, stat_addr: 0x7ffd02f449f0) = 27337 1.312 ( 0.002 ms): bash/27335 wait4(upid: -1, stat_addr: 0x7ffd02f44690, options: NOHANG) = -1 ECHILD No child processes 1.550 ( 0.015 ms): bash/3856 wait4(upid: -1, stat_addr: 0x7ffd02f44990, options: NOHANG|UNTRACED|CONTINUED) = 27335 1.552 ( 0.001 ms): bash/3856 wait4(upid: -1, stat_addr: 0x7ffd02f44990, options: NOHANG|UNTRACED|CONTINUED) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i5vlo5n5jv0amt8bkyicmdxh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf trace: Beautify sched_setscheduler 'policy' argumentArnaldo Carvalho de Melo
$ trace -e sched_setscheduler chrt -f 1 usleep 1 chrt: failed to set pid 0's policy: Operation not permitted 0.005 ( 0.005 ms): chrt/19189 sched_setscheduler(policy: FIFO, param: 0x7ffec5273d70) = -1 EPERM Operation not permitted $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i5vlo5n5jv0amt8bkyicmdxh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf list: Document event specifications betterAndi Kleen
Document some features for specifying events in the perf list manpage: - Event groups - Leader sampling - How to specify raw PMU events in the new syntax - Global versus per process PMUs. - Access restrictions - Fix Intel SDM URL v2: Lots of new content. address review feedback. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1459810686-15913-1-git-send-email-andi@firstfloor.org [ Add quotes to some keywords, such as "any" ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf tools: Remove superfluous ARCH Makefile includesJiri Olsa
Link: http://lkml.kernel.org/n/tip-yk6brsq3opuotr9by18xlkr8@git.kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2016-04-06perf script perl: Do error checking on new backtrace routineArnaldo Carvalho de Melo
This ended up triggering these warnings when building on Ubuntu 12.04.5: util/scripting-engines/trace-event-perl.c: In function 'perl_process_callchain': util/scripting-engines/trace-event-perl.c:293:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:294:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:295:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:297:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:309:4: error: value computed is not used [-Werror=unused-value] cc1: all warnings being treated as errors mv: cannot stat `/tmp/build/perf/util/scripting-engines/.trace-event-perl.o.tmp': No such file or directory make[4]: *** [/tmp/build/perf/util/scripting-engines/trace-event-perl.o] Error 1 Fix it by doing error checking when building the perl data structures related to callchains. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@gmail.com> Fixes: f7380c12ec6c ("perf script perl: Perl scripts now get a backtrace, like the python ones") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf probe: Check if dwarf_getlocations() is availableArnaldo Carvalho de Melo
If not, tell the user that: config/Makefile:273: Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157 And return -ENOTSUPP in die_get_var_range(), failing features that need it, like the one pointed out above. This fixes the build on older systems, such as Ubuntu 12.04.5. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vinson Lee <vlee@freedesktop.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9l7luqkq4gfnx7vrklkq4obs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf config: Fix build with older toolchain.Vinson Lee
Fix build error on Ubuntu 12.04.5 with GCC 4.6.3. CC util/config.o util/config.c: In function ‘perf_buildid_config’: util/config.c:384:15: error: declaration of ‘dirname’ shadows a global declaration [-Werror=shadow] Signed-off-by: Vinson Lee <vlee@freedesktop.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 9cb5987c8227 ("perf config: Rework buildid_dir_command_config to perf_buildid_config") Link: http://lkml.kernel.org/r/1459807659-9020-1-git-send-email-vlee@freedesktop.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-04Merge tag 'linux-kselftest-4.6-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: "This update for Kselftest contains seccomp fixes" * tag 'linux-kselftest-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftest/seccomp: Fix the seccomp(2) signature selftest/seccomp: Fix the flag name SECCOMP_FILTER_FLAG_TSYNC
2016-04-03Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc kernel side fixes: - fix event leak - fix AMD PMU driver bug - fix core event handling bug - fix build bug on certain randconfigs Plus misc tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/amd/ibs: Fix pmu::stop() nesting perf/core: Don't leak event in the syscall error path perf/core: Fix time tracking bug with multiplexing perf jit: genelf makes assumptions about endian perf hists: Fix determination of a callchain node's childlessness perf tools: Add missing initialization of perf_sample.cpumode in synthesized samples perf tools: Fix build break on powerpc perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL perf bench: Fix detached tarball building due to missing 'perf bench memcpy' headers perf tests: Fix tarpkg build test error output redirection
2016-04-01perf bpf: Add sample types for 'bpf-output' eventWang Nan
Before this patch we can see very large time in the events before the 'bpf-output' event. For example: # perf trace -vv -T --ev sched:sched_switch \ --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:channel.event=evt/ \ usleep 10 ... 18446744073709.551 (18446564645918.480 ms): usleep/4157 nanosleep(rqtp: 0x7ffd3f0dc4e0) ... 18446744073709.551 ( ): evt:Raise a BPF event!..) 179427791.076 ( ): perf_bpf_probe:func_begin:(ffffffff810eb9a0)) 179427791.081 ( ): sched:sched_switch:usleep:4157 [120] S ==> swapper/2:0 [120]) ... We can also see the differences between bpf-output events and breakpoint events: For bpf output event: sample_type IP|TID|RAW|IDENTIFIER For tracepoint events: sample_type IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER This patch fix this differences by adding more sample type for bpf-output events. After this patch: # perf trace -vv -T --ev sched:sched_switch \ --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:channel.event=evt/ \ usleep 10 ... 179877370.878 ( 0.003 ms): usleep/5336 nanosleep(rqtp: 0x7ffff866c450) ... 179877370.878 ( ): evt:Raise a BPF event!..) 179877370.878 ( ): perf_bpf_probe:func_begin:(ffffffff810eb9a0)) 179877370.882 ( ): sched:sched_switch:usleep:5336 [120] S ==> swapper/4:0 [120]) 179877370.945 ( ): evt:Raise a BPF event!..) ... # ./perf trace -vv -T --ev sched:sched_switch \ --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:channel.event=evt/ \ usleep 10 2>&1 | grep sample_type sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW The 'IDENTIFIER' info is not required because all events have the same sample_type. Committer notes: Further testing, on top of the changes making 'perf trace' avoid samples from events without PERF_SAMPLE_TIME: Before: # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10 <SNIP> 0.560 ( 0.001 ms): brk( ) = 0x55e5a1df8000 18446640227439.430 (18446640227438.859 ms): nanosleep(rqtp: 0x7ffc96643370) ... 18446640227439.430 ( ): evt:Raise a BPF event!..) 0.576 ( ): perf_bpf_probe:func_begin:(ffffffff81112460)) 18446640227439.430 ( ): evt:Raise a BPF event!..) 0.645 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92)) 0.646 ( 0.076 ms): ... [continued]: nanosleep()) = 0 # After: # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10 <SNIP> 0.292 ( 0.001 ms): brk( ) = 0x55c7cd6e1000 0.302 ( 0.004 ms): nanosleep(rqtp: 0x7ffedd8bc0f0) ... 0.302 ( ): evt:Raise a BPF event!..) 0.303 ( ): perf_bpf_probe:func_begin:(ffffffff81112460)) 0.397 ( ): evt:Raise a BPF event!..) 0.397 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92)) 0.398 ( 0.100 ms): ... [continued]: nanosleep()) = 0 Signed-off-by: Wang Nan <wangnan0@huawei.com> Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1459517202-42320-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-01perf trace: Don't set the base timestamp using events without PERF_SAMPLE_TIMEArnaldo Carvalho de Melo
This was causing bogus values to be shown at the timestamp column: Before: # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10 94631143.385 ( 0.001 ms): brk( ) = 0x555555757000 94631143.398 ( 0.003 ms): mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1) = 0x7ffff7ff6000 94631143.406 ( 0.004 ms): access(filename: 0xf7df9e10, mode: R ) = -1 ENOENT No such file or directory 94631143.412 ( 0.004 ms): open(filename: 0xf7df8761, flags: CLOEXEC) = 3 94631143.415 ( 0.002 ms): fstat(fd: 3, statbuf: 0x7fffffffd6b0 ) = 0 94631143.419 ( 0.003 ms): mmap(len: 106798, prot: READ, flags: PRIVATE, fd: 3) = 0x7ffff7fdb000 94631143.420 ( 0.001 ms): close(fd: 3 ) = 0 94631143.432 ( 0.004 ms): open(filename: 0xf7ff6640, flags: CLOEXEC) = 3 <SNIP> After: # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10 0.022 ( 0.001 ms): brk( ) = 0x55d7668a6000 0.037 ( 0.003 ms): mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1) = 0x7f8fbeb97000 0.123 ( 0.083 ms): access(filename: 0xbe995e10, mode: R ) = -1 ENOENT No such file or directory 0.130 ( 0.004 ms): open(filename: 0xbe994761, flags: CLOEXEC) = 3 0.133 ( 0.002 ms): fstat(fd: 3, statbuf: 0x7fff6487a890 ) = 0 0.138 ( 0.003 ms): mmap(len: 106798, prot: READ, flags: PRIVATE, fd: 3) = 0x7f8fbeb7c000 0.140 ( 0.001 ms): close(fd: 3 ) = 0 0.151 ( 0.004 ms): open(filename: 0xbeb97640, flags: CLOEXEC) = 3 <SNIP> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-p7m8llv81iv55ekxexdp5n57@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-01perf trace: Introduce function to set the base timestampArnaldo Carvalho de Melo
That is used in both live runs, i.e.: # trace ls As when processing events recorded in a perf.data file: # trace -i perf.data Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-901l6yebnzeqg7z8mbaf49xb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-01perf tools: Fix PMU term format max value calculationKan Liang
Currently the max value of format is calculated by the bits number. It relies on the continuity of the format. However, uncore event format is not continuous. E.g. uncore qpi event format can be 0-7,21. If bit 21 is set, there is parsing issues as below. $ perf stat -a -e uncore_qpi_0/event=0x200002,umask=0x8/ event syntax error: '..pi_0/event=0x200002,umask=0x8/' \___ value too big for format, maximum is 511 This patch return the real max value by setting all possible bits to 1. Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1459365375-14285-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-01perf intel-pt/bts: Define JITDUMP_USE_ARCH_TIMESTAMPAdrian Hunter
For Intel PT / BTS, define the environment variable that selects TSC timestamps in the jitdump file. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1457426333-30260-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-01perf jit: Add support for using TSC as a timestampAdrian Hunter
Intel PT uses TSC as a timestamp, so add support for using TSC instead of the monotonic clock. Use of TSC is selected by an environment variable "JITDUMP_USE_ARCH_TIMESTAMP" and flagged in the jitdump file with flag JITDUMP_FLAGS_ARCH_TIMESTAMP. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1457426330-30226-1-git-send-email-adrian.hunter@intel.com [ Added the fixup from He Kuang to make it build on other arches, ] [ such as aarch64, to avoid inserting this bisectiong breakage upstream ] Link: http://lkml.kernel.org/r/1459482572-129494-1-git-send-email-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31perf tools: Add time conversion eventAdrian Hunter
Intel PT uses the time members from the perf_event_mmap_page to convert between TSC and perf time. Due to a lack of foresight when Intel PT was implemented, those time members were recorded in the (implementation dependent) AUXTRACE_INFO event, the structure of which is generally inaccessible outside of the Intel PT decoder. However now the conversion between TSC and perf time is needed when processing a jitdump file when Intel PT has been used for tracing. So add a user event to record the time members. 'perf record' will synthesize the event if the information is available. And session processing will put a copy of the event on the session so that tools like 'perf inject' can easily access it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31perf trace: Pretty print getrandom() argsArnaldo Carvalho de Melo
# trace -e getrandom 35622.560 ( 0.023 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35622.585 ( 0.006 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35622.594 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35627.395 ( 0.010 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK ) = 16 35630.940 ( 0.013 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK ) = 16 35718.613 ( 0.015 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35718.629 ( 0.005 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35718.637 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35719.355 ( 0.010 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK ) = 16 35721.042 ( 0.030 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK ) = 16 41090.830 ( 0.012 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41090.845 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41090.851 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41091.750 ( 0.010 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK ) = 16 41091.823 ( 0.006 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK ) = 16 41122.078 ( 0.053 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41122.129 ( 0.009 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41122.139 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41124.492 ( 0.007 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK ) = 16 41124.470 ( 0.013 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK ) = 16 41590.832 ( 0.014 ms): chrome/5957 getrandom(buf: 0x7fabac7b15b0, count: 16, flags: NONBLOCK ) = 16 41590.884 ( 0.004 ms): chrome/5957 getrandom(buf: 0x7fabac7b15c0, count: 16, flags: NONBLOCK ) = 16 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-gca0n1p3aca3depey703ph2q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31perf trace: Pretty print seccomp() argsArnaldo Carvalho de Melo
E.g: # trace -e seccomp 200.061 (0.009 ms): :2441/2441 seccomp(op: FILTER, flags: TSYNC ) = -1 EFAULT Bad address 200.910 (0.121 ms): :2441/2441 seccomp(op: FILTER, flags: TSYNC, uargs: 0x7fff57479fe0) = 0 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-t369uckshlwp4evkks4bcoo7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31perf trace: Do not process PERF_RECORD_LOST twiceArnaldo Carvalho de Melo
We catch this record to provide a visual indication that events are getting lost, then call the default method to allow extra logging shared with the other tools to take place. This extra logging was done twice because we were continuing to the "default" clause where machine__process_event() will end up calling machine__process_lost_event() again, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wus2zlhw3qo24ye84ewu4aqw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31Merge tag 'perf-core-for-mingo-20160330' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes: User visible changes: - Add support for skipping itrace instructions, useful to fast forward processor trace (Intel PT, BTS) to right after initialization code at the start of a workload (Andi Kleen) - Add support for backtraces in perl 'perf script's (Dima Kogan) - Add -U/-K (--all-user/--all-kernel) options to 'perf mem' (Jiri Olsa) - Make -f/--force option documentation consistent across tools (Jiri Olsa) Infrastructure changes: - Add 'perf test' to check for event times (Jiri Olsa) - 'perf config' cleanups (Taeung Song) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-30perf jit: genelf makes assumptions about endianAnton Blanchard
Commit 9b07e27f88b9 ("perf inject: Add jitdump mmap injection support") incorrectly assumed that PowerPC is big endian only. Simplify things by consolidating the define of GEN_ELF_ENDIAN and checking for __BYTE_ORDER == __BIG_ENDIAN. The PowerPC checks were also incorrect, they do not match what gcc emits. We should first look for __powerpc64__, then __powerpc__. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Carl Love <cel@us.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Fixes: 9b07e27f88b9 ("perf inject: Add jitdump mmap injection support") Link: http://lkml.kernel.org/r/20160329175944.33a211cc@kryten Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30perf hists: Fix determination of a callchain node's childlessnessAndres Freund
The 4b3a3212233a ("perf hists browser: Support flat callchains") commit over-aggressively tried to optimize callchain_node__init_have_children(). That lead to --tui mode not allowing to expand call chain elements if a call chain element had only one parent. That's why --inverted callgraphs looked halfway sane, but plain ones didn't. Revert that individual optimization, it wasn't really related to the rest of the commit. Signed-off-by: Andres Freund <andres@anarazel.de> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Fixes: 4b3a3212233a ("perf hists browser: Support flat callchains") Link: http://lkml.kernel.org/r/20160330190245.GB13305@awork2.anarazel.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30perf tools: Add support for skipping itrace instructionsAndi Kleen
When using 'perf script' to look at PT traces it is often useful to ignore the initialization code at the beginning. On larger traces which may have many millions of instructions in initialization code doing that in a pipeline can be very slow, with perf script spending a lot of CPU time calling printf and writing data. This patch adds an extension to the --itrace argument that skips 'n' events (instructions, branches or transactions) at the beginning. This is much more efficient. v2: Add support for BTS (Adrian Hunter) Document in itrace.txt Fix branch check Check transactions and instructions too Committer note: To test intel_pt one needs to make sure VT-x isn't active, i.e. stopping KVM guests on the test machine, as described by Andi Kleen at http://lkml.kernel.org/r/20160301234953.GD23621@tassilo.jf.intel.com Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1459187142-20035-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30perf script perl: Perl scripts now get a backtrace, like the python onesDima Kogan
We have some infrastructure to use perl or python to analyze logs generated by perf. Prior to this patch, only the python tools had access to backtrace information. This patch makes this information available to perl scripts as well. Example: Let's look at malloc() calls made by the seq utility. First we create a probe point: $ perf probe -x /lib/x86_64-linux-gnu/libc.so.6 malloc Added new events: ... Now we run seq, while monitoring malloc() calls with perf $ perf record --call-graph=dwarf -e probe_libc:malloc seq 5 1 2 3 4 5 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.064 MB perf.data (6 samples) ] We can use perf to look at its log to see the malloc calls and the backtrace $ perf script seq 14195 [000] 1927993.748254: probe_libc:malloc: (7f9ff8edd320) bytes=0x22 7f9ff8edd320 malloc (/lib/x86_64-linux-gnu/libc-2.22.so) 7f9ff8e8eab0 set_binding_values.part.0 (/lib/x86_64-linux-gnu/libc-2.22.so) 7f9ff8e8eda1 __bindtextdomain (/lib/x86_64-linux-gnu/libc-2.22.so) 401b22 main (/usr/bin/seq) 7f9ff8e82610 __libc_start_main (/lib/x86_64-linux-gnu/libc-2.22.so) 402799 _start (/usr/bin/seq) ... We can also use the scripting facilities. We create a skeleton perl script that simply prints out the events $ perf script -g perl generated Perl script: perf-script.pl We can then use this script to see the malloc() calls with a backtrace. Prior to this patch, the backtrace was not available to the perl scripts. $ perf script -s perf-script.pl probe_libc::malloc 0 1927993.748254260 14195 seq __probe_ip=140325052863264, bytes=34 [7f9ff8edd320] malloc [7f9ff8e8eab0] set_binding_values.part.0 [7f9ff8e8eda1] __bindtextdomain [401b22] main [7f9ff8e82610] __libc_start_main [402799] _start ... Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/87mvphzld0.fsf@secretsauce.net Signed-off-by: Dima Kogan <dima@secretsauce.net>
2016-03-30perf config: Rename 'v' to 'home' in set_buildid_dir()Taeung Song
Change the variable name 'v' to 'home' to make it more readable. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1459099340-16911-3-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30perf config: Rework buildid_dir_command_config to perf_buildid_configTaeung Song
To avoid repeated calling perf_config() remove buildid_dir_command_config() and add new perf_buildid_config into perf_default_config. Because perf_config() is already called with perf_default_config at main(). Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1459099340-16911-2-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30perf config: Remove duplicated set_buildid_dir callsTaeung Song
Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1459099340-16911-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30perf tests: Add test to check for event timesJiri Olsa
This test creates software event 'cpu-clock' attaches it in several ways and checks that enabled and running times match. Committer notes: Testing it: [acme@jouet linux]$ perf test -v times 44: Test events times : --- start --- test child forked, pid 27170 attaching to spawned child, enable on exec OK : ena 307328, run 307328 attaching to current thread as enabled OK : ena 7826, run 7826 attaching to current thread as disabled OK : ena 738, run 738 attaching to CPU 0 as enabled SKIP : not enough rights attaching to CPU 0 as enabled SKIP : not enough rights test child finished with -2 ---- end ---- Test events times: Skip [acme@jouet linux]$ [root@jouet ~]# perf test times 44: Test events times : Ok [root@jouet ~]# perf test -v times 44: Test events times : --- start --- test child forked, pid 27306 attaching to spawned child, enable on exec OK : ena 479290, run 479290 attaching to current thread as enabled OK : ena 11356, run 11356 attaching to current thread as disabled OK : ena 987, run 987 attaching to CPU 0 as enabled OK : ena 3717, run 3717 attaching to CPU 0 as enabled OK : ena 2323, run 2323 test child finished with 0 ---- end ---- Test events times: Ok [root@jouet ~]# Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1458823940-24583-7-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30perf tools: Make -f/--force option documentation consistent across toolsJiri Olsa
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1458823940-24583-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30perf tools: Make hists__collapse_insert_entry staticJiri Olsa
No need to export hists__collapse_insert_entry function. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1458823940-24583-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30perf mem: Add -U/-K (--all-user/--all-kernel) optionsJiri Olsa
Add -U/-K (--all-user/--all-kernel) options to use the perf record --all-user/--all-kernel options. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1458823940-24583-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30tools/lib/lockdep: Fix unsupported 'basename -s' in run_tests.shSedat Dilek
Here on Ubuntu/precise I have GNU/coreutils v8.13 installed where 'basename -s' is not supported. The result is that run_tests.sh is not done properly. How to reproduce: $ cd $BUILD_DIR $ LC_ALL=C make -C tools/ liblockdep $ cd tools/lib/lockdep/ $ LC_ALL=C ./run_tests.sh basename: invalid option -- 's' Try `basename --help' for more information. ... timeout: failed to run command `./tests/': Permission denied FAILED! rm: cannot remove `tests/': Is a directory Due to unsupported basename the tests programs are not generated and cannot be removed. Fix this by doing a compatible basename invocation and check for the existence of generated tests programs. For more details see this LKML thread: http://marc.info/?t=145906667300001&r=1&w=2 Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> (maintainer:LIBLOCKDEP) Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org> Link: http://lkml.kernel.org/r/1459326169-7009-1-git-send-email-sedat.dilek@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>