summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2017-03-27selftests/powerpc: Fix standalone powerpc buildMichael Ellerman
The changes to enable building with a separate output directory, in commit a8ba798bc8ec ("selftests: enable O and KBUILD_OUTPUT") broke building the powerpc selftests on their own, eg: $ cd tools/testing/selftests/powerpc; make It was partially fixed in commit e53aff45c490 ("selftests: lib.mk Fix individual test builds"), which defined OUTPUT for standalone tests. But that only defines OUTPUT within the Makefile, the value is not exported so sub-shells can't see it. We could export OUTPUT, but it's actually cleaner to just expand the value of OUTPUT before we invoke the shell. Fixes: a8ba798bc8ec ("selftests: enable O and KBUILD_OUTPUT") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-24bpf: improve verifier packet range checksAlexei Starovoitov
llvm can optimize the 'if (ptr > data_end)' checks to be in the order slightly different than the original C code which will confuse verifier. Like: if (ptr + 16 > data_end) return TC_ACT_SHOT; // may be followed by if (ptr + 14 > data_end) return TC_ACT_SHOT; while llvm can see that 'ptr' is valid for all 16 bytes, the verifier could not. Fix verifier logic to account for such case and add a test. Reported-by: Huapeng Zhou <hzhou@fb.com> Fixes: 969bf05eb3ce ("bpf: direct packet access") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24perf trace: Fixup thread refcountingArnaldo Carvalho de Melo
In trace__vfs_getname() and when checking if a thread is filtered in trace__process_sample() we were not dropping the reference obtained via machine__findnew_thread(), fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9gc470phavxwxv5d9w7ck8ev@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-24perf trace: Fix up error path indentationArnaldo Carvalho de Melo
Trivial fix removing a tab in an error path. Link: http://lkml.kernel.org/n/tip-c14mk6cqaiby8gf5rpft3d9r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-24perf trace: Check for vfs_getname.pathname lengthArnaldo Carvalho de Melo
It shouldn't be zero, but if the 'perf probe' on getname_flags() (or elsewhere in the future we need to probe to catch the pathname for syscalls like 'open' being copied from userspace to the kernel) is misplaced somehow, then we will end up not allocating space and trying to copy the "" empty string to ttrace->filename.name, causing a segfault, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-c4f1t6sx1nczuzop19r5si5s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Several netfilter fixes from Pablo and the crew: - Handle fragmented packets properly in netfilter conntrack, from Florian Westphal. - Fix SCTP ICMP packet handling, from Ying Xue. - Fix big-endian bug in nftables, from Liping Zhang. - Fix alignment of fake conntrack entry, from Steven Rostedt. 2) Fix feature flags setting in fjes driver, from Taku Izumi. 3) Openvswitch ipv6 tunnel source address not set properly, from Or Gerlitz. 4) Fix jumbo MTU handling in amd-xgbe driver, from Thomas Lendacky. 5) sk->sk_frag.page not released properly in some cases, from Eric Dumazet. 6) Fix RTNL deadlocks in nl80211, from Johannes Berg. 7) Fix erroneous RTNL lockdep splat in crypto, from Herbert Xu. 8) Cure improper inflight handling during AF_UNIX GC, from Andrey Ulanov. 9) sch_dsmark doesn't write to packet headers properly, from Eric Dumazet. 10) Fix SCM_TIMESTAMPING_OPT_STATS handling in TCP, from Soheil Hassas Yeganeh. 11) Add some IDs for Motorola qmi_wwan chips, from Tony Lindgren. 12) Fix nametbl deadlock in tipc, from Ying Xue. 13) GRO and LRO packets not counted correctly in mlx5 driver, from Gal Pressman. 14) Fix reset of internal PHYs in bcmgenet, from Doug Berger. 15) Fix hashmap allocation handling, from Alexei Starovoitov. 16) nl_fib_input() needs stronger netlink message length checking, from Eric Dumazet. 17) Fix double-free of sk->sk_filter during sock clone, from Daniel Borkmann. 18) Fix RX checksum offloading in aquantia driver, from Pavel Belous. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits) net:ethernet:aquantia: Fix for RX checksum offload. amd-xgbe: Fix the ECC-related bit position definitions sfc: cleanup a condition in efx_udp_tunnel_del() Bluetooth: btqcomsmd: fix compile-test dependency inet: frag: release spinlock before calling icmp_send() tcp: initialize icsk_ack.lrcvtime at session start time genetlink: fix counting regression on ctrl_dumpfamily() socket, bpf: fix sk_filter use after free in sk_clone_lock ipv4: provide stronger user input validation in nl_fib_input() bpf: fix hashmap extra_elems logic enic: update enic maintainers net: bcmgenet: remove bcmgenet_internal_phy_setup() ipv6: make sure to initialize sockc.tsflags before first use fjes: Do not load fjes driver if extended socket device is not power on. fjes: Do not load fjes driver if system does not have extended socket device. net/mlx5e: Count LRO packets correctly net/mlx5e: Count GSO packets correctly net/mlx5: Increase number of max QPs in default profile net/mlx5e: Avoid supporting udp tunnel port ndo for VF reps net/mlx5e: Use the proper UAPI values when offloading TC vlan actions ...
2017-03-23perf list: Move extra details printing to new optionAndi Kleen
Move the printing of perf expressions and internal events to a new clearer --details flag, instead of lumping it together with other debug options in --debug. This makes it clearer to use. Before perf list --debug ... unc_m_power_critical_throttle_cycles [Cycles all ranks are in critical thermal throttle. Unit: uncore_imc] uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100. after perf list --details ... unc_m_power_critical_throttle_cycles [Cycles all ranks are in critical thermal throttle. Unit: uncore_imc] uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20170320201711.14142-14-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf pmu: Add support for MetricName JSON attributeAndi Kleen
Add support for a new JSON event attribute to name MetricExpr for better output in perf stat. If the event has no MetricName it uses the normal event name instead to describe the metric. Before % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only time unc_p_freq_max_os_cycles 1.000149775 15.7 2.000344807 19.3 3.000502544 16.7 4.000640656 6.6 5.000779955 9.9 After % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only time freq_max_os_cycles % 1.000149775 15.7 2.000344807 19.3 3.000502544 16.7 4.000640656 6.6 5.000779955 9.9 Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-13-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf list: Support printing MetricExpr with --debugAndi Kleen
Output the metric expr in perf list when --debug is specified, so that the user can check the formula. Before: % perf list ... unc_m_power_channel_ppd [Cycles where DRAM ranks are in power down (CKE) mode. Derived from unc_m_power_channel_ppd. Unit: uncore_imc] uncore_imc_2/event=0x85/ After: % perf list --debug ... unc_m_power_channel_ppd [Cycles where DRAM ranks are in power down (CKE) mode. Derived from unc_m_power_channel_ppd. Unit: uncore_imc] Perf: uncore_imc_2/event=0x85/ MetricExpr: (unc_m_power_channel_ppd / unc_m_clockticks) * 100. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-12-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf stat: Output JSON MetricExpr metricAndi Kleen
Add generic infrastructure to perf stat to output ratios for "MetricExpr" entries in the event lists. Many events are more useful as ratios than in raw form, typically some count in relation to total ticks. Transfer the MetricExpr information from the alias to the evsel. We mark the events that need to be collected for MetricExpr, and also link the events using them with a pointer. The code is careful to always prefer the right event in the same group to minimize multiplexing errors. At the moment only a single relation is supported. Then add a rblist to the stat shadow code that remembers stats based on the cpu and context. Then finally update and retrieve and print these values similarly to the existing hardcoded perf metrics. We use the simple expression parser added earlier to evaluate the expression. Normally we just output the result without further commentary, but for --metric-only this would lead to empty columns. So for this case use the original event as description. There is no attempt to automatically add the MetricExpr event, if it is missing, however we suggest it to the user, because the user tool doesn't have enough information to reliably construct a group that is guaranteed to schedule. So we leave that to the user. % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' 1.000147889 800,085,181 unc_p_clockticks 1.000147889 93,126,241 unc_p_freq_max_os_cycles # 11.6 2.000448381 800,218,217 unc_p_clockticks 2.000448381 142,516,095 unc_p_freq_max_os_cycles # 17.8 3.000639852 800,243,057 unc_p_clockticks 3.000639852 162,292,689 unc_p_freq_max_os_cycles # 20.3 % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only # time freq_max_os_cycles % 1.000127077 0.9 2.000301436 0.7 3.000456379 0.0 v2: Change from DivideBy to MetricExpr v3: Use expr__ prefix. Support more than one other event. v4: Update description v5: Only print warning message once for multiple PMUs. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-11-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf pmu: Support MetricExpr header in JSON event listAndi Kleen
Add support for parsing the MetricExpr header in the JSON event lists and storing them in the alias structure. Used in the next patch. v2: Change DividedBy to MetricExpr v3: Really catch all uses of DividedBy Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-10-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf vendor events intel: Update Intel uncore JSON event filesAndi Kleen
- Add MetricName to describe Metric - Remove redundant "derived from" in descriptions - Rename UNC_M_CAS_COUNT to LLC_MISSES.READ Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-9-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23perf tools: Add a simple expression parser for JSONAndi Kleen
Add a simple expression parser good enough to parse JSON relation expressions. The parser is implemented using bison. This is just intended as an simple parser for internal usage in the event lists, not the beginning of a "perf scripting language" v2: Use expr__ prefix instead of expr_ Support multiple free variables for parser Committer note: The v2 patch had: %define api.pure full In expr.y, that is a feature introduced in bison 2.7, to have reentrant parsers, not using global variables, which would make tools/perf stop building with the bison version shipped in older distros, so Andi realised that the other parsers (e.g. parse-events.y) were using: %pure-parser Which is present in older versions of bison and fits the bill. I added: CFLAGS_expr-bison.o += -DYYENABLE_NLS=0 -DYYLTYPE_IS_TRIVIAL=0 -w To finally make it build, copying what was there for pmu-bison.o, another parser. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-8-andi@firstfloor.org [ stdlib.h is needed in tests/expr.c for free() fixing build in systems such as ubuntu:16.04-x-s390 ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-23selftests/x86/ldt_gdt_32: Work around a glibc sigaction() bugAndy Lutomirski
i386 glibc is buggy and calls the sigaction syscall incorrectly. This is asymptomatic for normal programs, but it blows up on programs that do evil things with segmentation. The ldt_gdt self-test is an example of such an evil program. This doesn't appear to be a regression -- I think I just got lucky with the uninitialized memory that glibc threw at the kernel when I wrote the test. This hackish fix manually issues sigaction(2) syscalls to undo the damage. Without the fix, ldt_gdt_32 segfaults; with the fix, it passes for me. See: https://sourceware.org/bugzilla/show_bug.cgi?id=21269 Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/aaab0f9f93c9af25396f01232608c163a760a668.1490218061.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-22bpf: fix hashmap extra_elems logicAlexei Starovoitov
In both kmalloc and prealloc mode the bpf_map_update_elem() is using per-cpu extra_elems to do atomic update when the map is full. There are two issues with it. The logic can be misused, since it allows max_entries+num_cpus elements to be present in the map. And alloc_extra_elems() at map creation time can fail percpu alloc for large map values with a warn: WARNING: CPU: 3 PID: 2752 at ../mm/percpu.c:892 pcpu_alloc+0x119/0xa60 illegal size (32824) or align (8) for percpu allocation The fixes for both of these issues are different for kmalloc and prealloc modes. For prealloc mode allocate extra num_possible_cpus elements and store their pointers into extra_elems array instead of actual elements. Hence we can use these hidden(spare) elements not only when the map is full but during bpf_map_update_elem() that replaces existing element too. That also improves performance, since pcpu_freelist_pop/push is avoided. Unfortunately this approach cannot be used for kmalloc mode which needs to kfree elements after rcu grace period. Therefore switch it back to normal kmalloc even when full and old element exists like it was prior to commit 6c9059817432 ("bpf: pre-allocate hash map elements"). Add tests to check for over max_entries and large map values. Reported-by: Dave Jones <davej@codemonkey.org.uk> Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-21selftests/bpf: fix broken build, take 2Zi Shen Lim
Merge of 'linux-kselftest-4.11-rc1': 1. Partially removed use of 'test_objs' target, breaking force rebuild of BPFOBJ, introduced in commit d498f8719a09 ("bpf: Rebuild bpf.o for any dependency update"). Update target so dependency on BPFOBJ is restored. 2. Introduced commit 2047f1d8ba28 ("selftests: Fix the .c linking rule") which fixes order of LDLIBS. Commit d02d8986a768 ("bpf: Always test unprivileged programs") added libcap dependency into CFLAGS. Use LDLIBS instead to fix linking of test_verifier. 3. Introduced commit d83c3ba0b926 ("selftests: Fix selftests build to just build, not run tests"). Reordering the Makefile allows us to remove the 'all' target. Tested both: selftests/bpf$ make and selftests$ make TARGETS=bpf on Ubuntu 16.04.2. Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-21perf pmu: Special case uncore_ prefixAndi Kleen
Special case uncore_ prefix in PMU match, to allow for shorter event uncore specifications. Before: perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 After perf stat -a -e cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 Committer tests: # perf list uncore List of pre-defined events (to be used in -e): uncore_cbox_0/clockticks/ [Kernel PMU event] uncore_cbox_1/clockticks/ [Kernel PMU event] uncore_imc/data_reads/ [Kernel PMU event] uncore_imc/data_writes/ [Kernel PMU event] # perf stat -a -e cbox_0/clockticks/ sleep 1 Performance counter stats for 'system wide': 281,474,976,653,084 cbox_0/clockticks/ 1.000870129 seconds time elapsed # Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20170320201711.14142-7-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf pmu: Expand PMU events by prefix matchAndi Kleen
When the user specifies a pmu directly, expand it automatically with a prefix match for all available PMUs, similar as we do for the normal aliases now. This allows to specify attributes for duplicated boxes quickly. For example uncore_cbox_{0,6}/.../ can be now specified as uncore_cbox/.../ and it gets automatically expanded for all boxes. This generally makes it more concise to write uncore specifications, and also avoids the need to know the exact topology of the system. Before: % perf stat -a -e uncore_cbox_0/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_1/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_2/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_3/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_4/event=0x35,umask=0x1,filter_opc=0x19C/,\ uncore_cbox_5/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 After: % perf stat -a -e uncore_cbox/event=0x35,umask=0x1,filter_opc=0x19C/ sleep 1 v2: Handle all bison rules. Move multi add code to separate function. Handle uncore_ prefix correctly. v3: Move parse_events_multi_pmu_add to separate patch. Move uncore prefix check to separate patch. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-6-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf tools: Factor out PMU matching in parserAndi Kleen
Factor out the PMU name matching in the event parser into a separate function, to use the same code for other grammar rules later. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-5-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf stat: Handle partially bad results with mergingAndi Kleen
When any result that is being merged is bad, mark them all bad to give consistent output in interval mode. No before/after, because the issue was only found in theoretical review and it is hard to reproduce Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf stat: Collapse identically named eventsAndi Kleen
The uncore PMU has a lot of duplicated PMUs for different subsystems. When expanding an uncore alias we usually end up with a large number of identically named aliases, which makes perf stat output difficult to read. Automatically sum them up in perf stat, unless --no-merge is specified. This can be default because only the uncores generally have duplicated aliases. Other PMUs have unique names. Before: % perf stat --no-merge -a -e unc_c_llc_lookup.any sleep 1 Performance counter stats for 'system wide': 694,976 Bytes unc_c_llc_lookup.any 706,304 Bytes unc_c_llc_lookup.any 956,608 Bytes unc_c_llc_lookup.any 782,720 Bytes unc_c_llc_lookup.any 605,696 Bytes unc_c_llc_lookup.any 442,816 Bytes unc_c_llc_lookup.any 659,328 Bytes unc_c_llc_lookup.any 509,312 Bytes unc_c_llc_lookup.any 263,936 Bytes unc_c_llc_lookup.any 592,448 Bytes unc_c_llc_lookup.any 672,448 Bytes unc_c_llc_lookup.any 608,640 Bytes unc_c_llc_lookup.any 641,024 Bytes unc_c_llc_lookup.any 856,896 Bytes unc_c_llc_lookup.any 808,832 Bytes unc_c_llc_lookup.any 684,864 Bytes unc_c_llc_lookup.any 710,464 Bytes unc_c_llc_lookup.any 538,304 Bytes unc_c_llc_lookup.any 1.002577660 seconds time elapsed After: % perf stat -a -e unc_c_llc_lookup.any sleep 1 Performance counter stats for 'system wide': 2,685,120 Bytes unc_c_llc_lookup.any 1.002648032 seconds time elapsed v2: Split collect_aliases. Rename alias flag. v3: Make sure unsupported/not counted is always printed. v4: Factor out callback change into separate patch. v5: Move check for bad results here Move merged check into collect_data Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf stat: Factor out callback for collecting event valuesAndi Kleen
To be used in next patch to support automatic summing of alias events. v2: Move check for bad results to next patch v3: Remove trivial addition. v4: Use perf_evsel__cpus instead of evsel->cpus Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf annotate: Add comment clarifying how the source code line is parsedArnaldo Carvalho de Melo
The source code line number (lineno) needs to be kept in accross calls to symbol__parse_objdump_line() when parsing the output of 'objdump -l -dS', so that it can associate it with the instructions till the next line. See disasm_line__new() and struct disasm_line::line_nr. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-7hpx8f8ybdpiujceysaj229w@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf annotate: More exactly grep -v of the objdump commandTaeung Song
The 'grep -v "filename"' applied to the objdump command output cause a side effect eliminating filename:linenr of output of 'objdump -l' if the object file name and source file name are the same, fix it. E.g. the output of the following objdump command in symbol__disassemble(): $ objdump -l -d -S -C /home/taeung/hello --start-address=... /home/taeung/hello: file format elf64-x86-64 Disassembly of section .text: 0000000000400526 <main>: main(): /home/taeung/hello.c:4 void main() { 400526: 55 push %rbp 400527: 48 89 e5 mov %rsp,%rbp /home/taeung/hello.c:5 ... But it uses grep -v "filename" e.g. "/home/taeung/hello" in the objdump command to remove the first line containing file name and file format ("/home/taeung/hello: file format elf64-x86-64"): Before: $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello But this causes a side effect, removing filename:linenr too, because the object file and source file have the same name e.g. "/home/taueng/hello", "/home/taeung/hello.c" So more do a better match by using grep -v as below to correctly remove that first line: "/home/taeung/hello: file format elf64-x86-64" After: $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello: Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1489978617-31396-5-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf sdt x86: Add renaming logic for rNN and other registersRavi Bangoria
'perf probe' is failing for sdt markers whose arguments has rNN (with postfix b/w/d), %rsp, %esp, %sil etc. registers. Add renaming logic for these registers. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170202111143.14319-3-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf probe: Add sdt probes arguments into the uprobe cmd stringAlexis Berlemont
An sdt probe can be associated with arguments but they were not passed to the user probe tracing interface (uprobe_events); this patch adapts the sdt argument descriptors according to the uprobe input format. As the uprobe parser does not support scaled address mode, perf will skip arguments which cannot be adapted to the uprobe format. Here are the results: $ perf buildid-cache -v --add test_sdt $ perf probe -x test_sdt sdt_libfoo:table_frob $ perf probe -x test_sdt sdt_libfoo:table_diddle $ perf record -e sdt_libfoo:table_frob -e sdt_libfoo:table_diddle test_sdt $ perf script test_sdt ... 666.255678: sdt_libfoo:table_frob: (4004d7) arg0=0 arg1=0 test_sdt ... 666.255683: sdt_libfoo:table_diddle: (40051a) arg0=0 arg1=0 test_sdt ... 666.255686: sdt_libfoo:table_frob: (4004d7) arg0=1 arg1=2 test_sdt ... 666.255689: sdt_libfoo:table_diddle: (40051a) arg0=3 arg1=4 test_sdt ... 666.255692: sdt_libfoo:table_frob: (4004d7) arg0=2 arg1=4 test_sdt ... 666.255694: sdt_libfoo:table_diddle: (40051a) arg0=6 arg1=8 Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20161214000732.1710-3-alexis.berlemont@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf sdt: Add scanning of sdt probes argumentsAlexis Berlemont
During a "perf buildid-cache --add" command, the section ".note.stapsdt" of the "added" binary is scanned in order to list the available SDT markers available in a binary. The parts containing the probes arguments were left unscanned. The whole section is now parsed; the probe arguments are extracted for later use. Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20161214000732.1710-2-alexis.berlemont@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf probe: Return errno when not hitting any eventKefeng Wang
On old perf, when using 'perf probe -d' to delete an inexistent event, it returns errno, eg, -bash-4.3# perf probe -d xxx || echo $? Info: Event "*:xxx" does not exist. Error: Failed to delete events. 255 But now perf_del_probe_events() will always set ret = 0, different from previous del_perf_probe_events(). After this, it returns errno again, eg, -bash-4.3# ./perf probe -d xxx || echo $? "xxx" does not hit any event. Error: Failed to delete events. 254 And it is more appropriate to return -ENOENT instead of -EPERM. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: dddc7ee32fa1 ("perf probe: Fix an error when deleting probes successfully") Link: http://lkml.kernel.org/r/1489738592-61011-1-git-send-email-wangkefeng.wang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-21perf probe: Change MAX_CMDLENRavi Bangoria
There are many SDT markers in powerpc whose uprobe definition goes beyond current MAX_CMDLEN, especially when target filename is long and sdt marker has long list of arguments. For example, definition of sdt marker method__compile__end: 8@17 8@9 8@10 -4@8 8@7 -4@6 8@5 -4@4 1@37(28) from file /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so is p:sdt_hotspot/method__compile__end /usr/lib/jvm/java-1.8.0-openjdk-\ 1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so:0x4c4e00\ arg1=%gpr17:u64 arg2=%gpr9:u64 arg3=%gpr10:u64 arg4=%gpr8:s32\ arg5=%gpr7:u64 arg6=%gpr6:s32 arg7=%gpr5:u64 arg8=%gpr4:s32\ arg9=+37(%gpr28):u8 'perf probe' fails with segfault for such markers. As the uprobe_events file accepts definitions up to 4094 characters(4096 - 2 (\n\0)), increase value of MAX_CMDLEN match that. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170207054547.3690-1-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-20tools headers: Sync {tools/,}arch/powerpc/include/uapi/asm/kvm.hArnaldo Carvalho de Melo
The changes in the following csets are not relevant for what is used in tools/perf/arch/powerpc/util/kvm-stat.c, but lets sync it to silence the diff detector in the tools build system: c92701322711 ("KVM: PPC: Book3S HV: Add userspace interfaces for POWER9 MMU") 17d48610ae0f ("KVM: PPC: Book 3S: XICS: Implement ICS P/Q states") Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Scott Wood <scottwood@freescale.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-nsqxpyzcv4ywesikhhhrgfgc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-20perf probe: Fix concat_probe_trace_eventsRavi Bangoria
'*ntevs' contains number of elements present in 'tevs' array. If there are no elements in array, 'tevs2' can be directly assigned to 'tevs' without allocating more space. So the condition should be '*ntevs == 0' not 'ntevs == 0'. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 42bba263eb58 ("perf probe: Allow wildcard for cached events") Link: http://lkml.kernel.org/r/20170308065908.4128-1-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-20perf stat: Correct --no-aggr descriptionRavi Bangoria
Description of --no-aggr in perf-stat man page is outdated. --no-aggr can also be used while profiling specific set of cpus. For ex, $ perf stat -e cycles,instructions -C 1-2 --no-aggr -- sleep 1 Performance counter stats for 'CPU(s) 1-2': CPU1 5,94,92,795 cycles CPU2 2,69,72,403 cycles CPU1 2,02,08,327 instructions # 0.34 insn per cycle CPU2 73,17,123 instructions # 0.12 insn per cycle 1.000989132 seconds time elapsed Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1490013438-5713-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-20tools headers: Sync {tools/,}arch/arm{64}/include/uapi/asm/kvm.hArnaldo Carvalho de Melo
The changes in the following csets are not relevant for 'perf kvm' usage but lets sync it to silence the diff detector in the tools build system: e96a006cb066 ("KVM: arm/arm64: vgic: Implement KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO ioctl") d017d7b0bd7a ("KVM: arm/arm64: vgic: Implement VGICv3 CPU interface access") 94574c9488e2 ("KVM: arm/arm64: vgic: Add distributor and redistributor access") Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Vijaya Kumar K <Vijaya.Kumar@cavium.com> Cc: Yunlong Song <yunlong.song@huawei.com> Link: http://lkml.kernel.org/n/tip-nsqxpyzcv4ywesikhhhrgfgc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-17Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A set of perf related fixes: - fix a CR4.PCE propagation issue caused by usage of mm instead of active_mm and therefore propagated the wrong value. - perf core fixes, which plug a use-after-free issue and make the event inheritance on fork more robust. - a tooling fix for symbol handling" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf symbols: Fix symbols__fixup_end heuristic for corner cases x86/perf: Clarify why x86_pmu_event_mapped() isn't racy x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm perf/core: Better explain the inherit magic perf/core: Simplify perf_event_free_task() perf/core: Fix event inheritance on fork() perf/core: Fix use-after-free in perf_release()
2017-03-17tools headers: Sync {tools/,}arch/x86/include/asm/cpufeatures.hArnaldo Carvalho de Melo
We use those in tools/arch/x86/lib/mem{cpy,set}_64.S, in turn used in the 'perf bench mem' benchmarks. The changes in the following csets are not relevant for this usecase, but lets sync it to silence the diff detector in the tools build system: 6fb895692a03 ("x86/cpufeature: Add 5-level paging detection") Link: http://lkml.kernel.org/n/tip-nsqxpyzcv4ywesikhhhrgfgc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-17perf tools: Handle partial AUX records and print a warningAlexander Shishkin
This patch decodes the 'partial' flag in AUX records and prints a warning to the user, so that they don't have to guess why their PT traces contain gaps (or missing altogether): Warning: AUX data had gaps in it 8 times out of 8! Are you running a KVM guest in the background? Trying to be even more helpful, we will detect if the user's kvm driver sets up exclusive VMX root mode for the entire lifespan of the kvm process: Reloading kvm_intel module with vmm_exclusive=0 will reduce the gaps to only guest's timeslices. Note however, that you'll still have gaps in cpu-wide traces even with vmm_exclusive=0, but the number of gaps will be below 100% (as opposed to the above example). Currently this is the only reason for partial records. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vince@deater.net> Link: http://lkml.kernel.org/r/8760j941ig.fsf@ashishki-desk.ger.corp.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-17tools include: Sync {,tools/}include/uapi/linux/perf_event.hAlexander Shishkin
To get PERF_AUX_FLAG_PARTIAL, introduced in: ae0c2d995d64 ("perf/core: Add a flag for partial AUX records") and that will be used to warn the user about gaps in AUX records due to VMX being used in KVM guests. Silences the kernel/tools file copy detector: Warning: include/uapi/linux/perf_event.h differs from kernel Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vince@deater.net> Link: http://lkml.kernel.org/r/8760j941ig.fsf@ashishki-desk.ger.corp.intel.com [ Split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-17tools lib api fs: Introduce sysfs__read_boolAlexander Shishkin
Will be used in a upcoming patch warning about PERF_RECORD_AUX data gaps, reading the "module/kvm_intel/parameters/vmm_exclusive" sysfs entry. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vince@deater.net> Link: http://lkml.kernel.org/r/8760j941ig.fsf@ashishki-desk.ger.corp.intel.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-17perf timechart: Use OPT_PARENT for common optionsArnaldo Carvalho de Melo
Move -T/--tasks-only and -P/--power-only options to a separate options array that then gets referenced via OPT_PARENT from the 'perf timechart' and 'perf timechart record' option arrays. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-j80lol9wj1i6556ibh48iebe@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-17perf lock: Make 'f' part of the common 'lock_options'Arnaldo Carvalho de Melo
All options need the -f/--force option, so move it to the array referenced via OPT_PARENT. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-unbeionpi58rioh4e9w8kp4n@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-17perf lock: Subcommands should include common optionsChangbin Du
When I use -i option for report subcommand, it doesn't accept it. We need add common options using OPT_PARENT macro. perf lock report -i lock_perf.data Error: unknown switch `i' Usage: perf lock report [<options>] -f, --force don't complain, do it -k, --key <acquired> key for sorting ... Signed-off-by: Changbin Du <changbin.du@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170317055342.8284-1-changbin.du@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-17perf symbols: Fix symbols__fixup_end heuristic for corner casesDaniel Borkmann
The current symbols__fixup_end() heuristic for the last entry in the rb tree is suboptimal as it leads to not being able to recognize the symbol in the call graph in a couple of corner cases, for example: i) If the symbol has a start address (f.e. exposed via kallsyms) that is at a page boundary, then the roundup(curr->start, 4096) for the last entry will result in curr->start == curr->end with a symbol length of zero. ii) If the symbol has a start address that is shortly before a page boundary, then also here, curr->end - curr->start will just be very few bytes, where it's unrealistic that we could perform a match against. Instead, change the heuristic to roundup(curr->start, 4096) + 4096, so that we can catch such corner cases and have a better chance to find that specific symbol. It's still just best effort as the real end of the symbol is unknown to us (and could even be at a larger offset than the current range), but better than the current situation. Alexei reported that he recently run into case i) with a JITed eBPF program (these are all page aligned) as the last symbol which wasn't properly shown in the call graph (while other eBPF program symbols in the rb tree were displayed correctly). Since this is a generic issue, lets try to improve the heuristic a bit. Reported-and-Tested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Fixes: 2e538c4a1847 ("perf tools: Improve kernel/modules symbol lookup") Link: http://lkml.kernel.org/r/bb5c80d27743be6f12afc68405f1956a330e1bc9.1489614365.git.daniel@iogearbox.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-16Merge tag 'perf-core-for-mingo-4.12-20170316' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: New features: - Add 'brstackinsn' field in 'perf script' to reuse the x86 instruction decoder used in the Intel PT code to study hot paths to samples (Andi Kleen) Kernel changes: - Default UPROBES_EVENTS to Y (Alexei Starovoitov) - Fix check for kretprobe offset within function entry (Naveen N. Rao) Infrastructure changes: - Introduce util func is_sdt_event() (Ravi Bangoria) - Make perf_event__synthesize_mmap_events() scale on older kernels where reading /proc/pid/maps is way slower than reading /proc/pid/task/pid/maps (Stephane Eranian) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-16perf script: Add 'brstackinsn' for branch stacksAndi Kleen
Implement printing instruction sequences as hex dump for branch stacks. This relies on the x86 instruction decoder used by the PT decoder to find the lengths of instructions to dump them individually. This is good enough for pattern matching. This allows to study hot paths for individual samples, together with branch misprediction and cycle count / IPC information if available (on Skylake systems). % perf record -b ... % perf script -F brstackinsn ... read_hpet+67: ffffffff9905b843 insn: 74 ea # PRED ffffffff9905b82f insn: 85 c9 ffffffff9905b831 insn: 74 12 ffffffff9905b833 insn: f3 90 ffffffff9905b835 insn: 48 8b 0f ffffffff9905b838 insn: 48 89 ca ffffffff9905b83b insn: 48 c1 ea 20 ffffffff9905b83f insn: 39 f2 ffffffff9905b841 insn: 89 d0 ffffffff9905b843 insn: 74 ea # PRED Only works when no special branch filters are specified. Occasionally the path does not reach up to the sample IP, as the LBRs may be frozen before executing a final jump. In this case we print a special message. The instruction dumper piggy backs on the existing infrastructure from the IP PT decoder. An earlier iteration of this patch relied on a disassembler, but this version only uses the existing instruction decoder. Committer note: Added hint about how to get suitable perf.data files for use with '-F brstackinsm': $ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ] $ $ perf script -F brstackinsn Display of branch stack assembler requested, but non all-branch filter set Hint: run 'perf record -b ...' $ Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-16tools headers: Sync {tools/,}arch/x86/include/asm/cpufeatures.hArnaldo Carvalho de Melo
We use those in tools/arch/x86/lib/mem{cpy,set}_64.S, in turn used in the 'perf bench mem' benchmarks. The changes in the following csets are not relevant for this usecase, but lets sync it to silence the diff detector in the tools build system: 78d1b296843a ("x86/cpu: Add X86_FEATURE_CPUID") 3bba73b1b7a8 ("x86/cpufeature: Move RING3MWAIT feature to avoid conflicts") Cc: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-nsqxpyzcv4ywesikhhhrgfgc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-16Merge branch 'linus' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-15perf tools: Make perf_event__synthesize_mmap_events() scaleStephane Eranian
This patch significantly improves the execution time of perf_event__synthesize_mmap_events() when running perf record on systems where processes have lots of threads. It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to generate each map line in the maps file. If you have 1000 threads, then you have necessarily 1000 stacks. For each vma, you need to check if it corresponds to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn. As of today, perf does not use the fact that a mapping is a stack, therefore we can work around the issue by using /proc/pid/tasks/pid/maps. This entry does not try to map a vma to stack and is thus much faster with no loss of functonality. The proc-map-timeout logic is kept in case users still want some upper limit. In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual /proc/pid/task/pid/maps, tasks -> task. Thanks Arnaldo for catching this. Committer note: This problem seems to have been elliminated in the kernel since commit : b18cb64ead40 ("fs/proc: Stop trying to report thread stacks"). Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170315135059.GC2177@redhat.com Link: http://lkml.kernel.org/r/1489598233-25586-1-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-15perf probe: Introduce util func is_sdt_event()Ravi Bangoria
Factor out the SDT event name checking routine as is_sdt_event(). Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170314150658.7065-2-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Ensure that mtu is at least IPV6_MIN_MTU in ipv6 VTI tunnel driver, from Steffen Klassert. 2) Fix crashes when user tries to get_next_key on an LPM bpf map, from Alexei Starovoitov. 3) Fix detection of VLAN fitlering feature for bnx2x VF devices, from Michal Schmidt. 4) We can get a divide by zero when TCP socket are morphed into listening state, fix from Eric Dumazet. 5) Fix socket refcounting bugs in skb_complete_wifi_ack() and skb_complete_tx_timestamp(). From Eric Dumazet. 6) Use after free in dccp_feat_activate_values(), also from Eric Dumazet. 7) Like bonding team needs to use ETH_MAX_MTU as netdev->max_mtu, from Jarod Wilson. 8) Fix use after free in vrf_xmit(), from David Ahern. 9) Don't do UDP Fragmentation Offload on IPComp ipsec packets, from Alexey Kodanev. 10) Properly check napi_complete_done() return value in order to decide whether to re-enable IRQs or not in amd-xgbe driver, from Thomas Lendacky. 11) Fix double free of hwmon device in marvell phy driver, from Andrew Lunn. 12) Don't crash on malformed netlink attributes in act_connmark, from Etienne Noss. 13) Don't remove routes with a higher metric in ipv6 ECMP route replace, from Sabrina Dubroca. 14) Don't write into a cloned SKB in ipv6 fragmentation handling, from Florian Westphal. 15) Fix routing redirect races in dccp and tcp, basically the ICMP handler can't modify the socket's cached route in it's locked by the user at this moment. From Jon Maxwell. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (108 commits) qed: Enable iSCSI Out-of-Order qed: Correct out-of-bound access in OOO history qed: Fix interrupt flags on Rx LL2 qed: Free previous connections when releasing iSCSI qed: Fix mapping leak on LL2 rx flow qed: Prevent creation of too-big u32-chains qed: Align CIDs according to DORQ requirement mlxsw: reg: Fix SPVMLR max record count mlxsw: reg: Fix SPVM max record count net: Resend IGMP memberships upon peer notification. dccp: fix memory leak during tear-down of unsuccessful connection request tun: fix premature POLLOUT notification on tun devices dccp/tcp: fix routing redirect race ucc/hdlc: fix two little issue vxlan: fix ovs support net: use net->count to check whether a netns is alive or not bridge: drop netfilter fake rtable unconditionally ipv6: avoid write to a possibly cloned skb net: wimax/i2400m: fix NULL-deref at probe isdn/gigaset: fix NULL-deref at probe ...
2017-03-14perf powerpc: Choose local entry point with kretprobesNaveen N. Rao
perf now uses an offset from _text/_stext for kretprobes if the kernel supports it, rather than the actual function name. As such, let's choose the LEP for powerpc ABIv2 so as to ensure the probe gets hit. Do it only if the kernel supports specifying offsets with kretprobes. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/7445b5334673ef5404ac1d12609bad4d73d2b567.1488961018.git.naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>