summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2016-02-24perf hists browser: Implement hierarchy outputNamhyung Kim
Implement hierarchy mode in TUI. The output is look like stdio but it also supports to fold/unfold children dynamically. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-14-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf hists browser: Support collapsing/expanding whole entries in hierarchyNamhyung Kim
The 'C' and 'E' keys are to collapse/expand all hist entries. Update nr_hierarchy_entries properly in this case. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-13-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf hists browser: Count number of hierarchy entriesNamhyung Kim
Add nr_hierarchy_entries field to keep current number of (unfolded) hist entries. And the hist_entry->nr_rows carries number of direct children. But in the hierarchy mode, entry can have grand children and callchains. So update the number properly using hierarchy_count_rows() when toggling the folded state (by pressing ENTER key). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-12-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf ui/stdio: Align column header for hierarchy outputNamhyung Kim
The hierarchy output mode is to group entries so the existing columns won't fit to the new output. Treat all sort keys as a single column and separate headers by "/". # Overhead Command / Shared Object # ........... ................................ # 15.11% swapper 14.97% [kernel.vmlinux] 0.09% [libahci] 0.05% [iwlwifi] ... Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-11-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf ui/stdio: Implement hierarchy output modeNamhyung Kim
The hierarchy output mode is to group entries for each level so that user can see higher level picture more easily. It also helps to find out which component is most costly. The output will look like below: 15.11% swapper 14.97% [kernel.vmlinux] 0.09% [libahci] 0.05% [iwlwifi] 10.29% irq/33-iwlwifi 6.45% [kernel.vmlinux] 1.41% [mac80211] 1.15% [iwldvm] 1.14% [iwlwifi] 0.14% [cfg80211] 4.81% firefox 3.92% libxul.so 0.34% [kernel.vmlinux] Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-10-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf hists: Count number of sort keysNamhyung Kim
It'll be used for hierarchy output mode to indent entries properly. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-9-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf hists: Resort after filtering hierarchyNamhyung Kim
In hierarchy mode, a filter can affect periods of entries in upper hierarchy. So it needs to resort the hists after filter. For example, let's look at following example: Overhead Command / Shared Object / Symbol ------------ -------------------------------- 30.00% perf 20.00% perf 10.00% main 5.00% pr_debug 5.00% memcpy 10.00% [kernel.vmlinux] 8.00% memset 2.00% cpu_idle If we apply simbol filter for 'mem' it should look like this 13.00% perf 8.00% [kernel.vmlinux] 8.00% memset 5.00% perf 5.00% memcpy Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-8-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf hists: Support filtering in hierarchy modeNamhyung Kim
The hists__filter_hierarchy() function implements filtering in hierarchy mode. Now we have hist_entry__filter() so use it for entries in the hierarchy. It returns 3 kind of values. A negative value means that it's not filtered by this type. It marks current entry as filtered tentatively so if a lower level entry removes the filter it also removes the all parent so that we can find the entry in the output. Zero means it's filtered out by this type. A positive value means it's not filtered so it removes the filter and shows in the output. In these cases, it moves to next entry since lower level entry won't match by this type of filter anymore. Thus all children will be filtered or not together. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-7-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf hists: Introduce hist_entry__filter()Namhyung Kim
The hist_entry__filter() function is to filter hist entries using sort key related info. This is needed to support hierarchy mode since each hist entry will be associated with a hpp fmt which has a sort key. So each entry should compare to only matching type of filters. To do that, add the ->se_filter callback field to struct sort_entry. This callback takes 'type' argument which determines whether it's matching sort key or not. It returns -1 for non-matching type, 0 for filtered entry and 1 for not filtered entries. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-6-git-send-email-namhyung@kernel.org [ 'socket' is reserved in sys/socket.h, so replace it with 'sk' ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf hists: Add helper functions for hierarchy modeNamhyung Kim
The rb_hierarchy_{next,prev,last} functions are to traverse all hist entries in a hierarchy. They will be used by various function which supports hierarchy output. As the rb_hierarchy_next() is used to traverse the whole hierarchy, it sometime needs to visit entries regardless of current folding state. So add enum hierarchy_move_dir and pass it to __rb_hierarchy_next() for those cases. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf hists: Resort hist entries with hierarchyNamhyung Kim
For hierarchical output, each entry must be sorted in their rbtree (hroot) properly. Add hists__hierarchy_output_resort() to do the job. Note that those hierarchy entries share the period counts, it'd be important to update the hists->stats only once (for leaves). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf hists: Basic support of hierarchical report viewNamhyung Kim
In the hierarchical view, entries will be grouped and sorted on the first key, and then on the second key, and so on. Add the he->hroot_{in,out} fields to keep the lower level entries. Actually this can share space, in a union, with callchain's 'sorted_root' since the hroots are only used by non-leaf entries and callchain is only used by leaf entries. It also adds the 'parent_he' and 'depth' fields which can be used by browsers. This patch only implements collapsing part which creates internal entries for each sort key. These need to be sorted by output_sort stage and to be displayed properly in the later patch(es). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf tools: Add helper functions for some sort keysNamhyung Kim
The 'trace', 'srcline' and 'srcfile' sort keys updates hist entry's field later. With the hierarchy mode, those fields are passed to a matching entry so it needs to identify the sort keys. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf script: Print bpf-output events in 'perf script'Wang Nan
This patch allows 'perf script' output messages from BPF program. For example, use test_bpf_output_3.c at the end of this commit message, # ./perf record -e bpf-output/no-inherit,name=evt/ \ -e ./test_bpf_output_3.c/map:channel.event=evt/ \ usleep 100000 # ./perf script usleep 4882 21384.532523: evt: ffffffff810e97d1 sys_nanosleep ([kernel.kallsyms]) BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" usleep 4882 21384.632606: evt: ffffffff8105c609 kretprobe_trampoline_holder ([kernel.kallsyms BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" Two samples from BPF output are printed by both binary and string format. If BPF program output something unprintable, string format is suppressed. /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; #define SEC(NAME) __attribute__((section(NAME), used)) static u64 (*ktime_get_ns)(void) = (void *)BPF_FUNC_ktime_get_ns; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; static int (*get_smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) = (void *)BPF_FUNC_perf_event_output; struct bpf_map_def SEC("maps") channel = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; static inline int __attribute__((always_inline)) func(void *ctx, int type) { char output_str[] = "Raise a BPF event!"; perf_event_output(ctx, &channel, get_smp_processor_id(), &output_str, sizeof(output_str)); return 0; } SEC("func_begin=sys_nanosleep") int func_begin(void *ctx) {return func(ctx, 1);} SEC("func_end=sys_nanosleep%return") int func_end(void *ctx) { return func(ctx, 2);} char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456312845-111583-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf tools: Make binary data printer code in trace_event public availableWang Nan
Move code printing binray data from trace_event() to utils.c and allows passing different printer. Further commits will use this logic to print bpf output event. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456312845-111583-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf script: Display data_src valuesJiri Olsa
Adding support to display data_src values, for events with data_src data in sample. Example: $ perf script ... rcuos/3 32 [002] ... 68501042 Local RAM hit|SNP None or Hit|TLB L1 or L2 hit|LCK No ... rcuos/3 32 [002] ... 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No ... swapper 0 [002] ... 68100242 LFB hit|SNP None|TLB L1 or L2 hit|LCK No ... swapper 0 [000] ... 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No ... swapper 0 [000] ... 50100142 L1 hit|SNP None|TLB L2 miss|LCK No ... rcuos/3 32 [002] ... 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No ... plugin-containe 16538 [000] ... 6a100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK Yes ... gkrellm 1736 [000] ... 68100242 LFB hit|SNP None|TLB L1 or L2 hit|LCK No ... gkrellm 1736 [000] ... 6a100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK Yes ... ^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ data_src value data_src translation Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-14-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf tools: Change perf_mem__lck_scnprintf to return nb of displayed bytesJiri Olsa
Moving strncat call into scnprintf to easily track number of displayed bytes. It will be used in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-13-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf tools: Change perf_mem__snp_scnprintf to return nb of displayed bytesJiri Olsa
Moving strncat/strcpy calls into scnprintf to easily track number of displayed bytes. It will be used in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-12-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf tools: Change perf_mem__lvl_scnprintf to return nb of displayed bytesJiri Olsa
Moving strncat/strcpy calls into scnprintf to easily track number of displayed bytes. It will be used in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-11-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf tools: Change perf_mem__tlb_scnprintf to return nb of displayed bytesJiri Olsa
Moving strncat/strcpy calls into scnprintf to easily track number of displayed bytes. It will be used in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-10-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf tools: Introduce perf_mem__lck_scnprintf functionJiri Olsa
Move meminfo's lck display function into mem-events.c object, so it could be reused later from script code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-9-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf tools: Introduce perf_mem__snp_scnprintf functionJiri Olsa
Move meminfo's snp display function into mem-events.c object, so it could be reused later from script code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-8-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf tools: Introduce perf_mem__lvl_scnprintf functionJiri Olsa
Move meminfo's lvl display function into mem-events.c object, so it could be reused later from script code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-7-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf tools: Introduce perf_mem__tlb_scnprintf functionJiri Olsa
Move meminfo's tlb display function into mem-events.c object, so it could be reused later from script code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf mem: Introduce perf_mem_events__name functionJiri Olsa
Wrap perf_mem_events[].name into perf_mem_events__name() so we could alter the events name if needed. This will be handy when changing latency settings for loads event in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-24perf mem record: Check for memory events supportJiri Olsa
Check if current kernel support available memory events and display the status within -e list option: $ perf mem record -e list ldlat-loads : available ldlat-stores : available Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23perf tools: Remove strbuf_{remove,splice}()Arnaldo Carvalho de Melo
No users, nuke them. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-kfv2wo8xann8t97wdalttcx7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23perf help: No need to use strbuf_remove()Arnaldo Carvalho de Melo
It is the only user of this function, just use the strlen() to skip the prefix. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-blao710l5cd5hmwrhy51ftgq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23perf tools: Dont stop PMU parsing on alias parse errorAndi Kleen
When an error happens during alias parsing currently the complete parsing of all attributes of the PMU is stopped. This is breaks old perf on a newer kernel that may have not-yet-know alias attributes (such as .scale or .per-pkg). Continue when some attribute is unparseable. This is IMHO a stable candidate and should be backported to older versions to avoid problems with newer kernels. v2: Print warnings when something goes wrong. v3: Change warning to debug output Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org # v3.6+ Link: http://lkml.kernel.org/r/1455749095-18358-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23perf script: Display addr/data_src/weight columns for raw eventsJiri Olsa
Adding addr/data_src/weight columns for raw events. Example: $ perf script ... true 11883 322960.489590: ... ffff8801aa0b8400 68501042 246 ffffffff813b2cd true 11883 322960.489600: ... ffff8800b90b38d8 68501042 251 ffffffff811d0b7 true 11883 322960.489612: ... ffff880196893130 6a100142 94 ffffffff8177fb8 true 11883 322960.489637: ... ffff880164277b40 68100842 101 ffffffff813b2cd true 11883 322960.489683: ... ffff880035d3d818 68501042 201 ffffffff811d0b7 true 11883 322960.489733: ... 7fb9616efcf0 68100242 199 7fb961aaba9 true 11883 322960.489818: ... ffffea000481c39c 6a100142 122 ffffffff811b634 ^^^^^^^^^^^^^^^^ ^^^^^^^^ ^^^ addr data_src weight Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1455525293-8671-23-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23perf script: Add data_src and weight column definitionsJiri Olsa
Adding data_src and weight column definitions, so it's displayed for related sample types. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1455525293-8671-22-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23perf tools: Use ARRAY_SIZE in mem sort display functionsJiri Olsa
There's no need to define extra macros for that. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1455525293-8671-13-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23perf mem: Add -e record optionJiri Olsa
Adding -e option for perf mem record command, to be able to specify memory event directly. Get list of available events: $ perf mem record -e list ldlat-loads ldlat-stores Monitor ldlat-loads: $ perf mem record -e ldlat-loads true Committer notes: Further testing: # perf mem record -e ldlat-loads true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB perf.data (10 samples) ] # perf evlist cpu/mem-loads,ldlat=30/P # Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1455525293-8671-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23perf tools: Add monitored events arrayJiri Olsa
It will ease up configuration of memory events and addition of other memory events in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1455525293-8671-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23perf tools: Introduce cl_offset functionJiri Olsa
It'll be used in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1455525293-8671-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-23perf tools: Make cl_address globalJiri Olsa
It'll be used in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1455525293-8671-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Introduce bpf-output eventWang Nan
Commit a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") adds a helper to enable a BPF program to output data to a perf ring buffer through a new type of perf event, PERF_COUNT_SW_BPF_OUTPUT. This patch enables perf to create events of that type. Now a perf user can use the following cmdline to receive output data from BPF programs: # perf record -a -e bpf-output/no-inherit,name=evt/ \ -e ./test_bpf_output.c/map:channel.event=evt/ ls / # perf script perf 1560 [004] 347747.086295: evt: ffffffff811fd201 sys_write ... perf 1560 [004] 347747.086300: evt: ffffffff811fd201 sys_write ... perf 1560 [004] 347747.086315: evt: ffffffff811fd201 sys_write ... ... Test result: # cat test_bpf_output.c /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; #define SEC(NAME) __attribute__((section(NAME), used)) static u64 (*ktime_get_ns)(void) = (void *)BPF_FUNC_ktime_get_ns; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; static int (*get_smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) = (void *)BPF_FUNC_perf_event_output; struct bpf_map_def SEC("maps") channel = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; SEC("func_write=sys_write") int func_write(void *ctx) { struct { u64 ktime; int cpuid; } __attribute__((packed)) output_data; char error_data[] = "Error: failed to output: %d\n"; output_data.cpuid = get_smp_processor_id(); output_data.ktime = ktime_get_ns(); int err = perf_event_output(ctx, &channel, get_smp_processor_id(), &output_data, sizeof(output_data)); if (err) trace_printk(error_data, sizeof(error_data), err); return 0; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************ END ***************************/ # perf record -a -e bpf-output/no-inherit,name=evt/ \ -e ./test_bpf_output.c/map:channel.event=evt/ ls / # perf script | grep ls ls 2242 [003] 347851.557563: evt: ffffffff811fd201 sys_write ... ls 2242 [003] 347851.557571: evt: ffffffff811fd201 sys_write ... Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-11-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Apply tracepoint event definition options to BPF scriptWang Nan
Users can pass options to tracepoints defined in the BPF script. For example: # perf record -e ./test.c/no-inherit/ bash # dd if=/dev/zero of=/dev/null count=10000 # exit [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.022 MB perf.data (139 samples) ] (no-inherit works, only the sys_read issued by bash are captured, at least 10000 sys_read issued by dd are skipped.) test.c: #define SEC(NAME) __attribute__((section(NAME), used)) SEC("func=sys_read") int bpf_func__sys_read(void *ctx) { return 1; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; no-inherit is applied to the kprobe event defined in test.c. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-10-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Enable indices setting syntax for BPF mapWang Nan
This patch introduces a new syntax to perf event parser: # perf record -e './test_bpf_map_3.c/map:channel.value[0,1,2,3...5]=101/' usleep 2 By utilizing the basic facilities in bpf-loader.c which allow setting different slots in a BPF map separately, the newly introduced syntax allows perf to control specific elements in a BPF map. Test result: # cat ./test_bpf_map_3.c /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> #define SEC(NAME) __attribute__((section(NAME), used)) struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; static void *(*map_lookup_elem)(struct bpf_map_def *, void *) = (void *)BPF_FUNC_map_lookup_elem; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; struct bpf_map_def SEC("maps") channel = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(int), .value_size = sizeof(unsigned char), .max_entries = 100, }; SEC("func=hrtimer_nanosleep rqtp->tv_nsec") int func(void *ctx, int err, long nsec) { char fmt[] = "%ld\n"; long usec = nsec * 0x10624dd3 >> 38; // nsec / 1000 int key = (int)usec; unsigned char *pval = map_lookup_elem(&channel, &key); if (!pval) return 0; trace_printk(fmt, sizeof(fmt), (unsigned char)*pval); return 0; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ Normal case: # echo "" > /sys/kernel/debug/tracing/trace # ./perf record -e './test_bpf_map_3.c/map:channel.value[0,1,2,3...5]=101/' usleep 2 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data ] # cat /sys/kernel/debug/tracing/trace | grep usleep usleep-405 [004] d... 2745423.547822: : 101 # ./perf record -e './test_bpf_map_3.c/map:channel.value[0...9,20...29]=102,map:channel.value[10...19]=103/' usleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data ] # ./perf record -e './test_bpf_map_3.c/map:channel.value[0...9,20...29]=102,map:channel.value[10...19]=103/' usleep 15 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data ] # cat /sys/kernel/debug/tracing/trace | grep usleep usleep-405 [004] d... 2745423.547822: : 101 usleep-655 [006] d... 2745434.122814: : 102 usleep-904 [006] d... 2745439.916264: : 103 # ./perf record -e './test_bpf_map_3.c/map:channel.value[all]=104/' usleep 99 # cat /sys/kernel/debug/tracing/trace | grep usleep usleep-405 [004] d... 2745423.547822: : 101 usleep-655 [006] d... 2745434.122814: : 102 usleep-904 [006] d... 2745439.916264: : 103 usleep-1537 [003] d... 2745538.053737: : 104 Error case: # ./perf record -e './test_bpf_map_3.c/map:channel.value[10...1000]=104/' usleep 99 event syntax error: '..annel.value[10...1000]=104/' \___ Index too large Hint: Valid config terms: map:[<arraymap>].value<indices>=[value] map:[<eventmap>].event<indices>=[event] where <indices> is something like [0,3...5] or [all] (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-9-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Support setting different slots in a BPF map separatelyWang Nan
This patch introduces basic facilities to support config different slots in a BPF map one by one. array.nr_ranges and array.ranges are introduced into 'struct parse_events_term', where ranges is an array of indices range (start, length) which will be configured by this config term. nr_ranges is the size of the array. The array is passed to 'struct bpf_map_priv'. To indicate the new type of configuration, BPF_MAP_KEY_RANGES is added as a new key type. bpf_map_config_foreach_key() is extended to iterate over those indices instead of all possible keys. Code in this commit will be enabled by following commit which enables the indices syntax for array configuration. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-8-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Enable passing event to BPF objectWang Nan
A new syntax is added to the parser so that the user can access predefined perf events in BPF objects. After this patch, BPF programs for perf are finally able to utilize bpf_perf_event_read() introduced in commit 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU counter"). Test result: # cat test_bpf_map_2.c /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> #define SEC(NAME) __attribute__((section(NAME), used)) struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; static int (*get_smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; static int (*perf_event_read)(struct bpf_map_def *, int) = (void *)BPF_FUNC_perf_event_read; struct bpf_map_def SEC("maps") pmu_map = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(int), .max_entries = __NR_CPUS__, }; SEC("func_write=sys_write") int func_write(void *ctx) { unsigned long long val; char fmt[] = "sys_write: pmu=%llu\n"; val = perf_event_read(&pmu_map, get_smp_processor_id()); trace_printk(fmt, sizeof(fmt), val); return 0; } SEC("func_write_return=sys_write%return") int func_write_return(void *ctx) { unsigned long long val = 0; char fmt[] = "sys_write_return: pmu=%llu\n"; val = perf_event_read(&pmu_map, get_smp_processor_id()); trace_printk(fmt, sizeof(fmt), val); return 0; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ Normal case: # echo "" > /sys/kernel/debug/tracing/trace # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' ls / [SNIP] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data (7 samples) ] # cat /sys/kernel/debug/tracing/trace | grep ls ls-17066 [000] d... 938449.863301: : sys_write: pmu=1157327 ls-17066 [000] dN.. 938449.863342: : sys_write_return: pmu=1225218 ls-17066 [000] d... 938449.863349: : sys_write: pmu=1241922 ls-17066 [000] dN.. 938449.863369: : sys_write_return: pmu=1267445 Normal case (system wide): # echo "" > /sys/kernel/debug/tracing/trace # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.811 MB perf.data (120 samples) ] # cat /sys/kernel/debug/tracing/trace | grep -v '18446744073709551594' | grep -v perf | head -n 20 [SNIP] # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | gmain-30828 [002] d... 2740551.068992: : sys_write: pmu=84373 gmain-30828 [002] d... 2740551.068992: : sys_write_return: pmu=87696 gmain-30828 [002] d... 2740551.068996: : sys_write: pmu=100658 gmain-30828 [002] d... 2740551.068997: : sys_write_return: pmu=102572 Error case 1: # perf record -e './test_bpf_map_2.c' ls / [SNIP] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.014 MB perf.data ] # cat /sys/kernel/debug/tracing/trace | grep ls ls-17115 [007] d... 2724279.665625: : sys_write: pmu=18446744073709551614 ls-17115 [007] dN.. 2724279.665651: : sys_write_return: pmu=18446744073709551614 ls-17115 [007] d... 2724279.665658: : sys_write: pmu=18446744073709551614 ls-17115 [007] dN.. 2724279.665677: : sys_write_return: pmu=18446744073709551614 (18446744073709551614 is 0xfffffffffffffffe (-2)) Error case 2: # perf record -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=evt/' -a event syntax error: '..ps:pmu_map.event=evt/' \___ Event not found for map setting Hint: Valid config terms: map:[<arraymap>].value=[value] map:[<eventmap>].event=[event] [SNIP] Error case 3: # ls /proc/2348/task/ 2348 2505 2506 2507 2508 # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' -p 2348 ERROR: Apply config to BPF failed: Cannot set event to BPF map in multi-thread tracing Error case 4: # perf record -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' ls / ERROR: Apply config to BPF failed: Doesn't support inherit event (Hint: use -i to turn off inherit) Error case 5: # perf record -i -e raw_syscalls:sys_enter -e './test_bpf_map_2.c/map:pmu_map.event=raw_syscalls:sys_enter/' ls ERROR: Apply config to BPF failed: Can only put raw, hardware and BPF output event into a BPF map Error case 6: # perf record -i -e './test_bpf_map_2.c/map:pmu_map.event=123/' ls / event syntax error: '.._map.event=123/' \___ Incorrect value type for map [SNIP] Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-7-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf record: Apply config to BPF objects before recordingWang Nan
bpf__apply_obj_config() is introduced as the core API to apply object config options to all BPF objects. This patch also does the real work for setting values for BPF_MAP_TYPE_PERF_ARRAY maps by inserting value stored in map's private field into the BPF map. This patch is required because we are not always able to set all BPF config during parsing. Further patch will set events created by perf to BPF_MAP_TYPE_PERF_EVENT_ARRAY maps, which is not exist until perf_evsel__open(). bpf_map_foreach_key() is introduced to iterate over each key needs to be configured. This function would be extended to support more map types and different key settings. In perf record, before start recording, call bpf__apply_config() to turn on all BPF config options. Test result: # cat ./test_bpf_map_1.c /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> #define SEC(NAME) __attribute__((section(NAME), used)) struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; static void *(*map_lookup_elem)(struct bpf_map_def *, void *) = (void *)BPF_FUNC_map_lookup_elem; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; struct bpf_map_def SEC("maps") channel = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(int), .value_size = sizeof(int), .max_entries = 1, }; SEC("func=sys_nanosleep") int func(void *ctx) { int key = 0; char fmt[] = "%d\n"; int *pval = map_lookup_elem(&channel, &key); if (!pval) return 0; trace_printk(fmt, sizeof(fmt), *pval); return 0; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ # echo "" > /sys/kernel/debug/tracing/trace # ./perf record -e './test_bpf_map_1.c/map:channel.value=11/' usleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data ] # cat /sys/kernel/debug/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 1/1 #P:8 [SNIP] # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | usleep-18593 [007] d... 2394714.395539: : 11 # ./perf record -e './test_bpf_map_1.c/map:channel.value=101/' usleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data ] # cat /sys/kernel/debug/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 1/1 #P:8 [SNIP] # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | usleep-18593 [007] d... 2394714.395539: : 11 usleep-19000 [006] d... 2394831.057840: : 101 Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-6-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Enable BPF object configure syntaxWang Nan
This patch adds the final step for BPF map configuration. A new syntax is appended into parser so user can config BPF objects through '/' '/' enclosed config terms. After this patch, following syntax is available: # perf record -e ./test_bpf_map_1.c/map:channel.value=10/ ... It would takes effect after appling following commits. Test result: # cat ./test_bpf_map_1.c /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> #define SEC(NAME) __attribute__((section(NAME), used)) struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; static void *(*map_lookup_elem)(struct bpf_map_def *, void *) = (void *)BPF_FUNC_map_lookup_elem; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; struct bpf_map_def SEC("maps") channel = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(int), .value_size = sizeof(int), .max_entries = 1, }; SEC("func=sys_nanosleep") int func(void *ctx) { int key = 0; char fmt[] = "%d\n"; int *pval = map_lookup_elem(&channel, &key); if (!pval) return 0; trace_printk(fmt, sizeof(fmt), *pval); return 0; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ - Normal case: # ./perf record -e './test_bpf_map_1.c/map:channel.value=10/' usleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data ] - Error case: # ./perf record -e './test_bpf_map_1.c/map:channel.value/' usleep 10 event syntax error: '..ps:channel:value/' \___ Config value not set (missing '=') Hint: Valid config term: map:[<arraymap>]:value=[value] (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events # ./perf record -e './test_bpf_map_1.c/xmap:channel.value=10/' usleep 10 event syntax error: '..pf_map_1.c/xmap:channel.value=10/' \___ Invalid object config option [SNIP] # ./perf record -e './test_bpf_map_1.c/map:xchannel.value=10/' usleep 10 event syntax error: '..p_1.c/map:xchannel.value=10/' \___ Target map not exist [SNIP] # ./perf record -e './test_bpf_map_1.c/map:channel.xvalue=10/' usleep 10 event syntax error: '..ps:channel.xvalue=10/' \___ Invalid object map config option [SNIP] # ./perf record -e './test_bpf_map_1.c/map:channel.value=x10/' usleep 10 event syntax error: '..nnel.value=x10/' \___ Incorrect value type for map [SNIP] Change BPF_MAP_TYPE_ARRAY to '1' in test_bpf_map_1.c: # ./perf record -e './test_bpf_map_1.c/map:channel.value=10/' usleep 10 event syntax error: '..ps:channel.value=10/' \___ Can't use this config term to this type of map Hint: Valid config term: map:[<arraymap>].value=[value] (add -v to see detail) Signed-off-by: Wang Nan <wangnan0@huawei.com> [for parser part] Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-5-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf bpf: Add API to set values to map entries in a bpf objectWang Nan
bpf__config_obj() is introduced as a core API to config BPF object after loading. One configuration option of maps is introduced. After this patch BPF object can accept assignments like: map:my_map.value=1234 (map.my_map.value looks pretty. However, there's a small but hard to fix problem related to flex's greedy matching. Please see [1]. Choose ':' to avoid it in a simpler way.) This patch is more complex than the work it does because the consideration of extension. In designing BPF map configuration, the following things should be considered: 1. Array indices selection: perf should allow user setting different value for different slots in an array, with syntax like: map:my_map.value[0,3...6]=1234; 2. A map should be set by different config terms, each for a part of it. For example, set each slot to the pid of a thread; 3. Type of value: integer is not the only valid value type. A perf counter can also be put into a map after commit 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU counter") 4. For a hash table, it should be possible to use a string or other value as a key; 5. It is possible that map configuration is unable to be setup during parsing. A perf counter is an example. Therefore, this patch does the following: 1. Instead of updating map element during parsing, this patch stores map config options in 'struct bpf_map_priv'. Following patches will apply those configs at an appropriate time; 2. Link map operations in a list so a map can have multiple config terms attached, so different parts can be configured separately; 3. Make 'struct bpf_map_priv' extensible so that the following patches can add new types of keys and operations; 4. Use bpf_obj_config__map_funcs array to support more map config options. Since the patch changing the event parser to parse BPF object config is relative large, I've put it in another commit. Code in this patch can be tested after applying the next patch. [1] http://lkml.kernel.org/g/564ED621.4050500@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-4-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Changes "maps:my_map.value" to "map:my_map.value", improved error messages ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Fix assertion failure on dynamic entryNamhyung Kim
The dynamic entry is created for each field in a tracepoint event. Since they have no fixed hpp format index, it should skip when perf_hpp__reset_width() is called. This caused following assertion failure.. $ perf record -e sched:sched_switch -a sleep 1 $ perf report -s comm,next_pid --stdio perf: ui/hist.c:651: perf_hpp__reset_width: Assertion `!(fmt->idx >= PERF_HPP__MAX_INDEX)' failed. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1456064558-13086-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Fix column width setting on 'trace' sort keyNamhyung Kim
It missed to update column length of the 'trace' sort key in the hists__calc_col_len() so it might truncate the output. It calculated the column length in the ->cmp() callback originally but it doesn't guarantee it's called always. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1456064558-13086-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Fix alignment on some sort keysNamhyung Kim
The srcline, srcfile and trace sort keys can have long entries. With commit 89fee7094323 ("perf hists: Do column alignment on the format iterator"), it now aligns output with hist_entry__snprintf_alignment(). So each (possibly long) sort entries don't need to do it themselves. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1456101153-14519-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Update srcline/file if neededNamhyung Kim
Normally the hist entry's srcline and/or srcfile is set during sorting. However sometime it's possible to a hist entry's srcline is not set yet after the sorting. This is because the entry is so unique and other sort keys already make it distinct. Then the srcline/file sort didn't have a chance to be called during the sorting. In that case it has NULL srcline/srcfile field and shows nothing. Before: $ perf report -s comm,sym,srcline ... Overhead Command Symbol ----------------------------------------------------------------- 34.42% swapper [k] intel_idle intel_idle.c:0 2.44% perf [.] __poll_nocancel (null) 1.70% gnome-shell [k] fw_domains_get (null) 1.04% Xorg [k] sock_poll (null) After: 34.42% swapper [k] intel_idle intel_idle.c:0 2.44% perf [.] __poll_nocancel .:0 1.70% gnome-shell [k] fw_domains_get fw_domains_get+42 1.04% Xorg [k] sock_poll socket.c:0 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1456101111-14400-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Fix segfault on dynamic entriesNamhyung Kim
A dynamic entry is created for each tracepoint event. When it sets up the sort key, it checks with existing keys using ->equal() callback. But it missed to set the ->equal for dynamic entries. The following segfault was due to the missing ->equal() callback. (gdb) bt #0 0x0000000000140003 in ?? () #1 0x0000000000537769 in fmt_equal (b=0x2106980, a=0x21067a0) at ui/hist.c:548 #2 perf_hpp__setup_output_field (list=0x8c6d80 <perf_hpp_list>) at ui/hist.c:560 #3 0x00000000004e927e in setup_sorting (evlist=<optimized out>) at util/sort.c:2642 #4 0x000000000043cf50 in cmd_report (argc=<optimized out>, argv=<optimized out>, prefix=<optimized out>) at builtin-report.c:932 #5 0x00000000004865a1 in run_builtin (p=p@entry=0x8bbce0 <commands+192>, argc=argc@entry=7, argv=argv@entry=0x7ffd24d56ce0) at perf.c:390 #6 0x000000000042dc1f in handle_internal_command (argv=0x7ffd24d56ce0, argc=7) at perf.c:451 #7 run_argv (argv=0x7ffd24d56a70, argcp=0x7ffd24d56a7c) at perf.c:495 #8 main (argc=7, argv=0x7ffd24d56ce0) at perf.c:620 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1456064558-13086-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Remove duplicate typedef config_term_func_t definitionArnaldo Carvalho de Melo
Older compilers don't like this, for instance, on RHEL6.7: CC /tmp/build/perf/util/parse-events.o util/parse-events.c:844: error: redefinition of typedef ‘config_term_func_t’ util/parse-events.c:353: note: previous declaration of ‘config_term_func_t’ was here So remove the second definition, that should've been just moved in 43d0b97817a4 ("perf tools: Enable config and setting names for legacy cache events"), not copied. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 43d0b97817a4 ("perf tools: Enable config and setting names for legacy cache events") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>