linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-10-02	bpf/libbpf: BTF support for typed ksyms	Hao Luo
	If a ksym is defined with a type, libbpf will try to find the ksym's btf information from kernel btf. If a valid btf entry for the ksym is found, libbpf can pass in the found btf id to the verifier, which validates the ksym's type and value. Typeless ksyms (i.e. those defined as 'void') will not have such btf_id, but it has the symbol's address (read from kallsyms) and its value is treated as a raw pointer. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-3-haoluo@google.com
2020-10-02	bpf: Introduce pseudo_btf_id	Hao Luo
	Pseudo_btf_id is a type of ld_imm insn that associates a btf_id to a ksym so that further dereferences on the ksym can use the BTF info to validate accesses. Internally, when seeing a pseudo_btf_id ld insn, the verifier reads the btf_id stored in the insn[0]'s imm field and marks the dst_reg as PTR_TO_BTF_ID. The btf_id points to a VAR_KIND, which is encoded in btf_vminux by pahole. If the VAR is not of a struct type, the dst reg will be marked as PTR_TO_MEM instead of PTR_TO_BTF_ID and the mem_size is resolved to the size of the VAR's type. >From the VAR btf_id, the verifier can also read the address of the ksym's corresponding kernel var from kallsyms and use that to fill dst_reg. Therefore, the proper functionality of pseudo_btf_id depends on (1) kallsyms and (2) the encoding of kernel global VARs in pahole, which should be available since pahole v1.18. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-2-haoluo@google.com
2020-10-02	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-10-02 1) Add a full xfrm compatible layer for 32-bit applications on 64-bit kernels. From Dmitry Safonov. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02	bpf: selftest: Ensure the child sk inherited all bpf_sock_ops_cb_flags	Martin KaFai Lau
	This patch adds a test to ensure the child sk inherited everything from the bpf_sock_ops_cb_flags of the listen sk: 1. Sets one more cb_flags (BPF_SOCK_OPS_STATE_CB_FLAG) to the listen sk in test_tcp_hdr_options.c 2. Saves the skops->bpf_sock_ops_cb_flags when handling the newly established passive connection 3. CHECK() it is the same as the listen sk This also covers the fastopen case as the existing test_tcp_hdr_options.c does. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201002013454.2542367-1-kafai@fb.com
2020-10-02	tools/power/acpi: Serialize Makefile	Thomas Renninger
	Before this patch you get tools/power/acpi/Makefile.rules included in parallel trying to copy KERNEL_INCLUDE multiple times: make -j20 acpi DESCEND power/acpi DESCEND tools/acpidbg DESCEND tools/acpidump DESCEND tools/ec MKDIR include MKDIR include MKDIR include CP include CP include cp: cannot create directory '/home/abuild/rpmbuild/BUILD/linux-5.7.7+git20200917.10b82d517648/tools/power/acpi/include/acpi': File exists make[2]: * [../../Makefile.rules:20: /home/abuild/rpmbuild/BUILD/linux-5.7.7+git20200917.10b82d517648/tools/power/acpi/include] Error 1 make[1]: * [Makefile:16: acpidbg] Error 2 make[1]: *** Waiting for unfinished jobs.... with this patch each subdirectory will be processed serialized: DESCEND power/acpi DESCEND tools/acpidbg MKDIR include CP include CC tools/acpidbg/acpidbg.o LD acpidbg STRIP acpidbg DESCEND tools/acpidump CC tools/acpidump/apdump.o ... LD acpidump STRIP acpidump DESCEND tools/ec CC tools/ec/ec_access.o LD ec STRIP ec Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-02	tools: Avoid comma separated statements	Joe Perches
	Use semicolons and braces. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-10-02	kunit: tool: handle when .kunit exists but .kunitconfig does not	Brendan Higgins
	Right now .kunitconfig and the build dir are automatically created if the build dir does not exists; however, if the build dir is present and .kunitconfig is not, kunit_tool will crash. Fix this by checking for both the build dir as well as the .kunitconfig. NOTE: This depends on commit 5578d008d9e0 ("kunit: tool: fix running kunit_tool from outside kernel tree") Link: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/commit/?id=5578d008d9e06bb531fb3e62dd17096d9fd9c853 Signed-off-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-10-02	selftests/bpf: Properly initialize linfo in sockmap_basic	Stanislav Fomichev
	When using -Werror=missing-braces, compiler complains about missing braces. Let's use use ={} initialization which should do the job: tools/testing/selftests/bpf/prog_tests/sockmap_basic.c: In function 'test_sockmap_iter': tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:181:8: error: missing braces around initializer [-Werror=missing-braces] union bpf_iter_link_info linfo = {0}; ^ tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:181:8: error: (near initialization for 'linfo.map') [-Werror=missing-braces] tools/testing/selftests/bpf/prog_tests/sockmap_basic.c: At top level: Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20201002000451.1794044-1-sdf@google.com
2020-10-02	selftests/bpf: Initialize duration in xdp_noinline.c	Stanislav Fomichev
	Fixes clang error: tools/testing/selftests/bpf/prog_tests/xdp_noinline.c:35:6: error: variable 'duration' is uninitialized when used here [-Werror,-Wuninitialized] if (CHECK(!skel, "skel_open_and_load", "failed\n")) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201001225440.1373233-1-sdf@google.com
2020-10-02	objtool: Permit __kasan_check_{read,write} under UACCESS	Jann Horn
	Building linux-next with JUMP_LABEL=n and KASAN=y, I got this objtool warning: arch/x86/lib/copy_mc.o: warning: objtool: copy_mc_to_user()+0x22: call to __kasan_check_read() with UACCESS enabled What happens here is that copy_mc_to_user() branches on a static key in a UACCESS region: __uaccess_begin(); if (static_branch_unlikely(&copy_mc_fragile_key)) ret = copy_mc_fragile(to, from, len); ret = copy_mc_generic(to, from, len); __uaccess_end(); and the !CONFIG_JUMP_LABEL version of static_branch_unlikely() uses static_key_enabled(), which uses static_key_count(), which uses atomic_read(), which calls instrument_atomic_read(), which uses kasan_check_read(), which is __kasan_check_read(). Let's permit these KASAN helpers in UACCESS regions - static keys should probably work under UACCESS, I think. PeterZ adds: It's not a matter of permitting, it's a matter of being safe and correct. In this case it is, because it's a thin wrapper around check_memory_region() which was already marked safe. check_memory_region() is correct because the only thing it ends up calling is kasa_report() and that is also marked safe because that is annotated with user_access_save/restore() before it does anything else. On top of that, all of KASAN is noinstr, so nothing in here will end up in tracing and/or call schedule() before the user_access_save(). Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-10-02	Merge branch 'for-next/mte' into for-next/core	Will Deacon
	Add userspace support for the Memory Tagging Extension introduced by Armv8.5. (Catalin Marinas and others) * for-next/mte: (30 commits) arm64: mte: Fix typo in memory tagging ABI documentation arm64: mte: Add Memory Tagging Extension documentation arm64: mte: Kconfig entry arm64: mte: Save tags when hibernating arm64: mte: Enable swap of tagged pages mm: Add arch hooks for saving/restoring tags fs: Handle intra-page faults in copy_mount_options() arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset arm64: mte: ptrace: Add PTRACE_{PEEK,POKE}MTETAGS support arm64: mte: Allow {set,get}_tagged_addr_ctrl() on non-current tasks arm64: mte: Restore the GCR_EL1 register after a suspend arm64: mte: Allow user control of the generated random tags via prctl() arm64: mte: Allow user control of the tag check mode via prctl() mm: Allow arm64 mmap(PROT_MTE) on RAM-based files arm64: mte: Validate the PROT_MTE request via arch_validate_flags() mm: Introduce arch_validate_flags() arm64: mte: Add PROT_MTE support to mmap() and mprotect() mm: Introduce arch_calc_vm_flag_bits() arm64: mte: Tags-aware aware memcmp_pages() implementation arm64: Avoid unnecessary clear_user_page() indirection ...
2020-10-01	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf-next 2020-10-01 The following pull-request contains BPF updates for your net-next tree. We've added 90 non-merge commits during the last 8 day(s) which contain a total of 103 files changed, 7662 insertions(+), 1894 deletions(-). Note that once bpf(/net) tree gets merged into net-next, there will be a small merge conflict in tools/lib/bpf/btf.c between commit 1245008122d7 ("libbpf: Fix native endian assumption when parsing BTF") from the bpf tree and the commit 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") from the bpf-next tree. Correct resolution would be to stick with bpf-next, it should look like: [...] /* check BTF magic / if (fread(&magic, 1, sizeof(magic), f) < sizeof(magic)) { err = -EIO; goto err_out; } if (magic != BTF_MAGIC && magic != bswap_16(BTF_MAGIC)) { / definitely not a raw BTF / err = -EPROTO; goto err_out; } / get file size */ [...] The main changes are: 1) Add bpf_snprintf_btf() and bpf_seq_printf_btf() helpers to support displaying BTF-based kernel data structures out of BPF programs, from Alan Maguire. 2) Speed up RCU tasks trace grace periods by a factor of 50 & fix a few race conditions exposed by it. It was discussed to take these via BPF and networking tree to get better testing exposure, from Paul E. McKenney. 3) Support multi-attach for freplace programs, needed for incremental attachment of multiple XDP progs using libxdp dispatcher model, from Toke Høiland-Jørgensen. 4) libbpf support for appending new BTF types at the end of BTF object, allowing intrusive changes of prog's BTF (useful for future linking), from Andrii Nakryiko. 5) Several BPF helper improvements e.g. avoid atomic op in cookie generator and add a redirect helper into neighboring subsys, from Daniel Borkmann. 6) Allow map updates on sockmaps from bpf_iter context in order to migrate sockmaps from one to another, from Lorenz Bauer. 7) Fix 32 bit to 64 bit assignment from latest alu32 bounds tracking which caused a verifier issue due to type downgrade to scalar, from John Fastabend. 8) Follow-up on tail-call support in BPF subprogs which optimizes x64 JIT prologue and epilogue sections, from Maciej Fijalkowski. 9) Add an option to perf RB map to improve sharing of event entries by avoiding remove- on-close behavior. Also, add BPF_PROG_TEST_RUN for raw_tracepoint, from Song Liu. 10) Fix a crash in AF_XDP's socket_release when memory allocation for UMEMs fails, from Magnus Karlsson. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-01	perf python scripting: Fix printable strings in python3 scripts	Jiri Olsa
	Hagen reported broken strings in python3 tracepoint scripts: make PYTHON=python3 perf record -e sched:sched_switch -a -- sleep 5 perf script --gen-script py perf script -s ./perf-script.py [..] sched__sched_switch 7 563231.759525792 0 swapper prev_comm=bytearray(b'swapper/7\x00\x00\x00\x00\x00\x00\x00'), prev_pid=0, prev_prio=120, prev_state=, next_comm=bytearray(b'mutex-thread-co\x00'), The problem is in the is_printable_array function that does not take the zero byte into account and claim such string as not printable, so the code will create byte array instead of string. Committer testing: After this fix: sched__sched_switch 3 484522.497072626 1158680 kworker/3:0-eve prev_comm=kworker/3:0, prev_pid=1158680, prev_prio=120, prev_state=I, next_comm=swapper/3, next_pid=0, next_prio=120 Sample: {addr=0, cpu=3, datasrc=84410401, datasrc_decode=N/A\|SNP N/A\|TLB N/A\|LCK N/A, ip=18446744071841817196, period=1, phys_addr=0, pid=1158680, tid=1158680, time=484522497072626, transaction=0, values=[(0, 0)], weight=0} sched__sched_switch 4 484522.497085610 1225814 perf prev_comm=perf, prev_pid=1225814, prev_prio=120, prev_state=, next_comm=migration/4, next_pid=30, next_prio=0 Sample: {addr=0, cpu=4, datasrc=84410401, datasrc_decode=N/A\|SNP N/A\|TLB N/A\|LCK N/A, ip=18446744071841817196, period=1, phys_addr=0, pid=1225814, tid=1225814, time=484522497085610, transaction=0, values=[(0, 0)], weight=0} Fixes: 249de6e07458 ("perf script python: Fix string vs byte array resolving") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Hagen Paul Pfeifer <hagen@jauu.net> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20200928201135.3633850-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-01	perf trace: Use the autogenerated mmap 'prot' string/id table	Arnaldo Carvalho de Melo
	No change in behaviour: # perf trace -e mmap sleep 1 0.000 ( 0.009 ms): sleep/751870 mmap(len: 143317, prot: READ, flags: PRIVATE, fd: 3) = 0x7fa96d0f7000 0.028 ( 0.004 ms): sleep/751870 mmap(len: 8192, prot: READ\|WRITE, flags: PRIVATE\|ANONYMOUS) = 0x7fa96d0f5000 0.037 ( 0.005 ms): sleep/751870 mmap(len: 1872744, prot: READ, flags: PRIVATE\|DENYWRITE, fd: 3) = 0x7fa96cf2b000 0.044 ( 0.011 ms): sleep/751870 mmap(addr: 0x7fa96cf50000, len: 1376256, prot: READ\|EXEC, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x25000) = 0x7fa96cf50000 0.056 ( 0.007 ms): sleep/751870 mmap(addr: 0x7fa96d0a0000, len: 307200, prot: READ, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x175000) = 0x7fa96d0a0000 0.064 ( 0.007 ms): sleep/751870 mmap(addr: 0x7fa96d0eb000, len: 24576, prot: READ\|WRITE, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x1bf000) = 0x7fa96d0eb000 0.075 ( 0.005 ms): sleep/751870 mmap(addr: 0x7fa96d0f1000, len: 13160, prot: READ\|WRITE, flags: PRIVATE\|FIXED\|ANONYMOUS) = 0x7fa96d0f1000 0.253 ( 0.005 ms): sleep/751870 mmap(len: 218049136, prot: READ, flags: PRIVATE, fd: 3) = 0x7fa95ff38000 # # # set -o vi # strace -e mmap sleep 1 mmap(NULL, 143317, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f333bd83000 mmap(NULL, 8192, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_ANONYMOUS, -1, 0) = 0x7f333bd81000 mmap(NULL, 1872744, PROT_READ, MAP_PRIVATE\|MAP_DENYWRITE, 3, 0) = 0x7f333bbb7000 mmap(0x7f333bbdc000, 1376256, PROT_READ\|PROT_EXEC, MAP_PRIVATE\|MAP_FIXED\|MAP_DENYWRITE, 3, 0x25000) = 0x7f333bbdc000 mmap(0x7f333bd2c000, 307200, PROT_READ, MAP_PRIVATE\|MAP_FIXED\|MAP_DENYWRITE, 3, 0x175000) = 0x7f333bd2c000 mmap(0x7f333bd77000, 24576, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_FIXED\|MAP_DENYWRITE, 3, 0x1bf000) = 0x7f333bd77000 mmap(0x7f333bd7d000, 13160, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_FIXED\|MAP_ANONYMOUS, -1, 0) = 0x7f333bd7d000 mmap(NULL, 218049136, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f332ebc4000 +++ exited with 0 +++ # And you can as well tweak 'perf trace's output to more closely match strace's: # perf config trace.show_arg_names=no # perf config trace.show_duration=no # perf config trace.show_prefix=yes # perf config trace.show_timestamp=no # perf config trace.show_zeros=yes # perf config trace.no_inherit=yes # perf trace -e mmap sleep 1 mmap(NULL, 143317, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f0d287ca000 mmap(NULL, 8192, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_ANONYMOUS) = 0x7f0d287c8000 mmap(NULL, 1872744, PROT_READ, MAP_PRIVATE\|MAP_DENYWRITE, 3, 0) = 0x7f0d285fe000 mmap(0x7f0d28623000, 1376256, PROT_READ\|PROT_EXEC, MAP_PRIVATE\|MAP_FIXED\|MAP_DENYWRITE, 3, 0x25000) = 0x7f0d28623000 mmap(0x7f0d28773000, 307200, PROT_READ, MAP_PRIVATE\|MAP_FIXED\|MAP_DENYWRITE, 3, 0x175000) = 0x7f0d28773000 mmap(0x7f0d287be000, 24576, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_FIXED\|MAP_DENYWRITE, 3, 0x1bf000) = 0x7f0d287be000 mmap(0x7f0d287c4000, 13160, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_FIXED\|MAP_ANONYMOUS) = 0x7f0d287c4000 mmap(NULL, 218049136, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f0d1b60b000 # # perf config \| grep ^trace trace.show_arg_names=no trace.show_duration=no trace.show_prefix=yes trace.show_timestamp=no trace.show_zeros=yes trace.no_inherit=yes # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-01	tools beauty: Add script to generate table of mmap's 'prot' argument	Arnaldo Carvalho de Melo
	Will be wired up in the following csets: $ tools/perf/trace/beauty/mmap_prot.sh static const char mmap_prot[] = { [ilog2(0x1) + 1] = "READ", #ifndef PROT_READ #define PROT_READ 0x1 #endif [ilog2(0x2) + 1] = "WRITE", #ifndef PROT_WRITE #define PROT_WRITE 0x2 #endif [ilog2(0x4) + 1] = "EXEC", #ifndef PROT_EXEC #define PROT_EXEC 0x4 #endif [ilog2(0x8) + 1] = "SEM", #ifndef PROT_SEM #define PROT_SEM 0x8 #endif [ilog2(0x01000000) + 1] = "GROWSDOWN", #ifndef PROT_GROWSDOWN #define PROT_GROWSDOWN 0x01000000 #endif [ilog2(0x02000000) + 1] = "GROWSUP", #ifndef PROT_GROWSUP #define PROT_GROWSUP 0x02000000 #endif }; $ $ $ $ tools/perf/trace/beauty/mmap_prot.sh alpha static const char mmap_prot[] = { [ilog2(0x4) + 1] = "EXEC", #ifndef PROT_EXEC #define PROT_EXEC 0x4 #endif [ilog2(0x01000000) + 1] = "GROWSDOWN", #ifndef PROT_GROWSDOWN #define PROT_GROWSDOWN 0x01000000 #endif [ilog2(0x02000000) + 1] = "GROWSUP", #ifndef PROT_GROWSUP #define PROT_GROWSUP 0x02000000 #endif [ilog2(0x1) + 1] = "READ", #ifndef PROT_READ #define PROT_READ 0x1 #endif [ilog2(0x8) + 1] = "SEM", #ifndef PROT_SEM #define PROT_SEM 0x8 #endif [ilog2(0x2) + 1] = "WRITE", #ifndef PROT_WRITE #define PROT_WRITE 0x2 #endif }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-30	selftests/bpf: Add tests for BPF_F_PRESERVE_ELEMS	Song Liu
	Add tests for perf event array with and without BPF_F_PRESERVE_ELEMS. Add a perf event to array via fd mfd. Without BPF_F_PRESERVE_ELEMS, the perf event is removed when mfd is closed. With BPF_F_PRESERVE_ELEMS, the perf event is removed when the map is freed. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200930224927.1936644-3-songliubraving@fb.com
2020-09-30	bpf: Introduce BPF_F_PRESERVE_ELEMS for perf event array	Song Liu
	Currently, perf event in perf event array is removed from the array when the map fd used to add the event is closed. This behavior makes it difficult to the share perf events with perf event array. Introduce perf event map that keeps the perf event open with a new flag BPF_F_PRESERVE_ELEMS. With this flag set, perf events in the array are not removed when the original map fd is closed. Instead, the perf event will stay in the map until 1) it is explicitly removed from the array; or 2) the array is freed. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200930224927.1936644-2-songliubraving@fb.com
2020-09-30	selftests/bpf: Fix alignment of .BTF_ids	Jean-Philippe Brucker
	Fix a build failure on arm64, due to missing alignment information for the .BTF_ids section: resolve_btfids.test.o: in function `test_resolve_btfids': tools/testing/selftests/bpf/prog_tests/resolve_btfids.c:140:(.text+0x29c): relocation truncated to fit: R_AARCH64_LDST32_ABS_LO12_NC against `.BTF_ids' ld: tools/testing/selftests/bpf/prog_tests/resolve_btfids.c:140: warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined In vmlinux, the .BTF_ids section is aligned to 4 bytes by vmlinux.lds.h. In test_progs however, .BTF_ids doesn't have alignment constraints. The arm64 linker expects the btf_id_set.cnt symbol, a u32, to be naturally aligned but finds it misaligned and cannot apply the relocation. Enforce alignment of .BTF_ids to 4 bytes. Fixes: cd04b04de119 ("selftests/bpf: Add set test to resolve_btfids") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/bpf/20200930093559.2120126-1-jean-philippe@linaro.org
2020-09-30	selftests: net: Add drop monitor test	Ido Schimmel
	Test that drop monitor correctly captures both software and hardware originated packet drops. # ./drop_monitor_tests.sh Software drops test TEST: Capturing active software drops [ OK ] TEST: Capturing inactive software drops [ OK ] Hardware drops test TEST: Capturing active hardware drops [ OK ] TEST: Capturing inactive hardware drops [ OK ] Tests passed: 4 Tests failed: 0 Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30	selftests: mlxsw: Add a PFC test	Petr Machata
	Add a test for PFC. Runs 10MB of traffic through a bottleneck and checks that none of it gets lost. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30	selftests: mlxsw: Add headroom handling test	Petr Machata
	Add a test for headroom configuration. This covers projection of ETS configuration to ingress, PFC, adjustments for MTU, the qdisc / TC mode and the effect of egress SPAN session on buffer configuration. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30	selftests: mlxsw: qos_lib: Add a wrapper for running mlnx_qos	Petr Machata
	mlnx_qos is a script for configuration of DCB. Despite the name it is not actually Mellanox-specific in any way. It is currently the only ad-hoc tool available (in contrast to a daemon that manages an interface on an ongoing basis). However, it is very verbose and parsing out error messages is not really possible. Add a wrapper that makes it easier to use the tool. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30	selftests: forwarding: devlink_lib: Support port-less topologies	Petr Machata
	Some selftests may not need any actual ports. Technically those are not forwarding selftests, but devlink_lib can still be handy. Fall back on NETIF_NO_CABLE in those cases. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30	selftests: forwarding: devlink_lib: Add devlink_cell_size_get()	Petr Machata
	Add a helper that answers the cell size of the devlink device. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30	selftests: forwarding: devlink_lib: Split devlink_..._set() into save & set	Petr Machata
	Changing pool type from static to dynamic causes reinterpretation of threshold values. They therefore need to be saved before pool type is changed, then the pool type can be changed, and then the new values need to be set up. For that reason, set cannot subsume save, because it would be saving the wrong thing, with possibly a nonsensical value, and restore would then fail to restore the nonsensical value. Thus extract a _save() from each of the relevant _set()'s. This way it is possible to save everything up front, then to tweak it, and then restore in the required order. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30	selftests/bpf: Test "incremental" btf_dump in C format	Andrii Nakryiko
	Add test validating that btf_dump works fine with BTFs that are modified and incrementally generated. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929232843.1249318-5-andriin@fb.com
2020-09-30	libbpf: Make btf_dump work with modifiable BTF	Andrii Nakryiko
	Ensure that btf_dump can accommodate new BTF types being appended to BTF instance after struct btf_dump was created. This came up during attemp to use btf_dump for raw type dumping in selftests, but given changes are not excessive, it's good to not have any gotchas in API usage, so I decided to support such use case in general. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929232843.1249318-2-andriin@fb.com
2020-09-30	bpf, selftests: Add redirect_neigh selftest	Daniel Borkmann
	Add a small test that exercises the new redirect_neigh() helper for the IPv4 and IPv6 case. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/0fc7d9c5f9a6cc1c65b0d3be83b44b1ec9889f43.1601477936.git.daniel@iogearbox.net
2020-09-30	bpf, selftests: Use bpf_tail_call_static where appropriate	Daniel Borkmann
	For those locations where we use an immediate tail call map index use the newly added bpf_tail_call_static() helper. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/3cfb2b799a62d22c6e7ae5897c23940bdcc24cbc.1601477936.git.daniel@iogearbox.net
2020-09-30	bpf, libbpf: Add bpf_tail_call_static helper for bpf programs	Daniel Borkmann
	Port of tail_call_static() helper function from Cilium's BPF code base [0] to libbpf, so others can easily consume it as well. We've been using this in production code for some time now. The main idea is that we guarantee that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the JITed BPF insns with direct jumps instead of having to fall back to using expensive retpolines. By using inline asm, we guarantee that the compiler won't merge the call from different paths with potentially different content of r2/r3. We're also using Cilium's __throw_build_bug() macro (here as: __bpf_unreachable()) in different places as a neat trick to trigger compilation errors when compiler does not remove code at compilation time. This works for the BPF back end as it does not implement the __builtin_trap(). [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/1656a082e077552eb46642d513b4a6bde9a7dd01.1601477936.git.daniel@iogearbox.net
2020-09-30	bpf: Add redirect_neigh helper as redirect drop-in	Daniel Borkmann
	Add a redirect_neigh() helper as redirect() drop-in replacement for the xmit side. Main idea for the helper is to be very similar in semantics to the latter just that the skb gets injected into the neighboring subsystem in order to let the stack do the work it knows best anyway to populate the L2 addresses of the packet and then hand over to dev_queue_xmit() as redirect() does. This solves two bigger items: i) skbs don't need to go up to the stack on the host facing veth ingress side for traffic egressing the container to achieve the same for populating L2 which also has the huge advantage that ii) the skb->sk won't get orphaned in ip_rcv_core() when entering the IP routing layer on the host stack. Given that skb->sk neither gets orphaned when crossing the netns as per 9c4c325252c5 ("skbuff: preserve sock reference when scrubbing the skb.") the helper can then push the skbs directly to the phys device where FQ scheduler can do its work and TCP stack gets proper backpressure given we hold on to skb->sk as long as skb is still residing in queues. With the helper used in BPF data path to then push the skb to the phys device, I observed a stable/consistent TCP_STREAM improvement on veth devices for traffic going container -> host -> host -> container from ~10Gbps to ~15Gbps for a single stream in my test environment. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/bpf/f207de81629e1724899b73b8112e0013be782d35.1601477936.git.daniel@iogearbox.net
2020-09-30	bpf: Add classid helper only based on skb->sk	Daniel Borkmann
	Similarly to 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks"), add a helper to retrieve cgroup v1 classid solely based on the skb->sk, so it can be used as key as part of BPF map lookups out of tc from host ns, in particular given the skb->sk is retained these days when crossing net ns thanks to 9c4c325252c5 ("skbuff: preserve sock reference when scrubbing the skb."). This is similar to bpf_skb_cgroup_id() which implements the same for v2. Kubernetes ecosystem is still operating on v1 however, hence net_cls needs to be used there until this can be dropped in with the v2 helper of bpf_skb_cgroup_id(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/ed633cf27a1c620e901c5aa99ebdefb028dce600.1601477936.git.daniel@iogearbox.net
2020-09-30	perf beauty mmap_flags: Conditionaly define the mmap flags	Arnaldo Carvalho de Melo
	So that in older systems we get it in the mmap flags scnprintf routines: $ tools/perf/trace/beauty/mmap_flags.sh \| head -9 2> /dev/null static const char *mmap_flags[] = { [ilog2(0x40) + 1] = "32BIT", #ifndef MAP_32BIT #define MAP_32BIT 0x40 #endif [ilog2(0x01) + 1] = "SHARED", #ifndef MAP_SHARED #define MAP_SHARED 0x01 #endif $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-30	selftests: netfilter: add time counter check	Fabian Frederick
	Check packets are correctly placed in current year. Also do a NULL check for another one. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-09-30	tools: gpio: add debounce support to gpio-event-mon	Kent Gibson
	Add support for debouncing monitored lines to gpio-event-mon. Signed-off-by: Kent Gibson <warthog618@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2020-09-30	tools: gpio: add multi-line monitoring to gpio-event-mon	Kent Gibson
	Extend gpio-event-mon to support monitoring multiple lines. This would require multiple lineevent requests to implement using uAPI v1, but can be performed with a single line request using uAPI v2. Signed-off-by: Kent Gibson <warthog618@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2020-09-30	tools: gpio: port gpio-event-mon to v2 uAPI	Kent Gibson
	Port the gpio-event-mon tool to the latest GPIO uAPI. Signed-off-by: Kent Gibson <warthog618@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2020-09-30	tools: gpio: port gpio-hammer to v2 uAPI	Kent Gibson
	Port the gpio-hammer tool to the latest GPIO uAPI. Signed-off-by: Kent Gibson <warthog618@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2020-09-30	tools: gpio: rename nlines to num_lines	Kent Gibson
	Rename nlines to num_lines to be consistent with other usage for fields describing the number of entries in an array. Signed-off-by: Kent Gibson <warthog618@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2020-09-30	tools: gpio: port gpio-watch to v2 uAPI	Kent Gibson
	Port the gpio-watch tool to the latest GPIO uAPI. Signed-off-by: Kent Gibson <warthog618@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2020-09-30	tools: gpio: port lsgpio to v2 uAPI	Kent Gibson
	Port the lsgpio tool to the latest GPIO uAPI. Signed-off-by: Kent Gibson <warthog618@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2020-09-30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller
	Alexei Starovoitov says: ==================== pull-request: bpf 2020-09-29 The following pull-request contains BPF updates for your net tree. We've added 7 non-merge commits during the last 14 day(s) which contain a total of 7 files changed, 28 insertions(+), 8 deletions(-). The main changes are: 1) fix xdp loading regression in libbpf for old kernels, from Andrii. 2) Do not discard packet when NETDEV_TX_BUSY, from Magnus. 3) Fix corner cases in libbpf related to endianness and kconfig, from Tony. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29	libbpf: Compile in PIC mode only for shared library case	Andrii Nakryiko
	Libbpf compiles .o's for static and shared library modes separately, so no need to specify -fPIC for both. Keep it only for shared library mode. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200929220604.833631-3-andriin@fb.com
2020-09-29	libbpf: Compile libbpf under -O2 level by default and catch extra warnings	Andrii Nakryiko
	For some reason compiler doesn't complain about uninitialized variable, fixed in previous patch, if libbpf is compiled without -O2 optimization level. So do compile it with -O2 and never let similar issue slip by again. -Wall is added unconditionally, so no need to specify it again. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200929220604.833631-2-andriin@fb.com
2020-09-29	libbpf: Fix uninitialized variable in btf_parse_type_sec	Andrii Nakryiko
	Fix obvious unitialized variable use that wasn't reported by compiler. libbpf Makefile changes to catch such errors are added separately. Fixes: 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200929220604.833631-1-andriin@fb.com
2020-09-29	selftests/bpf: Fix endianness issues in sk_lookup/ctx_narrow_access	Ilya Leoshkevich
	This test makes a lot of narrow load checks while assuming little endian architecture, and therefore fails on s390. Fix by introducing LSB and LSW macros and using them to perform narrow loads. Fixes: 0ab5539f8584 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929201814.44360-1-iii@linux.ibm.com
2020-09-29	perf trace beauty: Add script to autogenerate mremap's flags args string/id ↵	Arnaldo Carvalho de Melo
	table It'll also conditionally generate the defines, so that if we don't have those when building a new tool tarball in an older systems, we get those, and we need them sometimes in the actual scnprintf routine, such as when checking if a flags means we have an extra arg, like with MREMAP_FIXED. $ tools/perf/trace/beauty/mremap_flags.sh static const char *mremap_flags[] = { [ilog2(1) + 1] = "MAYMOVE", #ifndef MREMAP_MAYMOVE #define MREMAP_MAYMOVE 1 #endif [ilog2(2) + 1] = "FIXED", #ifndef MREMAP_FIXED #define MREMAP_FIXED 2 #endif [ilog2(4) + 1] = "DONTUNMAP", #ifndef MREMAP_DONTUNMAP #define MREMAP_DONTUNMAP 4 #endif }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-29	bpf, selftests: Fix warning in snprintf_btf where system() call unchecked	John Fastabend
	On my systems system() calls are marked with warn_unused_result apparently. So without error checking we get this warning, ./prog_tests/snprintf_btf.c:30:9: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result[-Wunused-result] Also it seems like a good idea to check the return value anyways to ensure ping exists even if its seems unlikely. Fixes: 076a95f5aff2c ("selftests/bpf: Add bpf_snprintf_btf helper tests") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/160141006897.25201.12095049414156293265.stgit@john-Precision-5820-Tower
2020-09-29	selftests: Add selftest for disallowing modify_return attachment to freplace	Toke Høiland-Jørgensen
	This adds a selftest that ensures that modify_return tracing programs cannot be attached to freplace programs. The security_ prefix is added to the freplace program because that would otherwise let it pass the check for modify_return. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/160138355713.48470.3811074984255709369.stgit@toke.dk
2020-09-29	selftests/bpf: Adding test for arg dereference in extension trace	Jiri Olsa
	Adding test that setup following program: SEC("classifier/test_pkt_md_access") int test_pkt_md_access(struct __sk_buff skb) with its extension: SEC("freplace/test_pkt_md_access") int test_pkt_md_access_new(struct __sk_buff skb) and tracing that extension with: SEC("fentry/test_pkt_md_access_new") int BPF_PROG(fentry, struct sk_buff *skb) The test verifies that the tracing program can dereference skb argument properly. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/160138355603.48470.9072073357530773228.stgit@toke.dk