linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-04-25	selftests/bpf: Add C tests for kptr	Kumar Kartikeya Dwivedi
	This uses the __kptr and __kptr_ref macros as well, and tries to test the stuff that is supposed to work, since we have negative tests in test_verifier suite. Also include some code to test map-in-map support, such that the inner_map_meta matches the kptr_off_tab of map added as element. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-12-memxor@gmail.com
2022-04-25	libbpf: Add kptr type tag macros to bpf_helpers.h	Kumar Kartikeya Dwivedi
	Include convenience definitions: __kptr: Unreferenced kptr __kptr_ref: Referenced kptr Users can use them to tag the pointer type meant to be used with the new support directly in the map value definition. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-11-memxor@gmail.com
2022-04-25	bpf: Allow storing referenced kptr in map	Kumar Kartikeya Dwivedi
	Extending the code in previous commits, introduce referenced kptr support, which needs to be tagged using 'kptr_ref' tag instead. Unlike unreferenced kptr, referenced kptr have a lot more restrictions. In addition to the type matching, only a newly introduced bpf_kptr_xchg helper is allowed to modify the map value at that offset. This transfers the referenced pointer being stored into the map, releasing the references state for the program, and returning the old value and creating new reference state for the returned pointer. Similar to unreferenced pointer case, return value for this case will also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer must either be eventually released by calling the corresponding release function, otherwise it must be transferred into another map. It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear the value, and obtain the old value if any. BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future commit will permit using BPF_LDX for such pointers, but attempt at making it safe, since the lifetime of object won't be guaranteed. There are valid reasons to enforce the restriction of permitting only bpf_kptr_xchg to operate on referenced kptr. The pointer value must be consistent in face of concurrent modification, and any prior values contained in the map must also be released before a new one is moved into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg returns the old value, which the verifier would require the user to either free or move into another map, and releases the reference held for the pointer being moved in. In the future, direct BPF_XCHG instruction may also be permitted to work like bpf_kptr_xchg helper. Note that process_kptr_func doesn't have to call check_helper_mem_access, since we already disallow rdonly/wronly flags for map, which is what check_map_access_type checks, and we already ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc, so check_map_access is also not required. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com
2022-04-25	bpf: Tag argument to be released in bpf_func_proto	Kumar Kartikeya Dwivedi
	Add a new type flag for bpf_arg_type that when set tells verifier that for a release function, that argument's register will be the one for which meta.ref_obj_id will be set, and which will then be released using release_reference. To capture the regno, introduce a new field release_regno in bpf_call_arg_meta. This would be required in the next patch, where we may either pass NULL or a refcounted pointer as an argument to the release function bpf_kptr_xchg. Just releasing only when meta.ref_obj_id is set is not enough, as there is a case where the type of argument needed matches, but the ref_obj_id is set to 0. Hence, we must enforce that whenever meta.ref_obj_id is zero, the register that is to be released can only be NULL for a release function. Since we now indicate whether an argument is to be released in bpf_func_proto itself, is_release_function helper has lost its utitlity, hence refactor code to work without it, and just rely on meta.release_regno to know when to release state for a ref_obj_id. Still, the restriction of one release argument and only one ref_obj_id passed to BPF helper or kfunc remains. This may be lifted in the future. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-3-memxor@gmail.com
2022-04-25	bpftool, musl compat: Replace sys/fcntl.h by fcntl.h	Dominique Martinet
	musl does not like including sys/fcntl.h directly: [...] 1 \| #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [...] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220424051022.2619648-5-asmadeus@codewreck.org
2022-04-25	bpftool, musl compat: Replace nftw with FTW_ACTIONRETVAL	Dominique Martinet
	musl nftw implementation does not support FTW_ACTIONRETVAL. There have been multiple attempts at pushing the feature in musl upstream, but it has been refused or ignored all the times: https://www.openwall.com/lists/musl/2021/03/26/1 https://www.openwall.com/lists/musl/2022/01/22/1 In this case we only care about /proc/<pid>/fd/<fd>, so it's not too difficult to reimplement directly instead, and the new implementation makes 'bpftool perf' slightly faster because it doesn't needlessly stat/readdir unneeded directories (54ms -> 13ms on my machine). Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220424051022.2619648-4-asmadeus@codewreck.org
2022-04-25	libbpf: Remove unnecessary type cast	Yuntao Wang
	The link variable is already of type 'struct bpf_link ', casting it to 'struct bpf_link ' is redundant, drop it. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220424143420.457082-1-ytcoode@gmail.com
2022-04-23	selftests/bpf: Switch fexit_stress to bpf_link_create() API	Andrii Nakryiko
	Use bpf_link_create() API in fexit_stress test to attach FEXIT programs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Kui-Feng Lee <kuifeng@fb.com> Link: https://lore.kernel.org/bpf/20220421033945.3602803-4-andrii@kernel.org
2022-04-23	libbpf: Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open()	Andrii Nakryiko
	Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open() on older kernels for programs that are attachable through BPF_RAW_TRACEPOINT_OPEN. This makes bpf_link_create() more unified and convenient interface for creating bpf_link-based attachments. With this approach end users can just use bpf_link_create() for tp_btf/fentry/fexit/fmod_ret/lsm program attachments without needing to care about kernel support, as libbpf will handle this transparently. On the other hand, as newer features (like BPF cookie) are added to LINK_CREATE interface, they will be readily usable though the same bpf_link_create() API without any major refactoring from user's standpoint. bpf_program__attach_btf_id() is now using bpf_link_create() internally as well and will take advantaged of this unified interface when BPF cookie is added for fentry/fexit. Doing proactive feature detection of LINK_CREATE support for fentry/tp_btf/etc is quite involved. It requires parsing vmlinux BTF, determining some stable and guaranteed to be in all kernels versions target BTF type (either raw tracepoint or fentry target function), actually attaching this program and thus potentially affecting the performance of the host kernel briefly, etc. So instead we are taking much simpler "lazy" approach of falling back to bpf_raw_tracepoint_open() call only if initial LINK_CREATE command fails. For modern kernels this will mean zero added overhead, while older kernels will incur minimal overhead with a single fast-failing LINK_CREATE call. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Kui-Feng Lee <kuifeng@fb.com> Link: https://lore.kernel.org/bpf/20220421033945.3602803-3-andrii@kernel.org
2022-04-21	libbpf: Remove redundant non-null checks on obj_elf	Gaosheng Cui
	Obj_elf is already non-null checked at the function entry, so remove redundant non-null checks on obj_elf. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220421031803.2283974-1-cuigaosheng1@huawei.com
2022-04-21	selftests/bpf: Fix map tests errno checks	Artem Savkov
	Switching to libbpf 1.0 API broke test_lpm_map and test_lru_map as error reporting changed. Instead of setting errno and returning -1 bpf calls now return -Exxx directly. Drop errno checks and look at return code directly. Fixes: b858ba8c52b6 ("selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK") Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/bpf/20220421094320.1563570-1-asavkov@redhat.com
2022-04-21	selftests/bpf: Fix prog_tests uprobe_autoattach compilation error	Artem Savkov
	I am getting the following compilation error for prog_tests/uprobe_autoattach.c: tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c: In function ‘test_uprobe_autoattach’: ./test_progs.h:209:26: error: pointer ‘mem’ may be used after ‘free’ [-Werror=use-after-free] The value of mem is now used in one of the asserts, which is why it may be confusing compilers. However, it is not dereferenced. Silence this by moving free(mem) after the assert block. Fixes: 1717e248014c ("selftests/bpf: Uprobe tests should verify param/return values") Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220421132317.1583867-1-asavkov@redhat.com
2022-04-21	selftests/bpf: Fix attach tests retcode checks	Artem Savkov
	Switching to libbpf 1.0 API broke test_sock and test_sysctl as they check for return of bpf_prog_attach to be exactly -1. Switch the check to '< 0' instead. Fixes: b858ba8c52b6 ("selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK") Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/bpf/20220421130104.1582053-1-asavkov@redhat.com
2022-04-21	libbpf: Add documentation to API functions	Grant Seltzer
	This adds documentation for the following API functions: - bpf_program__set_expected_attach_type() - bpf_program__set_type() - bpf_program__set_attach_target() - bpf_program__attach() - bpf_program__pin() - bpf_program__unpin() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220420161226.86803-3-grantseltzer@gmail.com
2022-04-21	libbpf: Update API functions usage to check error	Grant Seltzer
	This updates usage of the following API functions within libbpf so their newly added error return is checked: - bpf_program__set_expected_attach_type() - bpf_program__set_type() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220420161226.86803-2-grantseltzer@gmail.com
2022-04-21	libbpf: Add error returns to two API functions	Grant Seltzer
	This adds an error return to the following API functions: - bpf_program__set_expected_attach_type() - bpf_program__set_type() In both cases, the error occurs when the BPF object has already been loaded when the function is called. In this case -EBUSY is returned. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220420161226.86803-1-grantseltzer@gmail.com
2022-04-20	selftests/bpf: Add test for skb_load_bytes	Liu Jian
	Use bpf_prog_test_run_opts to test the skb_load_bytes function. Tests the behavior when offset is greater than INT_MAX or a normal value. Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220416105801.88708-4-liujian56@huawei.com
2022-04-19	libbpf: Support riscv USDT argument parsing logic	Pu Lehui
	Add riscv-specific USDT argument specification parsing logic. riscv USDT argument format is shown below: - Memory dereference case: "size@off(reg)", e.g. "-8@-88(s0)" - Constant value case: "size@val", e.g. "4@5" - Register read case: "size@reg", e.g. "-8@a1" s8 will be marked as poison while it's a reg of riscv, we need to alias it in advance. Both RV32 and RV64 have been tested. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220419145238.482134-3-pulehui@huawei.com
2022-04-19	libbpf: Fix usdt_cookie being cast to 32 bits	Pu Lehui
	The usdt_cookie is defined as __u64, which should not be used as a long type because it will be cast to 32 bits in 32-bit platforms. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220419145238.482134-2-pulehui@huawei.com
2022-04-19	selftests/bpf: Add tests for type tag order validation	Kumar Kartikeya Dwivedi
	Add a few test cases that ensure we catch cases of badly ordered type tags in modifier chains. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220419164608.1990559-3-memxor@gmail.com
2022-04-19	selftests/bpf: Use non-autoloaded programs in few tests	Andrii Nakryiko
	Take advantage of new libbpf feature for declarative non-autoloaded BPF program SEC() definitions in few test that test single program at a time out of many available programs within the single BPF object. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220419002452.632125-2-andrii@kernel.org
2022-04-19	libbpf: Support opting out from autoloading BPF programs declaratively	Andrii Nakryiko
	Establish SEC("?abc") naming convention (i.e., adding question mark in front of otherwise normal section name) that allows to set corresponding program's autoload property to false. This is effectively just a declarative way to do bpf_program__set_autoload(prog, false). Having a way to do this declaratively in BPF code itself is useful and convenient for various scenarios. E.g., for testing, when BPF object consists of multiple independent BPF programs that each needs to be tested separately. Opting out all of them by default and then setting autoload to true for just one of them at a time simplifies testing code (see next patch for few conversions in BPF selftests taking advantage of this new feature). Another real-world use case is in libbpf-tools for cases when different BPF programs have to be picked depending on particulars of the host kernel due to various incompatible changes (like kernel function renames or signature change, or to pick kprobe vs fentry depending on corresponding kernel support for the latter). Marking all the different BPF program candidates as non-autoloaded declaratively makes this more obvious in BPF source code and allows simpler code in user-space code. When BPF program marked as SEC("?abc") it is otherwise treated just like SEC("abc") and bpf_program__section_name() reported will be "abc". Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220419002452.632125-1-andrii@kernel.org
2022-04-19	selftests/bpf: Workaround a verifier issue for test exhandler	Yonghong Song
	The llvm patch [1] enabled opaque pointer which caused selftest 'exhandler' failure. ... ; work = task->task_works; 7: (79) r1 = (u64 )(r6 +2120) ; R1_w=ptr_callback_head(off=0,imm=0) R6_w=ptr_task_struct(off=0,imm=0) ; func = work->func; 8: (79) r2 = (u64 )(r1 +8) ; R1_w=ptr_callback_head(off=0,imm=0) R2_w=scalar() ; if (!work && !func) 9: (4f) r1 \|= r2 math between ptr_ pointer and register with unbounded min value is not allowed below is insn 10 and 11 10: (55) if r1 != 0 goto +5 11: (18) r1 = 0 ll ... In llvm, the code generation of 'r1 \|= r2' happened in codegen selectiondag phase due to difference of opaque pointer vs. non-opaque pointer. Without [1], the related code looks like: r2 = (u64 )(r6 + 2120) r1 = (u64 )(r2 + 8) if r2 != 0 goto +6 <LBB0_4> if r1 != 0 goto +5 <LBB0_4> r1 = 0 ll ... I haven't found a good way in llvm to fix this issue. So let us workaround the problem first so bpf CI won't be blocked. [1] https://reviews.llvm.org/D123300 Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220419050900.3136024-1-yhs@fb.com
2022-04-19	selftests/bpf: Limit unroll_count for pyperf600 test	Yonghong Song
	LLVM commit [1] changed loop pragma behavior such that full loop unroll is always honored with user pragma. Previously, unroll count also depends on the unrolled code size. For pyperf600, without [1], the loop unroll count is 150. With [1], the loop unroll count is 600. The unroll count of 600 caused the program size close to 298k and this caused the following code is generated: 0: 7b 1a 00 ff 00 00 00 00 (u64 )(r10 - 256) = r1 ; uint64_t pid_tgid = bpf_get_current_pid_tgid(); 1: 85 00 00 00 0e 00 00 00 call 14 2: bf 06 00 00 00 00 00 00 r6 = r0 ; pid_t pid = (pid_t)(pid_tgid >> 32); 3: bf 61 00 00 00 00 00 00 r1 = r6 4: 77 01 00 00 20 00 00 00 r1 >>= 32 5: 63 1a fc ff 00 00 00 00 (u32 )(r10 - 4) = r1 6: bf a2 00 00 00 00 00 00 r2 = r10 7: 07 02 00 00 fc ff ff ff r2 += -4 ; PidData* pidData = bpf_map_lookup_elem(&pidmap, &pid); 8: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 10: 85 00 00 00 01 00 00 00 call 1 11: bf 08 00 00 00 00 00 00 r8 = r0 ; if (!pidData) 12: 15 08 15 e8 00 00 00 00 if r8 == 0 goto -6123 <LBB0_27588+0xffffffffffdae100> Note that insn 12 has a branch offset -6123 which is clearly illegal and will be rejected by the verifier. The negative offset is due to the branch range is greater than INT16_MAX. This patch changed the unroll count to be 150 to avoid above branch target insn out-of-range issue. Also the llvm is enhanced ([2]) to assert if the branch target insn is out of INT16 range. [1] https://reviews.llvm.org/D119148 [2] https://reviews.llvm.org/D123877 Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220419043230.2928530-1-yhs@fb.com
2022-04-18	selftests/bpf: Refactor prog_tests logging and test execution	Mykola Lysenko
	This is a pre-req to add separate logging for each subtest in test_progs. Move all the mutable test data to the test_result struct. Move per-test init/de-init into the run_one_test function. Consolidate data aggregation and final log output in calculate_and_print_summary function. As a side effect, this patch fixes double counting of errors for subtests and possible duplicate output of subtest log on failures. Also, add prog_tests_framework.c test to verify some of the counting logic. As part of verification, confirmed that number of reported tests is the same before and after the change for both parallel and sequential test execution. Signed-off-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220418222507.1726259-1-mykolal@fb.com
2022-04-11	libbpf: Usdt aarch64 arg parsing support	Alan Maguire
	Parsing of USDT arguments is architecture-specific. On aarch64 it is relatively easy since registers used are x[0-31] and sp. Format is slightly different compared to x86_64. Possible forms are: - "size@[reg[,offset]]" for dereferences, e.g. "-8@[sp,76]" and "-4@[sp]"; - "size@reg" for register values, e.g. "-4@x0"; - "size@value" for raw values, e.g. "-8@1". Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649690496-1902-2-git-send-email-alan.maguire@oracle.com
2022-04-11	selftests/bpf: Drop duplicate max/min definitions	Geliang Tang
	Drop duplicate macros min() and MAX() definitions in prog_tests and use MIN() or MAX() in sys/param.h instead. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/1ae276da9925c2de59b5bdc93b693b4c243e692e.1649462033.git.geliang.tang@suse.com
2022-04-10	tools/runqslower: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK	Yafang Shao
	Explicitly set libbpf 1.0 API mode, then we can avoid using the deprecated RLIMIT_MEMLOCK. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220409125958.92629-5-laoar.shao@gmail.com
2022-04-10	bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK	Yafang Shao
	We have switched to memcg-based memory accouting and thus the rlimit is not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in libbpf for backward compatibility, so we can use it instead now. libbpf_set_strict_mode always return 0, so we don't need to check whether the return value is 0 or not. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220409125958.92629-4-laoar.shao@gmail.com
2022-04-10	selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK	Yafang Shao
	We have switched to memcg-based memory accouting and thus the rlimit is not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in libbpf for backward compatibility, so we can use it instead now. After this change, the header tools/testing/selftests/bpf/bpf_rlimit.h can be removed. This patch also removes the useless header sys/resource.h from many files in tools/testing/selftests/bpf/. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220409125958.92629-3-laoar.shao@gmail.com
2022-04-10	libbpf: Fix a bug with checking bpf_probe_read_kernel() support in old kernels	Runqing Yang
	Background: Libbpf automatically replaces calls to BPF bpf_probe_read_{kernel,user} [_str]() helpers with bpf_probe_read[_str](), if libbpf detects that kernel doesn't support new APIs. Specifically, libbpf invokes the probe_kern_probe_read_kernel function to load a small eBPF program into the kernel in which bpf_probe_read_kernel API is invoked and lets the kernel checks whether the new API is valid. If the loading fails, libbpf considers the new API invalid and replaces it with the old API. static int probe_kern_probe_read_kernel(void) { struct bpf_insn insns[] = { BPF_MOV64_REG(BPF_REG_1, BPF_REG_10), /* r1 = r10 (fp) / BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8), / r1 += -8 / BPF_MOV64_IMM(BPF_REG_2, 8), / r2 = 8 / BPF_MOV64_IMM(BPF_REG_3, 0), / r3 = 0 */ BPF_RAW_INSN(BPF_JMP \| BPF_CALL, 0, 0, 0, BPF_FUNC_probe_read_kernel), BPF_EXIT_INSN(), }; int fd, insn_cnt = ARRAY_SIZE(insns); fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, NULL, "GPL", insns, insn_cnt, NULL); return probe_fd(fd); } Bug: On older kernel versions [0], the kernel checks whether the version number provided in the bpf syscall, matches the LINUX_VERSION_CODE. If not matched, the bpf syscall fails. eBPF However, the probe_kern_probe_read_kernel code does not set the kernel version number provided to the bpf syscall, which causes the loading process alwasys fails for old versions. It means that libbpf will replace the new API with the old one even the kernel supports the new one. Solution: After a discussion in [1], the solution is using BPF_PROG_TYPE_TRACEPOINT program type instead of BPF_PROG_TYPE_KPROBE because kernel does not enfoce version check for tracepoint programs. I test the patch in old kernels (4.18 and 4.19) and it works well. [0] https://elixir.bootlin.com/linux/v4.19/source/kernel/bpf/syscall.c#L1360 [1] Closes: https://github.com/libbpf/libbpf/issues/473 Signed-off-by: Runqing Yang <rainkin1993@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220409144928.27499-1-rainkin1993@gmail.com
2022-04-10	selftests/bpf: Improve by-name subtest selection logic in prog_tests	Mykola Lysenko
	Improve subtest selection logic when using -t/-a/-d parameters. In particular, more than one subtest can be specified or a combination of tests / subtests. -a send_signal -d send_signal/send_signal_nmi* - runs send_signal test without nmi tests -a send_signal/send_signal_nmi,find_vma - runs two send_signal subtests and find_vma test -a 'send_signal' -a find_vma -d send_signal/send_signal_nmi* - runs 2 send_signal test and find_vma test. Disables two send_signal nmi subtests -t send_signal -t find_vma - runs two send_signal tests and one find_vma test This will allow us to have granular control over which subtests to disable in the CI system instead of disabling whole tests. Also, add new selftest to avoid possible regression when changing prog_test test name selection logic. Signed-off-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220409001750.529930-1-mykolal@fb.com
2022-04-10	libbpf: Add ARC support to bpf_tracing.h	Vladimir Isaev
	Add PT_REGS macros suitable for ARCompact and ARCv2. Signed-off-by: Vladimir Isaev <isaev@synopsys.com> Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220408224442.599566-1-geomatsi@gmail.com
2022-04-08	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	Jakub Kicinski
	Daniel Borkmann says: ==================== pull-request: bpf-next 2022-04-09 We've added 63 non-merge commits during the last 9 day(s) which contain a total of 68 files changed, 4852 insertions(+), 619 deletions(-). The main changes are: 1) Add libbpf support for USDT (User Statically-Defined Tracing) probes. USDTs are an abstraction built on top of uprobes, critical for tracing and BPF, and widely used in production applications, from Andrii Nakryiko. 2) While Andrii was adding support for x86{-64}-specific logic of parsing USDT argument specification, Ilya followed-up with USDT support for s390 architecture, from Ilya Leoshkevich. 3) Support name-based attaching for uprobe BPF programs in libbpf. The format supported is `u[ret]probe/binary_path:[raw_offset\|function[+offset]]`, e.g. attaching to libc malloc can be done in BPF via SEC("uprobe/libc.so.6:malloc") now, from Alan Maguire. 4) Various load/store optimizations for the arm64 JIT to shrink the image size by using arm64 str/ldr immediate instructions. Also enable pointer authentication to verify return address for JITed code, from Xu Kuohai. 5) BPF verifier fixes for write access checks to helper functions, e.g. rd-only memory from bpf__cpu_ptr() must not be passed to helpers that write into passed buffers, from Kumar Kartikeya Dwivedi. 6) Fix overly excessive stack map allocation for its base map structure and buckets which slipped-in from cleanups during the rlimit accounting removal back then, from Yuntao Wang. 7) Extend the unstable CT lookup helpers for XDP and tc/BPF to report netfilter connection tracking tuple direction, from Lorenzo Bianconi. 8) Improve bpftool dump to show BPF program/link type names, Milan Landaverde. 9) Minor cleanups all over the place from various others. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (63 commits) bpf: Fix excessive memory allocation in stack_map_alloc() selftests/bpf: Fix return value checks in perf_event_stackmap test selftests/bpf: Add CO-RE relos into linked_funcs selftests libbpf: Use weak hidden modifier for USDT BPF-side API functions libbpf: Don't error out on CO-RE relos for overriden weak subprogs samples, bpf: Move routes monitor in xdp_router_ipv4 in a dedicated thread libbpf: Allow WEAK and GLOBAL bindings during BTF fixup libbpf: Use strlcpy() in path resolution fallback logic libbpf: Add s390-specific USDT arg spec parsing logic libbpf: Make BPF-side of USDT support work on big-endian machines libbpf: Minor style improvements in USDT code libbpf: Fix use #ifdef instead of #if to avoid compiler warning libbpf: Potential NULL dereference in usdt_manager_attach_usdt() selftests/bpf: Uprobe tests should verify param/return values libbpf: Improve string parsing for uprobe auto-attach libbpf: Improve library identification for uprobe binary path resolution selftests/bpf: Test for writes to map key from BPF helpers selftests/bpf: Test passing rdonly mem to global func bpf: Reject writes for PTR_TO_MAP_KEY in check_helper_mem_access bpf: Check PTR_TO_MEM \| MEM_RDONLY in check_helper_mem_access ... ==================== Link: https://lore.kernel.org/r/20220408231741.19116-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08	selftests/bpf: Fix return value checks in perf_event_stackmap test	Yuntao Wang
	The bpf_get_stackid() function may also return 0 on success as per UAPI BPF helper documentation. Therefore, correct checks from 'val > 0' to 'val >= 0' to ensure that they cover all possible success return values. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220408041452.933944-1-ytcoode@gmail.com
2022-04-08	selftests/bpf: Add CO-RE relos into linked_funcs selftests	Andrii Nakryiko
	Add CO-RE relocations into __weak subprogs for multi-file linked_funcs selftest to make sure libbpf handles such combination well. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220408181425.2287230-4-andrii@kernel.org
2022-04-08	libbpf: Use weak hidden modifier for USDT BPF-side API functions	Andrii Nakryiko
	Use __weak __hidden for bpf_usdt_xxx() APIs instead of much more confusing `static inline __noinline`. This was previously impossible due to libbpf erroring out on CO-RE relocations pointing to eliminated weak subprogs. Now that previous patch fixed this issue, switch back to __weak __hidden as it's a more direct way of specifying the desired behavior. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220408181425.2287230-3-andrii@kernel.org
2022-04-08	libbpf: Don't error out on CO-RE relos for overriden weak subprogs	Andrii Nakryiko
	During BPF static linking, all the ELF relocations and .BTF.ext information (including CO-RE relocations) are preserved for __weak subprograms that were logically overriden by either previous weak subprogram instance or by corresponding "strong" (non-weak) subprogram. This is just how native user-space linkers work, nothing new. But libbpf is over-zealous when processing CO-RE relocation to error out when CO-RE relocation belonging to such eliminated weak subprogram is encountered. Instead of erroring out on this expected situation, log debug-level message and skip the relocation. Fixes: db2b8b06423c ("libbpf: Support CO-RE relocations for multi-prog sections") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220408181425.2287230-2-andrii@kernel.org
2022-04-08	libbpf: Allow WEAK and GLOBAL bindings during BTF fixup	Andrii Nakryiko
	During BTF fix up for global variables, global variable can be global weak and will have STB_WEAK binding in ELF. Support such global variables in addition to non-weak ones. This is not the problem when using BPF static linking, as BPF static linker "fixes up" BTF during generation so that libbpf doesn't have to do it anymore during bpf_object__open(), which led to this not being noticed for a while, along with a pretty rare (currently) use of __weak variables and maps. Reported-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220407230446.3980075-2-andrii@kernel.org
2022-04-08	libbpf: Use strlcpy() in path resolution fallback logic	Andrii Nakryiko
	Coverity static analyzer complains that strcpy() can cause buffer overflow. Use libbpf_strlcpy() instead to be 100% sure this doesn't happen. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220407230446.3980075-1-andrii@kernel.org
2022-04-08	libbpf: Add s390-specific USDT arg spec parsing logic	Ilya Leoshkevich
	The logic is superficially similar to that of x86, but the small differences (no need for register table and dynamic allocation of register names, no $ sign before constants) make maintaining a common implementation too burdensome. Therefore simply add a s390x-specific version of parse_usdt_arg(). Note that while bcc supports index registers, this patch does not. This should not be a problem in most cases, since s390 uses a default value "nor" for STAP_SDT_ARG_CONSTRAINT. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220407214411.257260-4-iii@linux.ibm.com
2022-04-07	Merge tag 'net-5.18-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - new code bugs: - mctp: correct mctp_i2c_header_create result - eth: fungible: fix reference to __udivdi3 on 32b builds - eth: micrel: remove latencies support lan8814 Previous releases - regressions: - bpf: resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT - vrf: fix packet sniffing for traffic originating from ip tunnels - rxrpc: fix a race in rxrpc_exit_net() - dsa: revert "net: dsa: stop updating master MTU from master.c" - eth: ice: fix MAC address setting Previous releases - always broken: - tls: fix slab-out-of-bounds bug in decrypt_internal - bpf: support dual-stack sockets in bpf_tcp_check_syncookie - xdp: fix coalescing for page_pool fragment recycling - ovs: fix leak of nested actions - eth: sfc: - add missing xdp queue reinitialization - fix using uninitialized xdp tx_queue - eth: ice: - clear default forwarding VSI during VSI release - fix broken IFF_ALLMULTI handling - synchronize_rcu() when terminating rings - eth: qede: confirm skb is allocated before using - eth: aqc111: fix out-of-bounds accesses in RX fixup - eth: slip: fix NPD bug in sl_tx_timeout()" * tag 'net-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits) drivers: net: slip: fix NPD bug in sl_tx_timeout() bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets bpf: Support dual-stack sockets in bpf_tcp_check_syncookie myri10ge: fix an incorrect free for skb in myri10ge_sw_tso net: usb: aqc111: Fix out-of-bounds accesses in RX fixup qede: confirm skb is allocated before using net: ipv6mr: fix unused variable warning with CONFIG_IPV6_PIMSM_V2=n net: phy: mscc-miim: reject clause 45 register accesses net: axiemac: use a phandle to reference pcs_phy dt-bindings: net: add pcs-handle attribute net: axienet: factor out phy_node in struct axienet_local net: axienet: setup mdio unconditionally net: sfc: fix using uninitialized xdp tx_queue rxrpc: fix a race in rxrpc_exit_net() net: openvswitch: fix leak of nested actions net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address() net: openvswitch: don't send internal clone attribute to the userspace. net: micrel: Fix KS8851 Kconfig ice: clear cmd_type_offset_bsz for TX rings ice: xsk: fix VSI state check in ice_xsk_wakeup() ...
2022-04-07	libbpf: Make BPF-side of USDT support work on big-endian machines	Ilya Leoshkevich
	BPF_USDT_ARG_REG_DEREF handling always reads 8 bytes, regardless of the actual argument size. On little-endian the relevant argument bits end up in the lower bits of val, and later on the code that handles all the argument types expects them to be there. On big-endian they end up in the upper bits of val, breaking that expectation. Fix by right-shifting val on big-endian. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220407214411.257260-3-iii@linux.ibm.com
2022-04-07	libbpf: Minor style improvements in USDT code	Ilya Leoshkevich
	Fix several typos and references to non-existing headers. Also use __BYTE_ORDER__ instead of __BYTE_ORDER for consistency with the rest of the bpf code - see commit 45f2bebc8079 ("libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED()") for rationale). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220407214411.257260-2-iii@linux.ibm.com
2022-04-07	libbpf: Fix use #ifdef instead of #if to avoid compiler warning	Andrii Nakryiko
	As reported by Naresh: perf build errors on i386 [1] on Linux next-20220407 [2] usdt.c:1181:5: error: "__x86_64__" is not defined, evaluates to 0 [-Werror=undef] 1181 \| #if __x86_64__ \| ^~~~~~~~~~ usdt.c:1196:5: error: "__x86_64__" is not defined, evaluates to 0 [-Werror=undef] 1196 \| #if __x86_64__ \| ^~~~~~~~~~ cc1: all warnings being treated as errors Use #ifdef instead of #if to avoid this. Fixes: 4c59e584d158 ("libbpf: Add x86-specific USDT arg spec parsing logic") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220407203842.3019904-1-andrii@kernel.org
2022-04-07	libbpf: Potential NULL dereference in usdt_manager_attach_usdt()	Haowen Bai
	link could be null but still dereference bpf_link__destroy(&link->link) and it will lead to a null pointer access. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649299098-2069-1-git-send-email-baihaowen@meizu.com
2022-04-07	selftests/bpf: Uprobe tests should verify param/return values	Alan Maguire
	uprobe/uretprobe tests don't do any validation of arguments/return values, and without this we can't be sure we are attached to the right function, or that we are indeed attached to a uprobe or uretprobe. To fix this record argument and return value for auto-attached functions and ensure these match expectations. Also need to filter by pid to ensure we do not pick up stray malloc()s since auto-attach traces libc system-wide. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649245431-29956-4-git-send-email-alan.maguire@oracle.com
2022-04-07	libbpf: Improve string parsing for uprobe auto-attach	Alan Maguire
	For uprobe auto-attach, the parsing can be simplified for the SEC() name to a single sscanf(); the return value of the sscanf can then be used to distinguish between sections that simply specify "u[ret]probe" (and thus cannot auto-attach), those that specify "u[ret]probe/binary_path:function+offset" etc. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649245431-29956-3-git-send-email-alan.maguire@oracle.com
2022-04-07	libbpf: Improve library identification for uprobe binary path resolution	Alan Maguire
	In the process of doing path resolution for uprobe attach, libraries are identified by matching a ".so" substring in the binary_path. This matches a lot of patterns that do not conform to library.so[.version] format, so instead match a ".so" _suffix_, and if that fails match a ".so." substring for the versioned library case. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649245431-29956-2-git-send-email-alan.maguire@oracle.com
2022-04-06	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	Jakub Kicinski
	Alexei Starovoitov says: ==================== pull-request: bpf 2022-04-06 We've added 8 non-merge commits during the last 8 day(s) which contain a total of 9 files changed, 139 insertions(+), 36 deletions(-). The main changes are: 1) rethook related fixes, from Jiri and Masami. 2) Fix the case when tracing bpf prog is attached to struct_ops, from Martin. 3) Support dual-stack sockets in bpf_tcp_check_syncookie, from Maxim. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets bpf: Support dual-stack sockets in bpf_tcp_check_syncookie bpf: selftests: Test fentry tracing a struct_ops program bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT rethook: Fix to use WRITE_ONCE() for rethook:: Handler selftests/bpf: Fix warning comparing pointer to 0 bpf: Fix sparse warnings in kprobe_multi_resolve_syms bpftool: Explicit errno handling in skeletons ==================== Link: https://lore.kernel.org/r/20220407031245.73026-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>