summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2024-01-07net/sched: Remove ipt action testsJamal Hadi Salim
Commit ba24ea129126 ("net/sched: Retire ipt action") removed the ipt action but not the testcases. This patch removes the outstanding tdc tests. Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-05selftests: forwarding: Avoid failures to source net/lib.shBenjamin Poirier
The expression "source ../lib.sh" added to net/forwarding/lib.sh in commit 25ae948b4478 ("selftests/net: add lib.sh") does not work for tests outside net/forwarding which source net/forwarding/lib.sh (1). It also does not work in some cases where only a subset of tests are exported (2). Avoid the problems mentioned above by replacing the faulty expression with a copy of the content from net/lib.sh which is used by files under net/forwarding. A more thorough solution which avoids duplicating content between net/lib.sh and net/forwarding/lib.sh has been posted here: https://lore.kernel.org/netdev/20231222135836.992841-1-bpoirier@nvidia.com/ The approach in the current patch is a stopgap solution to avoid submitting large changes at the eleventh hour of this development cycle. Example of problem 1) tools/testing/selftests/drivers/net/bonding$ ./dev_addr_lists.sh ./net_forwarding_lib.sh: line 41: ../lib.sh: No such file or directory TEST: bonding cleanup mode active-backup [ OK ] TEST: bonding cleanup mode 802.3ad [ OK ] TEST: bonding LACPDU multicast address to slave (from bond down) [ OK ] TEST: bonding LACPDU multicast address to slave (from bond up) [ OK ] An error message is printed but since the test does not use functions from net/lib.sh, the test results are not affected. Example of problem 2) tools/testing/selftests$ make install TARGETS="net/forwarding" tools/testing/selftests$ cd kselftest_install/net/forwarding/ tools/testing/selftests/kselftest_install/net/forwarding$ ./pedit_ip.sh veth{0..3} lib.sh: line 41: ../lib.sh: No such file or directory TEST: ping [ OK ] TEST: ping6 [ OK ] ./pedit_ip.sh: line 135: busywait: command not found TEST: dev veth1 ingress pedit ip src set 198.51.100.1 [FAIL] Expected to get 10 packets, but got . ./pedit_ip.sh: line 135: busywait: command not found TEST: dev veth2 egress pedit ip src set 198.51.100.1 [FAIL] Expected to get 10 packets, but got . ./pedit_ip.sh: line 135: busywait: command not found TEST: dev veth1 ingress pedit ip dst set 198.51.100.1 [FAIL] Expected to get 10 packets, but got . ./pedit_ip.sh: line 135: busywait: command not found TEST: dev veth2 egress pedit ip dst set 198.51.100.1 [FAIL] Expected to get 10 packets, but got . ./pedit_ip.sh: line 135: busywait: command not found TEST: dev veth1 ingress pedit ip6 src set 2001:db8:2::1 [FAIL] Expected to get 10 packets, but got . ./pedit_ip.sh: line 135: busywait: command not found TEST: dev veth2 egress pedit ip6 src set 2001:db8:2::1 [FAIL] Expected to get 10 packets, but got . ./pedit_ip.sh: line 135: busywait: command not found TEST: dev veth1 ingress pedit ip6 dst set 2001:db8:2::1 [FAIL] Expected to get 10 packets, but got . ./pedit_ip.sh: line 135: busywait: command not found TEST: dev veth2 egress pedit ip6 dst set 2001:db8:2::1 [FAIL] Expected to get 10 packets, but got . In this case, the test results are affected. Fixes: 25ae948b4478 ("selftests/net: add lib.sh") Suggested-by: Ido Schimmel <idosch@nvidia.com> Suggested-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240104141109.100672-1-bpoirier@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-05Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-01-05 We've added 40 non-merge commits during the last 2 day(s) which contain a total of 73 files changed, 1526 insertions(+), 951 deletions(-). The main changes are: 1) Fix a memory leak when streaming AF_UNIX sockets were inserted into multiple sockmap slots/maps, from John Fastabend. 2) Fix gotol in s390 BPF JIT with large offsets, from Ilya Leoshkevich. 3) Fix reattachment branch in bpf_tracing_prog_attach() and reject the request if there is no valid attach_btf, from Jiri Olsa. 4) Remove deprecated bpfilter kernel leftovers given the project is developed in user space (https://github.com/facebook/bpfilter), from Quentin Deslandes. 5) Relax tracing BPF program recursive attach rules given right now it is not possible to create tracing program call cycles, from Dmitrii Dolgov. 6) Fix excessive memory consumption for the bpf_global_percpu_ma for systems with a large number of CPUs, from Yonghong Song. 7) Small x86 BPF JIT cleanup to reuse emit_nops instead of open-coding memcpy of x86_nops, from Leon Hwang. 8) Follow-up for libbpf to support __arg_ctx global function argument tag semantics to complement the merged kernel side, from Andrii Nakryiko. 9) Introduce "volatile compare" macros for BPF selftests in order to make the latter more robust against compiler optimization, from Alexei Starovoitov. 10) Small simplification in verifier's size checking of helper accesses along with additional selftests, from Andrei Matei. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (40 commits) selftests/bpf: Test re-attachment fix for bpf_tracing_prog_attach bpf: Fix re-attachment branch in bpf_tracing_prog_attach selftests/bpf: Add test for recursive attachment of tracing progs bpf: Relax tracing prog recursive attach rules bpf, x86: Use emit_nops to replace memcpy x86_nops selftests/bpf: Test gotol with large offsets selftests/bpf: Double the size of test_loader log s390/bpf: Fix gotol with large offsets bpfilter: remove bpfilter bpf: Remove unnecessary cpu == 0 check in memalloc selftests/bpf: add __arg_ctx BTF rewrite test selftests/bpf: add arg:ctx cases to test_global_funcs tests libbpf: implement __arg_ctx fallback logic libbpf: move BTF loading step after relocation step libbpf: move exception callbacks assignment logic into relocation step libbpf: use stable map placeholder FDs libbpf: don't rely on map->fd as an indicator of map being created libbpf: use explicit map reuse flag to skip map creation steps libbpf: make uniform use of btf__fd() accessor inside libbpf selftests/bpf: Add a selftest with > 512-byte percpu allocation size ... ==================== Link: https://lore.kernel.org/r/20240105170105.21070-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-05selftests/net: fix GRO coalesce test and add ext header coalesce testsRichard Gobert
Currently there is no test which checks that IPv6 extension header packets successfully coalesce. This commit adds a test, which verifies two IPv6 packets with HBH extension headers do coalesce, and another test which checks that packets with different extension header data do not coalesce in GRO. I changed the receive socket filter to accept a packet with one extension header. This change exposed a bug in the fragment test -- the old BPF did not accept the fragment packet. I updated correct_num_packets in the fragment test accordingly. Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/69282fed-2415-47e8-b3d3-34939ec3eb56@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04selftests/bpf: Test re-attachment fix for bpf_tracing_prog_attachDmitrii Dolgov
Add a test case to verify the fix for "prog->aux->dst_trampoline and tgt_prog is NULL" branch in bpf_tracing_prog_attach. The sequence of events: 1. load rawtp program 2. load fentry program with rawtp as target_fd 3. create tracing link for fentry program with target_fd = 0 4. repeat 3 Acked-by: Jiri Olsa <olsajiri@gmail.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Link: https://lore.kernel.org/r/20240103190559.14750-5-9erthalion6@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04selftests/bpf: Add test for recursive attachment of tracing progsDmitrii Dolgov
Verify the fact that only one fentry prog could be attached to another fentry, building up an attachment chain of limited size. Use existing bpf_testmod as a start of the chain. Acked-by: Jiri Olsa <olsajiri@gmail.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Link: https://lore.kernel.org/r/20240103190559.14750-3-9erthalion6@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt.c e009b2efb7a8 ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()") 0f2b21477988 ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL") https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04Merge tag 'net-6.7-rc9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless and netfilter. We haven't accumulated much over the break. If it wasn't for the uninterrupted stream of fixes for Intel drivers this PR would be very slim. There was a handful of user reports, however, either they stood out because of the lower traffic or users have had more time to test over the break. The ones which are v6.7-relevant should be wrapped up. Current release - regressions: - Revert "net: ipv6/addrconf: clamp preferred_lft to the minimum required", it caused issues on networks where routers send prefixes with preferred_lft=0 - wifi: - iwlwifi: pcie: don't synchronize IRQs from IRQ, prevent deadlock - mac80211: fix re-adding debugfs entries during reconfiguration Current release - new code bugs: - tcp: print AO/MD5 messages only if there are any keys Previous releases - regressions: - virtio_net: fix missing dma unmap for resize, prevent OOM Previous releases - always broken: - mptcp: prevent tcp diag from closing listener subflows - nf_tables: - set transport header offset for egress hook, fix IPv4 mangling - skip set commit for deleted/destroyed sets, avoid double deactivation - nat: make sure action is set for all ct states, fix openvswitch matching on ICMP packets in related state - eth: mlxbf_gige: fix receive hang under heavy traffic - eth: r8169: fix PCI error on system resume for RTL8168FP - net: add missing getsockopt(SO_TIMESTAMPING_NEW) and cmsg handling" * tag 'net-6.7-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits) net/tcp: Only produce AO/MD5 logs if there are any keys net: Implement missing SO_TIMESTAMPING_NEW cmsg support bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters() net: ravb: Wait for operating mode to be applied asix: Add check for usbnet_get_endpoints octeontx2-af: Re-enable MAC TX in otx2_stop processing octeontx2-af: Always configure NIX TX link credits based on max frame size net/smc: fix invalid link access in dumping SMC-R connections net/qla3xxx: fix potential memleak in ql_alloc_buffer_queues virtio_net: fix missing dma unmap for resize igc: Fix hicredit calculation ice: fix Get link status data length i40e: Restore VF MSI-X state during PCI reset i40e: fix use-after-free in i40e_aqc_add_filters() net: Save and restore msg_namelen in sock_sendmsg netfilter: nft_immediate: drop chain reference counter on error netfilter: nf_nat: fix action not being set for all ct states net: bcmgenet: Fix FCS generation for fragmented skbuffs mptcp: prevent tcp diag from closing listener subflows MAINTAINERS: add Geliang as reviewer for MPTCP ...
2024-01-04selftests/bpf: Test gotol with large offsetsIlya Leoshkevich
Test gotol with offsets that don't fit into a short (i.e., larger than 32k or smaller than -32k). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20240102193531.3169422-4-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04selftests/bpf: Double the size of test_loader logIlya Leoshkevich
Testing long jumps requires having >32k instructions. That many instructions require the verifier log buffer of 2 megabytes. The regular test_progs run doesn't need an increased buffer, since gotol test with 40k instructions doesn't request a log, but test_progs -v will set the verifier log level. Hence to avoid breaking gotol test with -v increase the buffer size. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20240102193531.3169422-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04bpfilter: remove bpfilterQuentin Deslandes
bpfilter was supposed to convert iptables filtering rules into BPF programs on the fly, from the kernel, through a usermode helper. The base code for the UMH was introduced in 2018, and couple of attempts (2, 3) tried to introduce the BPF program generate features but were abandoned. bpfilter now sits in a kernel tree unused and unusable, occasionally causing confusion amongst Linux users (4, 5). As bpfilter is now developed in a dedicated repository on GitHub (6), it was suggested a couple of times this year (LSFMM/BPF 2023, LPC 2023) to remove the deprecated kernel part of the project. This is the purpose of this patch. [1]: https://lore.kernel.org/lkml/20180522022230.2492505-1-ast@kernel.org/ [2]: https://lore.kernel.org/bpf/20210829183608.2297877-1-me@ubique.spb.ru/#t [3]: https://lore.kernel.org/lkml/20221224000402.476079-1-qde@naccy.de/ [4]: https://dxuuu.xyz/bpfilter.html [5]: https://github.com/linuxkit/linuxkit/pull/3904 [6]: https://github.com/facebook/bpfilter Signed-off-by: Quentin Deslandes <qde@naccy.de> Link: https://lore.kernel.org/r/20231226130745.465988-1-qde@naccy.de Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03selftests/bpf: add __arg_ctx BTF rewrite testAndrii Nakryiko
Add a test validating that libbpf uploads BTF and func_info with rewritten type information for arguments of global subprogs that are marked with __arg_ctx tag. Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03selftests/bpf: add arg:ctx cases to test_global_funcs testsAndrii Nakryiko
Add a few extra cases of global funcs with context arguments. This time rely on "arg:ctx" decl_tag (__arg_ctx macro), but put it next to "classic" cases where context argument has to be of an exact type that BPF verifier expects (e.g., bpf_user_pt_regs_t for kprobe/uprobe). Colocating all these cases separately from other global func args that rely on arg:xxx decl tags (in verifier_global_subprogs.c) allows for simpler backwards compatibility testing on old kernels. All the cases in test_global_func_ctx_args.c are supposed to work on older kernels, which was manually validated during development. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-9-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03libbpf: implement __arg_ctx fallback logicAndrii Nakryiko
Out of all special global func arg tag annotations, __arg_ctx is practically is the most immediately useful and most critical to have working across multitude kernel version, if possible. This would allow end users to write much simpler code if __arg_ctx semantics worked for older kernels that don't natively understand btf_decl_tag("arg:ctx") in verifier logic. Luckily, it is possible to ensure __arg_ctx works on old kernels through a bit of extra work done by libbpf, at least in a lot of common cases. To explain the overall idea, we need to go back at how context argument was supported in global funcs before __arg_ctx support was added. This was done based on special struct name checks in kernel. E.g., for BPF_PROG_TYPE_PERF_EVENT the expectation is that argument type `struct bpf_perf_event_data *` mark that argument as PTR_TO_CTX. This is all good as long as global function is used from the same BPF program types only, which is often not the case. If the same subprog has to be called from, say, kprobe and perf_event program types, there is no single definition that would satisfy BPF verifier. Subprog will have context argument either for kprobe (if using bpf_user_pt_regs_t struct name) or perf_event (with bpf_perf_event_data struct name), but not both. This limitation was the reason to add btf_decl_tag("arg:ctx"), making the actual argument type not important, so that user can just define "generic" signature: __noinline int global_subprog(void *ctx __arg_ctx) { ... } I won't belabor how libbpf is implementing subprograms, see a huge comment next to bpf_object_relocate_calls() function. The idea is that each main/entry BPF program gets its own copy of global_subprog's code appended. This per-program copy of global subprog code *and* associated func_info .BTF.ext information, pointing to FUNC -> FUNC_PROTO BTF type chain allows libbpf to simulate __arg_ctx behavior transparently, even if the kernel doesn't yet support __arg_ctx annotation natively. The idea is straightforward: each time we append global subprog's code and func_info information, we adjust its FUNC -> FUNC_PROTO type information, if necessary (that is, libbpf can detect the presence of btf_decl_tag("arg:ctx") just like BPF verifier would do it). The rest is just mechanical and somewhat painful BTF manipulation code. It's painful because we need to clone FUNC -> FUNC_PROTO, instead of reusing it, as same FUNC -> FUNC_PROTO chain might be used by another main BPF program within the same BPF object, so we can't just modify it in-place (and cloning BTF types within the same struct btf object is painful due to constant memory invalidation, see comments in code). Uploaded BPF object's BTF information has to work for all BPF programs at the same time. Once we have FUNC -> FUNC_PROTO clones, we make sure that instead of using some `void *ctx` parameter definition, we have an expected `struct bpf_perf_event_data *ctx` definition (as far as BPF verifier and kernel is concerned), which will mark it as context for BPF verifier. Same global subprog relocated and copied into another main BPF program will get different type information according to main program's type. It all works out in the end in a completely transparent way for end user. Libbpf maintains internal program type -> expected context struct name mapping internally. Note, not all BPF program types have named context struct, so this approach won't work for such programs (just like it didn't before __arg_ctx). So native __arg_ctx is still important to have in kernel to have generic context support across all BPF program types. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03libbpf: move BTF loading step after relocation stepAndrii Nakryiko
With all the preparations in previous patches done we are ready to postpone BTF loading and sanitization step until after all the relocations are performed. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03libbpf: move exception callbacks assignment logic into relocation stepAndrii Nakryiko
Move the logic of finding and assigning exception callback indices from BTF sanitization step to program relocations step, which seems more logical and will unblock moving BTF loading to after relocation step. Exception callbacks discovery and assignment has no dependency on BTF being loaded into the kernel, it only uses BTF information. It does need to happen before subprogram relocations happen, though. Which is why the split. No functional changes. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03libbpf: use stable map placeholder FDsAndrii Nakryiko
Move map creation to later during BPF object loading by pre-creating stable placeholder FDs (utilizing memfd_create()). Use dup2() syscall to then atomically make those placeholder FDs point to real kernel BPF map objects. This change allows to delay BPF map creation to after all the BPF program relocations. That, in turn, allows to delay BTF finalization and loading into kernel to after all the relocations as well. We'll take advantage of the latter in subsequent patches to allow libbpf to adjust BTF in a way that helps with BPF global function usage. Clean up a few places where we close map->fd, which now shouldn't happen, because map->fd should be a valid FD regardless of whether map was created or not. Surprisingly and nicely it simplifies a bunch of error handling code. If this change doesn't backfire, I'm tempted to pre-create such stable FDs for other entities (progs, maybe even BTF). We previously did some manipulations to make gen_loader work with fake map FDs, with stable map FDs this hack is not necessary for maps (we still have it for BTF, but I left it as is for now). Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03libbpf: don't rely on map->fd as an indicator of map being createdAndrii Nakryiko
With the upcoming switch to preallocated placeholder FDs for maps, switch various getters/setter away from checking map->fd. Use map_is_created() helper that detect whether BPF map can be modified based on map->obj->loaded state, with special provision for maps set up with bpf_map__reuse_fd(). For backwards compatibility, we take map_is_created() into account in bpf_map__fd() getter as well. This way before bpf_object__load() phase bpf_map__fd() will always return -1, just as before the changes in subsequent patches adding stable map->fd placeholders. We also get rid of all internal uses of bpf_map__fd() getter, as it's more oriented for uses external to libbpf. The above map_is_created() check actually interferes with some of the internal uses, if map FD is fetched through bpf_map__fd(). Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03libbpf: use explicit map reuse flag to skip map creation stepsAndrii Nakryiko
Instead of inferring whether map already point to previously created/pinned BPF map (which user can specify with bpf_map__reuse_fd()) API), use explicit map->reused flag that is set in such case. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03libbpf: make uniform use of btf__fd() accessor inside libbpfAndrii Nakryiko
It makes future grepping and code analysis a bit easier. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03selftests/bpf: Add a selftest with > 512-byte percpu allocation sizeYonghong Song
Add a selftest to capture the verification failure when the allocation size is greater than 512. Acked-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231222031812.1293190-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03selftests/bpf: Cope with 512 bytes limit with bpf_global_percpu_maYonghong Song
In the previous patch, the maximum data size for bpf_global_percpu_ma is 512 bytes. This breaks selftest test_bpf_ma. The test is adjusted in two aspects: - Since the maximum allowed data size for bpf_global_percpu_ma is 512, remove all tests beyond that, names sizes 1024, 2048 and 4096. - Previously the percpu data size is bucket_size - 8 in order to avoid percpu allocation into the next bucket. This patch removed such data size adjustment thanks to Patch 1. Also, a better way to generate BTF type is used than adding a member to the value struct. Acked-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231222031807.1292853-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-03selftests/net: change shebang to bash to support "source"Yujie Liu
The patch set [1] added a general lib.sh in net selftests, and converted several test scripts to source the lib.sh. unicast_extensions.sh (converted in [1]) and pmtu.sh (converted in [2]) have a /bin/sh shebang which may point to various shells in different distributions, but "source" is only available in some of them. For example, "source" is a built-it function in bash, but it cannot be used in dash. Refer to other scripts that were converted together, simply change the shebang to bash to fix the following issues when the default /bin/sh points to other shells. not ok 51 selftests: net: unicast_extensions.sh # exit=1 v1 -> v2: - Fix pmtu.sh which has the same issue as unicast_extensions.sh, suggested by Hangbin - Change the style of the "source" line to be consistent with other tests, suggested by Hangbin Link: https://lore.kernel.org/all/20231202020110.362433-1-liuhangbin@gmail.com/ [1] Link: https://lore.kernel.org/all/20231219094856.1740079-1-liuhangbin@gmail.com/ [2] Reported-by: kernel test robot <oliver.sang@intel.com> Fixes: 378f082eaf37 ("selftests/net: convert pmtu.sh to run it in unique namespace") Fixes: 0f4765d0b48d ("selftests/net: convert unicast_extensions.sh to run it in unique namespace") Signed-off-by: Yujie Liu <yujie.liu@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://lore.kernel.org/r/20231229131931.3961150-1-yujie.liu@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-03bpf: sockmap, add tests for proto updates replace socketJohn Fastabend
Add test that replaces the same socket with itself. This exercises a corner case where old element and new element have the same posck. Test protocols: TCP, UDP, stream af_unix and dgram af_unix. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20231221232327.43678-6-john.fastabend@gmail.com
2024-01-03bpf: sockmap, add tests for proto updates single socket to many mapJohn Fastabend
Add test with multiple maps where each socket is inserted in multiple maps. Test protocols: TCP, UDP, stream af_unix and dgram af_unix. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20231221232327.43678-5-john.fastabend@gmail.com
2024-01-03bpf: sockmap, add tests for proto updates many to single mapJohn Fastabend
Add test with a single map where each socket is inserted multiple times. Test protocols: TCP, UDP, stream af_unix and dgram af_unix. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20231221232327.43678-4-john.fastabend@gmail.com
2024-01-03selftests/bpf: Convert profiler.c to bpf_cmp.Alexei Starovoitov
Convert profiler[123].c to "volatile compare" to compare barrier_var() approach vs bpf_cmp_likely() vs bpf_cmp_unlikely(). bpf_cmp_unlikely() produces correct code, but takes much longer to verify: ./veristat -C -e prog,insns,states before after_with_unlikely Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) ------------------------------------ --------- --------- ------------------ ---------- ---------- ----------------- kprobe__proc_sys_write 1603 19606 +18003 (+1123.08%) 123 1678 +1555 (+1264.23%) kprobe__vfs_link 11815 70305 +58490 (+495.05%) 971 4967 +3996 (+411.53%) kprobe__vfs_symlink 5464 42896 +37432 (+685.07%) 434 3126 +2692 (+620.28%) kprobe_ret__do_filp_open 5641 44578 +38937 (+690.25%) 446 3162 +2716 (+608.97%) raw_tracepoint__sched_process_exec 2770 35962 +33192 (+1198.27%) 226 3121 +2895 (+1280.97%) raw_tracepoint__sched_process_exit 1526 2135 +609 (+39.91%) 133 208 +75 (+56.39%) raw_tracepoint__sched_process_fork 265 337 +72 (+27.17%) 19 24 +5 (+26.32%) tracepoint__syscalls__sys_enter_kill 18782 140407 +121625 (+647.56%) 1286 12176 +10890 (+846.81%) bpf_cmp_likely() is equivalent to barrier_var(): ./veristat -C -e prog,insns,states before after_with_likely Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) ------------------------------------ --------- --------- -------------- ---------- ---------- ------------- kprobe__proc_sys_write 1603 1663 +60 (+3.74%) 123 127 +4 (+3.25%) kprobe__vfs_link 11815 12090 +275 (+2.33%) 971 971 +0 (+0.00%) kprobe__vfs_symlink 5464 5448 -16 (-0.29%) 434 426 -8 (-1.84%) kprobe_ret__do_filp_open 5641 5739 +98 (+1.74%) 446 446 +0 (+0.00%) raw_tracepoint__sched_process_exec 2770 2608 -162 (-5.85%) 226 216 -10 (-4.42%) raw_tracepoint__sched_process_exit 1526 1526 +0 (+0.00%) 133 133 +0 (+0.00%) raw_tracepoint__sched_process_fork 265 265 +0 (+0.00%) 19 19 +0 (+0.00%) tracepoint__syscalls__sys_enter_kill 18782 18970 +188 (+1.00%) 1286 1286 +0 (+0.00%) kprobe__proc_sys_write 2700 2809 +109 (+4.04%) 107 109 +2 (+1.87%) kprobe__vfs_link 12238 12366 +128 (+1.05%) 267 269 +2 (+0.75%) kprobe__vfs_symlink 7139 7365 +226 (+3.17%) 167 175 +8 (+4.79%) kprobe_ret__do_filp_open 7264 7070 -194 (-2.67%) 180 182 +2 (+1.11%) raw_tracepoint__sched_process_exec 3768 3453 -315 (-8.36%) 211 199 -12 (-5.69%) raw_tracepoint__sched_process_exit 3138 3138 +0 (+0.00%) 83 83 +0 (+0.00%) raw_tracepoint__sched_process_fork 265 265 +0 (+0.00%) 19 19 +0 (+0.00%) tracepoint__syscalls__sys_enter_kill 26679 24327 -2352 (-8.82%) 1067 1037 -30 (-2.81%) kprobe__proc_sys_write 1833 1833 +0 (+0.00%) 157 157 +0 (+0.00%) kprobe__vfs_link 9995 10127 +132 (+1.32%) 803 803 +0 (+0.00%) kprobe__vfs_symlink 5606 5672 +66 (+1.18%) 451 451 +0 (+0.00%) kprobe_ret__do_filp_open 5716 5782 +66 (+1.15%) 462 462 +0 (+0.00%) raw_tracepoint__sched_process_exec 3042 3042 +0 (+0.00%) 278 278 +0 (+0.00%) raw_tracepoint__sched_process_exit 1680 1680 +0 (+0.00%) 146 146 +0 (+0.00%) raw_tracepoint__sched_process_fork 299 299 +0 (+0.00%) 25 25 +0 (+0.00%) tracepoint__syscalls__sys_enter_kill 18372 18372 +0 (+0.00%) 1558 1558 +0 (+0.00%) default (mcpu=v3), no_alu32, cpuv4 have similar differences. Note one place where bpf_nop_mov() is used to workaround the verifier lack of link between the scalar register and its spill to stack. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231226191148.48536-7-alexei.starovoitov@gmail.com
2024-01-03bpf: Add bpf_nop_mov() asm macro.Alexei Starovoitov
bpf_nop_mov(var) asm macro emits nop register move: rX = rX. If 'var' is a scalar and not a fixed constant the verifier will assign ID to it. If it's later spilled the stack slot will carry that ID as well. Hence the range refining comparison "if rX < const" will update all copies including spilled slot. This macro is a temporary workaround until the verifier gets smarter. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231226191148.48536-6-alexei.starovoitov@gmail.com
2024-01-03selftests/bpf: Remove bpf_assert_eq-like macros.Alexei Starovoitov
Since the last user was converted to bpf_cmp, remove bpf_assert_eq/ne/... macros. __bpf_assert_op() macro is kept for experiments, since it's slightly more efficient than bpf_assert(bpf_cmp_unlikely()) until LLVM is fixed. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20231226191148.48536-5-alexei.starovoitov@gmail.com
2024-01-03selftests/bpf: Convert exceptions_assert.c to bpf_cmpAlexei Starovoitov
Convert exceptions_assert.c to bpf_cmp_unlikely() macro. Since bpf_assert(bpf_cmp_unlikely(var, ==, 100)); other code; will generate assembly code: if r1 == 100 goto L2; r0 = 0 call bpf_throw L1: other code; ... L2: goto L1; LLVM generates redundant basic block with extra goto. LLVM will be fixed eventually. Right now it's less efficient than __bpf_assert(var, ==, 100) macro that produces: if r1 == 100 goto L1; r0 = 0 call bpf_throw L1: other code; But extra goto doesn't hurt the verification process. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20231226191148.48536-4-alexei.starovoitov@gmail.com
2024-01-03bpf: Introduce "volatile compare" macrosAlexei Starovoitov
Compilers optimize conditional operators at will, but often bpf programmers want to force compilers to keep the same operator in asm as it's written in C. Introduce bpf_cmp_likely/unlikely(var1, conditional_op, var2) macros that can be used as: - if (seen >= 1000) + if (bpf_cmp_unlikely(seen, >=, 1000)) The macros take advantage of BPF assembly that is C like. The macros check the sign of variable 'seen' and emits either signed or unsigned compare. For example: int a; bpf_cmp_unlikely(a, >, 0) will be translated to 'if rX s> 0 goto' in BPF assembly. unsigned int a; bpf_cmp_unlikely(a, >, 0) will be translated to 'if rX > 0 goto' in BPF assembly. C type conversions coupled with comparison operator are tricky. int i = -1; unsigned int j = 1; if (i < j) // this is false. long i = -1; unsigned int j = 1; if (i < j) // this is true. Make sure BPF program is compiled with -Wsign-compare then the macros will catch the mistake. The macros check LHS (left hand side) only to figure out the sign of compare. 'if 0 < rX goto' is not allowed in the assembly, so the users have to use a variable on LHS anyway. The patch updates few tests to demonstrate the use of the macros. The macro allows to use BPF_JSET in C code, since LLVM doesn't generate it at present. For example: if (i & j) compiles into r0 &= r1; if r0 == 0 goto while if (bpf_cmp_unlikely(i, &, j)) compiles into if r0 & r1 goto Note that the macros has to be careful with RHS assembly predicate. Since: u64 __rhs = 1ull << 42; asm goto("if r0 < %[rhs] goto +1" :: [rhs] "ri" (__rhs)); LLVM will silently truncate 64-bit constant into s32 imm. Note that [lhs] "r"((short)LHS) the type cast is a workaround for LLVM issue. When LHS is exactly 32-bit LLVM emits redundant <<=32, >>=32 to zero upper 32-bits. When LHS is 64 or 16 or 8-bit variable there are no shifts. When LHS is 32-bit the (u64) cast doesn't help. Hence use (short) cast. It does _not_ truncate the variable before it's assigned to a register. Traditional likely()/unlikely() macros that use __builtin_expect(!!(x), 1 or 0) have no effect on these macros, hence macros implement the logic manually. bpf_cmp_unlikely() macro preserves compare operator as-is while bpf_cmp_likely() macro flips the compare. Consider two cases: A. for() { if (foo >= 10) { bar += foo; } other code; } B. for() { if (foo >= 10) break; other code; } It's ok to use either bpf_cmp_likely or bpf_cmp_unlikely macros in both cases, but consider that 'break' is effectively 'goto out_of_the_loop'. Hence it's better to use bpf_cmp_unlikely in the B case. While 'bar += foo' is better to keep as 'fallthrough' == likely code path in the A case. When it's written as: A. for() { if (bpf_cmp_likely(foo, >=, 10)) { bar += foo; } other code; } B. for() { if (bpf_cmp_unlikely(foo, >=, 10)) break; other code; } The assembly will look like: A. for() { if r1 < 10 goto L1; bar += foo; L1: other code; } B. for() { if r1 >= 10 goto L2; other code; } L2: The bpf_cmp_likely vs bpf_cmp_unlikely changes basic block layout, hence it will greatly influence the verification process. The number of processed instructions will be different, since the verifier walks the fallthrough first. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20231226191148.48536-3-alexei.starovoitov@gmail.com
2024-01-03selftests/bpf: Attempt to build BPF programs with -Wsign-compareAlexei Starovoitov
GCC's -Wall includes -Wsign-compare while clang does not. Since BPF programs are built with clang we need to add this flag explicitly to catch problematic comparisons like: int i = -1; unsigned int j = 1; if (i < j) // this is false. long i = -1; unsigned int j = 1; if (i < j) // this is true. C standard for reference: - If either operand is unsigned long the other shall be converted to unsigned long. - Otherwise, if one operand is a long int and the other unsigned int, then if a long int can represent all the values of an unsigned int, the unsigned int shall be converted to a long int; otherwise both operands shall be converted to unsigned long int. - Otherwise, if either operand is long, the other shall be converted to long. - Otherwise, if either operand is unsigned, the other shall be converted to unsigned. Unfortunately clang's -Wsign-compare is very noisy. It complains about (s32)a == (u32)b which is safe and doen't have surprising behavior. This patch fixes some of the issues. It needs a follow up to fix the rest. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20231226191148.48536-2-alexei.starovoitov@gmail.com
2024-01-03bpf: Add a possibly-zero-sized read testAndrei Matei
This patch adds a test for the condition that the previous patch mucked with - illegal zero-sized helper memory access. As opposed to existing tests, this new one uses a size whose lower bound is zero, as opposed to a known-zero one. Signed-off-by: Andrei Matei <andreimatei1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231221232225.568730-3-andreimatei1@gmail.com
2024-01-03bpf: Simplify checking size of helper accessesAndrei Matei
This patch simplifies the verification of size arguments associated to pointer arguments to helpers and kfuncs. Many helpers take a pointer argument followed by the size of the memory access performed to be performed through that pointer. Before this patch, the handling of the size argument in check_mem_size_reg() was confusing and wasteful: if the size register's lower bound was 0, then the verification was done twice: once considering the size of the access to be the lower-bound of the respective argument, and once considering the upper bound (even if the two are the same). The upper bound checking is a super-set of the lower-bound checking(*), except: the only point of the lower-bound check is to handle the case where zero-sized-accesses are explicitly not allowed and the lower-bound is zero. This static condition is now checked explicitly, replacing a much more complex, expensive and confusing verification call to check_helper_mem_access(). Error messages change in this patch. Before, messages about illegal zero-size accesses depended on the type of the pointer and on other conditions, and sometimes the message was plain wrong: in some tests that changed you'll see that the old message was something like "R1 min value is outside of the allowed memory range", where R1 is the pointer register; the error was wrongly claiming that the pointer was bad instead of the size being bad. Other times the information that the size came for a register with a possible range of values was wrong, and the error presented the size as a fixed zero. Now the errors refer to the right register. However, the old error messages did contain useful information about the pointer register which is now lost; recovering this information was deemed not important enough. (*) Besides standing to reason that the checks for a bigger size access are a super-set of the checks for a smaller size access, I have also mechanically verified this by reading the code for all types of pointers. I could convince myself that it's true for all but PTR_TO_BTF_ID (check_ptr_to_btf_access). There, simply looking line-by-line does not immediately prove what we want. If anyone has any qualms, let me know. Signed-off-by: Andrei Matei <andreimatei1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231221232225.568730-2-andreimatei1@gmail.com
2024-01-02net/sched: Remove uapi support for CBQ qdiscJamal Hadi Salim
Commit 051d44209842 ("net/sched: Retire CBQ qdisc") retired the CBQ qdisc. Remove UAPI for it. Iproute2 will sync by equally removing it from user space. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02net/sched: Remove uapi support for ATM qdiscJamal Hadi Salim
Commit fb38306ceb9e ("net/sched: Retire ATM qdisc") retired the ATM qdisc. Remove UAPI for it. Iproute2 will sync by equally removing it from user space. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02net/sched: Remove uapi support for dsmark qdiscJamal Hadi Salim
Commit bbe77c14ee61 ("net/sched: Retire dsmark qdisc") retired the dsmark classifier. Remove UAPI support for it. Iproute2 will sync by equally removing it from user space. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02net/sched: Remove uapi support for tcindex classifierJamal Hadi Salim
commit 8c710f75256b ("net/sched: Retire tcindex classifier") retired the TC tcindex classifier. Remove UAPI for it. Iproute2 will sync by equally removing it from user space. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02net/sched: Remove uapi support for rsvp classifierJamal Hadi Salim
commit 265b4da82dbf ("net/sched: Retire rsvp classifier") retired the TC RSVP classifier. Remove UAPI for it. Iproute2 will sync by equally removing it from user space. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02selftests: bonding: do not set port down when adding to bondHangbin Liu
Similar to commit be809424659c ("selftests: bonding: do not set port down before adding to bond"). The bond-arp-interval-causes-panic test failed after commit a4abfa627c38 ("net: rtnetlink: Enslave device before bringing it up") as the kernel will set the port down _after_ adding to bond if setting port down specifically. Fix it by removing the link down operation when adding to bond. Fixes: 2ffd57327ff1 ("selftests: bonding: cause oops in bond_rr_gen_slave_id") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Tested-by: Benjamin Poirier <benjamin.poirier@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02selftests: mptcp: diag: check CURRESTAB countersGeliang Tang
This patch adds a new helper chk_msk_cestab() to check the current established connections counter MIB_CURRESTAB in diag.sh. Invoke it to check the counter during the connection after every chk_msk_inuse(). Signed-off-by: Geliang Tang <geliang.tang@linux.dev> Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02selftests: mptcp: join: check CURRESTAB countersGeliang Tang
This patch adds a new helper chk_cestab_nr() to check the current established connections counter MIB_CURRESTAB. Set the newly added variables cestab_ns1 and cestab_ns2 to indicate how many connections are expected in ns1 or ns2. Invoke check_cestab() to check the counter during the connection in do_transfer() and invoke chk_cestab_nr() to re-check it when the connection closed. These checks are embedded in add_tests(). Signed-off-by: Geliang Tang <geliang.tang@linux.dev> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02selftest/tcp-ao: Work on namespace-ified sysctl_optmem_maxDmitry Safonov
Since commit f5769faeec36 ("net: Namespace-ify sysctl_optmem_max") optmem_max is per-netns, so need of switching to root namespace. It seems trivial to keep the old logic working, so going to keep it for a while (at least, until kernel with netns-optmem_max will be release). Currently, there is a test that checks that optmem_max limit applies to TCP-AO keys and a little benchmark that measures linked-list TCP-AO keys scaling, those are fixed by this. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02selftest/tcp-ao: Set routes in a proper VRF table idDmitry Safonov
In unsigned-md5 selftests ip_route_add() is not needed in client_add_ip(): the route was pre-setup in __test_init() => link_init() for subnet, rather than a specific ip-address. Currently, __ip_route_add() mistakenly always sets VRF table to RT_TABLE_MAIN - this seems to have sneaked in during unsigned-md5 tests debugging. That also explains, why ip_route_add_vrf() ignored EEXIST, returned by fib6. Yet, keep EEXIST ignoring in bench-lookups selftests as it's expected that those selftests may add the same (duplicate) routes. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02net/sched: Retire ipt actionJamal Hadi Salim
The tc ipt action was intended to run all netfilter/iptables target. Unfortunately it has not benefitted over the years from proper updates when netfilter changes, and for that reason it has remained rudimentary. Pinging a bunch of people that i was aware were using this indicates that removing it wont affect them. Retire it to reduce maintenance efforts. Buh-bye. Reviewed-by: Victor Noguiera <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01Merge tag 'nf-next-23-12-22' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== netfilter pull request 23-12-22 The following patchset contains Netfilter updates for net-next: 1) Add locking for NFT_MSG_GETSETELEM_RESET requests, to address a race scenario with two concurrent processes running a dump-and-reset which exposes negative counters to userspace, from Phil Sutter. 2) Use GFP_KERNEL in pipapo GC, from Florian Westphal. 3) Reorder nf_flowtable struct members, place the read-mostly parts accessed by the datapath first. From Florian Westphal. 4) Set on dead flag for NFT_MSG_NEWSET in abort path, from Florian Westphal. 5) Support filtering zone in ctnetlink, from Felix Huettner. 6) Bail out if user tries to redefine an existing chain with different type in nf_tables. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01Merge tag 'for-netdev' of ↵David S. Miller
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next-for-netdev The following pull-request contains BPF updates for your *net-next* tree. We've added 22 non-merge commits during the last 3 day(s) which contain a total of 23 files changed, 652 insertions(+), 431 deletions(-). The main changes are: 1) Add verifier support for annotating user's global BPF subprogram arguments with few commonly requested annotations for a better developer experience, from Andrii Nakryiko. These tags are: - Ability to annotate a special PTR_TO_CTX argument - Ability to annotate a generic PTR_TO_MEM as non-NULL 2) Support BPF verifier tracking of BPF_JNE which helps cases when the compiler transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the like, from Menglong Dong. 3) Fix a warning in bpf_mem_cache's check_obj_size() as reported by LKP, from Hou Tao. 4) Re-support uid/gid options when mounting bpffs which had to be reverted with the prior token series revert to avoid conflicts, from Daniel Borkmann. 5) Fix a libbpf NULL pointer dereference in bpf_object__collect_prog_relos() found from fuzzing the library with malformed ELF files, from Mingyi Zhang. 6) Skip DWARF sections in libbpf's linker sanity check given compiler options to generate compressed debug sections can trigger a rejection due to misalignment, from Alyssa Ross. 7) Fix an unnecessary use of the comma operator in BPF verifier, from Simon Horman. 8) Fix format specifier for unsigned long values in cpustat sample, from Colin Ian King. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27Merge tag 'mm-hotfixes-stable-2023-12-27-15-00' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "11 hotfixes. 7 are cc:stable and the other 4 address post-6.6 issues or are not considered backporting material" * tag 'mm-hotfixes-stable-2023-12-27-15-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: add an old address for Naoya Horiguchi mm/memory-failure: cast index to loff_t before shifting it mm/memory-failure: check the mapcount of the precise page mm/memory-failure: pass the folio and the page to collect_procs() selftests: secretmem: floor the memory size to the multiple of page_size mm: migrate high-order folios in swap cache correctly maple_tree: do not preallocate nodes for slot stores mm/filemap: avoid buffered read/write race to read inconsistent data kunit: kasan_test: disable fortify string checker on kmalloc_oob_memset kexec: select CRYPTO from KEXEC_FILE instead of depending on it kexec: fix KEXEC_FILE dependencies
2023-12-26selftests/net: add MPTCP coverage for IP_LOCAL_PORT_RANGEMaxim Galaganov
Since previous commit, MPTCP has support for IP_BIND_ADDRESS_NO_PORT and IP_LOCAL_PORT_RANGE sockopts. Add ip4_mptcp and ip6_mptcp fixture variants to ip_local_port_range selftest to provide selftest coverage for these sockopts. Acked-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Maxim Galaganov <max@internet.ru> Signed-off-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "RISC-V: - Fix a race condition in updating external interrupt for trap-n-emulated IMSIC swfile - Fix print_reg defaults in get-reg-list selftest ARM: - Ensure a vCPU's redistributor is unregistered from the MMIO bus if vCPU creation fails - Fix building KVM selftests for arm64 from the top-level Makefile x86: - Fix breakage for SEV-ES guests that use XSAVES Selftests: - Fix bad use of strcat(), by not using strcat() at all" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests KVM: selftests: Fix dynamic generation of configuration names RISCV: KVM: update external interrupt atomically for IMSIC swfile KVM: riscv: selftests: Fix get-reg-list print_reg defaults KVM: selftests: Ensure sysreg-defs.h is generated at the expected path KVM: Convert comment into an assertion in kvm_io_bus_register_dev() KVM: arm64: vgic: Ensure that slots_lock is held in vgic_register_all_redist_iodevs() KVM: arm64: vgic: Force vcpu vgic teardown on vcpu destroy KVM: arm64: vgic: Add a non-locking primitive for kvm_vgic_vcpu_destroy() KVM: arm64: vgic: Simplify kvm_vgic_destroy()