git.armlinux.org.uk/linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2023-04-06	Merge tag 'net-6.3-rc6-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless and can. Current release - regressions: - wifi: mac80211: - fix potential null pointer dereference - fix receiving mesh packets in forwarding=0 networks - fix mesh forwarding Current release - new code bugs: - virtio/vsock: fix leaks due to missing skb owner Previous releases - regressions: - raw: fix NULL deref in raw_get_next(). - sctp: check send stream number after wait_for_sndbuf - qrtr: - fix a refcount bug in qrtr_recvmsg() - do not do DEL_SERVER broadcast after DEL_CLIENT - wifi: brcmfmac: fix SDIO suspend/resume regression - wifi: mt76: fix use-after-free in fw features query. - can: fix race between isotp_sendsmg() and isotp_release() - eth: mtk_eth_soc: fix remaining throughput regression - eth: ice: reset FDIR counter in FDIR init stage Previous releases - always broken: - core: don't let netpoll invoke NAPI if in xmit context - icmp: guard against too small mtu - ipv6: fix an uninit variable access bug in __ip6_make_skb() - wifi: mac80211: fix the size calculation of ieee80211_ie_len_eht_cap() - can: fix poll() to not report false EPOLLOUT events - eth: gve: secure enough bytes in the first TX desc for all TCP pkts" * tag 'net-6.3-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits) net: stmmac: check fwnode for phy device before scanning for phy net: stmmac: Add queue reset into stmmac_xdp_open() function selftests: net: rps_default_mask.sh: delete veth link specifically net: fec: make use of MDIO C45 quirk can: isotp: fix race between isotp_sendsmg() and isotp_release() can: isotp: isotp_ops: fix poll() to not report false EPOLLOUT events can: isotp: isotp_recvmsg(): use sock_recv_cmsgs() to get SOCK_RXQ_OVFL infos can: j1939: j1939_tp_tx_dat_new(): fix out-of-bounds memory access gve: Secure enough bytes in the first TX desc for all TCP pkts netlink: annotate lockless accesses to nlk->max_recvmsg_len ethtool: reset #lanes when lanes is omitted ping: Fix potentail NULL deref for /proc/net/icmp. raw: Fix NULL deref in raw_get_next(). ice: Reset FDIR counter in FDIR init stage ice: fix wrong fallback logic for FDIR net: stmmac: fix up RX flow hash indirection table when setting channels net: ethernet: ti: am65-cpsw: Fix mdio cleanup in probe wifi: mt76: ignore key disable commands wifi: ath11k: reduce the MHI timeout to 20s ipv6: Fix an uninit variable access bug in __ip6_make_skb() ...
2023-04-06	Merge tag 'linux-kselftest-fixes-6.3-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: "One single fix to mount_setattr_test build failure" * tag 'linux-kselftest-fixes-6.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests mount: Fix mount_setattr_test builds failed
2023-04-06	selftests: xsk: Add test UNALIGNED_INV_DESC_4K1_FRAME_SIZE	Kal Conley
	Add unaligned descriptor test for frame size of 4001. Using an odd frame size ensures that the end of the UMEM is not near a page boundary. This allows testing descriptors that staddle the end of the UMEM but not a page. This test used to fail without the previous commit ("xsk: Fix unaligned descriptor validation"). Signed-off-by: Kal Conley <kal.conley@dectris.com> Link: https://lore.kernel.org/r/20230405235920.7305-3-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-06	selftests/bpf: fix xdp_redirect xdp-features selftest for veth driver	Lorenzo Bianconi
	xdp-features supported by veth driver are no more static, but they depends on veth configuration (e.g. if GRO is enabled/disabled or TX/RX queue configuration). Take it into account in xdp_redirect xdp-features selftest for veth driver. Fixes: fccca038f300 ("veth: take into account device reconfiguration for xdp_features flag") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/bc35455cfbb1d4f7f52536955ded81ad47d8dc54.1680777371.git.lorenzo@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-06	selftests/net: fix typo in tcp_mmap	Eric Dumazet
	kernel test robot reported the following warning: All warnings (new ones prefixed by >>): tcp_mmap.c: In function 'child_thread': >> tcp_mmap.c:211:61: warning: 'lu' may be used uninitialized in this function [-Wmaybe-uninitialized] 211 \| zc.length = min(chunk_size, FILE_SZ - lu); We want to read FILE_SZ bytes, so the correct expression should be (FILE_SZ - total) Fixes: 5c5945dc695c ("selftests/net: Add SHA256 computation over data sent in tcp_mmap") Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202304042104.UFIuevBp-lkp@intel.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Xiaoyan Li <lixiaoyan@google.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230405071556.1019623-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-05	bpftool: Clean up _bpftool_once_attr() calls in bash completion	Quentin Monnet
	In bpftool's bash completion file, function _bpftool_once_attr() is able to process multiple arguments. There are a few locations where this function is called multiple times in a row, each time for a single argument; let's pass all arguments instead to minimize the number of function calls required for the completion. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-8-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05	bpftool: Support printing opcodes and source file references in CFG	Quentin Monnet
	Add support for displaying opcodes or/and file references (filepath, line and column numbers) when dumping the control flow graphs of loaded BPF programs with bpftool. The filepaths in the records are absolute. To avoid blocks on the graph to get too wide, we truncate them when they get too long (but we always keep the entire file name). In the unlikely case where the resulting file name is ambiguous, it remains possible to get the full path with a regular dump (no CFG). Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-7-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05	bpftool: Support "opcodes", "linum", "visual" simultaneously	Quentin Monnet
	When dumping a program, the keywords "opcodes" (for printing the raw opcodes), "linum" (for displaying the filename, line number, column number along with the source code), and "visual" (for generating the control flow graph for translated programs) are mutually exclusive. But there's no reason why they should be. Let's make it possible to pass several of them at once. The "file FILE" option, which makes bpftool output a binary image to a file, remains incompatible with the others. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-6-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05	bpftool: Return an error on prog dumps if both CFG and JSON are required	Quentin Monnet
	We do not support JSON output for control flow graphs of programs with bpftool. So far, requiring both the CFG and JSON output would result in producing a null JSON object. It makes more sense to raise an error directly when parsing command line arguments and options, so that users know they won't get any output they might expect. If JSON is required for the graph, we leave it to Graphviz instead: # bpftool prog dump xlated <REF> visual \| dot -Tjson Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-5-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05	bpftool: Support inline annotations when dumping the CFG of a program	Quentin Monnet
	We support dumping the control flow graph of loaded programs to the DOT format with bpftool, but so far this feature wouldn't display the source code lines available through BTF along with the eBPF bytecode. Let's add support for these annotations, to make it easier to read the graph. In prog.c, we move the call to dump_xlated_cfg() in order to pass and use the full struct dump_data, instead of creating a minimal one in draw_bb_node(). We pass the pointer to this struct down to dump_xlated_for_graph() in xlated_dumper.c, where most of the logics is added. We deal with BTF mostly like we do for plain or JSON output, except that we cannot use a "nr_skip" value to skip a given number of linfo records (we don't process the BPF instructions linearly, and apart from the root of the graph we don't know how many records we should skip, so we just store the last linfo and make sure the new one we find is different before printing it). When printing the source instructions to the label of a DOT graph node, there are a few subtleties to address. We want some special newline markers, and there are some characters that we must escape. To deal with them, we introduce a new dedicated function btf_dump_linfo_dotlabel() in btf_dumper.c. We'll reuse this function in a later commit to format the filepath, line, and column references as well. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-4-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05	bpftool: Fix bug for long instructions in program CFG dumps	Quentin Monnet
	When dumping the control flow graphs for programs using the 16-byte long load instruction, we need to skip the second part of this instruction when looking for the next instruction to process. Otherwise, we end up printing "BUG_ld_00" from the kernel disassembler in the CFG. Fixes: efcef17a6d65 ("tools: bpftool: generate .dot graph from CFG information") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-3-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05	bpftool: Fix documentation about line info display for prog dumps	Quentin Monnet
	The documentation states that when line_info is available when dumping a program, the source line will be displayed "by default". There is no notion of "default" here: the line is always displayed if available, there is no way currently to turn it off. In the next sentence, the documentation states that if "linum" is used on the command line, the relevant filename, line, and column will be displayed "on top of the source line". This is incorrect, as they are currently displayed on the right side of the source line (or on top of the eBPF instruction, not the source). This commit fixes the documentation to address these points. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-2-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05	selftests: net: rps_default_mask.sh: delete veth link specifically	Hangbin Liu
	When deleting the netns and recreating a new one while re-adding the veth interface, there is a small window of time during which the old veth interface has not yet been removed. This can cause the new addition to fail. To resolve this issue, we can either wait for a short while to ensure that the old veth interface is deleted, or we can specifically remove the veth interface. Before this patch: # ./rps_default_mask.sh empty rps_default_mask [ ok ] changing rps_default_mask dont affect existing devices [ ok ] changing rps_default_mask dont affect existing netns [ ok ] changing rps_default_mask affect newly created devices [ ok ] changing rps_default_mask don't affect newly child netns[II][ ok ] rps_default_mask is 0 by default in child netns [ ok ] RTNETLINK answers: File exists changing rps_default_mask in child ns don't affect the main one[ ok ] cat: /sys/class/net/vethC11an1/queues/rx-0/rps_cpus: No such file or directory changing rps_default_mask in child ns affects new childns devices./rps_default_mask.sh: line 36: [: -eq: unary operator expected [fail] expected 1 found changing rps_default_mask in child ns don't affect existing devices[ ok ] After this patch: # ./rps_default_mask.sh empty rps_default_mask [ ok ] changing rps_default_mask dont affect existing devices [ ok ] changing rps_default_mask dont affect existing netns [ ok ] changing rps_default_mask affect newly created devices [ ok ] changing rps_default_mask don't affect newly child netns[II][ ok ] rps_default_mask is 0 by default in child netns [ ok ] changing rps_default_mask in child ns don't affect the main one[ ok ] changing rps_default_mask in child ns affects new childns devices[ ok ] changing rps_default_mask in child ns don't affect existing devices[ ok ] Fixes: 3a7d84eae03b ("self-tests: more rps self tests") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20230404072411.879476-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-05	maple_tree: fix write memory barrier of nodes once dead for RCU mode	Liam R. Howlett
	During the development of the maple tree, the strategy of freeing multiple nodes changed and, in the process, the pivots were reused to store pointers to dead nodes. To ensure the readers see accurate pivots, the writers need to mark the nodes as dead and call smp_wmb() to ensure any readers can identify the node as dead before using the pivot values. There were two places where the old method of marking the node as dead without smp_wmb() were being used, which resulted in RCU readers seeing the wrong pivot value before seeing the node was dead. Fix this race condition by using mte_set_node_dead() which has the smp_wmb() call to ensure the race is closed. Add a WARN_ON() to the ma_free_rcu() call to ensure all nodes being freed are marked as dead to ensure there are no other call paths besides the two updated paths. This is necessary for the RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-6-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05	selftests/bpf: Wait for receive in cg_storage_multi test	YiFei Zhu
	In some cases the loopback latency might be large enough, causing the assertion on invocations to be run before ingress prog getting executed. The assertion would fail and the test would flake. This can be reliably reproduced by arbitrarily increasing the loopback latency (thanks to [1]): tc qdisc add dev lo root handle 1: htb default 12 tc class add dev lo parent 1:1 classid 1:12 htb rate 20kbps ceil 20kbps tc qdisc add dev lo parent 1:12 netem delay 100ms Fix this by waiting on the receive end, instead of instantly returning to the assert. The call to read() will wait for the default SO_RCVTIMEO timeout of 3 seconds provided by start_server(). [1] https://gist.github.com/kstevens715/4598301 Reported-by: Martin KaFai Lau <martin.lau@linux.dev> Link: https://lore.kernel.org/bpf/9c5c8b7e-1d89-a3af-5400-14fde81f4429@linux.dev/ Fixes: 3573f384014f ("selftests/bpf: Test CGROUP_STORAGE behavior on shared egress + ingress") Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: YiFei Zhu <zhuyifei@google.com> Link: https://lore.kernel.org/r/20230405193354.1956209-1-zhuyifei@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05	selftests: xsk: Deflakify STATS_RX_DROPPED test	Kal Conley
	Fix flaky STATS_RX_DROPPED test. The receiver calls getsockopt after receiving the last (valid) packet which is not the final packet sent in the test (valid and invalid packets are sent in alternating fashion with the final packet being invalid). Since the last packet may or may not have been dropped already, both outcomes must be allowed. This issue could also be fixed by making sure the last packet sent is valid. This alternative is left as an exercise to the reader (or the benevolent maintainers of this file). This problem was quite visible on certain setups. On one machine this failure was observed 50% of the time. Also, remove a redundant assignment of pkt_stream->nb_pkts. This field is already initialized by __pkt_stream_alloc. Fixes: 27e934bec35b ("selftests: xsk: make stat tests not spin on getsockopt") Signed-off-by: Kal Conley <kal.conley@dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230403120400.31018-1-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05	selftests: xsk: Disable IPv6 on VETH1	Kal Conley
	This change fixes flakiness in the BIDIRECTIONAL test: # [is_pkt_valid] expected length [60], got length [90] not ok 1 FAIL: SKB BUSY-POLL BIDIRECTIONAL When IPv6 is enabled, the interface will periodically send MLDv1 and MLDv2 packets. These packets can cause the BIDIRECTIONAL test to fail since it uses VETH0 for RX. For other tests, this was not a problem since they only receive on VETH1 and IPv6 was already disabled on VETH0. Fixes: a89052572ebb ("selftests/bpf: Xsk selftests framework") Signed-off-by: Kal Conley <kal.conley@dectris.com> Link: https://lore.kernel.org/r/20230405082905.6303-1-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05	selftests: xsk: Add test case for packets at end of UMEM	Kal Conley
	Add test case to testapp_invalid_desc for valid packets at the end of the UMEM. Signed-off-by: Kal Conley <kal.conley@dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230403145047.33065-3-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05	selftests: xsk: Use correct UMEM size in testapp_invalid_desc	Kal Conley
	Avoid UMEM_SIZE macro in testapp_invalid_desc which is incorrect when the frame size is not XSK_UMEM__DEFAULT_FRAME_SIZE. Also remove the macro since it's no longer being used. Fixes: 909f0e28207c ("selftests: xsk: Add tests for 2K frame size") Signed-off-by: Kal Conley <kal.conley@dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230403145047.33065-2-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05	selftests: xsk: Add xskxceiver.h dependency to Makefile	Kal Conley
	xskxceiver depends on xskxceiver.h so tell make about it. Signed-off-by: Kal Conley <kal.conley@dectris.com> Link: https://lore.kernel.org/r/20230403130151.31195-1-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-04	selftests/bpf: Add tracing tests for walking skb and req.	Alexei Starovoitov
	Add tracing tests for walking skb->sk and req->sk. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230404045029.82870-9-alexei.starovoitov@gmail.com
2023-04-04	selftests/bpf: Add RESOLVE_BTFIDS dependency to bpf_testmod.ko	Ilya Leoshkevich
	bpf_testmod.ko sometimes fails to build from a clean checkout: BTF [M] linux/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.ko /bin/sh: 1: linux-build//tools/build/resolve_btfids/resolve_btfids: not found The reason is that RESOLVE_BTFIDS may not yet be built. Fix by adding a dependency. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230403172935.1553022-1-iii@linux.ibm.com
2023-04-04	tools/virtio: fix typo in README instructions	Ross Zwisler
	We need to have a unique chardev for each data path, else the chardevs will collide and qemu will die with this message: qemu-system-x86_64: -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0, id=channel1,name=trace-path-cpu0: Property 'virtserialport.chardev' can't take value 'charchannel0': Device 'charchannel0' is in use Signed-off-by: Ross Zwisler <zwisler@google.com> Message-Id: <20230215223350.2658616-7-zwisler@google.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-04	vsock/test: update expected return values	Arseniy Krasnov
	This updates expected return values for invalid buffer test. Now such values are returned from transport, not from af_vsock.c. Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-01	bpf: Remove now-defunct task kfuncs	David Vernet
	In commit 22df776a9a86 ("tasks: Extract rcu_users out of union"), the 'refcount_t rcu_users' field was extracted out of a union with the 'struct rcu_head rcu' field. This allows us to safely perform a refcount_inc_not_zero() on task->rcu_users when acquiring a reference on a task struct. A prior patch leveraged this by making struct task_struct an RCU-protected object in the verifier, and by bpf_task_acquire() to use the task->rcu_users field for synchronization. Now that we can use RCU to protect tasks, we no longer need bpf_task_kptr_get(), or bpf_task_acquire_not_zero(). bpf_task_kptr_get() is truly completely unnecessary, as we can just use RCU to get the object. bpf_task_acquire_not_zero() is now equivalent to bpf_task_acquire(). In addition to these changes, this patch also updates the associated selftests to no longer use these kfuncs. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230331195733.699708-3-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01	bpf: Make struct task_struct an RCU-safe type	David Vernet
	struct task_struct objects are a bit interesting in terms of how their lifetime is protected by refcounts. task structs have two refcount fields: 1. refcount_t usage: Protects the memory backing the task struct. When this refcount drops to 0, the task is immediately freed, without waiting for an RCU grace period to elapse. This is the field that most callers in the kernel currently use to ensure that a task remains valid while it's being referenced, and is what's currently tracked with bpf_task_acquire() and bpf_task_release(). 2. refcount_t rcu_users: A refcount field which, when it drops to 0, schedules an RCU callback that drops a reference held on the 'usage' field above (which is acquired when the task is first created). This field therefore provides a form of RCU protection on the task by ensuring that at least one 'usage' refcount will be held until an RCU grace period has elapsed. The qualifier "a form of" is important here, as a task can remain valid after task->rcu_users has dropped to 0 and the subsequent RCU gp has elapsed. In terms of BPF, we want to use task->rcu_users to protect tasks that function as referenced kptrs, and to allow tasks stored as referenced kptrs in maps to be accessed with RCU protection. Let's first determine whether we can safely use task->rcu_users to protect tasks stored in maps. All of the bpf_task* kfuncs can only be called from tracepoint, struct_ops, or BPF_PROG_TYPE_SCHED_CLS, program types. For tracepoint and struct_ops programs, the struct task_struct passed to a program handler will always be trusted, so it will always be safe to call bpf_task_acquire() with any task passed to a program. Note, however, that we must update bpf_task_acquire() to be KF_RET_NULL, as it is possible that the task has exited by the time the program is invoked, even if the pointer is still currently valid because the main kernel holds a task->usage refcount. For BPF_PROG_TYPE_SCHED_CLS, tasks should never be passed as an argument to the any program handlers, so it should not be relevant. The second question is whether it's safe to use RCU to access a task that was acquired with bpf_task_acquire(), and stored in a map. Because bpf_task_acquire() now uses task->rcu_users, it follows that if the task is present in the map, that it must have had at least one task->rcu_users refcount by the time the current RCU cs was started. Therefore, it's safe to access that task until the end of the current RCU cs. With all that said, this patch makes struct task_struct is an RCU-protected object. In doing so, we also change bpf_task_acquire() to be KF_ACQUIRE \| KF_RCU \| KF_RET_NULL, and adjust any selftests as necessary. A subsequent patch will remove bpf_task_kptr_get(), and bpf_task_acquire_not_zero() respectively. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230331195733.699708-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01	veristat: small fixed found in -O2 mode	Andrii Nakryiko
	Fix few potentially unitialized variables uses, found while building veristat.c in release (-O2) mode. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01	veristat: avoid using kernel-internal headers	Andrii Nakryiko
	Drop linux/compiler.h include, which seems to be needed for ARRAY_SIZE macro only. Redefine own version of ARRAY_SIZE instead. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01	veristat: improve version reporting	Andrii Nakryiko
	For packaging version of the tool is important, so add a simple way to specify veristat version for upstream mirror at Github. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01	veristat: relicense veristat.c as dual GPL-2.0-only or BSD-2-Clause licensed	Andrii Nakryiko
	Dual-license veristat.c to dual GPL-2.0-only or BSD-2-Clause license. This is needed to mirror it to Github to make it convenient for distro packagers to package veristat as a separate package. Veristat grew into a useful tool by itself, and there are already a bunch of users relying on veristat as generic BPF loading and verification helper tool. So making it easy to packagers by providing Github mirror just like we do for bpftool and libbpf is the next step to get veristat into the hands of users. Apart from few typo fixes, I'm the sole contributor to veristat.c so far, so no extra Acks should be needed for relicensing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-31	selftests/bpf: Fix conflicts with built-in functions in ↵	James Hilliard
	bench_local_storage_create The fork function in gcc is considered a built in function due to being used by libgcov when building with gnu extensions. Rename fork to sched_process_fork to prevent this conflict. See details: https://github.com/gcc-mirror/gcc/commit/d1c38823924506d389ca58d02926ace21bdf82fa https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82457 Fixes the following error: In file included from progs/bench_local_storage_create.c:6: progs/bench_local_storage_create.c:43:14: error: conflicting types for built-in function 'fork'; expected 'int(void)' [-Werror=builtin-declaration-mismatch] 43 \| int BPF_PROG(fork, struct task_struct parent, struct task_struct child) \| ^~~~ Fixes: cbe9d93d58b1 ("selftests/bpf: Add bench for task storage creation") Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230331075848.1642814-1-james.hilliard1@gmail.com
2023-03-31	selftests/bpf: Replace extract_build_id with read_build_id	Jiri Olsa
	Replacing extract_build_id with read_build_id that parses out build id directly from elf without using readelf tool. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230331093157.1749137-4-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-31	selftests/bpf: Add read_build_id function	Jiri Olsa
	Adding read_build_id function that parses out build id from specified binary. It will replace extract_build_id and also be used in following changes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230331093157.1749137-3-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-31	selftests/bpf: Add err.h header	Jiri Olsa
	Moving error macros from profiler.inc.h to new err.h header. It will be used in following changes. Also adding PTR_ERR macro that will be used in following changes. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230331093157.1749137-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-31	selftests mount: Fix mount_setattr_test builds failed	Anh Tuan Phan
	When compiling selftests with target mount_setattr I encountered some errors with the below messages: mount_setattr_test.c: In function ‘mount_setattr_thread’: mount_setattr_test.c:343:16: error: variable ‘attr’ has initializer but incomplete type 343 \| struct mount_attr attr = { \| ^~~~~~~~~~ These errors might be because of linux/mount.h is not included. This patch resolves that issue. Signed-off-by: Anh Tuan Phan <tuananhlfc@gmail.com> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-03-30	tools: ynl: ethtool testing tool	Stanislav Fomichev
	This is what I've been using to see whether the spec makes sense. A small subset of getters (mostly the unprivileged ones) is implemented. Some setters (channels) also work. Setters for messages with bitmasks are not implemented. Initially I was trying to make this tool look 1:1 like real ethtool, but eventually gave up :-) Sample output: $ ./tools/net/ynl/ethtool enp0s31f6 Settings for enp0s31f6: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause frame use: no Supports auto-negotiation: yes Supported FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Auto-negotiation: on Port: Twisted Pair PHYAD: 2 Transceiver: Internal MDI-X: Unknown (auto) Current message level: drv probe link Link detected: no Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30	tools: ynl: replace print with NlError	Stanislav Fomichev
	Instead of dumping the error on the stdout, make the callee and opportunity to decide what to do with it. This is mostly for the ethtool testing. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30	tools: ynl: support byte-order in cli	Stanislav Fomichev
	Used by ethtool spec. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30	selftests: forwarding: add tunnel_key "nofrag" test case	Davide Caratti
	Add a selftest that configures metadata tunnel encapsulation using the TC "tunnel_key" action: it includes a test case for setting "nofrag" flag. Example output: # selftests: net/forwarding: tc_tunnel_key.sh # TEST: tunnel_key nofrag (skip_hw) [ OK ] # INFO: Could not test offloaded functionality ok 1 selftests: net/forwarding: tc_tunnel_key.sh Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30	selftests: tc-testing: add tunnel_key "nofrag" test case	Davide Caratti
	# ./tdc.py -e 6bda -l 6bda: (actions, tunnel_key) Add tunnel_key action with nofrag option Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30	selftests: tc-testing: add "depends_on" property to skip tests	Davide Caratti
	currently, users can skip individual test cases by means of writing "skip": "yes" in the scenario file. Extend this functionality, introducing 'dependsOn': it's optional property like "skip", but the value contains a command (for example, a probe on iproute2 to check if it supports a specific feature). If such property is present, tdc executes that command and skips the test when the return value is non-zero. Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30	selftests: rtnetlink: Fix do_test_address_proto()	Petr Machata
	This selftest was introduced recently in the commit cited below. It misses several check_err() invocations to actually verify that the previous command succeeded. When these are added, the first one fails, because besides the addresses added by hand, there can be a link-local address added by the kernel. Adjust the check to expect at least three addresses instead of exactly three, and add the missing check_err's. Furthermore, the explanatory comments assume that the address with no protocol is $addr2, when in fact it is $addr3. Update the comments. Fixes: 6a414fd77f61 ("selftests: rtnetlink: Add an address proto test") Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/53a579bc883e1bf2fe490d58427cf22c2d1aa21f.1680102695.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Conflicts: drivers/net/ethernet/mediatek/mtk_ppe.c 3fbe4d8c0e53 ("net: ethernet: mtk_eth_soc: ppe: add support for flow accounting") 924531326e2d ("net: ethernet: mtk_eth_soc: add missing ppe cache flush when deleting a flow") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30	selftests/bpf: Add testcases for ptr_*_or_null_ in bpf_kptr_xchg	David Vernet
	The second argument of the bpf_kptr_xchg() helper function is ARG_PTR_TO_BTF_ID_OR_NULL. A recent patch fixed a bug whereby the verifier would fail with an internal error message if a program invoked the helper with a PTR_TO_BTF_ID \| PTR_MAYBE_NULL register. This testcase adds some testcases to ensure that it fails gracefully moving forward. Before the fix, these testcases would have failed an error resembling the following: ; p = bpf_kfunc_call_test_acquire(&(unsigned long){0}); 99: (7b) (u64 )(r10 -16) = r7 ; frame1: ... 100: (bf) r1 = r10 ; frame1: ... 101: (07) r1 += -16 ; frame1: ... ; p = bpf_kfunc_call_test_acquire(&(unsigned long){0}); 102: (85) call bpf_kfunc_call_test_acquire#13908 ; frame1: R0_w=ptr_or_null_prog_test_ref_kfunc... ; p = bpf_kptr_xchg(&v->ref_ptr, p); 103: (bf) r1 = r6 ; frame1: ... 104: (bf) r2 = r0 ; frame1: R0_w=ptr_or_null_prog_test_ref_kfunc... 105: (85) call bpf_kptr_xchg#194 verifier internal error: invalid PTR_TO_BTF_ID register for type match Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230330145203.80506-2-void@manifault.com
2023-03-30	Merge tag 'net-6.3-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from CAN and WPAN. Still quite a few bugs from this release. This pull is a bit smaller because major subtrees went into the previous one. Or maybe people took spring break off? Current release - regressions: - phy: micrel: correct KSZ9131RNX EEE capabilities and advertisement Current release - new code bugs: - eth: wangxun: fix vector length of interrupt cause - vsock/loopback: consistently protect the packet queue with sk_buff_head.lock - virtio/vsock: fix header length on skb merging - wpan: ca8210: fix unsigned mac_len comparison with zero Previous releases - regressions: - eth: stmmac: don't reject VLANs when IFF_PROMISC is set - eth: smsc911x: avoid PHY being resumed when interface is not up - eth: mtk_eth_soc: fix tx throughput regression with direct 1G links - eth: bnx2x: use the right build_skb() helper after core rework - wwan: iosm: fix 7560 modem crash on use on unsupported channel Previous releases - always broken: - eth: sfc: don't overwrite offload features at NIC reset - eth: r8169: fix RTL8168H and RTL8107E rx crc error - can: j1939: prevent deadlock by moving j1939_sk_errqueue() - virt: vmxnet3: use GRO callback when UPT is enabled - virt: xen: don't do grant copy across page boundary - phy: dp83869: fix default value for tx-/rx-internal-delay - dsa: ksz8: fix multiple issues with ksz8_fdb_dump - eth: mvpp2: fix classification/RSS of VLAN and fragmented packets - eth: mtk_eth_soc: fix flow block refcounting logic Misc: - constify fwnode pointers in SFP handling" * tag 'net-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits) net: ethernet: mtk_eth_soc: add missing ppe cache flush when deleting a flow net: ethernet: mtk_eth_soc: fix L2 offloading with DSA untag offload net: ethernet: mtk_eth_soc: fix flow block refcounting logic net: mvneta: fix potential double-frees in mvneta_txq_sw_deinit() net: dsa: sync unicast and multicast addresses for VLAN filters too net: dsa: mv88e6xxx: Enable IGMP snooping on user ports only xen/netback: use same error messages for same errors test/vsock: new skbuff appending test virtio/vsock: WARN_ONCE() for invalid state of socket virtio/vsock: fix header length on skb merging bnxt_en: Add missing 200G link speed reporting bnxt_en: Fix typo in PCI id to device description string mapping bnxt_en: Fix reporting of test result in ethtool selftest i40e: fix registers dump after run ethtool adapter self test bnx2x: use the right build_skb() helper net: ipa: compute DMA pool size properly net: wwan: iosm: fixes 7560 modem crash net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links ice: fix invalid check for empty list in ice_sched_assoc_vsi_to_agg() ice: add profile conflict check for AVF FDIR ...
2023-03-30	veristat: change guess for __sk_buff from CGROUP_SKB to SCHED_CLS	Andrii Nakryiko
	SCHED_CLS seems to be a better option as a default guess for freplace programs that have __sk_buff as a context type. Reported-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230330190115.3942962-1-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-03-30	test/vsock: new skbuff appending test	Arseniy Krasnov
	This adds test which checks case when data of newly received skbuff is appended to the last skbuff in the socket's queue. It looks like simple test with 'send()' and 'recv()', but internally it triggers logic which appends one received skbuff to another. Test checks that this feature works correctly. This test is actual only for virtio transport. Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-29	selftests/bpf: Rewrite two infinite loops in bound check cases	Xu Kuohai
	The two infinite loops in bound check cases added by commit 1a3148fc171f ("selftests/bpf: Check when bounds are not in the 32-bit range") increased the execution time of test_verifier from about 6 seconds to about 9 seconds. Rewrite these two infinite loops to finite loops to get rid of this extra time cost. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20230329011048.1721937-1-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-29	veristat: guess and substitue underlying program type for freplace (EXT) progs	Andrii Nakryiko
	SEC("freplace") (i.e., BPF_PROG_TYPE_EXT) programs are not loadable as is through veristat, as kernel expects actual program's FD during BPF_PROG_LOAD time, which veristat has no way of knowing. Unfortunately, freplace programs are a pretty important class of programs, especially when dealing with XDP chaining solutions, which rely on EXT programs. So let's do our best and teach veristat to try to guess the original program type, based on program's context argument type. And if guessing process succeeds, we manually override freplace/EXT with guessed program type using bpf_program__set_type() setter to increase chances of proper BPF verification. We rely on BTF and maintain a simple lookup table. This process is obviously not 100% bulletproof, as valid program might not use context and thus wouldn't have to specify correct type. Also, __sk_buff is very ambiguous and is the context type across many different program types. We pick BPF_PROG_TYPE_CGROUP_SKB for now, which seems to work fine in practice so far. Similarly, some program types require specifying attach type, and so we pick one out of possible few variants. Best effort at its best. But this makes veristat even more widely applicable. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230327185202.1929145-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-29	veristat: add -d debug mode option to see debug libbpf log	Andrii Nakryiko
	Add -d option to allow requesting libbpf debug logs from veristat. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230327185202.1929145-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>