summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-04-25bpf: Tag argument to be released in bpf_func_protoKumar Kartikeya Dwivedi
Add a new type flag for bpf_arg_type that when set tells verifier that for a release function, that argument's register will be the one for which meta.ref_obj_id will be set, and which will then be released using release_reference. To capture the regno, introduce a new field release_regno in bpf_call_arg_meta. This would be required in the next patch, where we may either pass NULL or a refcounted pointer as an argument to the release function bpf_kptr_xchg. Just releasing only when meta.ref_obj_id is set is not enough, as there is a case where the type of argument needed matches, but the ref_obj_id is set to 0. Hence, we must enforce that whenever meta.ref_obj_id is zero, the register that is to be released can only be NULL for a release function. Since we now indicate whether an argument is to be released in bpf_func_proto itself, is_release_function helper has lost its utitlity, hence refactor code to work without it, and just rely on meta.release_regno to know when to release state for a ref_obj_id. Still, the restriction of one release argument and only one ref_obj_id passed to BPF helper or kfunc remains. This may be lifted in the future. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-3-memxor@gmail.com
2022-04-25bpf: Allow storing unreferenced kptr in mapKumar Kartikeya Dwivedi
This commit introduces a new pointer type 'kptr' which can be embedded in a map value to hold a PTR_TO_BTF_ID stored by a BPF program during its invocation. When storing such a kptr, BPF program's PTR_TO_BTF_ID register must have the same type as in the map value's BTF, and loading a kptr marks the destination register as PTR_TO_BTF_ID with the correct kernel BTF and BTF ID. Such kptr are unreferenced, i.e. by the time another invocation of the BPF program loads this pointer, the object which the pointer points to may not longer exist. Since PTR_TO_BTF_ID loads (using BPF_LDX) are patched to PROBE_MEM loads by the verifier, it would safe to allow user to still access such invalid pointer, but passing such pointers into BPF helpers and kfuncs should not be permitted. A future patch in this series will close this gap. The flexibility offered by allowing programs to dereference such invalid pointers while being safe at runtime frees the verifier from doing complex lifetime tracking. As long as the user may ensure that the object remains valid, it can ensure data read by it from the kernel object is valid. The user indicates that a certain pointer must be treated as kptr capable of accepting stores of PTR_TO_BTF_ID of a certain type, by using a BTF type tag 'kptr' on the pointed to type of the pointer. Then, this information is recorded in the object BTF which will be passed into the kernel by way of map's BTF information. The name and kind from the map value BTF is used to look up the in-kernel type, and the actual BTF and BTF ID is recorded in the map struct in a new kptr_off_tab member. For now, only storing pointers to structs is permitted. An example of this specification is shown below: #define __kptr __attribute__((btf_type_tag("kptr"))) struct map_value { ... struct task_struct __kptr *task; ... }; Then, in a BPF program, user may store PTR_TO_BTF_ID with the type task_struct into the map, and then load it later. Note that the destination register is marked PTR_TO_BTF_ID_OR_NULL, as the verifier cannot know whether the value is NULL or not statically, it must treat all potential loads at that map value offset as loading a possibly NULL pointer. Only BPF_LDX, BPF_STX, and BPF_ST (with insn->imm = 0 to denote NULL) are allowed instructions that can access such a pointer. On BPF_LDX, the destination register is updated to be a PTR_TO_BTF_ID, and on BPF_STX, it is checked whether the source register type is a PTR_TO_BTF_ID with same BTF type as specified in the map BTF. The access size must always be BPF_DW. For the map in map support, the kptr_off_tab for outer map is copied from the inner map's kptr_off_tab. It was chosen to do a deep copy instead of introducing a refcount to kptr_off_tab, because the copy only needs to be done when paramterizing using inner_map_fd in the map in map case, hence would be unnecessary for all other users. It is not permitted to use MAP_FREEZE command and mmap for BPF map having kptrs, similar to the bpf_timer case. A kptr also requires that BPF program has both read and write access to the map (hence both BPF_F_RDONLY_PROG and BPF_F_WRONLY_PROG are disallowed). Note that check_map_access must be called from both check_helper_mem_access and for the BPF instructions, hence the kptr check must distinguish between ACCESS_DIRECT and ACCESS_HELPER, and reject ACCESS_HELPER cases. We rename stack_access_src to bpf_access_src and reuse it for this purpose. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-2-memxor@gmail.com
2022-04-25bpf: Use bpf_prog_run_array_cg_flags everywhereStanislav Fomichev
Rename bpf_prog_run_array_cg_flags to bpf_prog_run_array_cg and use it everywhere. check_return_code already enforces sane return ranges for all cgroup types. (only egress and bind hooks have uncanonical return ranges, the rest is using [0, 1]) No functional changes. v2: - 'func_ret & 1' under explicit test (Andrii & Martin) Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220425220448.3669032-1-sdf@google.com
2022-04-25bpftool, musl compat: Replace sys/fcntl.h by fcntl.hDominique Martinet
musl does not like including sys/fcntl.h directly: [...] 1 | #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [...] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220424051022.2619648-5-asmadeus@codewreck.org
2022-04-25bpftool, musl compat: Replace nftw with FTW_ACTIONRETVALDominique Martinet
musl nftw implementation does not support FTW_ACTIONRETVAL. There have been multiple attempts at pushing the feature in musl upstream, but it has been refused or ignored all the times: https://www.openwall.com/lists/musl/2021/03/26/1 https://www.openwall.com/lists/musl/2022/01/22/1 In this case we only care about /proc/<pid>/fd/<fd>, so it's not too difficult to reimplement directly instead, and the new implementation makes 'bpftool perf' slightly faster because it doesn't needlessly stat/readdir unneeded directories (54ms -> 13ms on my machine). Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220424051022.2619648-4-asmadeus@codewreck.org
2022-04-25wwan_hwsim: Avoid flush_scheduled_work() usageTetsuo Handa
Flushing system-wide workqueues is dangerous and will be forbidden. Replace system_wq with local wwan_wq. While we are at it, make err_clean_devs: label of wwan_hwsim_init() behave like wwan_hwsim_exit(), for it is theoretically possible to call wwan_hwsim_debugfs_devcreate_write()/wwan_hwsim_debugfs_devdestroy_write() by the moment wwan_hwsim_init_devs() returns. Link: https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Link: https://lore.kernel.org/r/7390d51f-60e2-3cee-5277-b819a55ceabe@I-love.SAKURA.ne.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-25libbpf: Remove unnecessary type castYuntao Wang
The link variable is already of type 'struct bpf_link *', casting it to 'struct bpf_link *' is redundant, drop it. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220424143420.457082-1-ytcoode@gmail.com
2022-04-25net: dsa: remove unused headersMarcin Wojtas
Reduce a number of included headers to a necessary minimum. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25arp: fix unused variable warnning when CONFIG_PROC_FS=nYajun Deng
net/ipv4/arp.c:1412:36: warning: unused variable 'arp_seq_ops' [-Wunused-const-variable] Add #ifdef CONFIG_PROC_FS for 'arp_seq_ops'. Fixes: e968b1b3e9b8 ("arp: Remove #ifdef CONFIG_PROC_FS") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25net: ipa: compute proper aggregation limitAlex Elder
The aggregation byte limit for an endpoint is currently computed based on the endpoint's receive buffer size. However, some bytes at the front of each receive buffer are reserved on the assumption that--as with SKBs--it might be useful to insert data (such as headers) before what lands in the buffer. The aggregation byte limit currently doesn't take into account that reserved space, and as a result, aggregation could require space past that which is available in the buffer. Fix this by reducing the size used to compute the aggregation byte limit by the NET_SKB_PAD offset reserved for each receive buffer. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25net: ethernet: mtk_eth_soc: add check for allocation failureDan Carpenter
Check if the kzalloc() failed. Fixes: 804775dfc288 ("net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25ethernet: broadcom/sb1250-mac: remove BUG_ON in sbmac_probe()Yang Yingliang
Replace the BUG_ON() with returning error code to handle the fault more gracefully. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25net: mscc: ocelot: Remove useless codeHaowen Bai
payload only memset but no use at all, so we drop them. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25Merge branch 'mlxsw-line-card-model'David S. Miller
Ido Schimmel says: ==================== mlxsw: extend line card model by devices and info Jiri says: This patchset is extending the line card model by three items: 1) line card devices 2) line card info 3) line card device info First three patches are introducing the necessary changes in devlink core. Then, all three extensions are implemented in mlxsw alongside with selftest. Examples: $ devlink lc show pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 state active type 16x100G supported_types: 16x100G devices: device 0 device 1 device 2 device 3 $ devlink lc info pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 versions: fixed: hw.revision 0 running: ini.version 4 devices: device 0 versions: running: fw 19.2010.1310 device 1 versions: running: fw 19.2010.1310 device 2 versions: running: fw 19.2010.1310 device 3 versions: running: fw 19.2010.1310 Note that device FW flashing is going to be implemented in the follow-up patchset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25selftests: mlxsw: Check device info on activated line cardJiri Pirko
Once line card is activated, check the device FW version is exposed. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25mlxsw: core_linecards: Expose device FW version over device infoJiri Pirko
Extend MDDQ to obtain FW version of line card device and implement device_info_get() op to fill up the info with that. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25mlxsw: reg: Extend MDDQ device_info by FW version fieldsJiri Pirko
Add FW version fields to MDDQ device_info. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25selftests: mlxsw: Check line card info on provisioned line cardJiri Pirko
Once line card is provisioned, check if HW revision and INI version are exposed. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25mlxsw: core_linecards: Expose HW revision and INI versionJiri Pirko
Implement info_get() to expose HW revision of a linecard and loaded INI version. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25selftests: mlxsw: Check devices on provisioned line cardJiri Pirko
Once line card is provisioned, check the count of devices on it and print them out. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25mlxsw: core_linecards: Probe provisioned line cards for devices and attach themJiri Pirko
In case the line card is provisioned, go over all possible existing devices (gearboxes) on it and attach them, so devlink core is aware of them. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25mlxsw: reg: Extend MDDQ by device_infoJiri Pirko
Extend existing MDDQ register by possibility to query information about devices residing on a line card. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25devlink: introduce line card device info infrastructureJiri Pirko
Extend the line card info message with information (e.g., FW version) about devices found on the line card. Example: $ devlink lc info pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 versions: fixed: hw.revision 0 running: ini.version 4 devices: device 0 versions: running: fw 19.2010.1310 device 1 versions: running: fw 19.2010.1310 device 2 versions: running: fw 19.2010.1310 device 3 versions: running: fw 19.2010.1310 Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25devlink: introduce line card info get messageJiri Pirko
Allow the driver to provide per line card info get op to fill-up info, similar to the "devlink dev info". Example: $ devlink lc info pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 versions: fixed: hw.revision 0 running: ini.version 4 Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25devlink: introduce line card devices supportJiri Pirko
Line card can contain one or more devices that makes sense to make visible to the user. For example, this can be a gearbox with flash memory, which could be updated. Provide the driver possibility to attach such devices to a line card and expose those to user. Example: $ devlink lc show pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 state active type 16x100G supported_types: 16x100G devices: device 0 device 1 device 2 device 3 Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23Merge branch 'dsa-selftests'David S. Miller
Vladimir Oltean says: ==================== DSA selftests When working on complex new features or reworks it becomes increasingly difficult to ensure there aren't regressions being introduced, and therefore it would be nice if we could go over the functionality we already have and write some tests for it. Verbally I know from Tobias Waldekranz that he has been working on some selftests for DSA, yet I have never seen them, so here I am adding some tests I have written which have been useful for me. The list is by no means complete (it only covers elementary functionality), but it's still good to have as a starting point. I also borrowed some refactoring changes from Joachim Wiberg that he submitted for his "net: bridge: forwarding of unknown IPv4/IPv6/MAC BUM traffic" series, but not the entirety of his selftests. I now think that his selftests have some overlap with bridge_vlan_unaware.sh and bridge_vlan_aware.sh and they should be more tightly integrated with each other - yet I didn't do that either :). Another issue I had with his selftests was that they jumped straight ahead to configure brport flags on br0 (a radical new idea still at RFC status) while we have bigger problems, and we don't have nearly enough coverage for the *existing* functionality. One idea introduced here which I haven't seen before is the symlinking of relevant forwarding selftests to the selftests/drivers/net/<my-driver>/ folder, plus a forwarding.config file. I think there's some value in having things structured this way, since the forwarding dir has so many selftests that aren't relevant to DSA that it is a bit difficult to find the ones that are. While searching for applications that I could use for multicast testing (not my domain of interest/knowledge really), I found Joachim Wiberg's mtools, mcjoin and omping, and I tried them all with various degrees of success. In particular, I was going to use mcjoin, but I faced some issues getting IPv6 multicast traffic to work in a VRF, and I bothered David Ahern about it here: https://lore.kernel.org/netdev/97eaffb8-2125-834e-641f-c99c097b6ee2@gmail.com/t/ It seems that the problem is that this application should use SO_BINDTODEVICE, yet it doesn't. So I ended up patching the bare-bones mtools (msend, mreceive) forked by Joachim from the University of Virginia's Multimedia Networks Group to include IPv6 support, and to use SO_BINDTODEVICE. This is what I'm using now for IPv6. Note that mausezahn doesn't appear to do a particularly good job of supporting IPv6 really, and I needed a program to emit the actual IP_ADD_MEMBERSHIP calls, for dev_mc_add(), so I could test RX filtering. Crafting the IGMP/MLD reports by hand doesn't really do the trick. While extremely bare-bones, the mreceive application now seems to do what I need it to. Feedback appreciated, it is very likely that I could have done things in a better way. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23selftests: drivers: dsa: add a subset of forwarding selftestsVladimir Oltean
This adds an initial subset of forwarding selftests which I considered to be relevant for DSA drivers, along with a forwarding.config that makes it easier to run them (disables veth pair creation, makes sure MAC addresses are unique and stable). The intention is to request driver writers to run these selftests during review and make sure that the tests pass, or at least that the problems are known. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23selftests: forwarding: add a test for local_termination.shVladimir Oltean
This tests the capability of switch ports to filter out undesired traffic. Different drivers are expected to have different capabilities here (so some may fail and some may pass), yet the test still has some value, for example to check for regressions. There are 2 kinds of failures, one is when a packet which should have been accepted isn't (and that should be fixed), and the other "failure" (as reported by the test) is when a packet could have been filtered out (for being unnecessary) yet it was received. The bridge driver fares particularly badly at this test: TEST: br0: Unicast IPv4 to primary MAC address [ OK ] TEST: br0: Unicast IPv4 to macvlan MAC address [ OK ] TEST: br0: Unicast IPv4 to unknown MAC address [FAIL] reception succeeded, but should have failed TEST: br0: Unicast IPv4 to unknown MAC address, promisc [ OK ] TEST: br0: Unicast IPv4 to unknown MAC address, allmulti [FAIL] reception succeeded, but should have failed TEST: br0: Multicast IPv4 to joined group [ OK ] TEST: br0: Multicast IPv4 to unknown group [FAIL] reception succeeded, but should have failed TEST: br0: Multicast IPv4 to unknown group, promisc [ OK ] TEST: br0: Multicast IPv4 to unknown group, allmulti [ OK ] TEST: br0: Multicast IPv6 to joined group [ OK ] TEST: br0: Multicast IPv6 to unknown group [FAIL] reception succeeded, but should have failed TEST: br0: Multicast IPv6 to unknown group, promisc [ OK ] TEST: br0: Multicast IPv6 to unknown group, allmulti [ OK ] mainly because it does not implement IFF_UNICAST_FLT. Yet I still think having the test (with the failures) is useful in case somebody wants to tackle that problem in the future, to make an easy before-and-after comparison. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23selftests: forwarding: add a no_forwarding.sh testVladimir Oltean
Bombard a standalone switch port with various kinds of traffic to ensure it is really standalone and doesn't leak packets to other switch ports. Also check for switch ports in different bridges, and switch ports in a VLAN-aware bridge but having different pvids. No forwarding should take place in either case. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23selftests: forwarding: add helper for retrieving IPv6 link-local address of ↵Vladimir Oltean
interface Pinging an IPv6 link-local multicast address selects the link-local unicast address of the interface as source, and we'd like to monitor for that in tcpdump. Add a helper to the forwarding library which retrieves the link-local IPv6 address of an interface, to make that task easier. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23selftests: forwarding: add helpers for IP multicast group joins/leavesVladimir Oltean
Extend the forwarding library with calls to some small C programs which join an IP multicast group and send some packets to it. Both IPv4 and IPv6 groups are supported. Use cases range from testing IGMP/MLD snooping, to RX filtering, to multicast routing. Testing multicast traffic using msend/mreceive is intended to be done using tcpdump. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23selftests: forwarding: multiple instances in tcpdump helperJoachim Wiberg
Extend tcpdump_start() & C:o to handle multiple instances. Useful when observing bridge operation, e.g., unicast learning/flooding, and any case of multicast distribution (to these ports but not that one ...). This means the interface argument is now a mandatory argument to all tcpdump_*() functions, hence the changes to the ocelot flower test. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23selftests: forwarding: add TCPDUMP_EXTRA_FLAGS to lib.shJoachim Wiberg
For some use-cases we may want to change the tcpdump flags used in tcpdump_start(). For instance, observing interfaces without the PROMISC flag, e.g. to see what's really being forwarded to the bridge interface. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23selftests: forwarding: add option to run tests with stable MAC addressesVladimir Oltean
By default, DSA switch ports inherit their MAC address from the DSA master. This works well for practical situations, but some selftests like bridge_vlan_unaware.sh loop back 2 standalone DSA ports with 2 bridged DSA ports, and require the bridge to forward packets between the standalone ports. Due to the bridge seeing that the MAC DA it needs to forward is present as a local FDB entry (it coincides with the MAC address of the bridge ports), the test packets are not forwarded, but terminated locally on br0. In turn, this makes the ping and ping6 tests fail. Address this by introducing an option to have stable MAC addresses. When mac_addr_prepare is called, the current addresses of the netifs are saved and replaced with 00:01:02:03:04:${netif number}. Then when mac_addr_restore is called at the end of the test, the original MAC addresses are restored. This ensures that the MAC addresses are unique, which makes the test pass even for DSA ports. The usage model is for the behavior to be opt-in via STABLE_MAC_ADDRS, which DSA should set to true, all others behave as before. By hooking the calls to mac_addr_prepare and mac_addr_restore within the forwarding lib itself, we do not need to patch each individual selftest, the only requirement is that pre_cleanup is called. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23Merge branch 'mptcp-tcp-fallback'David S. Miller
Mat Martineau says: ==================== mptcp: TCP fallback for established connections RFC 8684 allows some MPTCP connections to fall back to regular TCP when the MPTCP DSS checksum detects middlebox interference, there is only a single subflow, and there is no unacknowledged out-of-sequence data. When this condition is detected, the stack sends a MPTCP DSS option with an "infinite mapping" to signal that a fallback is happening, and the peers will stop sending MPTCP options in their TCP headers. The Linux MPTCP stack has not yet supported this type of fallback, instead closing the connection when the MPTCP checksum fails. This series adds support for fallback to regular TCP in a more limited scenario, for only MPTCP connections that have never connected additional subflows or transmitted out-of-sequence data. The selftests are also updated to check new MIBs that track infinite mappings. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23selftests: mptcp: add infinite map mibs checkGeliang Tang
This patch adds a function chk_infi_nr() to check the mibs for the infinite mapping. Invoke it in chk_join_nr() when validate_checksum is set. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: dump infinite_map field in mptcp_dump_mpextGeliang Tang
In trace event class mptcp_dump_mpext, dump the newly added infinite_map field of struct mptcp_dump_mpext too. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: add mib for infinite map sendingGeliang Tang
This patch adds a new mib named MPTCP_MIB_INFINITEMAPTX, increase it when a infinite mapping has been sent out. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: infinite mapping receivingGeliang Tang
This patch adds the infinite mapping receiving logic. When the infinite mapping is received, set the map_data_len of the subflow to 0. In subflow_check_data_avail(), only reset the subflow when the map_data_len of the subflow is non-zero. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: infinite mapping sendingGeliang Tang
This patch adds the infinite mapping sending logic. Add a new flag send_infinite_map in struct mptcp_subflow_context. Set it true when a single contiguous subflow is in use and the allow_infinite_fallback flag is true in mptcp_pm_mp_fail_received(). In mptcp_sendmsg_frag(), if this flag is true, call the new function mptcp_update_infinite_map() to set the infinite mapping. Add a new flag infinite_map in struct mptcp_ext, set it true in mptcp_update_infinite_map(), and check this flag in a new helper mptcp_check_infinite_map(). In mptcp_update_infinite_map(), set data_len to 0, and clear the send_infinite_map flag, then do fallback. In mptcp_established_options(), use the helper mptcp_check_infinite_map() to let the infinite mapping DSS can be sent out in the fallback mode. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: track and update contiguous data statusGeliang Tang
This patch adds a new member allow_infinite_fallback in mptcp_sock, which is initialized to 'true' when the connection begins and is set to 'false' on any retransmit or successful MP_JOIN. Only do infinite mapping fallback if there is a single subflow AND there have been no retransmissions AND there have never been any MP_JOINs. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: add the fallback checkGeliang Tang
This patch adds the fallback check in subflow_check_data_avail(). Only do the fallback when the msk hasn't fallen back yet. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: don't send RST for single subflowGeliang Tang
When a bad checksum is detected and a single subflow is in use, don't send RST + MP_FAIL, send data_ack + MP_FAIL instead. So invoke tcp_send_active_reset() only when mptcp_has_another_subflow() is true. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-22net: hns3: Fix spelling mistake "actvie" -> "active"Colin Ian King
There is a spelling mistake in a netdev_info message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20220421085546.321792-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-22tsnep: Remove useless null check before call of_node_put()Haowen Bai
No need to add null check before call of_node_put(), since the implementation of of_node_put() has done it. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Link: https://lore.kernel.org/r/1650509283-26168-1-git-send-email-baihaowen@meizu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-22Merge branch 'add-ethtool-sqi-support-for-lan87xx-t1-phy'Jakub Kicinski
Arun Ramadoss says: ==================== add ethtool SQI support for LAN87xx T1 Phy This patch series add the Signal Quality Index measurement for the LAN87xx and LAN937x T1 phy. Updated the maintainers file for microchip_t1.c. ==================== Link: https://lore.kernel.org/r/20220420152016.9680-1-arun.ramadoss@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-22MAINTAINERS: Add maintainers for Microchip T1 Phy driverArun Ramadoss
Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-22net: phy: LAN87xx: add ethtool SQI supportArun Ramadoss
This patch add the support for measuring Signal Quality Index for LAN87xx and LAN937x T1 Phy. It uses the SQI Method 5 for obtaining the values. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-22mlxsw: core_linecards: Fix size of array element during ini_files allocationJiri Pirko
types_info->ini_files is an array of pointers to struct mlxsw_linecard_ini_file. Fix the kmalloc_array() argument to be of a size of a pointer. Addresses-Coverity: ("Incorrect expression (SIZEOF_MISMATCH)") Fixes: b217127e5e4e ("mlxsw: core_linecards: Add line card objects and implement provisioning") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20220420142007.3041173-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-23selftests/bpf: Switch fexit_stress to bpf_link_create() APIAndrii Nakryiko
Use bpf_link_create() API in fexit_stress test to attach FEXIT programs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Kui-Feng Lee <kuifeng@fb.com> Link: https://lore.kernel.org/bpf/20220421033945.3602803-4-andrii@kernel.org