linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-12-15	ipmr: support IP_PKTINFO on cache report IGMP msg	Leone Fernando
	In order to support IP_PKTINFO on those packets, we need to call ipv4_pktinfo_prepare. When sending mrouted/pimd daemons a cache report IGMP msg, it is unnecessary to set dst on the newly created skb. It used to be necessary on older versions until commit d826eb14ecef ("ipv4: PKTINFO doesnt need dst reference") which changed the way IP_PKTINFO struct is been retrieved. Changes from v1: 1. Undo changes in ipv4_pktinfo_prepare function. use it directly and copy the control block. Fixes: d826eb14ecef ("ipv4: PKTINFO doesnt need dst reference") Signed-off-by: Leone Fernando <leone4fernando@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15	net: mana: add msix index sharing between EQs	Konstantin Taranov
	This patch allows to assign and poll more than one EQ on the same msix index. It is achieved by introducing a list of attached EQs in each IRQ context. It also removes the existing msix_index map that tried to ensure that there is only one EQ at each msix_index. This patch exports symbols for creating EQs from other MANA kernel modules. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15	octeontx2-af: Fix multicast/mirror group lock/unlock issue	Suman Ghosh
	As per the existing implementation, there exists a race between finding a multicast/mirror group entry and deleting that entry. The group lock was taken and released independently by rvu_nix_mcast_find_grp_elem() function. Which is incorrect and group lock should be taken during the entire operation of group updation/deletion. This patch fixes the same. Fixes: 51b2804c19cd ("octeontx2-af: Add new mbox to support multicast/mirror offload") Signed-off-by: Suman Ghosh <sumang@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15	Merge tag 'mlx5-updates-2023-12-13' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-12-13 Preparation for mlx5e socket direct feature. Socket direct will allow multiple PF devices attached to different NUMA nodes but sharing the same physical port. The following series is a small refactoring series in preparation to support socket direct in the following submission. Highlights: - Define required device registers and bits related to socket direct - Flow steering re-arrangements - Generalize TX objects (TISs) and store them in a common object, will be useful in the next series for per function object management. - Decouple raw CQ objects from their parent netdev priv - Prepare devcom for Socket Direct device group discovery. Please see the individual patches for more information. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15	Merge branch 'net-phy-rust'	David S. Miller
	FUJITA Tomonori says: ==================== Rust abstractions for network PHY drivers No functional change since v10; only comment and commit log updates. This patchset adds Rust abstractions for phylib. It doesn't fully cover the C APIs yet but I think that it's already useful. I implement two PHY drivers (Asix AX88772A PHYs and Realtek Generic FE-GE). Seems they work well with real hardware. The first patch introduces Rust bindings for phylib. The second patch adds a macro to declare a kernel module for PHYs drivers. The third adds the Rust ETHERNET PHY LIBRARY entry to MAINTAINERS file; adds the binding file and me as a maintainer (as Andrew Lunn suggested) with Trevor Gross as a reviewer. The last patch introduces the Rust version of Asix PHY driver, drivers/net/phy/ax88796b.c. The features are equivalent to the C version. You can choose C (by default) or Rust version on kernel configuration. v11: - adds Andrew, Alice, and Trevor's Reviewed-by - comment update v10: https://lore.kernel.org/netdev/20231210234924.1453917-1-fujita.tomonori@gmail.com/T/ - adds Trevor's SoB to the third patch - adds Benno's Reviewed-by to the second patch v9: https://lore.kernel.org/netdev/20231205.124531.842372711631366729.fujita.tomonori@gmail.com/T/ - adds a workaround to access to a bit field in phy_device - fixes a comment typo v8: https://lore.kernel.org/netdev/20231123050412.1012252-1-fujita.tomonori@gmail.com/ - updates the safety comments on Device and its related code - uses _phy_start_aneg instead of phy_start_aneg - drops the patch for enum synchronization - moves Sync From Registration to DriverVTable - fixes doctest errors - minor cleanups v7: https://lore.kernel.org/netdev/20231026001050.1720612-1-fujita.tomonori@gmail.com/T/ - renames get_link() to is_link_up() - improves the macro format - improves the commit log in the third patch - improves comments v6: https://lore.kernel.org/netdev/20231025.090243.1437967503809186729.fujita.tomonori@gmail.com/T/ - improves comments - makes the requirement of phy_drivers_register clear - fixes Makefile of the third patch v5: https://lore.kernel.org/all/20231019.094147.1808345526469629486.fujita.tomonori@gmail.com/T/ - drops the rustified-enum option, writes match by hand; no risk of UB - adds Miguel's patch for enum checking - moves CONFIG_RUST_PHYLIB_ABSTRACTIONS to drivers/net/phy/Kconfig - adds a new entry for this abstractions in MAINTAINERS - changes some of Device's methods to take &mut self - comment improvment v4: https://lore.kernel.org/netdev/20231012125349.2702474-1-fujita.tomonori@gmail.com/T/ - split the core patch - making Device::from_raw() private - comment improvement with code update - commit message improvement - avoiding using bindings::phy_driver in public functions - using an anonymous constant in module_phy_driver macro v3: https://lore.kernel.org/netdev/20231011.231607.1747074555988728415.fujita.tomonori@gmail.com/T/ - changes the base tree to net-next from rust-next - makes this feature optional; only enabled with CONFIG_RUST_PHYLIB_BINDINGS=y - cosmetic code and comment improvement - adds copyright v2: https://lore.kernel.org/netdev/20231006094911.3305152-2-fujita.tomonori@gmail.com/T/ - build failure fix - function renaming v1: https://lore.kernel.org/netdev/20231002085302.2274260-3-fujita.tomonori@gmail.com/T/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15	net: phy: add Rust Asix PHY driver	FUJITA Tomonori
	This is the Rust implementation of drivers/net/phy/ax88796b.c. The features are equivalent. You can choose C or Rust version kernel configuration. Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Reviewed-by: Trevor Gross <tmgross@umich.edu> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15	MAINTAINERS: add Rust PHY abstractions for ETHERNET PHY LIBRARY	FUJITA Tomonori
	Adds me as a maintainer and Trevor as a reviewer. The files are placed at rust/kernel/ directory for now but the files are likely to be moved to net/ directory once a new Rust build system is implemented. Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Signed-off-by: Trevor Gross <tmgross@umich.edu> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15	rust: net::phy add module_phy_driver macro	FUJITA Tomonori
	This macro creates an array of kernel's `struct phy_driver` and registers it. This also corresponds to the kernel's `MODULE_DEVICE_TABLE` macro, which embeds the information for module loading into the module binary file. A PHY driver should use this macro. Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Trevor Gross <tmgross@umich.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15	rust: core abstractions for network PHY drivers	FUJITA Tomonori
	This patch adds abstractions to implement network PHY drivers; the driver registration and bindings for some of callback functions in struct phy_driver and many genphy_ functions. This feature is enabled with CONFIG_RUST_PHYLIB_ABSTRACTIONS=y. This patch enables unstable const_maybe_uninit_zeroed feature for kernel crate to enable unsafe code to handle a constant value with uninitialized data. With the feature, the abstractions can initialize a phy_driver structure with zero easily; instead of initializing all the members by hand. It's supposed to be stable in the not so distant future. Link: https://github.com/rust-lang/rust/pull/116218 Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15	net: stmmac: don't create a MDIO bus if unnecessary	Andrew Halaney
	Currently a MDIO bus is created if the devicetree description is either: 1. Not fixed-link 2. fixed-link but contains a MDIO bus as well The "1" case above isn't always accurate. If there's a phy-handle, it could be referencing a phy on another MDIO controller's bus[1]. In this case, where the MDIO bus is not described at all, currently stmmac will make a MDIO bus and scan its address space to discover phys (of which there are none). This process takes time scanning a bus that is known to be empty, delaying time to complete probe. There are also a lot of upstream devicetrees[2] that expect a MDIO bus to be created, scanned for phys, and the first one found connected to the MAC. This case can be inferred from the platform description by not having a phy-handle && not being fixed-link. This hits case "1" in the current driver's logic, and must be handled in any logic change here since it is a valid legacy dt-binding. Let's improve the logic to create a MDIO bus if either: - Devicetree contains a MDIO bus - !fixed-link && !phy-handle (legacy handling) This way the case where no MDIO bus should be made is handled, as well as retaining backwards compatibility with the valid cases. Below devicetree snippets can be found that explain some of the cases above more concretely. Here's[0] a devicetree example where the MAC is both fixed-link and driving a switch on MDIO (case "2" above). This needs a MDIO bus to be created: &fec1 { phy-mode = "rmii"; fixed-link { speed = <100>; full-duplex; }; mdio1: mdio { switch0: switch0@0 { compatible = "marvell,mv88e6190"; pinctrl-0 = <&pinctrl_gpio_switch0>; }; }; }; Here's[1] an example where there is no MDIO bus or fixed-link for the ethernet1 MAC, so no MDIO bus should be created since ethernet0 is the MDIO master for ethernet1's phy: &ethernet0 { phy-mode = "sgmii"; phy-handle = <&sgmii_phy0>; mdio { compatible = "snps,dwmac-mdio"; sgmii_phy0: phy@8 { compatible = "ethernet-phy-id0141.0dd4"; reg = <0x8>; device_type = "ethernet-phy"; }; sgmii_phy1: phy@a { compatible = "ethernet-phy-id0141.0dd4"; reg = <0xa>; device_type = "ethernet-phy"; }; }; }; &ethernet1 { phy-mode = "sgmii"; phy-handle = <&sgmii_phy1>; }; Finally there's descriptions like this[2] which don't describe the MDIO bus but expect it to be created and the whole address space scanned for a phy since there's no phy-handle or fixed-link described: &gmac { phy-supply = <&vcc_lan>; phy-mode = "rmii"; snps,reset-gpio = <&gpio3 RK_PB4 GPIO_ACTIVE_HIGH>; snps,reset-active-low; snps,reset-delays-us = <0 10000 1000000>; }; [0] https://elixir.bootlin.com/linux/v6.5-rc5/source/arch/arm/boot/dts/nxp/vf/vf610-zii-ssmb-dtu.dts [1] https://elixir.bootlin.com/linux/v6.6-rc5/source/arch/arm64/boot/dts/qcom/sa8775p-ride.dts [2] https://elixir.bootlin.com/linux/v6.6-rc5/source/arch/arm64/boot/dts/rockchip/rk3368-r88.dts#L164 Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Co-developed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-14	bpf: xdp: Register generic_kfunc_set with XDP programs	Daniel Xu
	Registering generic_kfunc_set with XDP programs enables some of the newer BPF features inside XDP -- namely tree based data structures and BPF exceptions. The current motivation for this commit is to enable assertions inside XDP bpf progs. Assertions are a standard and useful tool to encode intent. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/d07d4614b81ca6aada44fcb89bb6b618fb66e4ca.1702594357.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	i40e: remove fake support of rx-frames-irq	Jason Xing
	Since we never support this feature for I40E driver, we don't have to display the value when using 'ethtool -c eth0'. Before this patch applied, the rx-frames-irq is 256 which is consistent with tx-frames-irq. Apparently it could mislead users. Signed-off-by: Jason Xing <kernelxing@tencent.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20231213184406.1306602-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	Merge branch 'mdio-mux-cleanup'	Jakub Kicinski
	Vladimir Oltean says: ==================== MDIO mux cleanup This small patch set resolves some technical debt in the MDIO mux driver which was discovered during the investigation for commit 1f9f2143f24e ("net: mdio-mux: fix C45 access returning -EIO after API change"). The patches have been sitting for 2 months in the NXP SDK kernel and haven't caused issues. ==================== Link: https://lore.kernel.org/r/20231213152712.320842-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	net: mdio-mux: be compatible with parent buses which only support C45	Vladimir Oltean
	After the mii_bus API conversion to a split read() / read_c45(), there might be MDIO parent buses which only populate the read_c45() and write_c45() function pointers but not the C22 variants. We haven't seen these in the wild paired with MDIO multiplexers, but Andrew points out we should treat the corner case. Link: https://lore.kernel.org/netdev/4ccd7dc9-b611-48aa-865f-68d3a1327ce8@lunn.ch/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20231213152712.320842-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	net: mdio-mux: show errors on probe failure	Vladimir Oltean
	Showing the precise error symbols can help debugging probe issues, such as the recent -EIO error in of_mdiobus_register() caused by the lack of bus->read_c45() and bus->write_c45() methods. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20231213152712.320842-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	net: atlantic: eliminate double free in error handling logic	Igor Russkikh
	Driver has a logic leak in ring data allocation/free, where aq_ring_free could be called multiple times on same ring, if system is under stress and got memory allocation error. Ring pointer was used as an indicator of failure, but this is not correct since only ring data is allocated/deallocated. Ring itself is an array member. Changing ring allocation functions to return error code directly. This simplifies error handling and eliminates aq_ring_free on higher layer. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Link: https://lore.kernel.org/r/20231213095044.23146-1-irusskikh@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	Merge branch 'convert-net-selftests-to-run-in-unique-namespace-part-3'	Jakub Kicinski
	Hangbin Liu says: ==================== Convert net selftests to run in unique namespace (Part 3) Here is the 3rd part of converting net selftests to run in unique namespace. This part converts all srv6 and fib tests. Note that patch 06 is a fix for testing fib_nexthop_multiprefix. Here is the part 1 link: https://lore.kernel.org/netdev/20231202020110.362433-1-liuhangbin@gmail.com And part 2 link: https://lore.kernel.org/netdev/20231206070801.1691247-1-liuhangbin@gmail.com ==================== Link: https://lore.kernel.org/r/20231213060856.4030084-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: convert fdb_flush.sh to run it in unique namespace	Hangbin Liu
	Here is the test result after conversion. # ./fdb_flush.sh TEST: vx10: Expected 5 FDB entries, got 5 [ OK ] TEST: vx20: Expected 5 FDB entries, got 5 [ OK ] ... TEST: vx10: Expected 5 FDB entries, got 5 [ OK ] TEST: Test entries with dst 192.0.2.1 [ OK ] Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20231213060856.4030084-14-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: convert fib_tests.sh to run it in unique namespace	Hangbin Liu
	Here is the test result after conversion. # ./fib_tests.sh Single path route test Start point TEST: IPv4 fibmatch [ OK ] ... Fib6 garbage collection test TEST: ipv6 route garbage collection [ OK ] IPv4 multipath list receive tests TEST: Multipath route hit ratio (1.00) [ OK ] IPv6 multipath list receive tests TEST: Multipath route hit ratio (1.00) [ OK ] Tests passed: 225 Tests failed: 0 Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20231213060856.4030084-13-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: convert fib_rule_tests.sh to run it in unique namespace	Hangbin Liu
	Here is the test result after conversion. ]# ./fib_rule_tests.sh TEST: rule6 check: oif redirect to table [ OK ] ... TEST: rule4 dsfield tcp connect (dsfield 0x07) [ OK ] Tests passed: 66 Tests failed: 0 Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20231213060856.4030084-12-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: convert fib-onlink-tests.sh to run it in unique namespace	Hangbin Liu
	Remove PEER_CMD, which is not used in this test Here is the test result after conversion. ]# ./fib-onlink-tests.sh Error: ipv4: FIB table does not exist. Flush terminated Error: ipv6: FIB table does not exist. Flush terminated ######################################## Configuring interfaces ... TEST: Gateway resolves to wrong nexthop device - VRF [ OK ] Tests passed: 38 Tests failed: 0 Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20231213060856.4030084-11-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: convert fib_nexthops.sh to run it in unique namespace	Hangbin Liu
	Here is the test result after conversion. ]# ./fib_nexthops.sh Basic functional tests ---------------------- TEST: List with nothing defined [ OK ] TEST: Nexthop get on non-existent id [ OK ] ... TEST: IPv6 resilient nexthop group torture test [ OK ] Tests passed: 234 Tests failed: 0 Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20231213060856.4030084-10-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: convert fib_nexthop_nongw.sh to run it in unique namespace	Hangbin Liu
	Here is the test result after conversion. ]# ./fib_nexthop_nongw.sh TEST: nexthop: get route with nexthop without gw [ OK ] TEST: nexthop: ping through nexthop without gw [ OK ] Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20231213060856.4030084-9-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: convert fib_nexthop_multiprefix to run it in unique namespace	Hangbin Liu
	Here is the test result after conversion. ]# ./fib_nexthop_multiprefix.sh TEST: IPv4: host 0 to host 1, mtu 1300 [ OK ] TEST: IPv6: host 0 to host 1, mtu 1300 [ OK ] TEST: IPv4: host 0 to host 2, mtu 1350 [ OK ] TEST: IPv6: host 0 to host 2, mtu 1350 [ OK ] TEST: IPv4: host 0 to host 3, mtu 1400 [ OK ] TEST: IPv6: host 0 to host 3, mtu 1400 [ OK ] TEST: IPv4: host 0 to host 1, mtu 1300 [ OK ] TEST: IPv6: host 0 to host 1, mtu 1300 [ OK ] TEST: IPv4: host 0 to host 2, mtu 1350 [ OK ] TEST: IPv6: host 0 to host 2, mtu 1350 [ OK ] TEST: IPv4: host 0 to host 3, mtu 1400 [ OK ] TEST: IPv6: host 0 to host 3, mtu 1400 [ OK ] Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20231213060856.4030084-8-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: fix grep checking for fib_nexthop_multiprefix	Hangbin Liu
	When running fib_nexthop_multiprefix test I saw all IPv6 test failed. e.g. ]# ./fib_nexthop_multiprefix.sh TEST: IPv4: host 0 to host 1, mtu 1300 [ OK ] TEST: IPv6: host 0 to host 1, mtu 1300 [FAIL] With -v it shows COMMAND: ip netns exec h0 /usr/sbin/ping6 -s 1350 -c5 -w5 2001:db8:101::1 PING 2001:db8:101::1(2001:db8:101::1) 1350 data bytes From 2001:db8:100::64 icmp_seq=1 Packet too big: mtu=1300 --- 2001:db8:101::1 ping statistics --- 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms Route get 2001:db8:101::1 via 2001:db8:100::64 dev eth0 src 2001:db8:100::1 metric 1024 expires 599sec mtu 1300 pref medium Searching for: 2001:db8:101::1 from :: via 2001:db8:100::64 dev eth0 src 2001:db8:100::1 .* mtu 1300 The reason is when CONFIG_IPV6_SUBTREES is not enabled, rt6_fill_node() will not put RTA_SRC info. After fix: ]# ./fib_nexthop_multiprefix.sh TEST: IPv4: host 0 to host 1, mtu 1300 [ OK ] TEST: IPv6: host 0 to host 1, mtu 1300 [ OK ] Fixes: 735ab2f65dce ("selftests: Add test with multiple prefixes using single nexthop") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20231213060856.4030084-7-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: convert fcnal-test.sh to run it in unique namespace	Hangbin Liu
	Here is the test result after conversion. There are some failures, but it also exists on my system without this patch. So it's not affectec by this patch and I will check the reason later. ]# time ./fcnal-test.sh /usr/bin/which: no nettest in (/root/.local/bin:/root/bin:/usr/share/Modules/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin) ########################################################################### IPv4 ping ########################################################################### ################################################################# No VRF SYSCTL: net.ipv4.raw_l3mdev_accept=0 TEST: ping out - ns-B IP [ OK ] TEST: ping out, device bind - ns-B IP [ OK ] TEST: ping out, address bind - ns-B IP [ OK ] ... ################################################################# SNAT on VRF TEST: IPv4 TCP connection over VRF with SNAT [ OK ] TEST: IPv6 TCP connection over VRF with SNAT [ OK ] Tests passed: 893 Tests failed: 21 real 52m48.178s user 0m34.158s sys 1m42.976s BTW, this test needs a really long time. So expand the timeout to 1h. Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20231213060856.4030084-6-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: convert srv6_end_dt6_l3vpn_test.sh to run it in unique namespace	Hangbin Liu
	As the name \${rt-${rt}} may make reader confuse, convert the variable hs/rt in setup_rt/hs to hid, rid. Here is the test result after conversion. ]# ./srv6_end_dt6_l3vpn_test.sh ################################################################################ TEST SECTION: IPv6 routers connectivity test ################################################################################ TEST: Routers connectivity: rt-1 -> rt-2 [ OK ] TEST: Routers connectivity: rt-2 -> rt-1 [ OK ] ... TEST: Hosts isolation: hs-t200-4 -X-> hs-t100-2 [ OK ] Tests passed: 18 Tests failed: 0 Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20231213060856.4030084-5-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: convert srv6_end_dt4_l3vpn_test.sh to run it in unique namespace	Hangbin Liu
	As the name \${rt-${rt}} may make reader confuse, convert the variable hs/rt in setup_rt/hs to hid, rid. Here is the test result after conversion. ]# ./srv6_end_dt4_l3vpn_test.sh ################################################################################ TEST SECTION: IPv6 routers connectivity test ################################################################################ TEST: Routers connectivity: rt-1 -> rt-2 [ OK ] TEST: Routers connectivity: rt-2 -> rt-1 [ OK ] ... TEST: Hosts isolation: hs-t200-4 -X-> hs-t100-2 [ OK ] Tests passed: 18 Tests failed: 0 Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20231213060856.4030084-4-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: convert srv6_end_dt46_l3vpn_test.sh to run it in unique namespace	Hangbin Liu
	As the name \${rt-${rt}} may make reader confuse, convert the variable hs/rt in setup_rt/hs to hid, rid. Here is the test result after conversion. ]# ./srv6_end_dt46_l3vpn_test.sh ################################################################################ TEST SECTION: IPv6 routers connectivity test ################################################################################ TEST: Routers connectivity: rt-1 -> rt-2 [ OK ] TEST: Routers connectivity: rt-2 -> rt-1 [ OK ] ... TEST: IPv4 Hosts isolation: hs-t200-4 -X-> hs-t100-2 [ OK ] Tests passed: 34 Tests failed: 0 Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20231213060856.4030084-3-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	selftests/net: add variable NS_LIST for lib.sh	Hangbin Liu
	Add a global variable NS_LIST to store all the namespaces that setup_ns created, so the caller could call cleanup_all_ns() instead of remember all the netns names when using cleanup_ns(). Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20231213060856.4030084-2-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	page_pool: fix typos and punctuation	Randy Dunlap
	Correct spelling (s/and/any) and a run-on sentence. Spell out "multi". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/r/20231213043650.12672-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	net: skbuff: fix spelling errors	Randy Dunlap
	Correct spelling as reported by codespell. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231213043511.10357-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	Merge branch 'net-mdio-mdio-bcm-unimac-optimizations-and-clean-up'	Jakub Kicinski
	Justin Chen says: ==================== net: mdio: mdio-bcm-unimac: optimizations and clean up Clean up mdio poll to use read_poll_timeout() and reduce the potential poll time. ==================== Link: https://lore.kernel.org/r/20231213222744.2891184-1-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	net: mdio: mdio-bcm-unimac: Use read_poll_timeout	Justin Chen
	Simplify the code by using read_poll_timeout(). Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20231213222744.2891184-3-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	net: mdio: mdio-bcm-unimac: Delay before first poll	Justin Chen
	With a clock interval of 400 nsec and a 64 bit transactions (32 bit preamble & 16 bit control & 16 bit data), it is reasonable to assume the mdio transaction will take 25.6 usec. Add a 30 usec delay before the first poll to reduce the chance of a 1000-2000 usec sleep. Reduce the timeout from 1000ms to 100ms as it is unlikely for the bus to take this long. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20231213222744.2891184-2-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	Merge branch 'tools-ynl-gen-fill-in-the-gaps-in-support-of-legacy-families'	Jakub Kicinski
	Jakub Kicinski says: ==================== tools: ynl-gen: fill in the gaps in support of legacy families Fill in the gaps in YNL C code gen so that we can generate user space code for all genetlink families for which we have specs. The two major changes we need are support for fixed headers and support for recursive nests. For fixed header support - place the struct for the fixed header directly in the request struct (and don't bother generating access helpers). The member of a fixed header can't be too complex, and also are by definition not optional so the user has to fill them in. The YNL core needs a bit of a tweak to understand that the attrs may now start at a fixed offset, which is not necessarily equal to sizeof(struct genlmsghdr). Dealing with nested sets is much harder. Previously we'd gen the nested structs as: struct outer { struct inner inner; }; If structs are recursive (e.g. inner contains outer again) we must break this chain and allocate one of the structs dynamically (store a pointer rather than full struct). ==================== Link: https://lore.kernel.org/r/20231213231432.2944749-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	tools: ynl-gen: print prototypes for recursive stuff	Jakub Kicinski
	We avoid printing forward declarations and prototypes for most types by sorting things topologically. But if structs nest we do need the forward declarations, there's no other way. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231213231432.2944749-9-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	tools: ynl-gen: store recursive nests by a pointer	Jakub Kicinski
	To avoid infinite nesting store recursive structs by pointer. If recursive struct is placed in the op directly - the first instance can be stored by value. That makes the code much less of a pain for majority of practical uses. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231213231432.2944749-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	tools: ynl-gen: re-sort ignoring recursive nests	Jakub Kicinski
	We try to keep the structures and helpers "topologically sorted", to avoid forward declarations. When recursive nests are at play we need to sort twice, because structs which end up being marked as recursive will get a full set of forward declarations, so we should ignore them for the purpose of sorting. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231213231432.2944749-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	tools: ynl-gen: record information about recursive nests	Jakub Kicinski
	Track which nests are recursive. Non-recursive nesting gets rendered in C as directly nested structs. For recursive ones we need to put a pointer in, rather than full struct. Track this information, no change to generated code, yet. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231213231432.2944749-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	tools: ynl-gen: fill in implementations for TypeUnused	Jakub Kicinski
	Fill in more empty handlers for TypeUnused. When 'unused' attr gets specified in a nested set we have to cleanly skip it during code generation. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231213231432.2944749-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	tools: ynl-gen: support fixed headers in genetlink	Jakub Kicinski
	Support genetlink families using simple fixed headers. Assume fixed header is identical for all ops of the family for now. Fixed headers are added to the request and reply structs as a _hdr member, and copied to/from netlink messages appropriately. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231213231432.2944749-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	tools: ynl-gen: use enum user type for members and args	Jakub Kicinski
	Commit 30c902001534 ("tools: ynl-gen: use enum name from the spec") added pre-cooked user type for enums. Use it to fix ignoring enum-name provided in the spec. This changes a type in struct ethtool_tunnel_udp_entry but is generally inconsequential for current families. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231213231432.2944749-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	tools: ynl-gen: add missing request free helpers for dumps	Jakub Kicinski
	The code gen generates a prototype for dump request free in the header, but no implementation in the source. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231213231432.2944749-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	Merge branch 'bpf-fs-mount-options-parsing-follow-ups'	Alexei Starovoitov
	Andrii Nakryiko says: ==================== BPF FS mount options parsing follow ups Original BPF token patch set ([0]) added delegate_xxx mount options which supported only special "any" value and hexadecimal bitmask. This patch set attempts to make specifying and inspecting these mount options more human-friendly by supporting string constants matching corresponding bpf_cmd, bpf_map_type, bpf_prog_type, and bpf_attach_type enumerators. This implementation relies on BTF information to find all supported symbolic names. If kernel wasn't built with BTF, BPF FS will still support "any" and hex-based mask. [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=805707&state=* v1->v2: - strip BPF_, BPF_MAP_TYPE_, and BPF_PROG_TYPE_ prefixes, do case-insensitive comparison, normalize to lower case (Alexei). ==================== Link: https://lore.kernel.org/r/20231214225016.1209867-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	selftests/bpf: utilize string values for delegate_xxx mount options	Andrii Nakryiko
	Use both hex-based and string-based way to specify delegate mount options for BPF FS. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231214225016.1209867-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	bpf: support symbolic BPF FS delegation mount options	Andrii Nakryiko
	Besides already supported special "any" value and hex bit mask, support string-based parsing of delegation masks based on exact enumerator names. Utilize BTF information of `enum bpf_cmd`, `enum bpf_map_type`, `enum bpf_prog_type`, and `enum bpf_attach_type` types to find supported symbolic names (ignoring __MAX_xxx guard values and stripping repetitive prefixes like BPF_ for cmd and attach types, BPF_MAP_TYPE_ for maps, and BPF_PROG_TYPE_ for prog types). The case doesn't matter, but it is normalized to lower case in mount option output. So "PROG_LOAD", "prog_load", and "MAP_create" are all valid values to specify for delegate_cmds options, "array" is among supported for map types, etc. Besides supporting string values, we also support multiple values specified at the same time, using colon (':') separator. There are corresponding changes on bpf_show_options side to use known values to print them in human-readable format, falling back to hex mask printing, if there are any unrecognized bits. This shouldn't be necessary when enum BTF information is present, but in general we should always be able to fall back to this even if kernel was built without BTF. As mentioned, emitted symbolic names are normalized to be all lower case. Example below shows various ways to specify delegate_cmds options through mount command and how mount options are printed back: 12/14 14:39:07.604 vmuser@archvm:~/local/linux/tools/testing/selftests/bpf $ mount \| rg token $ sudo mkdir -p /sys/fs/bpf/token $ sudo mount -t bpf bpffs /sys/fs/bpf/token \ -o delegate_cmds=prog_load:MAP_CREATE \ -o delegate_progs=kprobe \ -o delegate_attachs=xdp $ mount \| grep token bpffs on /sys/fs/bpf/token type bpf (rw,relatime,delegate_cmds=map_create:prog_load,delegate_progs=kprobe,delegate_attachs=xdp) Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231214225016.1209867-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/intel/iavf/iavf_ethtool.c 3a0b5a2929fd ("iavf: Introduce new state machines for flow director") 95260816b489 ("iavf: use iavf_schedule_aq_request() helper") https://lore.kernel.org/all/84e12519-04dc-bd80-bc34-8cf50d7898ce@intel.com/ drivers/net/ethernet/broadcom/bnxt/bnxt.c c13e268c0768 ("bnxt_en: Fix HWTSTAMP_FILTER_ALL packet timestamp logic") c2f8063309da ("bnxt_en: Refactor RX VLAN acceleration logic.") a7445d69809f ("bnxt_en: Add support for new RX and TPA_START completion types for P7") 1c7fd6ee2fe4 ("bnxt_en: Rename some macros for the P5 chips") https://lore.kernel.org/all/20231211110022.27926ad9@canb.auug.org.au/ drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c bd6781c18cb5 ("bnxt_en: Fix wrong return value check in bnxt_close_nic()") 84793a499578 ("bnxt_en: Skip nic close/open when configuring tstamp filters") https://lore.kernel.org/all/20231214113041.3a0c003c@canb.auug.org.au/ drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c 3d7a3f2612d7 ("net/mlx5: Nack sync reset request when HotPlug is enabled") cecf44ea1a1f ("net/mlx5: Allow sync reset flow when BF MGT interface device is present") https://lore.kernel.org/all/20231211110328.76c925af@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	Merge branch 'add-bpf_xdp_get_xfrm_state-kfunc'	Alexei Starovoitov
	Daniel Xu says: ==================== Add bpf_xdp_get_xfrm_state() kfunc This patchset adds two kfunc helpers, bpf_xdp_get_xfrm_state() and bpf_xdp_xfrm_state_release() that wrap xfrm_state_lookup() and xfrm_state_put(). The intent is to support software RSS (via XDP) for the ongoing/upcoming ipsec pcpu work [0]. Recent experiments performed on (hopefully) reproducible AWS testbeds indicate that single tunnel pcpu ipsec can reach line rate on 100G ENA nics. Note this patchset only tests/shows generic xfrm_state access. The "secret sauce" (if you can really even call it that) involves accessing a soon-to-be-upstreamed pcpu_num field in xfrm_state. Early example is available here [1]. [0]: https://datatracker.ietf.org/doc/draft-ietf-ipsecme-multi-sa-performance/03/ [1]: https://github.com/danobi/xdp-tools/blob/e89a1c617aba3b50d990f779357d6ce2863ecb27/xdp-bench/xdp_redirect_cpumap.bpf.c#L385-L406 Changes from v5: * Improve kfunc doc comments * Remove extraneous replay-window setting on selftest reverse path * Squash two kfunc commits into one * Rebase to bpf-next to pick up bitfield write patches * Remove testing of opts.error in selftest prog Changes from v4: * Fixup commit message for selftest * Set opts->error -ENOENT for !x * Revert single file xfrm + bpf Changes from v3: * Place all xfrm bpf integrations in xfrm_bpf.c * Avoid using nval as a temporary * Rebase to bpf-next * Remove extraneous __failure_unpriv annotation for verifier tests Changes from v2: * Fix/simplify BPF_CORE_WRITE_BITFIELD() algorithm * Added verifier tests for bitfield writes * Fix state leakage across test_tunnel subtests Changes from v1: * Move xfrm tunnel tests to test_progs * Fix writing to opts->error when opts is invalid * Use __bpf_kfunc_start_defs() * Remove unused vxlanhdr definition * Add and use BPF_CORE_WRITE_BITFIELD() macro * Make series bisect clean Changes from RFCv2: * Rebased to ipsec-next * Fix netns leak Changes from RFCv1: * Add Antony's commit tags * Add KF_ACQUIRE and KF_RELEASE semantics ==================== Reviewed-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/cover.1702593901.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	bpf: xfrm: Add selftest for bpf_xdp_get_xfrm_state()	Daniel Xu
	This commit extends test_tunnel selftest to test the new XDP xfrm state lookup kfunc. Co-developed-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/e704e9a4332e3eac7b458e4bfdec8fcc6984cdb6.1702593901.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>