summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-08-16inet: move inet->defer_connect to inet->inet_flagsEric Dumazet
Make room in struct inet_sock by removing this bit field, using one available bit in inet_flags instead. Also move local_port_range to fill the resulting hole, saving 8 bytes on 64bit arches. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: move inet->bind_address_no_port to inet->inet_flagsEric Dumazet
IP_BIND_ADDRESS_NO_PORT socket option can now be set/read without locking the socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: move inet->nodefrag to inet->inet_flagsEric Dumazet
IP_NODEFRAG socket option can now be set/read without locking the socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: move inet->is_icsk to inet->inet_flagsEric Dumazet
We move single bit fields to inet->inet_flags to avoid races. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: move inet->transparent to inet->inet_flagsEric Dumazet
IP_TRANSPARENT socket option can now be set/read without locking the socket. v2: removed unused issk variable in mptcp_setsockopt_sol_ip_set_transparent() v4: rebased after commit 3f326a821b99 ("mptcp: change the mpc check helper to return a sk") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: move inet->mc_all to inet->inet_fragsEric Dumazet
IP_MULTICAST_ALL socket option can now be set/read without locking the socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: move inet->mc_loop to inet->inet_fragsEric Dumazet
IP_MULTICAST_LOOP socket option can now be set/read without locking the socket. v3: fix build bot error reported in ipvs set_mcast_loop() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: move inet->hdrincl to inet->inet_flagsEric Dumazet
IP_HDRINCL socket option can now be set/read without locking the socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: move inet->freebind to inet->inet_flagsEric Dumazet
IP_FREEBIND socket option can now be set/read without locking the socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: move inet->recverr_rfc4884 to inet->inet_flagsEric Dumazet
IP_RECVERR_RFC4884 socket option can now be set/read without locking the socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: move inet->recverr to inet->inet_flagsEric Dumazet
IP_RECVERR socket option can now be set/get without locking the socket. This patch potentially avoid data-races around inet->recverr. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: set/get simple options locklesslyEric Dumazet
Now we have inet->inet_flags, we can set following options without having to hold the socket lock: IP_PKTINFO, IP_RECVTTL, IP_RECVTOS, IP_RECVOPTS, IP_RETOPTS, IP_PASSSEC, IP_RECVORIGDSTADDR, IP_RECVFRAGSIZE. ip_sock_set_pktinfo() no longer hold the socket lock. Similarly we can get the following options whithout holding the socket lock: IP_PKTINFO, IP_RECVTTL, IP_RECVTOS, IP_RECVOPTS, IP_RETOPTS, IP_PASSSEC, IP_RECVORIGDSTADDR, IP_CHECKSUM, IP_RECVFRAGSIZE. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16inet: introduce inet->inet_flagsEric Dumazet
Various inet fields are currently racy. do_ip_setsockopt() and do_ip_getsockopt() are mostly holding the socket lock, but some (fast) paths do not. Use a new inet->inet_flags to hold atomic bits in the series. Remove inet->cmsg_flags, and use instead 9 bits from inet_flags. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16ipv6: fix indentation of a config attributePrasad Pandit
Fix indentation of a type attribute of IPV6_VTI config entry. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16Merge branch 'redundant-of_match_ptr'David S. Miller
Ruan Jinjie says: ==================== net: Remove redundant of_match_ptr() macro Since these net drivers depend on CONFIG_OF, there is no need to wrap the macro of_match_ptr() here. Changes in v3: - Collect responses from v1 and v2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16wlcore: spi: Remove redundant of_match_ptr()Ruan Jinjie
The driver depends on CONFIG_OF, it is not necessary to use of_match_ptr() here. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16net: qualcomm: Remove redundant of_match_ptr()Ruan Jinjie
The driver depends on CONFIG_OF, it is not necessary to use of_match_ptr() here. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16net: gemini: Remove redundant of_match_ptr()Ruan Jinjie
The driver depends on CONFIG_OF, it is not necessary to use of_match_ptr() here. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16net: dsa: rzn1-a5psw: Remove redundant of_match_ptr()Ruan Jinjie
The driver depends on CONFIG_OF, it is not necessary to use of_match_ptr() here. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16net: dsa: realtek: Remove redundant of_match_ptr()Ruan Jinjie
The driver depends on CONFIG_OF, it is not necessary to use of_match_ptr() here. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16nfc: virtual_ncidev: Use module_misc_device macro to simplify the codeLi Zetao
Use the module_misc_device macro to simplify the code, which is the same as declaring with module_init() and module_exit(). Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16mailmap: add entries for Simon HormanSimon Horman
Retire some of my email addresses from Kernel activities. Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16Merge tag 'ipsec-2023-08-15' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== 1) Fix a slab-out-of-bounds read in xfrm_address_filter. From Lin Ma. 2) Fix the pfkey sadb_x_filter validation. From Lin Ma. 3) Use the correct nla_policy structure for XFRMA_SEC_CTX. From Lin Ma. 4) Fix warnings triggerable by bad packets in the encap functions. From Herbert Xu. 5) Fix some slab-use-after-free in decode_session6. From Zhengchao Shao. 6) Fix a possible NULL piointer dereference in xfrm_update_ae_params. Lin Ma. 7) Add a forgotten nla_policy for XFRMA_MTIMER_THRESH. From Lin Ma. 8) Don't leak offloaded policies. From Leon Romanovsky. 9) Delete also the offloading part of an acquire state. From Leon Romanovsky. Please pull or let me know if there are problems.
2023-08-16Merge branch 'hns3-ethtool'David S. Miller
Jijie Shao says: ==================== hns3: refactor registers information for ethtool -d refactor registers information for ethtool -d ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16net: hns3: fix wrong rpu tln reg issueJijie Shao
In the original RPU query command, the status register values of multiple RPU tunnels are accumulated by default, which is unreasonable. This patch Fix it by querying the specified tunnel ID. The tunnel number of the device can be obtained from firmware during initialization. Fixes: ddb54554fa51 ("net: hns3: add DFX registers information for ethtool -d") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16net: hns3: Support tlv in regs data for HNS3 VF driverJijie Shao
The dump register function is being refactored. The third step in refactoring is to support tlv info in regs data for HNS3 PF driver. Currently, if we use "ethtool -d" to dump regs value, the output is as follows: offset1: 00 01 02 03 04 05 ... offset2:10 11 12 13 14 15 ... ...... We can't get the value of a register directly. This patch deletes the original separator information and add tag_len_value information in regs data. ethtool can parse register data in key-value format by -d command. a patch will be added to the ethtool to parse regs data in the following format: reg1 : value2 reg2 : value2 ...... Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16net: hns3: Support tlv in regs data for HNS3 PF driverJijie Shao
The dump register function is being refactored. The second step in refactoring is to support tlv info in regs data for HNS3 PF driver. Currently, if we use "ethtool -d" to dump regs value, the output is as follows: offset1: 00 01 02 03 04 05 ... offset2:10 11 12 13 14 15 ... ...... We can't get the value of a register directly. This patch deletes the original separator information and add tag_len_value information in regs data. ethtool can parse register data in key-value format by -d command. a patch will be added to the ethtool to parse regs data in the following format: reg1 : value2 reg2 : value2 ...... Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16net: hns3: move dump regs function to a separate fileJijie Shao
The dump register function is being refactored. The first step in refactoring is put the dump regs function into a separate file. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16Merge branch 'fec-XDP_TX'David S. Miller
Wei Fang says: ==================== net: fec: add XDP_TX feature support This patch set is to support the XDP_TX feature of FEC driver, the first patch is add initial XDP_TX support, and the second patch improves the performance of XDP_TX by not using xdp_convert_buff_to_frame(). Please refer to the commit message of each patch for more details. ==================== Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16net: fec: improve XDP_TX performanceWei Fang
As suggested by Jesper and Alexander, we can avoid converting xdp_buff to xdp_frame in case of XDP_TX to save a bunch of CPU cycles, so that we can further improve the XDP_TX performance. Before this patch on i.MX8MP-EVK board, the performance shows as follows. root@imx8mpevk:~# ./xdp2 eth0 proto 17: 353918 pkt/s proto 17: 352923 pkt/s proto 17: 353900 pkt/s proto 17: 352672 pkt/s proto 17: 353912 pkt/s proto 17: 354219 pkt/s After applying this patch, the performance is improved. root@imx8mpevk:~# ./xdp2 eth0 proto 17: 369261 pkt/s proto 17: 369267 pkt/s proto 17: 369206 pkt/s proto 17: 369214 pkt/s proto 17: 369126 pkt/s proto 17: 369272 pkt/s Signed-off-by: Wei Fang <wei.fang@nxp.com> Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Suggested-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16net: fec: add XDP_TX feature supportWei Fang
The XDP_TX feature is not supported before, and all the frames which are deemed to do XDP_TX action actually do the XDP_DROP action. So this patch adds the XDP_TX support to FEC driver. I tested the performance of XDP_TX in XDP_DRV mode and XDP_SKB mode respectively on i.MX8MP-EVK platform, and as suggested by Jesper, I also tested the performance of XDP_REDIRECT on the same platform. And the test steps and results are as follows. XDP_TX test: Step 1: One board is used as generator and connects to switch,and the FEC port of DUT also connects to the switch. Both boards with flow control off. Then the generator runs the pktgen_sample03_burst_single_flow.sh script to generate and send burst traffic to DUT. Note that the size of packet was set to 64 bytes and the procotol of packet was UDP in my test scenario. In addition, the SMAC of the packet need to be different from the MAC of the generator, because the xdp2 program will swap the DMAC and SMAC of the packet and send it back to the generator. If the SMAC of the generated packet is the MAC of the generator, the generator will receive the returned traffic which increase the CPU loading and significantly degrade the transmit speed of the generator, and finally it affects the test of XDP_TX performance. Step 2: The DUT runs the xdp2 program to transmit received UDP packets back out on the same port where they were received. root@imx8mpevk:~# ./xdp2 eth0 proto 17: 353918 pkt/s proto 17: 352923 pkt/s proto 17: 353900 pkt/s proto 17: 352672 pkt/s proto 17: 353912 pkt/s proto 17: 354219 pkt/s root@imx8mpevk:~# ./xdp2 -S eth0 proto 17: 160604 pkt/s proto 17: 160708 pkt/s proto 17: 160564 pkt/s proto 17: 160684 pkt/s proto 17: 160640 pkt/s proto 17: 160720 pkt/s The above results show that the XDP_TX performance of XDP_DRV mode is much better than XDP_SKB mode, more than twice that of XDP_SKB mode, which is in line with our expectation. XDP_REDIRECT test: Step1: Both the generator and the FEC port of the DUT connet to the switch port. All the ports with flow control off, then the generator runs the pktgen script to generate and send burst traffic to DUT. Note that the size of packet was set to 64 bytes and the procotol of packet was UDP in my test scenario. Step2: The DUT runs the xdp_redirect program to redirect the traffic from the FEC port to the FEC port itself. root@imx8mpevk:~# ./xdp_redirect eth0 eth0 Redirecting from eth0 (ifindex 2; driver fec) to eth0 (ifindex 2; driver fec) Summary 232,302 rx/s 0 err,drop/s 232,344 xmit/s Summary 234,579 rx/s 0 err,drop/s 234,577 xmit/s Summary 235,548 rx/s 0 err,drop/s 235,549 xmit/s Summary 234,704 rx/s 0 err,drop/s 234,703 xmit/s Summary 235,504 rx/s 0 err,drop/s 235,504 xmit/s Summary 235,223 rx/s 0 err,drop/s 235,224 xmit/s Summary 234,509 rx/s 0 err,drop/s 234,507 xmit/s Summary 235,481 rx/s 0 err,drop/s 235,482 xmit/s Summary 234,684 rx/s 0 err,drop/s 234,683 xmit/s Summary 235,520 rx/s 0 err,drop/s 235,520 xmit/s Summary 235,461 rx/s 0 err,drop/s 235,461 xmit/s Summary 234,627 rx/s 0 err,drop/s 234,627 xmit/s Summary 235,611 rx/s 0 err,drop/s 235,611 xmit/s Packets received : 3,053,753 Average packets/s : 234,904 Packets transmitted : 3,053,792 Average transmit/s : 234,907 Compared the performance of XDP_TX with XDP_REDIRECT, XDP_TX is also much better than XDP_REDIRECT. It's also in line with our expectation. Signed-off-by: Wei Fang <wei.fang@nxp.com> Suggested-by: Jesper Dangaard Brouer <hawk@kernel.org> Suggested-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16broadcom: b44: Use b44_writephy() return valueArtem Chernyshev
Return result of b44_writephy() instead of zero to deal with possible error. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-16selftests: bonding: remove redundant delete action of device link1_1Zhengchao Shao
When run command "ip netns delete client", device link1_1 has been deleted. So, it is no need to delete link1_1 again. Remove it. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-15Merge tag 'mlx5-updates-2023-08-14' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-08-14 1) Handle PTP out of order CQEs issue 2) Check FW status before determining reset successful 3) Expose maximum supported SFs via devlink resource 4) MISC cleanups * tag 'mlx5-updates-2023-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Don't query MAX caps twice net/mlx5: Remove unused MAX HCA capabilities net/mlx5: Remove unused CAPs net/mlx5: Fix error message in mlx5_sf_dev_state_change_handler() net/mlx5: Remove redundant check of mlx5_vhca_event_supported() net/mlx5: Use mlx5_sf_start_function_id() helper instead of directly calling MLX5_CAP_GEN() net/mlx5: Remove redundant SF supported check from mlx5_sf_hw_table_init() net/mlx5: Use auxiliary_device_uninit() instead of device_put() net/mlx5: E-switch, Add checking for flow rule destinations net/mlx5: Check with FW that sync reset completed successfully net/mlx5: Expose max possible SFs via devlink resource net/mlx5e: Add recovery flow for tx devlink health reporter for unhealthy PTP SQ net/mlx5e: Make tx_port_ts logic resilient to out-of-order CQEs net/mlx5: Consolidate devlink documentation in devlink/mlx5.rst ==================== Link: https://lore.kernel.org/r/20230814214144.159464-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15Merge branch 'net-warn-about-attempts-to-register-negative-ifindex'Jakub Kicinski
Jakub Kicinski says: ==================== net: warn about attempts to register negative ifindex Follow up to the recently posted fix for OvS lacking input validation: https://lore.kernel.org/all/20230814203840.2908710-1-kuba@kernel.org/ Warn about negative ifindex more explicitly and misc YNL updates. ==================== Link: https://lore.kernel.org/r/20230814205627.2914583-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15tools: ynl: add more info to KeyErrors on missing attrsJakub Kicinski
When developing specs its useful to know which attr space YNL was trying to find an attribute in on key error. Instead of printing: KeyError: 0 add info about the space: Exception: Space 'vport' has no attribute with value '0' Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20230814205627.2914583-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15netlink: specs: add ovs_vport new commandJakub Kicinski
Add NEW to the spec, it was useful testing the fix for OvS input validation. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20230814205627.2914583-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15net: warn about attempts to register negative ifindexJakub Kicinski
Since the xarray changes we mix returning valid ifindex and negative errno in a single int returned from dev_index_reserve(). This depends on the fact that ifindexes can't be negative. Otherwise we may insert into the xarray and return a very large negative value. This in turn may break ERR_PTR(). OvS is susceptible to this problem and lacking validation (fix posted separately for net). Reject negative ifindex explicitly. Add a warning because the input validation is better handled by the caller. Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230814205627.2914583-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15net: openvswitch: reject negative ifindexJakub Kicinski
Recent changes in net-next (commit 759ab1edb56c ("net: store netdevs in an xarray")) refactored the handling of pre-assigned ifindexes and let syzbot surface a latent problem in ovs. ovs does not validate ifindex, making it possible to create netdev ports with negative ifindex values. It's easy to repro with YNL: $ ./cli.py --spec netlink/specs/ovs_datapath.yaml \ --do new \ --json '{"upcall-pid": 1, "name":"my-dp"}' $ ./cli.py --spec netlink/specs/ovs_vport.yaml \ --do new \ --json '{"upcall-pid": "00000001", "name": "some-port0", "dp-ifindex":3,"ifindex":4294901760,"type":2}' $ ip link show -65536: some-port0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 7a:48:21:ad:0b:fb brd ff:ff:ff:ff:ff:ff ... Validate the inputs. Now the second command correctly returns: $ ./cli.py --spec netlink/specs/ovs_vport.yaml \ --do new \ --json '{"upcall-pid": "00000001", "name": "some-port0", "dp-ifindex":3,"ifindex":4294901760,"type":2}' lib.ynl.NlError: Netlink error: Numerical result out of range nl_len = 108 (92) nl_flags = 0x300 nl_type = 2 error: -34 extack: {'msg': 'integer out of range', 'unknown': [[type:4 len:36] b'\x0c\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x00\x03\x00\xff\xff\xff\x7f\x00\x00\x00\x00\x08\x00\x01\x00\x08\x00\x00\x00'], 'bad-attr': '.ifindex'} Accept 0 since it used to be silently ignored. Fixes: 54c4ef34c4b6 ("openvswitch: allow specifying ifindex of new interfaces") Reported-by: syzbot+7456b5dcf65111553320@syzkaller.appspotmail.com Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Aaron Conole <aconole@redhat.com> Link: https://lore.kernel.org/r/20230814203840.2908710-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15eth: r8152: try to use a normal budgetJakub Kicinski
Mario reports that loading r8152 on his system leads to a: netif_napi_add_weight() called with weight 256 warning getting printed. We don't have any solid data on why such high budget was chosen, and it may cause stalls in processing other softirqs and rt threads. So try to switch back to the default (64) weight. If this slows down someone's system we should investigate which part of stopping starting the NAPI poll in this driver are expensive. Reported-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/all/0bfd445a-81f7-f702-08b0-bd5a72095e49@amd.com/ Acked-by: Hayes Wang <hayeswang@realtek.com> Link: https://lore.kernel.org/r/20230814153521.2697982-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15net: e1000e: Remove unused declarationsYue Haibing
Commit bdfe2da6aefd ("e1000e: cosmetic move of function prototypes to the new mac.h") declared but never implemented them. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230814135821.4808-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15qed: remove unused 'resp_size' calculationArnd Bergmann
Newer versions of clang warn about this variable being assigned but never used: drivers/net/ethernet/qlogic/qed/qed_vf.c:63:67: error: parameter 'resp_size' set but not used [-Werror,-Wunused-but-set-parameter] There is no indication in the git history on how this was ever meant to be used, so just remove the entire calculation and argument passing for it to avoid the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230814074512.1067715-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15team: Fix incorrect deletion of ETH_P_8021AD protocol vid from slavesZiyang Xuan
Similar to commit 01f4fd270870 ("bonding: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves"), we can trigger BUG_ON(!vlan_info) in unregister_vlan_dev() with the following testcase: # ip netns add ns1 # ip netns exec ns1 ip link add team1 type team # ip netns exec ns1 ip link add team_slave type veth peer veth2 # ip netns exec ns1 ip link set team_slave master team1 # ip netns exec ns1 ip link add link team_slave name team_slave.10 type vlan id 10 protocol 802.1ad # ip netns exec ns1 ip link add link team1 name team1.10 type vlan id 10 protocol 802.1ad # ip netns exec ns1 ip link set team_slave nomaster # ip netns del ns1 Add S-VLAN tag related features support to team driver. So the team driver will always propagate the VLAN info to its slaves. Fixes: 8ad227ff89a7 ("net: vlan: add 802.1ad support") Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230814032301.2804971-1-william.xuanziyang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15net: phy: mediatek-ge-soc: support PHY LEDsDaniel Golle
Implement netdev trigger and primitive bliking offloading as well as simple set_brigthness function for both PHY LEDs of the in-SoC PHYs found in MT7981 and MT7988. For MT7988, read boottrap register and apply LED polarities accordingly to get uniform behavior from all LEDs on MT7988. This requires syscon phandle 'mediatek,pio' present in parenting MDIO bus which should point to the syscon holding the boottrap register. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/dc324d48c00cd7350f3a506eaa785324cae97372.1691977904.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15Merge branch 'nexthop-various-cleanups'Jakub Kicinski
Ido Schimmel says: ==================== nexthop: Various cleanups Benefit from recent bug fixes and simplify the nexthop dump code. No regressions in existing tests: # ./fib_nexthops.sh [...] Tests passed: 234 Tests failed: 0 ==================== Link: https://lore.kernel.org/r/20230813164856.2379822-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15nexthop: Do not increment dump sentinel at the end of the dumpIdo Schimmel
The nexthop and nexthop bucket dump callbacks previously returned a positive return code even when the dump was complete, prompting the core netlink code to invoke the callback again, until returning zero. Zero was only returned by these callbacks when no information was filled in the provided skb, which was achieved by incrementing the dump sentinel at the end of the dump beyond the ID of the last nexthop. This is no longer necessary as when the dump is complete these callbacks return zero. Remove the unnecessary increment. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230813164856.2379822-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15nexthop: Simplify nexthop bucket dumpIdo Schimmel
Before commit f10d3d9df49d ("nexthop: Make nexthop bucket dump more efficient"), rtm_dump_nexthop_bucket_nh() returned a non-zero return code for each resilient nexthop group whose buckets it dumped, regardless if it encountered an error or not. This meant that the sentinel ('dd->ctx->nh.idx') used by the function that walked the different nexthops could not be used as a sentinel for the bucket dump, as otherwise buckets from the same group would be dumped over and over again. This was dealt with by adding another sentinel ('dd->ctx->done_nh_idx') that was incremented by rtm_dump_nexthop_bucket_nh() after successfully dumping all the buckets from a given group. After the previously mentioned commit this sentinel is no longer necessary since the function no longer returns a non-zero return code when successfully dumping all the buckets from a given group. Remove this sentinel and simplify the code. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230813164856.2379822-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15net: phy: broadcom: stub c45 read/write for 54810Justin Chen
The 54810 does not support c45. The mmd_phy_indirect accesses return arbirtary values leading to odd behavior like saying it supports EEE when it doesn't. We also see that reading/writing these non-existent MMD registers leads to phy instability in some cases. Fixes: b14995ac2527 ("net: phy: broadcom: Add BCM54810 PHY entry") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/1691901708-28650-1-git-send-email-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15Merge branch 'seg6-add-next-c-sid-support-for-srv6-end-x-behavior'Jakub Kicinski
Andrea Mayer says: ==================== seg6: add NEXT-C-SID support for SRv6 End.X behavior In the Segment Routing (SR) architecture a list of instructions, called segments, can be added to the packet headers to influence the forwarding and processing of the packets in an SR enabled network. Considering the Segment Routing over IPv6 data plane (SRv6) [1], the segment identifiers (SIDs) are IPv6 addresses (128 bits) and the segment list (SID List) is carried in the Segment Routing Header (SRH). A segment may correspond to a "behavior" that is executed by a node when the packet is received. The Linux kernel currently supports a large subset of the behaviors described in [2] (e.g., End, End.X, End.T and so on). In some SRv6 scenarios, the number of segments carried by the SID List may increase dramatically, reducing the MTU (Maximum Transfer Unit) size and/or limiting the processing power of legacy hardware devices (due to longer IPv6 headers). The NEXT-C-SID mechanism [3] extends the SRv6 architecture by providing several ways to efficiently represent the SID List. By leveraging the NEXT-C-SID, it is possible to encode several SRv6 segments within a single 128 bit SID address (also referenced as Compressed SID Container). In this way, the length of the SID List can be drastically reduced. The NEXT-C-SID mechanism is built upon the "flavors" framework defined in [2]. This framework is already supported by the Linux SRv6 subsystem and is used to modify and/or extend a subset of existing behaviors. In this patchset, we extend the SRv6 End.X behavior in order to support the NEXT-C-SID mechanism. In details, the patchset is made of: - patch 1/2: add NEXT-C-SID support for SRv6 End.X behavior; - patch 2/2: add selftest for NEXT-C-SID in SRv6 End.X behavior. From the user space perspective, we do not need to change the iproute2 code to support the NEXT-C-SID flavor for the SRv6 End.X behavior. However, we will update the man page considering the NEXT-C-SID flavor applied to the SRv6 End.X behavior in a separate patch. [1] - https://datatracker.ietf.org/doc/html/rfc8754 [2] - https://datatracker.ietf.org/doc/html/rfc8986 [3] - https://datatracker.ietf.org/doc/html/draft-ietf-spring-srv6-srh-compression ==================== Link: https://lore.kernel.org/r/20230812180926.16689-1-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15selftests: seg6: add selftest for NEXT-C-SID flavor in SRv6 End.X behaviorPaolo Lungaroni
This selftest is designed for testing the support of NEXT-C-SID flavor for SRv6 End.X behavior. It instantiates a virtual network composed of several nodes: hosts and SRv6 routers. Each node is realized using a network namespace that is properly interconnected to others through veth pairs, according to the topology depicted in the selftest script file. The test considers SRv6 routers implementing IPv4/IPv6 L3 VPNs leveraged by hosts for communicating with each other. Such routers i) apply different SRv6 Policies to the traffic received from connected hosts, considering the IPv4 or IPv6 protocols; ii) use the NEXT-C-SID compression mechanism for encoding several SRv6 segments within a single 128-bit SID address, referred to as a Compressed SID (C-SID) container. The NEXT-C-SID is provided as a "flavor" of the SRv6 End.X behavior, enabling it to properly process the C-SID containers. The correct execution of the enabled NEXT-C-SID SRv6 End.X behavior is verified through reachability tests carried out between hosts belonging to the same VPN. Signed-off-by: Paolo Lungaroni <paolo.lungaroni@uniroma2.it> Co-developed-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230812180926.16689-3-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>