linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-03-26	Merge tag 'net-next-6.15' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Continue Netlink conversions to per-namespace RTNL lock (IPv4 routing, routing rules, routing next hops, ARP ioctls) - Continue extending the use of netdev instance locks. As a driver opt-in protect queue operations and (in due course) ethtool operations with the instance lock and not RTNL lock. - Support collecting TCP timestamps (data submitted, sent, acked) in BPF, allowing for transparent (to the application) and lower overhead tracking of TCP RPC performance. - Tweak existing networking Rx zero-copy infra to support zero-copy Rx via io_uring. - Optimize MPTCP performance in single subflow mode by 29%. - Enable GRO on packets which went thru XDP CPU redirect (were queued for processing on a different CPU). Improving TCP stream performance up to 2x. - Improve performance of contended connect() by 200% by searching for an available 4-tuple under RCU rather than a spin lock. Bring an additional 229% improvement by tweaking hash distribution. - Avoid unconditionally touching sk_tsflags on RX, improving performance under UDP flood by as much as 10%. - Avoid skb_clone() dance in ping_rcv() to improve performance under ping flood. - Avoid FIB lookup in netfilter if socket is available, 20% perf win. - Rework network device creation (in-kernel) API to more clearly identify network namespaces and their roles. There are up to 4 namespace roles but we used to have just 2 netns pointer arguments, interpreted differently based on context. - Use sysfs_break_active_protection() instead of trylock to avoid deadlocks between unregistering objects and sysfs access. - Add a new sysctl and sockopt for capping max retransmit timeout in TCP. - Support masking port and DSCP in routing rule matches. - Support dumping IPv4 multicast addresses with RTM_GETMULTICAST. - Support specifying at what time packet should be sent on AF_XDP sockets. - Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin users. - Add Netlink YAML spec for WiFi (nl80211) and conntrack. - Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols which only need to be exported when IPv6 support is built as a module. - Age FDB entries based on Rx not Tx traffic in VxLAN, similar to normal bridging. - Allow users to specify source port range for GENEVE tunnels. - netconsole: allow attaching kernel release, CPU ID and task name to messages as metadata Driver API: - Continue rework / fixing of Energy Efficient Ethernet (EEE) across the SW layers. Delegate the responsibilities to phylink where possible. Improve its handling in phylib. - Support symmetric OR-XOR RSS hashing algorithm. - Support tracking and preserving IRQ affinity by NAPI itself. - Support loopback mode speed selection for interface selftests. Device drivers: - Remove the IBM LCS driver for s390 - Remove the sb1000 cable modem driver - Add support for SFP module access over SMBus - Add MCTP transport driver for MCTP-over-USB - Enable XDP metadata support in multiple drivers - Ethernet high-speed NICs: - Broadcom (bnxt): - add PCIe TLP Processing Hints (TPH) support for new AMD platforms - support dumping RoCE queue state for debug - opt into instance locking - Intel (100G, ice, idpf): - ice: rework MSI-X IRQ management and distribution - ice: support for E830 devices - iavf: add support for Rx timestamping - iavf: opt into instance locking - nVidia/Mellanox: - mlx4: use page pool memory allocator for Rx - mlx5: support for one PTP device per hardware clock - mlx5: support for 200Gbps per-lane link modes - mlx5: move IPSec policy check after decryption - AMD/Solarflare: - support FW flashing via devlink - Cisco (enic): - use page pool memory allocator for Rx - enable 32, 64 byte CQEs - get max rx/tx ring size from the device - Meta (fbnic): - support flow steering and RSS configuration - report queue stats - support TCP segmentation - support IRQ coalescing - support ring size configuration - Marvell/Cavium: - support AF_XDP - Wangxun: - support for PTP clock and timestamping - Huawei (hibmcge): - checksum offload - add more statistics - Ethernet virtual: - VirtIO net: - aggressively suppress Tx completions, improve perf by 96% with 1 CPU and 55% with 2 CPUs - expose NAPI to IRQ mapping and persist NAPI settings - Google (gve): - support XDP in DQO RDA Queue Format - opt into instance locking - Microsoft vNIC: - support BIG TCP - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - cleanup Tx and Tx clock setting and other link-focused cleanups - enable SGMII and 2500BASEX mode switching for Intel platforms - support Sophgo SG2044 - Broadcom switches (b53): - support for BCM53101 - TI: - iep: add perout configuration support - icssg: support XDP - Cadence (macb): - implement BQL - Xilinx (axinet): - support dynamic IRQ moderation and changing coalescing at runtime - implement BQL - report standard stats - MediaTek: - support phylink managed EEE - Intel: - igc: don't restart the interface on every XDP program change - RealTek (r8169): - support reading registers of internal PHYs directly - increase max jumbo packet size on RTL8125/RTL8126 - Airoha: - support for RISC-V NPU packet processing unit - enable scatter-gather and support MTU up to 9kB - Tehuti (tn40xx): - support cards with TN4010 MAC and an Aquantia AQR105 PHY - Ethernet PHYs: - support for TJA1102S, TJA1121 - dp83tg720: add randomized polling intervals for link detection - dp83822: support changing the transmit amplitude voltage - support for LEDs on 88q2xxx - CAN: - canxl: support Remote Request Substitution bit access - flexcan: add S32G2/S32G3 SoC - WiFi: - remove cooked monitor support - strict mode for better AP testing - basic EPCS support - OMI RX bandwidth reduction support - batman-adv: add support for jumbo frames - WiFi drivers: - RealTek (rtw88): - support RTL8814AE and RTL8814AU - RealTek (rtw89): - switch using wiphy_lock and wiphy_work - add BB context to manipulate two PHY as preparation of MLO - improve BT-coexistence mechanism to play A2DP smoothly - Intel (iwlwifi): - add new iwlmld sub-driver for latest HW/FW combinations - MediaTek (mt76): - preparation for mt7996 Multi-Link Operation (MLO) support - Qualcomm/Atheros (ath12k): - continued work on MLO - Silabs (wfx): - Wake-on-WLAN support - Bluetooth: - add support for skb TX SND/COMPLETION timestamping - hci_core: enable buffer flow control for SCO/eSCO - coredump: log devcd dumps into the monitor - Bluetooth drivers: - intel: add support to configure TX power - nxp: handle bootloader error during cmd5 and cmd7" * tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1681 commits) unix: fix up for "apparmor: add fine grained af_unix mediation" mctp: Fix incorrect tx flow invalidation condition in mctp-i2c net: usb: asix: ax88772: Increase phy_name size net: phy: Introduce PHY_ID_SIZE — minimum size for PHY ID string net: libwx: fix Tx L4 checksum net: libwx: fix Tx descriptor content for some tunnel packets atm: Fix NULL pointer dereference net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards net: tn40xx: prepare tn40xx driver to find phy of the TN9510 card net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus net: phy: aquantia: add essential functions to aqr105 driver net: phy: aquantia: search for firmware-name in fwnode net: phy: aquantia: add probe function to aqr105 for firmware loading net: phy: Add swnode support to mdiobus_scan gve: add XDP DROP and PASS support for DQ gve: update XDP allocation path support RX buffer posting gve: merge packet buffer size fields gve: update GQ RX to use buf_size gve: introduce config-based allocation for XDP gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics ...
2025-02-21	net: Use link/peer netns in newlink() of rtnl_link_ops	Xiao Liang
	Add two helper functions - rtnl_newlink_link_net() and rtnl_newlink_peer_net() for netns fallback logic. Peer netns falls back to link netns, and link netns falls back to source netns. Convert the use of params->net in netdevice drivers to one of the helper functions for clarity. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-4-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21	rtnetlink: Pack newlink() params into struct	Xiao Liang
	There are 4 net namespaces involved when creating links: - source netns - where the netlink socket resides, - target netns - where to put the device being created, - link netns - netns associated with the device (backend), - peer netns - netns of peer device. Currently, two nets are passed to newlink() callback - "src_net" parameter and "dev_net" (implicitly in net_device). They are set as follows, depending on netlink attributes in the request. +------------+-------------------+---------+---------+ \| peer netns \| IFLA_LINK_NETNSID \| src_net \| dev_net \| +------------+-------------------+---------+---------+ \| \| absent \| source \| target \| \| absent +-------------------+---------+---------+ \| \| present \| link \| link \| +------------+-------------------+---------+---------+ \| \| absent \| peer \| target \| \| present +-------------------+---------+---------+ \| \| present \| peer \| link \| +------------+-------------------+---------+---------+ When IFLA_LINK_NETNSID is present, the device is created in link netns first and then moved to target netns. This has some side effects, including extra ifindex allocation, ifname validation and link events. These could be avoided if we create it in target netns from the beginning. On the other hand, the meaning of src_net parameter is ambiguous. It varies depending on how parameters are passed. It is the effective link (or peer netns) by design, but some drivers ignore it and use dev_net instead. To provide more netns context for drivers, this patch packs existing newlink() parameters, along with the source netns, link netns and peer netns, into a struct. The old "src_net" is renamed to "net" to avoid confusion with real source netns, and will be deprecated later. The use of src_net are converted to params->net trivially. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-3-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: qualcomm: rmnet: Switch to use hrtimer_setup()	Nam Cao
	hrtimer_setup() takes the callback function pointer as argument and initializes the timer completely. Replace hrtimer_init() and the open coded initialization of hrtimer::function with the new setup mechanism. Patch was created by using Coccinelle. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/4312885d95fc99053b2c285f74cc83a852c03f4a.1738746872.git.namcao@linutronix.de
2024-09-03	netdev_features: convert NETIF_F_LLTX to dev->lltx	Alexander Lobakin
	NETIF_F_LLTX can't be changed via Ethtool and is not a feature, rather an attribute, very similar to IFF_NO_QUEUE (and hot). Free one netdev_features_t bit and make it a "hot" private flag. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-07	net: annotate writes on dev->mtu from ndo_change_mtu()	Eric Dumazet
	Simon reported that ndo_change_mtu() methods were never updated to use WRITE_ONCE(dev->mtu, new_mtu) as hinted in commit 501a90c94510 ("inet: protect against too small mtu values.") We read dev->mtu without holding RTNL in many places, with READ_ONCE() annotations. It is time to take care of ndo_change_mtu() methods to use corresponding WRITE_ONCE() Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Simon Horman <horms@kernel.org> Closes: https://lore.kernel.org/netdev/20240505144608.GB67882@kernel.org/ Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20240506102812.3025432-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-26	rtnetlink: prepare nla_put_iflink() to run under RCU	Eric Dumazet
	We want to be able to run rtnl_fill_ifinfo() under RCU protection instead of RTNL in the future. This patch prepares dev_get_iflink() and nla_put_iflink() to run either with RTNL or RCU held. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-29	net: fill in MODULE_DESCRIPTION()s for Qualcom drivers	Breno Leitao
	W=1 builds now warn if module is built without a MODULE_DESCRIPTION(). Add descriptions to the Qualcom rmnet and emac drivers. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Subash Abhinov Kasiviswanathan <quic_subashab@quicinc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-11	net: qualcomm: rmnet: fix global oob in rmnet_policy	Lin Ma
	The variable rmnet_link_ops assign a bigger maxtype which leads to a global out-of-bounds read when parsing the netlink attributes. See bug trace below: ================================================================== BUG: KASAN: global-out-of-bounds in validate_nla lib/nlattr.c:386 [inline] BUG: KASAN: global-out-of-bounds in __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600 Read of size 1 at addr ffffffff92c438d0 by task syz-executor.6/84207 CPU: 0 PID: 84207 Comm: syz-executor.6 Tainted: G N 6.1.0 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x8b/0xb3 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:284 [inline] print_report+0x172/0x475 mm/kasan/report.c:395 kasan_report+0xbb/0x1c0 mm/kasan/report.c:495 validate_nla lib/nlattr.c:386 [inline] __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600 __nla_parse+0x3e/0x50 lib/nlattr.c:697 nla_parse_nested_deprecated include/net/netlink.h:1248 [inline] __rtnl_newlink+0x50a/0x1880 net/core/rtnetlink.c:3485 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3594 rtnetlink_rcv_msg+0x43c/0xd70 net/core/rtnetlink.c:6091 netlink_rcv_skb+0x14f/0x410 net/netlink/af_netlink.c:2540 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x54e/0x800 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x930/0xe50 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0x154/0x190 net/socket.c:734 ____sys_sendmsg+0x6df/0x840 net/socket.c:2482 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2536 __sys_sendmsg+0xf3/0x1c0 net/socket.c:2565 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fdcf2072359 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fdcf13e3168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fdcf219ff80 RCX: 00007fdcf2072359 RDX: 0000000000000000 RSI: 0000000020000200 RDI: 0000000000000003 RBP: 00007fdcf20bd493 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fffbb8d7bdf R14: 00007fdcf13e3300 R15: 0000000000022000 </TASK> The buggy address belongs to the variable: rmnet_policy+0x30/0xe0 The buggy address belongs to the physical page: page:0000000065bdeb3c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x155243 flags: 0x200000000001000(reserved\|node=0\|zone=2) raw: 0200000000001000 ffffea00055490c8 ffffea00055490c8 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffffffff92c43780: f9 f9 f9 f9 00 00 00 02 f9 f9 f9 f9 00 00 00 07 ffffffff92c43800: f9 f9 f9 f9 00 00 00 05 f9 f9 f9 f9 06 f9 f9 f9 >ffffffff92c43880: f9 f9 f9 f9 00 00 00 00 00 00 f9 f9 f9 f9 f9 f9 ^ ffffffff92c43900: 00 00 00 00 00 00 00 00 07 f9 f9 f9 f9 f9 f9 f9 ffffffff92c43980: 00 00 00 07 f9 f9 f9 f9 00 00 00 05 f9 f9 f9 f9 According to the comment of `nla_parse_nested_deprecated`, the maxtype should be len(destination array) - 1. Hence use `IFLA_RMNET_MAX` here. Fixes: 14452ca3b5ce ("net: qualcomm: rmnet: Export mux_id and flags to netlink") Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Subash Abhinov Kasiviswanathan <quic_subashab@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240110061400.3356108-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-13	net: qualcomm: rmnet: add ethtool support for configuring tx aggregation	Daniele Palmas
	Add support for ETHTOOL_COALESCE_TX_AGGR for configuring the tx aggregation settings. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-13	net: qualcomm: rmnet: add tx packets aggregation	Daniele Palmas
	Add tx packets aggregation. Bidirectional TCP throughput tests through iperf with low-cat Thread-x based modems revelead performance issues both in tx and rx. The Windows driver does not show this issue: inspecting USB packets revealed that the only notable change is the driver enabling tx packets aggregation. Tx packets aggregation is by default disabled and can be enabled by increasing the value of ETHTOOL_A_COALESCE_TX_MAX_AGGR_FRAMES. The maximum aggregated size is by default set to a reasonably low value in order to support the majority of modems. This implementation is based on patches available in Code Aurora repositories (msm kernel) whose main authors are Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Reviewed-by: Subash Abhinov Kasiviswanathan <quic_subashab@quicinc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28	net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).	Thomas Gleixner
	Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert to the regular interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net: ethernet: rmnet: Replace zero-length array with DECLARE_FLEX_ARRAY() helper	Gustavo A. R. Silva
	Zero-length arrays are deprecated and we are moving towards adopting C99 flexible-array members, instead. So, replace zero-length arrays declarations in anonymous union with the new DECLARE_FLEX_ARRAY() helper macro. This helper allows for flexible-array members in unions. Link: https://github.com/KSPP/linux/issues/193 Link: https://github.com/KSPP/linux/issues/221 Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11	net: add per-cpu storage and net->core_stats	Eric Dumazet
	Before adding yet another possibly contended atomic_long_t, it is time to add per-cpu storage for existing ones: dev->tx_dropped, dev->rx_dropped, and dev->rx_nohandler Because many devices do not have to increment such counters, allocate the per-cpu storage on demand, so that dev_get_stats() does not have to spend considerable time folding zero counters. Note that some drivers have abused these counters which were supposed to be only used by core networking stack. v4: should use per_cpu_ptr() in dev_get_stats() (Jakub) v3: added a READ_ONCE() in netdev_core_stats_alloc() (Paolo) v2: add a missing include (reported by kernel test robot <lkp@intel.com>) Change in netdev_core_stats_alloc() (Jakub) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: jeffreyji <jeffreyji@google.com> Reviewed-by: Brian Vazquez <brianvv@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20220311051420.2608812-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-21	net: qualcomm: rmnet: Use skb_put_zero() to simplify code	Christophe JAILLET
	Use skb_put_zero() instead of hand-writing it. This saves a few lines of code and is more readable. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-14	ethernet: make use of eth_hw_addr_random() where appropriate	Jakub Kicinski
	Number of drivers use eth_random_addr(netdev->dev_addr) thus writing to netdev->dev_addr directly, and not setting the address type. Switch them to eth_hw_addr_random(). @@ expression netdev; @@ - eth_random_addr(netdev->dev_addr); + eth_hw_addr_random(netdev); Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-21	net: qualcomm: rmnet: fix two pointer math bugs	Dan Carpenter
	We recently changed these two pointers from void pointers to struct pointers and it breaks the pointer math so now the "txphdr" points beyond the end of the buffer. Fixes: 56a967c4f7e5 ("net: qualcomm: rmnet: Remove some unneeded casts") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Trivial conflicts in net/can/isotp.c and tools/testing/selftests/net/mptcp/mptcp_connect.sh scaled_ppm_to_ppb() was moved from drivers/ptp/ptp_clock.c to include/linux/ptp_clock_kernel.h in -next so re-apply the fix there. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-16	net: qualcomm: rmnet: Remove some unneeded casts	Subash Abhinov Kasiviswanathan
	Remove the explicit casts in the checksum complement functions and pass the actual protocol specific headers instead. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16	net: qualcomm: rmnet: Allow partial updates of IFLA_FLAGS	Bjorn Andersson
	The idiomatic way to handle the changelink flags/mask pair seems to be allow partial updates of the driver's link flags. In contrast the rmnet driver masks the incoming flags and then use that as the new flags. Change the rmnet driver to follow the common scheme, before the introduction of IFLA_RMNET_FLAGS handling in iproute2 et al. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14	net: qualcomm: rmnet: always expose a few functions	Alex Elder
	A recent change tidied up some conditional code, avoiding the use of some #ifdefs. Unfortunately, if CONFIG_IPV6 was not enabled, it meant that two functions were referenced but never defined. The easiest fix is to just define stubs for these functions if CONFIG_IPV6 is not defined. This will soon be simplified further by some other development in the works... Reported-by: kernel test robot <lkp@intel.com> Fixes: 75db5b07f8c39 ("net: qualcomm: rmnet: eliminate some ifdefs") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14	net: qualcomm: rmnet: don't over-count statistics	Alex Elder
	The purpose of the loop using u64_stats_fetch_*_irq() is to ensure statistics on a given CPU are collected atomically. If one of the statistics values gets updated within the begin/retry window, the loop will run again. Currently the statistics totals are updated inside that window. This means that if the loop ever retries, the statistics for the CPU will be counted more than once. Fix this by taking a snapshot of a CPU's statistics inside the protected window, and then updating the counters with the snapshot values after exiting the loop. (Also add a newline at the end of this file...) Fixes: 192c4b5d48f2a ("net: qualcomm: rmnet: Add support for 64 bit stats") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12	net: qualcomm: rmnet: IPv6 payload length is simple	Alex Elder
	We don't support any extension headers for IPv6 packets. Extension headers therefore contribute 0 bytes to the payload length. As a result we can just use the IPv6 payload length as the length used to compute the pseudo header checksum for both UDP and TCP messages. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12	net: qualcomm: rmnet: drop some unary NOTs	Alex Elder
	We compare a payload checksum with a pseudo checksum value for equality in rmnet_map_ipv4_dl_csum_trailer(). Both of those values are computed with a unary NOT (~) operation. The result of the comparison is the same if we omit that NOT for both values. Remove these operations in rmnet_map_ipv6_dl_csum_trailer() also. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12	net: qualcomm: rmnet: trailer value is a checksum	Alex Elder
	The csum_value field in the rmnet_map_dl_csum_trailer structure is a "real" Internet checksum. It is a 16 bit value, in big endian format, which represents an inverted ones' complement sum over pairs of bytes. Make that clear by changing its type to __sum16. This makes a typecast in rmnet_map_ipv4_dl_csum_trailer() and another in rmnet_map_ipv6_dl_csum_trailer() unnecessary. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12	net: qualcomm: rmnet: remove unneeded code	Alex Elder
	The previous patch makes rmnet_map_ipv4_dl_csum_trailer() return early with an error if it is determined that the computed checksum for the IP payload does not match what was expected. If the computed checksum does match the expected value, the IP payload (i.e., the transport message), can be considered good. There is no need to do any further processing of the message. This means a big block of code is unnecessary for validating the transport checksum value, and can be removed. Make comparable changes in rmnet_map_ipv6_dl_csum_trailer(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12	net: qualcomm: rmnet: return earlier for bad checksum	Alex Elder
	In rmnet_map_ipv4_dl_csum_trailer(), if the sum of the trailer checksum and the pseudo checksum is non-zero, checksum validation has failed. We can return an error as soon as we know that. We can do the same thing in rmnet_map_ipv6_dl_csum_trailer(). Add some comments that explain where we're headed. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12	net: qualcomm: rmnet: show that an intermediate sum is zero	Alex Elder
	This patch simply demonstrates that a checksum value computed when verifying an offloaded transport checksum value for both IPv4 and IPv6 is (normally) 0. It can be squashed into the next patch. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12	net: qualcomm: rmnet: rearrange some NOTs	Alex Elder
	With the ones' complement arithmetic, the sum of two negated values is equal to the negation of the sum of the two original values [1]. Rearrange the calculation ip6_payload_sum using this property. [1] https://tools.ietf.org/html/rfc1071 Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12	net: qualcomm: rmnet: remove some local variables	Alex Elder
	In rmnet_map_ipv4_dl_csum_trailer(), remove the "csum_temp" and "addend" local variables, and simplify a few lines of code. Remove the "csum_temp", "csum_value", "ip6_hdr_csum", and "addend" local variables in rmnet_map_ipv6_dl_csum_trailer(), and simplify a few lines of code there as well. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: qualcomm: rmnet: avoid unnecessary IPv6 byte-swapping	Alex Elder
	In the previous patch IPv4 download checksum offload code was updated to avoid unnecessary byte swapping, based on properties of the Internet checksum algorithm. This patch makes comparable changes to the IPv6 download checksum offload handling. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: qualcomm: rmnet: avoid unnecessary byte-swapping	Alex Elder
	Internet checksums are used for IPv4 header checksum, as well as TCP segment and UDP datagram checksums. Such a checksum represents the negated sum of adjacent pairs of bytes, using ones' complement arithmetic. One property of the Internet checkum is byte order independence [1]. Specifically, the sum of byte-swapped pairs is equal to the result of byte swapping the sum of those same pairs when not byte-swapped. So for example if a, b, c, d, y, and z are hexadecimal digits, and PLUS represents ones' complement addition: If: ab PLUS cd = yz Then: ba PLUS dc = zy For this reason, there is no need to swap the order of bytes in the checksum value held in a message header, nor the one in the QMAPv4 trailer, in order to operate on them. In other words, we can determine whether the hardware-computed checksum matches the one in the message header without any byte swaps. (This patch leaves in place all existing type casts.) [1] https://tools.ietf.org/html/rfc1071 Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: qualcomm: rmnet: clarify a bit of code	Alex Elder
	In rmnet_map_ipv6_dl_csum_trailer() there is an especially involved line of code that determines the ones' complement sum of the IPv6 packet header (in host byte order). Simplify that by storing the result of computing just the header checksum in a local variable, then using that in the original assignment. Use the size of the IPv6 header structure as the number of bytes to checksum, rather than computing the offset to the transport header. And use ip_fast_csum() rather than ipa_compute_csum(), knowing that the size of an IPv6 header (40 bytes) is a multiple of 4 bytes greater than 16. Add some comments to match rmnet_map_ipv4_dl_csum_trailer(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: qualcomm: rmnet: IPv4 header has zero checksum	Alex Elder
	In rmnet_map_ipv4_dl_csum_trailer(), an illegal checksum subtraction is done, subtracting hdr_csum (in host byte order) from csum_value (in network byte order). Despite being illegal, it generally works, because it turns out the value subtracted is (or should be) always 0, which has the same representation in either byte order. Doing illegal operations is not good form though, so fix this by verifying the IP header checksum early in that function. If its checksum is non-zero, the packet will be bad, so just return an error. This will cause the packet to passed to the IP layer where it can be dropped. Thereafter, there is no need subtract the IP header checksum from the checksum value in the trailer because we know it is zero. Add a comment explaining this. This type of packet error is different from other types, so add a new statistics counter to track this condition. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: qualcomm: rmnet: simplify rmnet_map_get_csum_field()	Alex Elder
	The checksum fields of the TCP and UDP header structures already have type __sum16. We don't support any other protocol headers, so we can simplify rmnet_map_get_csum_field(), getting rid of the local variable entirely and just returning the appropriate address. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: qualcomm: rmnet: get rid of some local variables	Alex Elder
	The value passed as an argument to rmnet_map_ipv4_ul_csum_header() is always an IPv4 header. Rather than using a local variable, just have the type of the argument reflect the proper type. In rmnet_map_ipv6_ul_csum_header() things are defined a little differently, but make the same basic change there. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: qualcomm: rmnet: eliminate some ifdefs	Alex Elder
	If IPV6 is not enabled in the kernel configuration, the RMNet checksum code indicates a buffer containing an IPv6 packet is not supported. The same thing happens if a buffer contains something other than an IPv4 or IPv6 packet. We can rearrange things a bit in two functions so that some #ifdef calls can simply be eliminated. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: qualcomm: rmnet: use ip_is_fragment()	Alex Elder
	In rmnet_map_ipv4_dl_csum_trailer() use ip_is_fragment() to determine whether a socket buffer contains a packet fragment. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09	net: ethernet: rmnet: Always subtract MAP header	Kristian Evensen
	Commit e1d9a90a9bfd ("net: ethernet: rmnet: Support for ingress MAPv5 checksum offload") broke ingress handling for devices where RMNET_FLAGS_INGRESS_MAP_CKSUMV5 or RMNET_FLAGS_INGRESS_MAP_CKSUMV4 are not set. Unless either of these flags are set, the MAP header is not removed. This commit restores the original logic by ensuring that the MAP header is removed for all MAP packets. Fixes: e1d9a90a9bfd ("net: ethernet: rmnet: Support for ingress MAPv5 checksum offload") Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03	net: ethernet: rmnet: Restructure if checks to avoid uninitialized warning	Nathan Chancellor
	Clang warns that proto in rmnet_map_v5_checksum_uplink_packet() might be used uninitialized: drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c:283:14: warning: variable 'proto' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] } else if (skb->protocol == htons(ETH_P_IPV6)) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c:295:36: note: uninitialized use occurs here check = rmnet_map_get_csum_field(proto, trans); ^~~~~ drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c:283:10: note: remove the 'if' if its condition is always true } else if (skb->protocol == htons(ETH_P_IPV6)) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c:270:11: note: initialize the variable 'proto' to silence this warning u8 proto; ^ = '\0' 1 warning generated. This is technically a false positive because there is an if statement above this one that checks skb->protocol for not being either ETH_P_IP{,V6}. However, it is more obvious to sink that into the if statement as an else branch, which makes the code clearer and fixes the warning. At the same time, move the "IS_ENABLED(CONFIG_IPV6)" into the else if condition so that the else branch of the preprocessor conditional can be shared, since there is no build failure with CONFIG_IPV6 disabled. Fixes: b6e5d27e32ef ("net: ethernet: rmnet: Add support for MAPv5 egress packets") Link: https://github.com/ClangBuiltLinux/linux/issues/1390 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01	net: ethernet: rmnet: Add support for MAPv5 egress packets	Sharath Chandra Vurukala
	Adding support for MAPv5 egress packets. This involves adding the MAPv5 header and setting the csum_valid_required in the checksum header to request HW compute the checksum. Corresponding stats are incremented based on whether the checksum is computed in software or HW. New stat has been added which represents the count of packets whose checksum is calculated by the HW. Signed-off-by: Sharath Chandra Vurukala <sharathv@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01	net: ethernet: rmnet: Support for ingress MAPv5 checksum offload	Sharath Chandra Vurukala
	Adding support for processing of MAPv5 downlink packets. It involves parsing the Mapv5 packet and checking the csum header to know whether the hardware has validated the checksum and is valid or not. Based on the checksum valid bit the corresponding stats are incremented and skb->ip_summed is marked either CHECKSUM_UNNECESSARY or left as CHEKSUM_NONE to let network stack revalidate the checksum and update the respective snmp stats. Current MAPV1 header has been modified, the reserved field in the Mapv1 header is now used for next header indication. Signed-off-by: Sharath Chandra Vurukala <sharathv@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15	net: qualcomm: rmnet: don't use C bit-fields in rmnet checksum header	Alex Elder
	Replace the use of C bit-fields in the rmnet_map_ul_csum_header structure with a single two-byte (big endian) structure member, and use masks to encode or get values within it. The content of these fields can be accessed using simple bitwise AND and OR operations on the (host byte order) value of the new structure member. Previously rmnet_map_ipv4_ul_csum_header() would update C bit-field values in host byte order, then forcibly fix their byte order using a combination of byte swap operations and types. Instead, just compute the value that needs to go into the new structure member and save it with a simple byte-order conversion. Make similar simplifications in rmnet_map_ipv6_ul_csum_header(). Finally, in rmnet_map_checksum_uplink_packet() a set of assignments zeroes every field in the upload checksum header. Replace that with a single memset() operation. Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15	net: qualcomm: rmnet: don't use C bit-fields in rmnet checksum trailer	Alex Elder
	Replace the use of C bit-fields in the rmnet_map_dl_csum_trailer structure with a single one-byte field, using constant field masks to encode or get at embedded values. Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15	net: qualcomm: rmnet: use masks instead of C bit-fields	Alex Elder
	The actual layout of bits defined in C bit-fields (e.g. int foo : 3) is implementation-defined. Structures defined in <linux/if_rmnet.h> address this by specifying all bit-fields twice, to cover two possible layouts. I think this pattern is repetitive and noisy, and I find the whole notion of compiler "bitfield endianness" to be non-intuitive. Stop using C bit-fields for the command/data flag and the pad length fields in the rmnet_map structure, and define a single-byte flags field instead. Define a mask for the single-bit "command" flag, and another mask for the encoded pad length. The content of both fields can be accessed using a simple bitwise AND operation. Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15	net: qualcomm: rmnet: kill RMNET_MAP_GET_*() accessor macros	Alex Elder
	The following macros, defined in "rmnet_map.h", assume a socket buffer is provided as an argument without any real indication this is the case. RMNET_MAP_GET_MUX_ID() RMNET_MAP_GET_CD_BIT() RMNET_MAP_GET_PAD() RMNET_MAP_GET_CMD_START() RMNET_MAP_GET_LENGTH() What they hide is pretty trivial accessing of fields in a structure, and it's much clearer to see this if we do these accesses directly. So rather than using these accessor macros, assign a local variable of the map header pointer type to the socket buffer data pointer, and derereference that pointer variable. In "rmnet_map_data.c", use sizeof(object) rather than sizeof(type) in one spot. Also, there's no need to byte swap 0; it's all zeros irrespective of endianness. Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15	net: qualcomm: rmnet: simplify some byte order logic	Alex Elder
	In rmnet_map_ipv4_ul_csum_header() and rmnet_map_ipv6_ul_csum_header() the offset within a packet at which checksumming should commence is calculated. This calculation involves byte swapping and a forced type conversion that makes it hard to understand. Simplify this by computing the offset in host byte order, then converting the result when assigning it into the header field. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-06	net: qualcomm: rmnet: Fix rx_handler for non-linear skbs	Loic Poulain
	There is no guarantee that rmnet rx_handler is only fed with linear skbs, but current rmnet implementation does not check that, leading to crash in case of non linear skbs processed as linear ones. Fix that by ensuring skb linearization before processing. Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Link: https://lore.kernel.org/r/1612428002-12333-2-git-send-email-loic.poulain@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-10	net: qualcomm: rmnet: Update rmnet device MTU based on real device	Subash Abhinov Kasiviswanathan
	Packets sent by rmnet to the real device have variable MAP header lengths based on the data format configured. This patch adds checks to ensure that the real device MTU is sufficient to transmit the MAP packet comprising of the MAP header and the IP packet. This check is enforced when rmnet devices are created and updated and during MTU updates of both the rmnet and real device. Additionally, rmnet devices now have a default MTU configured which accounts for the real device MTU and the headroom based on the data format. Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Tested-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-11-23	net: don't include ethtool.h from netdevice.h	Jakub Kicinski
	linux/netdevice.h is included in very many places, touching any of its dependecies causes large incremental builds. Drop the linux/ethtool.h include, linux/netdevice.h just needs a forward declaration of struct ethtool_ops. Fix all the places which made use of this implicit include. Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Link: https://lore.kernel.org/r/20201120225052.1427503-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>