summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-08-22ipv4: Unmask upper DSCP bits in input route lookupIdo Schimmel
Unmask the upper DSCP bits in input route lookup so that in the future the lookup could be performed according to the full DSCP value. No functional changes intended since the upper DSCP bits are masked when comparing against the TOS selectors in FIB rules and routes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240821125251.1571445-9-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22ipv4: Unmask upper DSCP bits in fib_compute_spec_dst()Ido Schimmel
As explained in commit 35ebf65e851c ("ipv4: Create and use fib_compute_spec_dst() helper."), the function is used - for example - to determine the source address for an ICMP reply. If we are responding to a multicast or broadcast packet, the source address is set to the source address that we would use if we were to send a packet to the unicast source of the original packet. This address is determined by performing a FIB lookup and using the preferred source address of the resulting route. Unmask the upper DSCP bits of the DS field of the packet that triggered the reply so that in the future the FIB lookup could be performed according to the full DSCP value. No functional changes intended since the upper DSCP bits are masked when comparing against the TOS selectors in FIB rules and routes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240821125251.1571445-8-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22ipv4: ipmr: Unmask upper DSCP bits in ipmr_rt_fib_lookup()Ido Schimmel
Unmask the upper DSCP bits when calling ipmr_fib_lookup() so that in the future it could perform the FIB lookup according to the full DSCP value. Note that ipmr_fib_lookup() performs a FIB rule lookup (returning the relevant routing table) and that IPv4 multicast FIB rules do not support matching on TOS / DSCP. However, it is still worth unmasking the upper DSCP bits in case support for DSCP matching is ever added. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240821125251.1571445-7-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22netfilter: nft_fib: Unmask upper DSCP bitsIdo Schimmel
In a similar fashion to the iptables rpfilter match, unmask the upper DSCP bits of the DS field of the currently tested packet so that in the future the FIB lookup could be performed according to the full DSCP value. No functional changes intended since the upper DSCP bits are masked when comparing against the TOS selectors in FIB rules and routes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240821125251.1571445-6-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22netfilter: rpfilter: Unmask upper DSCP bitsIdo Schimmel
The rpfilter match performs a reverse path filter test on a packet by performing a FIB lookup with the source and destination addresses swapped. Unmask the upper DSCP bits of the DS field of the tested packet so that in the future the FIB lookup could be performed according to the full DSCP value. No functional changes intended since the upper DSCP bits are masked when comparing against the TOS selectors in FIB rules and routes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240821125251.1571445-5-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22ipv4: Unmask upper DSCP bits when constructing the Record Route optionIdo Schimmel
The Record Route IP option records the addresses of the routers that routed the packet. In the case of forwarded packets, the kernel performs a route lookup via fib_lookup() and fills in the preferred source address of the matched route. Unmask the upper DSCP bits when performing the lookup so that in the future the lookup could be performed according to the full DSCP value. No functional changes intended since the upper DSCP bits are masked when comparing against the TOS selectors in FIB rules and routes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240821125251.1571445-4-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22ipv4: Unmask upper DSCP bits in NETLINK_FIB_LOOKUP familyIdo Schimmel
The NETLINK_FIB_LOOKUP netlink family can be used to perform a FIB lookup according to user provided parameters and communicate the result back to user space. Unmask the upper DSCP bits of the user-provided DS field before invoking the IPv4 FIB lookup API so that in the future the lookup could be performed according to the full DSCP value. No functional changes intended since the upper DSCP bits are masked when comparing against the TOS selectors in FIB rules and routes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240821125251.1571445-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22bpf: Unmask upper DSCP bits in bpf_fib_lookup() helperIdo Schimmel
The helper performs a FIB lookup according to the parameters in the 'params' argument, one of which is 'tos'. According to the test in test_tc_neigh_fib.c, it seems that BPF programs are expected to initialize the 'tos' field to the full 8 bit DS field from the IPv4 header. Unmask the upper DSCP bits before invoking the IPv4 FIB lookup APIs so that in the future the lookup could be performed according to the full DSCP value. No functional changes intended since the upper DSCP bits are masked when comparing against the TOS selectors in FIB rules and routes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240821125251.1571445-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22Merge branch 'enhance-network-interface-feature-testing'Jakub Kicinski
Abhinav Jain says: ==================== Enhance network interface feature testing This small series includes fixes for creation of veth pairs for networkless kernels & adds tests for turning the different network interface features on and off in selftests/net/netdevice.sh script. Tested using vng and compiles for network as well as networkless kernel. # selftests: net: netdevice.sh # No valid network device found, creating veth pair # PASS: veth0: set interface up # PASS: veth0: set MAC address # XFAIL: veth0: set IP address unsupported for veth* # PASS: veth0: ethtool list features # PASS: veth0: Turned off feature: rx-checksumming # PASS: veth0: Turned on feature: rx-checksumming # PASS: veth0: Restore feature rx-checksumming to initial state on # Actual changes: # tx-checksum-ip-generic: off ... # PASS: veth0: Turned on feature: rx-udp-gro-forwarding # PASS: veth0: Restore feature rx-udp-gro-forwarding to initial state off # Cannot get register dump: Operation not supported # XFAIL: veth0: ethtool dump not supported # PASS: veth0: ethtool stats # PASS: veth0: stop interface ==================== Link: https://patch.msgid.link/20240821171903.118324-1-jain.abhinav177@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22selftests: net: Use XFAIL for operations not supported by the driverAbhinav Jain
Check if veth pair was created and if yes, xfail on setting IP address logging an informational message. Use XFAIL instead of SKIP for unsupported ethtool APIs. Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240821171903.118324-4-jain.abhinav177@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22selftests: net: Add on/off checks for non-fixed features of interfaceAbhinav Jain
Implement on/off testing for all non-fixed features via while loop. Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240821171903.118324-3-jain.abhinav177@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22selftests: net: Create veth pair for testing in networkless kernelAbhinav Jain
Check if the netdev list is empty and create veth pair to be used for feature on/off testing. Remove the veth pair after testing is complete. Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240821171903.118324-2-jain.abhinav177@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22net: atlantic: Avoid warning about potential string truncationSimon Horman
W=1 builds with GCC 14.2.0 warn that: .../aq_ethtool.c:278:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 6 [-Wformat-truncation=] 278 | snprintf(tc_string, 8, "TC%d ", tc); | ^~ .../aq_ethtool.c:278:56: note: directive argument in the range [-2147483641, 254] 278 | snprintf(tc_string, 8, "TC%d ", tc); | ^~~~~~~ .../aq_ethtool.c:278:33: note: ‘snprintf’ output between 5 and 15 bytes into a destination of size 8 278 | snprintf(tc_string, 8, "TC%d ", tc); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tc is always in the range 0 - cfg->tcs. And as cfg->tcs is a u8, the range is 0 - 255. Further, on inspecting the code, it seems that cfg->tcs will never be more than AQ_CFG_TCS_MAX (8), so the range is actually 0 - 8. So, it seems that the condition that GCC flags will not occur. But, nonetheless, it would be nice if it didn't emit the warning. It seems that this can be achieved by changing the format specifier from %d to %u, in which case I believe GCC recognises an upper bound on the range of tc of 0 - 255. After some experimentation I think this is due to the combination of the use of %u and the type of cfg->tcs (u8). Empirically, updating the type of the tc variable to unsigned int has the same effect. As both of these changes seem to make sense in relation to what the code is actually doing - iterating over unsigned values - do both. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240821-atlantic-str-v1-1-fa2cfe38ca00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-23Merge tag 'net-6.11-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and netfilter. Current release - regressions: - virtio_net: avoid crash on resume - move netdev_tx_reset_queue() call before RX napi enable Current release - new code bugs: - net/mlx5e: fix page leak and incorrect header release w/ HW GRO Previous releases - regressions: - udp: fix receiving fraglist GSO packets - tcp: prevent refcount underflow due to concurrent execution of tcp_sk_exit_batch() Previous releases - always broken: - ipv6: fix possible UAF when incrementing error counters on output - ip6: tunnel: prevent merging of packets with different L2 - mptcp: pm: fix IDs not being reusable - bonding: fix potential crashes in IPsec offload handling - Bluetooth: HCI: - MGMT: add error handling to pair_device() to avoid a crash - invert LE State quirk to be opt-out rather then opt-in - fix LE quote calculation - drv: dsa: VLAN fixes for Ocelot driver - drv: igb: cope with large MAX_SKB_FRAGS Kconfig settings - drv: ice: fi Rx data path on architectures with PAGE_SIZE >= 8192 Misc: - netpoll: do not export netpoll_poll_[disable|enable]() - MAINTAINERS: update the list of networking headers" * tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits) s390/iucv: Fix vargs handling in iucv_alloc_device() net: ovs: fix ovs_drop_reasons error net: xilinx: axienet: Fix dangling multicast addresses net: xilinx: axienet: Always disable promiscuous mode MAINTAINERS: Mark JME Network Driver as Odd Fixes MAINTAINERS: Add header files to NETWORKING sections MAINTAINERS: Add limited globs for Networking headers MAINTAINERS: Add net_tstamp.h to SOCKET TIMESTAMPING section MAINTAINERS: Add sonet.h to ATM section of MAINTAINERS octeontx2-af: Fix CPT AF register offset calculation net: phy: realtek: Fix setting of PHY LEDs Mode B bit on RTL8211F net: ngbe: Fix phy mode set to external phy netfilter: flowtable: validate vlan header bnxt_en: Fix double DMA unmapping for XDP_REDIRECT ipv6: prevent possible UAF in ip6_xmit() ipv6: fix possible UAF in ip6_finish_output2() ipv6: prevent UAF in ip6_send_skb() netpoll: do not export netpoll_poll_[disable|enable]() selftests: mlxsw: ethtool_lanes: Source ethtool lib from correct path udp: fix receiving fraglist GSO packets ...
2024-08-23Merge tag 'kbuild-fixes-v6.11-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Eliminate the fdtoverlay command duplication in scripts/Makefile.lib - Fix 'make compile_commands.json' for external modules - Ensure scripts/kconfig/merge_config.sh handles missing newlines - Fix some build errors on macOS * tag 'kbuild-fixes-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: fix typos "prequisites" to "prerequisites" Documentation/llvm: turn make command for ccache into code block kbuild: avoid scripts/kallsyms parsing /dev/null treewide: remove unnecessary <linux/version.h> inclusion scripts: kconfig: merge_config: config files: add a trailing newline Makefile: add $(srctree) to dependency of compile_commands.json target kbuild: clean up code duplication in cmd_fdtoverlay
2024-08-22s390/iucv: Fix vargs handling in iucv_alloc_device()Alexandra Winter
iucv_alloc_device() gets a format string and a varying number of arguments. This is incorrectly forwarded by calling dev_set_name() with the format string and a va_list, while dev_set_name() expects also a varying number of arguments. Symptoms: Corrupted iucv device names, which can result in log messages like: sysfs: cannot create duplicate filename '/devices/iucv/hvc_iucv1827699952' Fixes: 4452e8ef8c36 ("s390/iucv: Provide iucv_alloc_device() / iucv_release_device()") Link: https://bugzilla.suse.com/show_bug.cgi?id=1228425 Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/20240821091337.3627068-1-wintera@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22net: ovs: fix ovs_drop_reasons errorMenglong Dong
There is something wrong with ovs_drop_reasons. ovs_drop_reasons[0] is "OVS_DROP_LAST_ACTION", but OVS_DROP_LAST_ACTION == __OVS_DROP_REASON + 1, which means that ovs_drop_reasons[1] should be "OVS_DROP_LAST_ACTION". And as Adrian tested, without the patch, adding flow to drop packets results in: drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97) origin: software input port ifindex: 8 timestamp: Tue Aug 20 10:19:17 2024 859853461 nsec protocol: 0x800 length: 98 original length: 98 drop reason: OVS_DROP_ACTION_ERROR With the patch, the same results in: drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97) origin: software input port ifindex: 8 timestamp: Tue Aug 20 10:16:13 2024 475856608 nsec protocol: 0x800 length: 98 original length: 98 drop reason: OVS_DROP_LAST_ACTION Fix this by initializing ovs_drop_reasons with index. Fixes: 9d802da40b7c ("net: openvswitch: add last-action drop reason") Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Tested-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Adrian Moreno <amorenoz@redhat.com> Link: https://patch.msgid.link/20240821123252.186305-1-dongml2@chinatelecom.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22Merge tag 'nf-24-08-22' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: Patch #1 disable BH when collecting stats via hardware offload to ensure concurrent updates from packet path do not result in losing stats. From Sebastian Andrzej Siewior. Patch #2 uses write seqcount to reset counters serialize against reader. Also from Sebastian Andrzej Siewior. Patch #3 ensures vlan header is in place before accessing its fields, according to KMSAN splat triggered by syzbot. * tag 'nf-24-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: flowtable: validate vlan header netfilter: nft_counter: Synchronize nft_counter_reset() against reader. netfilter: nft_counter: Disable BH in nft_counter_offload_stats(). ==================== Link: https://patch.msgid.link/20240822101842.4234-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22Merge branch 'net-xilinx-axienet-multicast-fixes-and-improvements'Jakub Kicinski
Sean Anderson says: ==================== net: xilinx: axienet: Multicast fixes and improvements [part] ==================== First two patches of the series which are fixes. Link: https://patch.msgid.link/20240822154059.1066595-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22net: xilinx: axienet: Fix dangling multicast addressesSean Anderson
If a multicast address is removed but there are still some multicast addresses, that address would remain programmed into the frame filter. Fix this by explicitly setting the enable bit for each filter. Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240822154059.1066595-3-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22net: xilinx: axienet: Always disable promiscuous modeSean Anderson
If promiscuous mode is disabled when there are fewer than four multicast addresses, then it will not be reflected in the hardware. Fix this by always clearing the promiscuous mode flag even when we program multicast addresses. Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240822154059.1066595-2-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-23kbuild: fix typos "prequisites" to "prerequisites"Masahiro Yamada
This typo in scripts/Makefile.build has been present for more than 20 years. It was accidentally copy-pasted to other scripts/Makefile.* files. Fix them all. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org>
2024-08-22Merge branch 'maintainers-networking-updates'Paolo Abeni
Simon Horman says: ==================== MAINTAINERS: Networking updates This series includes Networking-related updates to MAINTAINERS. * Patches 1-4 aim to assign header files with "*net*' and '*skbuff*' in their name to Networking-related sections within Maintainers. There are a few such files left over after this patches. I have to sent separate patches to add them to SCSI SUBSYSTEM and NETWORKING DRIVERS (WIRELESS) sections [1][2]. [1] https://lore.kernel.org/linux-scsi/20240816-scsi-mnt-v1-1-439af8b1c28b@kernel.org/ [2] https://lore.kernel.org/linux-wireless/20240816-wifi-mnt-v1-1-3fb3bf5d44aa@kernel.org/ * Patch 5 updates the status of the JME driver to 'Odd Fixes' ==================== Link: https://patch.msgid.link/20240821-net-mnt-v2-0-59a5af38e69d@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22MAINTAINERS: Mark JME Network Driver as Odd FixesSimon Horman
This driver only appears to have received sporadic clean-ups, typically part of some tree-wide activity, and fixes for quite some time. And according to the maintainer, Guo-Fu Tseng, the device has been EOLed for a long time (see Link). Accordingly, it seems appropriate to mark this driver as odd fixes. Cc: Moon Yeounsu <yyyynoom@gmail.com> Cc: Guo-Fu Tseng <cooldavid@cooldavid.org> Link: https://lore.kernel.org/netdev/20240805003139.M94125@cooldavid.org/ Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22MAINTAINERS: Add header files to NETWORKING sectionsSimon Horman
This is part of an effort to assign a section in MAINTAINERS to header files that relate to Networking. In this case the files with "net" or "skbuff" in their name. This patch adds a number of such files to the NETWORKING DRIVERS and NETWORKING [GENERAL] sections. Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22MAINTAINERS: Add limited globs for Networking headersSimon Horman
This aims to add limited globs to improve the coverage of header files in the NETWORKING DRIVERS and NETWORKING [GENERAL] sections. It is done so in a minimal way to exclude overlap with other sections. And so as not to require "X" entries to exclude files otherwise matched by these new globs. While imperfect, due to it's limited nature, this does extend coverage of header files by these sections. And aims to automatically cover new files that seem very likely belong to these sections. The include/linux/netdev* glob (both sections) + Subsumes the entries for: - include/linux/netdevice.h + Extends the sections to cover - include/linux/netdevice_xmit.h - include/linux/netdev_features.h The include/uapi/linux/netdev* globs: (both sections) + Subsumes the entries for: - include/linux/netdevice.h + Extends the sections to cover - include/linux/netdev.h The include/linux/skbuff* glob (NETWORKING [GENERAL] section only): + Subsumes the entry for: - include/linux/skbuff.h + Extends the section to cover - include/linux/skbuff_ref.h A include/uapi/linux/net_* glob was not added to the NETWORKING [GENERAL] section. Although it would subsume the entry for include/uapi/linux/net_namespace.h, which is fine, it would also extend coverage to: - include/uapi/linux/net_dropmon.h, which belongs to the NETWORK DROP MONITOR section - include/uapi/linux/net_tstamp.h which, as per an earlier patch in this series, belongs to the SOCKET TIMESTAMPING section Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22MAINTAINERS: Add net_tstamp.h to SOCKET TIMESTAMPING sectionSimon Horman
This is part of an effort to assign a section in MAINTAINERS to header files that relate to Networking. In this case the files with "net" in their name. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Signed-off-by: Simon Horman <horms@kernel.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22MAINTAINERS: Add sonet.h to ATM section of MAINTAINERSSimon Horman
This is part of an effort to assign a section in MAINTAINERS to header files that relate to Networking. In this case the files with "net" in their name. It seems that sonet.h is included in ATM related source files, and thus that ATM is the most relevant section for these files. Cc: Chas Williams <3chas3@gmail.com> Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22nfp: bpf: Use kmemdup_array instead of kmemdup for multiple allocationYu Jiaoliang
Let the kememdup_array() take care about multiplication and possible overflows. Signed-off-by: Yu Jiaoliang <yujiaoliang@vivo.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240821081447.12430-1-yujiaoliang@vivo.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22net: airoha: configure hw mac address according to the port idLorenzo Bianconi
GDM1 port on EN7581 SoC is connected to the lan dsa switch. GDM{2,3,4} can be used as wan port connected to an external phy module. Configure hw mac address registers according to the port id. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20240821-airoha-eth-wan-mac-addr-v2-1-8706d0cd6cd5@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22octeontx2-af: Fix CPT AF register offset calculationBharat Bhushan
Some CPT AF registers are per LF and others are global. Translation of PF/VF local LF slot number to actual LF slot number is required only for accessing perf LF registers. CPT AF global registers access do not require any LF slot number. Also, there is no reason CPT PF/VF to know actual lf's register offset. Without this fix microcode loading will fail, VFs cannot be created and hardware is not usable. Fixes: bc35e28af789 ("octeontx2-af: replace cpt slot with lf id on reg write") Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240821070558.1020101-1-bbhushan2@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22net: phy: realtek: Fix setting of PHY LEDs Mode B bit on RTL8211FSava Jakovljev
The current implementation incorrectly sets the mode bit of the PHY chip. Bit 15 (RTL8211F_LEDCR_MODE) should not be shifted together with the configuration nibble of a LED- it should be set independently of the index of the LED being configured. As a consequence, the RTL8211F LED control is actually operating in Mode A. Fix the error by or-ing final register value to write with a const-value of RTL8211F_LEDCR_MODE, thus setting Mode bit explicitly. Fixes: 17784801d888 ("net: phy: realtek: Add support for PHY LEDs on RTL8211F") Signed-off-by: Sava Jakovljev <savaj@meyersound.com> Reviewed-by: Marek Vasut <marex@denx.de> Link: https://patch.msgid.link/PAWP192MB21287372F30C4E55B6DF6158C38E2@PAWP192MB2128.EURP192.PROD.OUTLOOK.COM Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22selftests: net: add helper for checking if nettest is availableJakub Kicinski
A few tests check if nettest exists in the $PATH before adding $PWD to $PATH and re-checking. They don't discard stderr on the first check (and nettest is built as part of selftests, so it's pretty normal for it to not be available in system $PATH). This leads to output noise: which: no nettest in (/home/virtme/tools/fs/bin:/home/virtme/tools/fs/sbin:/home/virtme/tools/fs/usr/bin:/home/virtme/tools/fs/usr/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin) Add a common helper for the check which does silence stderr. There is another small functional change hiding here, because pmtu.sh and fib_rule_tests.sh used to return from the test case rather than completely exit. Building nettest is not hard, there should be no need to maintain the ability to selectively skip cases in its absence. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20240821012227.1398769-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22net: ngbe: Fix phy mode set to external phyMengyuan Lou
The MAC only has add the TX delay and it can not be modified. MAC and PHY are both set the TX delay cause transmission problems. So just disable TX delay in PHY, when use rgmii to attach to external phy, set PHY_INTERFACE_MODE_RGMII_RXID to phy drivers. And it is does not matter to internal phy. Fixes: bc2426d74aa3 ("net: ngbe: convert phylib to phylink") Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Cc: stable@vger.kernel.org # 6.3+ Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/E6759CF1387CF84C+20240820030425.93003-1-mengyuanlou@net-swift.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22netfilter: flowtable: validate vlan headerPablo Neira Ayuso
Ensure there is sufficient room to access the protocol field of the VLAN header, validate it once before the flowtable lookup. ===================================================== BUG: KMSAN: uninit-value in nf_flow_offload_inet_hook+0x45a/0x5f0 net/netfilter/nf_flow_table_inet.c:32 nf_flow_offload_inet_hook+0x45a/0x5f0 net/netfilter/nf_flow_table_inet.c:32 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xf4/0x400 net/netfilter/core.c:626 nf_hook_ingress include/linux/netfilter_netdev.h:34 [inline] nf_ingress net/core/dev.c:5440 [inline] Fixes: 4cd91f7c290f ("netfilter: flowtable: add vlan support") Reported-by: syzbot+8407d9bb88cd4c6bf61a@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-22Merge branch 'net-ipv6-ioam6-introduce-tunsrc'Paolo Abeni
Justin Iurman says: ==================== net: ipv6: ioam6: introduce tunsrc This series introduces a new feature called "tunsrc" (just like seg6 already does). v3: - address Jakub's comments v2: - add links to performance result figures (see patch#2 description) - move the ipv6_addr_any() check out of the datapath ==================== Link: https://patch.msgid.link/20240817131818.11834-1-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22net: ipv6: ioam6: new feature tunsrcJustin Iurman
This patch provides a new feature (i.e., "tunsrc") for the tunnel (i.e., "encap") mode of ioam6. Just like seg6 already does, except it is attached to a route. The "tunsrc" is optional: when not provided (by default), the automatic resolution is applied. Using "tunsrc" when possible has a benefit: performance. See the comparison: - before (= "encap" mode): https://ibb.co/bNCzvf7 - after (= "encap" mode with "tunsrc"): https://ibb.co/PT8L6yq Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22net: ipv6: ioam6: code alignmentJustin Iurman
This patch prepares the next one by correcting the alignment of some lines. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-21selftest: bpf: Remove mssind boundary check in test_tcp_custom_syncookie.c.Kuniyuki Iwashima
Smatch reported a possible off-by-one in tcp_validate_cookie(). However, it's false positive because the possible range of mssind is limited from 0 to 3 by the preceding calculation. mssind = (cookie & (3 << 6)) >> 6; Now, the verifier does not complain without the boundary check. Let's remove the checks. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/bpf/6ae12487-d3f1-488b-9514-af0dac96608f@stanley.mountain/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240821013425.49316-1-kuniyu@amazon.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-21Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-08-20 (ice) This series contains updates to ice driver only. Maciej fixes issues with Rx data path on architectures with PAGE_SIZE >= 8192; correcting page reuse usage and calculations for last offset and truesize. Michal corrects assignment of devlink port number to use PF id. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: use internal pf id instead of function number ice: fix truesize operations for PAGE_SIZE >= 8192 ice: fix ICE_LAST_OFFSET formula ice: fix page reuse when PAGE_SIZE is over 8k ==================== Link: https://patch.msgid.link/20240820215620.1245310-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21tools: ynl: lift an assumption about spec file namePaolo Abeni
Currently the parsing code generator assumes that the yaml specification file name and the main 'name' attribute carried inside correspond, that is the field is the c-name representation of the file basename. The above assumption held true within the current tree, but will be hopefully broken soon by the upcoming net shaper specification. Additionally, it makes the field 'name' itself useless. Lift the assumption, always computing the generated include file name from the generated c file name. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/24da5a3596d814beeb12bd7139a6b4f89756cc19.1724165948.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21Merge branch 'net-xilinx-axienet-add-statistics-support'Jakub Kicinski
Sean Anderson says: ==================== net: xilinx: axienet: Add statistics support Add support for hardware statistics counters (if they are enabled) in the AXI Ethernet driver. Unfortunately, the implementation is complicated a bit since the hardware might only support 32-bit counters. ==================== Link: https://patch.msgid.link/20240820175343.760389-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21net: xilinx: axienet: Add statistics supportSean Anderson
Add support for reading the statistics counters, if they are enabled. The counters may be 64-bit, but we can't detect this statically as there's no ability bit for it and the counters are read-only. Therefore, we assume the counters are 32-bits by default. To ensure we don't miss an overflow, we read all counters at 13-second intervals. This should be often enough to ensure the bytes counters don't wrap at 2.5 Gbit/s. Another complication is that the counters may be reset when the device is reset (depending on configuration). To ensure the counters persist across link up/down (including suspend/resume), we maintain our own versions along with the last counter value we saw. Because we might wait up to 100 ms for the reset to complete, we use a mutex to protect writing hw_stats. We can't sleep in ndo_get_stats64, so we use a seqlock to protect readers. We don't bother disabling the refresh work when we detect 64-bit counters. This is because the reset issue requires us to read hw_stat_base and reset_in_progress anyway, which would still require the seqcount. And I don't think skipping the task is worth the extra bookkeeping. We can't use the byte counters for either get_stats64 or get_eth_mac_stats. This is because the byte counters include everything in the frame (destination address to FCS, inclusive). But rtnl_link_stats64 wants bytes excluding the FCS, and ethtool_eth_mac_stats wants to exclude the L2 overhead (addresses and length/type). It might be possible to calculate the byte values Linux expects based on the frame counters, but I think it is simpler to use the existing software counters. get_ethtool_stats is implemented for nonstandard statistics. This includes the aforementioned byte counters, VLAN and PFC frame counters, and user-defined (e.g. with custom RTL) counters. Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20240820175343.760389-3-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21net: xilinx: axienet: Report RxRject as rx_droppedSean Anderson
The Receive Frame Rejected interrupt is asserted whenever there was a receive error (bad FCS, bad length, etc.) or whenever the frame was dropped due to a mismatched address. So this is really a combination of rx_otherhost_dropped, rx_length_errors, rx_frame_errors, and rx_crc_errors. Mismatched addresses are common and aren't really errors at all (much like how fragments are normal on half-duplex links). To avoid confusion, report these events as rx_dropped. This better reflects what's going on: the packet was received by the MAC but dropped before being processed. Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://patch.msgid.link/20240820175343.760389-2-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21net: repack struct netdev_queueJakub Kicinski
Adding the NAPI pointer to struct netdev_queue made it grow into another cacheline, even though there was 44 bytes of padding available. The struct was historically grouped as follows: /* read-mostly stuff (align) */ /* ... random control path fields ... */ /* write-mostly stuff (align) */ /* ... 40 byte hole ... */ /* struct dql (align) */ It seems that people want to add control path fields after the read only fields. struct dql looks pretty innocent but it forces its own alignment and nothing indicates that there is a lot of empty space above it. Move dql above the xmit_lock. This shifts the empty space to the end of the struct rather than in the middle of it. Move two example fields there to set an example. Hopefully people will now add new fields at the end of the struct. A lot of the read-only stuff is also control path-only, but if we move it all we'll have another hole in the middle. Before: /* size: 384, cachelines: 6, members: 16 */ /* sum members: 284, holes: 3, sum holes: 100 */ After: /* size: 320, cachelines: 5, members: 16 */ /* sum members: 284, holes: 1, sum holes: 8 */ Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240820205119.1321322-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21bnxt_en: Fix double DMA unmapping for XDP_REDIRECTSomnath Kotur
Remove the dma_unmap_page_attrs() call in the driver's XDP_REDIRECT code path. This should have been removed when we let the page pool handle the DMA mapping. This bug causes the warning: WARNING: CPU: 7 PID: 59 at drivers/iommu/dma-iommu.c:1198 iommu_dma_unmap_page+0xd5/0x100 CPU: 7 PID: 59 Comm: ksoftirqd/7 Tainted: G W 6.8.0-1010-gcp #11-Ubuntu Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS 2.15.2 04/02/2024 RIP: 0010:iommu_dma_unmap_page+0xd5/0x100 Code: 89 ee 48 89 df e8 cb f2 69 ff 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 31 d2 31 c9 31 f6 31 ff 45 31 c0 e9 ab 17 71 00 <0f> 0b 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 31 d2 31 c9 RSP: 0018:ffffab1fc0597a48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff99ff838280c8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffab1fc0597a78 R08: 0000000000000002 R09: ffffab1fc0597c1c R10: ffffab1fc0597cd3 R11: ffff99ffe375acd8 R12: 00000000e65b9000 R13: 0000000000000050 R14: 0000000000001000 R15: 0000000000000002 FS: 0000000000000000(0000) GS:ffff9a06efb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000565c34c37210 CR3: 00000005c7e3e000 CR4: 0000000000350ef0 ? show_regs+0x6d/0x80 ? __warn+0x89/0x150 ? iommu_dma_unmap_page+0xd5/0x100 ? report_bug+0x16a/0x190 ? handle_bug+0x51/0xa0 ? exc_invalid_op+0x18/0x80 ? iommu_dma_unmap_page+0xd5/0x100 ? iommu_dma_unmap_page+0x35/0x100 dma_unmap_page_attrs+0x55/0x220 ? bpf_prog_4d7e87c0d30db711_xdp_dispatcher+0x64/0x9f bnxt_rx_xdp+0x237/0x520 [bnxt_en] bnxt_rx_pkt+0x640/0xdd0 [bnxt_en] __bnxt_poll_work+0x1a1/0x3d0 [bnxt_en] bnxt_poll+0xaa/0x1e0 [bnxt_en] __napi_poll+0x33/0x1e0 net_rx_action+0x18a/0x2f0 Fixes: 578fcfd26e2a ("bnxt_en: Let the page pool manage the DMA mapping") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240820203415.168178-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21Merge branch 'ipv6-fix-possible-uaf-in-output-paths'Jakub Kicinski
Eric Dumazet says: ==================== ipv6: fix possible UAF in output paths First patch fixes an issue spotted by syzbot, and the two other patches fix error paths after skb_expand_head() adoption. ==================== Link: https://patch.msgid.link/20240820160859.3786976-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21ipv6: prevent possible UAF in ip6_xmit()Eric Dumazet
If skb_expand_head() returns NULL, skb has been freed and the associated dst/idev could also have been freed. We must use rcu_read_lock() to prevent a possible UAF. Fixes: 0c9f227bee11 ("ipv6: use skb_expand_head in ip6_xmit") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vasily Averin <vasily.averin@linux.dev> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240820160859.3786976-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21ipv6: fix possible UAF in ip6_finish_output2()Eric Dumazet
If skb_expand_head() returns NULL, skb has been freed and associated dst/idev could also have been freed. We need to hold rcu_read_lock() to make sure the dst and associated idev are alive. Fixes: 5796015fa968 ("ipv6: allocate enough headroom in ip6_finish_output2()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vasily Averin <vasily.averin@linux.dev> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240820160859.3786976-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21ipv6: prevent UAF in ip6_send_skb()Eric Dumazet
syzbot reported an UAF in ip6_send_skb() [1] After ip6_local_out() has returned, we no longer can safely dereference rt, unless we hold rcu_read_lock(). A similar issue has been fixed in commit a688caa34beb ("ipv6: take rcu lock in rawv6_send_hdrinc()") Another potential issue in ip6_finish_output2() is handled in a separate patch. [1] BUG: KASAN: slab-use-after-free in ip6_send_skb+0x18d/0x230 net/ipv6/ip6_output.c:1964 Read of size 8 at addr ffff88806dde4858 by task syz.1.380/6530 CPU: 1 UID: 0 PID: 6530 Comm: syz.1.380 Not tainted 6.11.0-rc3-syzkaller-00306-gdf6cbc62cc9b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:93 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 ip6_send_skb+0x18d/0x230 net/ipv6/ip6_output.c:1964 rawv6_push_pending_frames+0x75c/0x9e0 net/ipv6/raw.c:588 rawv6_sendmsg+0x19c7/0x23c0 net/ipv6/raw.c:926 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x1a6/0x270 net/socket.c:745 sock_write_iter+0x2dd/0x400 net/socket.c:1160 do_iter_readv_writev+0x60a/0x890 vfs_writev+0x37c/0xbb0 fs/read_write.c:971 do_writev+0x1b1/0x350 fs/read_write.c:1018 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f936bf79e79 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f936cd7f038 EFLAGS: 00000246 ORIG_RAX: 0000000000000014 RAX: ffffffffffffffda RBX: 00007f936c115f80 RCX: 00007f936bf79e79 RDX: 0000000000000001 RSI: 0000000020000040 RDI: 0000000000000004 RBP: 00007f936bfe7916 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f936c115f80 R15: 00007fff2860a7a8 </TASK> Allocated by task 6530: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 unpoison_slab_object mm/kasan/common.c:312 [inline] __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook mm/slub.c:3988 [inline] slab_alloc_node mm/slub.c:4037 [inline] kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4044 dst_alloc+0x12b/0x190 net/core/dst.c:89 ip6_blackhole_route+0x59/0x340 net/ipv6/route.c:2670 make_blackhole net/xfrm/xfrm_policy.c:3120 [inline] xfrm_lookup_route+0xd1/0x1c0 net/xfrm/xfrm_policy.c:3313 ip6_dst_lookup_flow+0x13e/0x180 net/ipv6/ip6_output.c:1257 rawv6_sendmsg+0x1283/0x23c0 net/ipv6/raw.c:898 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x1a6/0x270 net/socket.c:745 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597 ___sys_sendmsg net/socket.c:2651 [inline] __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2680 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 45: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579 poison_slab_object+0xe0/0x150 mm/kasan/common.c:240 __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256 kasan_slab_free include/linux/kasan.h:184 [inline] slab_free_hook mm/slub.c:2252 [inline] slab_free mm/slub.c:4473 [inline] kmem_cache_free+0x145/0x350 mm/slub.c:4548 dst_destroy+0x2ac/0x460 net/core/dst.c:124 rcu_do_batch kernel/rcu/tree.c:2569 [inline] rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2843 handle_softirqs+0x2c4/0x970 kernel/softirq.c:554 __do_softirq kernel/softirq.c:588 [inline] invoke_softirq kernel/softirq.c:428 [inline] __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637 irq_exit_rcu+0x9/0x30 kernel/softirq.c:649 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 Last potentially related work creation: kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47 __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541 __call_rcu_common kernel/rcu/tree.c:3106 [inline] call_rcu+0x167/0xa70 kernel/rcu/tree.c:3210 refdst_drop include/net/dst.h:263 [inline] skb_dst_drop include/net/dst.h:275 [inline] nf_ct_frag6_queue net/ipv6/netfilter/nf_conntrack_reasm.c:306 [inline] nf_ct_frag6_gather+0xb9a/0x2080 net/ipv6/netfilter/nf_conntrack_reasm.c:485 ipv6_defrag+0x2c8/0x3c0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:67 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] __ip6_local_out+0x6fa/0x800 net/ipv6/output_core.c:143 ip6_local_out+0x26/0x70 net/ipv6/output_core.c:153 ip6_send_skb+0x112/0x230 net/ipv6/ip6_output.c:1959 rawv6_push_pending_frames+0x75c/0x9e0 net/ipv6/raw.c:588 rawv6_sendmsg+0x19c7/0x23c0 net/ipv6/raw.c:926 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x1a6/0x270 net/socket.c:745 sock_write_iter+0x2dd/0x400 net/socket.c:1160 do_iter_readv_writev+0x60a/0x890 Fixes: 0625491493d9 ("ipv6: ip6_push_pending_frames() should increment IPSTATS_MIB_OUTDISCARDS") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240820160859.3786976-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>