summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-06-07net: sched: move rtm_tca_policy declaration to include fileEric Dumazet
rtm_tca_policy is used from net/sched/sch_api.c and net/sched/cls_api.c, thus should be declared in an include file. This fixes the following sparse warning: net/sched/sch_api.c:1434:25: warning: symbol 'rtm_tca_policy' was not declared. Should it be static? Fixes: e331473fee3d ("net/sched: cls_api: add missing validation of netlink attributes") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07tcp: fix formatting in sysctl_net_ipv4.cDavid Morley
Fix incorrectly formatted tcp_syn_linear_timeouts sysctl in the ipv4_net_table. Fixes: ccce324dabfe ("tcp: make the first N SYN RTO backoffs linear") Signed-off-by: David Morley <morleyd@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Tested-by: David Morley <morleyd@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07ice: make writes to /dev/gnssX synchronousMichal Schmidt
The current ice driver's GNSS write implementation buffers writes and works through them asynchronously in a kthread. That's bad because: - The GNSS write_raw operation is supposed to be synchronous[1][2]. - There is no upper bound on the number of pending writes. Userspace can submit writes much faster than the driver can process, consuming unlimited amounts of kernel memory. A patch that's currently on review[3] ("[v3,net] ice: Write all GNSS buffers instead of first one") would add one more problem: - The possibility of waiting for a very long time to flush the write work when doing rmmod, softlockups. To fix these issues, simplify the implementation: Drop the buffering, the write_work, and make the writes synchronous. I tested this with gpsd and ubxtool. [1] https://events19.linuxfoundation.org/wp-content/uploads/2017/12/The-GNSS-Subsystem-Johan-Hovold-Hovold-Consulting-AB.pdf "User interface" slide. [2] A comment in drivers/gnss/core.c:gnss_write(): /* Ignoring O_NONBLOCK, write_raw() is synchronous. */ [3] https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20230217120541.16745-1-karol.kolacinski@intel.com/ Fixes: d6b98c8d242a ("ice: add write functionality for GNSS TTY") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07net: sched: add rcu annotations around qdisc->qdisc_sleepingEric Dumazet
syzbot reported a race around qdisc->qdisc_sleeping [1] It is time we add proper annotations to reads and writes to/from qdisc->qdisc_sleeping. [1] BUG: KCSAN: data-race in dev_graft_qdisc / qdisc_lookup_rcu read to 0xffff8881286fc618 of 8 bytes by task 6928 on cpu 1: qdisc_lookup_rcu+0x192/0x2c0 net/sched/sch_api.c:331 __tcf_qdisc_find+0x74/0x3c0 net/sched/cls_api.c:1174 tc_get_tfilter+0x18f/0x990 net/sched/cls_api.c:2547 rtnetlink_rcv_msg+0x7af/0x8c0 net/core/rtnetlink.c:6386 netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6413 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x375/0x4c0 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1e3/0x270 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __x64_sys_sendmsg+0x46/0x50 net/socket.c:2593 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd write to 0xffff8881286fc618 of 8 bytes by task 6912 on cpu 0: dev_graft_qdisc+0x4f/0x80 net/sched/sch_generic.c:1115 qdisc_graft+0x7d0/0xb60 net/sched/sch_api.c:1103 tc_modify_qdisc+0x712/0xf10 net/sched/sch_api.c:1693 rtnetlink_rcv_msg+0x807/0x8c0 net/core/rtnetlink.c:6395 netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6413 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x375/0x4c0 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1e3/0x270 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __x64_sys_sendmsg+0x46/0x50 net/socket.c:2593 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 6912 Comm: syz-executor.5 Not tainted 6.4.0-rc3-syzkaller-00190-g0d85b27b0cc6 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/16/2023 Fixes: 3a7d0d07a386 ("net: sched: extend Qdisc with rcu") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vlad Buslov <vladbu@nvidia.com> Acked-by: Jamal Hadi Salim<jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07net: dsa: ocelot: unlock on error in vsc9959_qos_port_tas_set()Dan Carpenter
This error path needs call mutex_unlock(&ocelot->tas_lock) before returning. Fixes: 2d800bc500fb ("net/sched: taprio: replace tc_taprio_qopt_offload :: enable with a "cmd" enum") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07Merge branch 'rfs-lockless-annotate'David S. Miller
Eric Dumazet says: ==================== rfs: annotate lockless accesses rfs runs without locks held, so we should annotate read and writes to shared variables. It should prevent compilers forcing writes in the following situation: if (var != val) var = val; A compiler could indeed simply avoid the conditional: var = val; This matters if var is shared between many cpus. v2: aligns one closing bracket (Simon) adds Fixes: tags (Jakub) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07rfs: annotate lockless accesses to RFS sock flow tableEric Dumazet
Add READ_ONCE()/WRITE_ONCE() on accesses to the sock flow table. This also prevents a (smart ?) compiler to remove the condition in: if (table->ents[index] != newval) table->ents[index] = newval; We need the condition to avoid dirtying a shared cache line. Fixes: fec5e652e58f ("rfs: Receive Flow Steering") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07rfs: annotate lockless accesses to sk->sk_rxhashEric Dumazet
Add READ_ONCE()/WRITE_ONCE() on accesses to sk->sk_rxhash. This also prevents a (smart ?) compiler to remove the condition in: if (sk->sk_rxhash != newval) sk->sk_rxhash = newval; We need the condition to avoid dirtying a shared cache line. Fixes: fec5e652e58f ("rfs: Receive Flow Steering") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07Merge branch 'realtek-external-phy-clock'David S. Miller
Detlev Casanova says: ==================== net: phy: realtek: Support external PHY clock Some PHYs can use an external clock that must be enabled before communicating with them. Changes since v3: * Do not call genphy_suspend if WoL is enabled. Changes since v2: * Reword documentation commit message Changes since v1: * Remove the clock name as it is not guaranteed to be identical across different PHYs * Disable/Enable the clock when suspending/resuming ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07net: phy: realtek: Disable clock on suspendDetlev Casanova
For PHYs that call rtl821x_probe() where an external clock can be configured, make sure that the clock is disabled when ->suspend() is called and enabled on resume. The PHY_ALWAYS_CALL_SUSPEND is added to ensure that the suspend function is actually always called. Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07dt-bindings: net: phy: Document support for external PHY clkDetlev Casanova
Ethern PHYs can have external an clock that needs to be activated before communicating with the PHY. Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07net: phy: realtek: Add optional external PHY clockDetlev Casanova
In some cases, the PHY can use an external clock source instead of a crystal. Add an optional clock in the phy node to make sure that the clock source is enabled, if specified, before probing. Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07hv_netvsc: Allocate rx indirection table size dynamicallyShradha Gupta
Allocate the size of rx indirection table dynamically in netvsc from the value of size provided by OID_GEN_RECEIVE_SCALE_CAPABILITIES query instead of using a constant value of ITAB_NUM. Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Tested-on: Ubuntu22 (azure VM, SKU size: Standard_F72s_v2) Testcases: 1. ethtool -x eth0 output 2. LISA testcase:PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-Synthetic 3. LISA testcase:PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-SRIOV Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-06tipc: replace open-code bearer rcu_dereference access in bearer.cXin Long
Replace these open-code bearer rcu_dereference access with bearer_get(), like other places in bearer.c. While at it, also use tipc_net() instead of net_generic(net, tipc_net_id) to get "tn" in bearer.c. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Link: https://lore.kernel.org/r/1072588a8691f970bda950c7e2834d1f2983f58e.1685976044.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06Merge tag 'for-net-2023-06-05' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fixes to debugfs registration - Fix use-after-free in hci_remove_ltk/hci_remove_irk - Fixes to ISO channel support - Fix missing checks for invalid L2CAP DCID - Fix l2cap_disconnect_req deadlock - Add lock to protect HCI_UNREGISTER * tag 'for-net-2023-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: L2CAP: Add missing checks for invalid DCID Bluetooth: ISO: use correct CIS order in Set CIG Parameters event Bluetooth: ISO: don't try to remove CIG if there are bound CIS left Bluetooth: Fix l2cap_disconnect_req deadlock Bluetooth: hci_qca: fix debugfs registration Bluetooth: fix debugfs registration Bluetooth: hci_sync: add lock to protect HCI_UNREGISTER Bluetooth: Fix use-after-free in hci_remove_ltk/hci_remove_irk Bluetooth: ISO: Fix CIG auto-allocation to select configurable CIG Bluetooth: ISO: consider right CIS when removing CIG at cleanup ==================== Link: https://lore.kernel.org/r/20230606003454.2392552-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06Merge tag 'nf-23-06-07' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Missing nul-check in basechain hook netlink dump path, from Gavrilov Ilia. 2) Fix bitwise register tracking, from Jeremy Sowden. 3) Null pointer dereference when accessing conntrack helper, from Tijs Van Buggenhout. 4) Add schedule point to ipset's call_ad, from Kuniyuki Iwashima. 5) Incorrect boundary check when building chain blob. * tag 'nf-23-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: out-of-bound check in chain blob netfilter: ipset: Add schedule point in call_ad(). netfilter: conntrack: fix NULL pointer dereference in nf_confirm_cthelper netfilter: nft_bitwise: fix register tracking netfilter: nf_tables: Add null check for nla_nest_start_noflag() in nft_dump_basechain_hook() ==================== Link: https://lore.kernel.org/r/20230606225851.67394-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06Merge tag 'wireless-2023-06-06' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v6.4 Both rtw88 and rtw89 have a 802.11 powersave fix for a regression introduced in v6.0. mt76 fixes a race and a null pointer dereference. iwlwifi fixes an issue where not enough memory was allocated for a firmware event. And finally the stack has several smaller fixes all over. * tag 'wireless-2023-06-06' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: cfg80211: fix locking in regulatory disconnect wifi: cfg80211: fix locking in sched scan stop work wifi: iwlwifi: mvm: Fix -Warray-bounds bug in iwl_mvm_wait_d3_notif() wifi: mac80211: fix switch count in EMA beacons wifi: mac80211: don't translate beacon/presp addrs wifi: mac80211: mlme: fix non-inheritence element wifi: cfg80211: reject bad AP MLD address wifi: mac80211: use correct iftype HE cap wifi: mt76: mt7996: fix possible NULL pointer dereference in mt7996_mac_write_txwi() wifi: rtw89: remove redundant check of entering LPS wifi: rtw89: correct PS calculation for SUPPORTS_DYNAMIC_PS wifi: rtw88: correct PS calculation for SUPPORTS_DYNAMIC_PS wifi: mt76: mt7615: fix possible race in mt7615_mac_sta_poll ==================== Link: https://lore.kernel.org/r/20230606150817.EC133C433D2@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06Merge branch 'ipv4-remove-rt_conn_flags-calls-in-flowi4_init_output'Jakub Kicinski
Guillaume Nault says: ==================== ipv4: Remove RT_CONN_FLAGS() calls in flowi4_init_output(). Remove a few RT_CONN_FLAGS() calls used inside flowi4_init_output(). These users can be easily converted to set the scope properly, instead of overloading the tos parameter with scope information as done by RT_CONN_FLAGS(). The objective is to eventually remove RT_CONN_FLAGS() entirely, which will then allow to also remove RTO_ONLINK and to finally convert ->flowi4_tos to dscp_t. ==================== Link: https://lore.kernel.org/r/cover.1685999117.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06tcp: Set route scope properly in cookie_v4_check().Guillaume Nault
RT_CONN_FLAGS(sk) overloads flowi4_tos with the RTO_ONLINK bit when sk has the SOCK_LOCALROUTE flag set. This allows ip_route_output_key_hash() to eventually adjust flowi4_scope. Instead of relying on special handling of the RTO_ONLINK bit, we can just set the route scope correctly. This will eventually allow to avoid special interpretation of tos variables and to convert ->flowi4_tos to dscp_t. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06ipv4: Set correct scope in inet_csk_route_*().Guillaume Nault
RT_CONN_FLAGS(sk) overloads the tos parameter with the RTO_ONLINK bit when sk has the SOCK_LOCALROUTE flag set. This is only useful for ip_route_output_key_hash() to eventually adjust the route scope. Let's drop RTO_ONLINK and set the correct scope directly to avoid this special case in the future and to allow converting ->flowi4_tos to dscp_t. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06virtio_net: use control_buf for coalesce paramsBrett Creeley
Commit 699b045a8e43 ("net: virtio_net: notifications coalescing support") added coalescing command support for virtio_net. However, the coalesce commands are using buffers on the stack, which is causing the device to see DMA errors. There should also be a complaint from check_for_stack() in debug_dma_map_xyz(). Fix this by adding and using coalesce params from the control_buf struct, which aligns with other commands. Cc: stable@vger.kernel.org Fixes: 699b045a8e43 ("net: virtio_net: notifications coalescing support") Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Allen Hubbe <allen.hubbe@amd.com> Signed-off-by: Brett Creeley <brett.creeley@amd.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20230605195925.51625-1-brett.creeley@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06pds_core: Fix FW recovery detectionBrett Creeley
Commit 523847df1b37 ("pds_core: add devcmd device interfaces") included initial support for FW recovery detection. Unfortunately, the ordering in pdsc_is_fw_good() was incorrect, which was causing FW recovery to be undetected by the driver. Fix this by making sure to update the cached fw_status by calling pdsc_is_fw_running() before setting the local FW gen. Fixes: 523847df1b37 ("pds_core: add devcmd device interfaces") Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230605195116.49653-1-brett.creeley@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06Merge branch 'move-ksz9477-errata-handling-to-phy-driver'Jakub Kicinski
Robert Hancock says: ==================== Move KSZ9477 errata handling to PHY driver Patches to move handling for KSZ9477 PHY errata register fixes from the DSA switch driver into the corresponding PHY driver, for more proper layering and ordering. ==================== Link: https://lore.kernel.org/r/20230605153943.1060444-1-robert.hancock@calian.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06net: dsa: microchip: remove KSZ9477 PHY errata handlingRobert Hancock
The KSZ9477 PHY errata handling code has now been moved into the Micrel PHY driver, so it is no longer needed inside the DSA switch driver. Remove it. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06net: phy: micrel: Move KSZ9477 errata fixes to PHY driverRobert Hancock
The ksz9477 DSA switch driver is currently updating some MMD registers on the internal port PHYs to address some chip errata. However, these errata are really a property of the PHY itself, not the switch they are part of, so this is kind of a layering violation. It makes more sense for these writes to be done inside the driver which binds to the PHY and not the driver for the containing device. This also addresses some issues where the ordering of when these writes are done may have been incorrect, causing the link to erratically fail to come up at the proper speed or at all. Doing this in the PHY driver during config_init ensures that they happen before anything else tries to change the state of the PHY on the port. The new code also ensures that autonegotiation is disabled during the register writes and re-enabled afterwards, as indicated by the latest version of the errata documentation from Microchip. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06tcp: gso: really support BIG TCPEric Dumazet
We missed that tcp_gso_segment() was assuming skb->len was smaller than 65535 : oldlen = (u16)~skb->len; This part came with commit 0718bcc09b35 ("[NET]: Fix CHECKSUM_HW GSO problems.") This leads to wrong TCP checksum. Adapt the code to accept arbitrary packet length. v2: - use two csum_add() instead of csum_fold() (Alexander Duyck) - Change delta type to __wsum to reduce casts (Alexander Duyck) Fixes: 09f3d1a3a52c ("ipv6/gso: remove temporary HBH/jumbo header") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230605161647.3624428-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06ipv6: rpl: Fix Route of Death.Kuniyuki Iwashima
A remote DoS vulnerability of RPL Source Routing is assigned CVE-2023-2156. The Source Routing Header (SRH) has the following format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Next Header | Hdr Ext Len | Routing Type | Segments Left | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CmprI | CmprE | Pad | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | . . . Addresses[1..n] . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The originator of an SRH places the first hop's IPv6 address in the IPv6 header's IPv6 Destination Address and the second hop's IPv6 address as the first address in Addresses[1..n]. The CmprI and CmprE fields indicate the number of prefix octets that are shared with the IPv6 Destination Address. When CmprI or CmprE is not 0, Addresses[1..n] are compressed as follows: 1..n-1 : (16 - CmprI) bytes n : (16 - CmprE) bytes Segments Left indicates the number of route segments remaining. When the value is not zero, the SRH is forwarded to the next hop. Its address is extracted from Addresses[n - Segment Left + 1] and swapped with IPv6 Destination Address. When Segment Left is greater than or equal to 2, the size of SRH is not changed because Addresses[1..n-1] are decompressed and recompressed with CmprI. OTOH, when Segment Left changes from 1 to 0, the new SRH could have a different size because Addresses[1..n-1] are decompressed with CmprI and recompressed with CmprE. Let's say CmprI is 15 and CmprE is 0. When we receive SRH with Segment Left >= 2, Addresses[1..n-1] have 1 byte for each, and Addresses[n] has 16 bytes. When Segment Left is 1, Addresses[1..n-1] is decompressed to 16 bytes and not recompressed. Finally, the new SRH will need more room in the header, and the size is (16 - 1) * (n - 1) bytes. Here the max value of n is 255 as Segment Left is u8, so in the worst case, we have to allocate 3825 bytes in the skb headroom. However, now we only allocate a small fixed buffer that is IPV6_RPL_SRH_WORST_SWAP_SIZE (16 + 7 bytes). If the decompressed size overflows the room, skb_push() hits BUG() below [0]. Instead of allocating the fixed buffer for every packet, let's allocate enough headroom only when we receive SRH with Segment Left 1. [0]: skbuff: skb_under_panic: text:ffffffff81c9f6e2 len:576 put:576 head:ffff8880070b5180 data:ffff8880070b4fb0 tail:0x70 end:0x140 dev:lo kernel BUG at net/core/skbuff.c:200! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 154 Comm: python3 Not tainted 6.4.0-rc4-00190-gc308e9ec0047 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:skb_panic (net/core/skbuff.c:200) Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50 ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 80 6e 77 82 e8 ad 8b 60 ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffc90000003da0 EFLAGS: 00000246 RAX: 0000000000000085 RBX: ffff8880058a6600 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff88807dc1c540 RDI: ffff88807dc1c540 RBP: ffffc90000003e48 R08: ffffffff82b392c8 R09: 00000000ffffdfff R10: ffffffff82a592e0 R11: ffffffff82b092e0 R12: ffff888005b1c800 R13: ffff8880070b51b8 R14: ffff888005b1ca18 R15: ffff8880070b5190 FS: 00007f4539f0b740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055670baf3000 CR3: 0000000005b0e000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <IRQ> skb_push (net/core/skbuff.c:210) ipv6_rthdr_rcv (./include/linux/skbuff.h:2880 net/ipv6/exthdrs.c:634 net/ipv6/exthdrs.c:718) ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437 (discriminator 5)) ip6_input_finish (./include/linux/rcupdate.h:805 net/ipv6/ip6_input.c:483) __netif_receive_skb_one_core (net/core/dev.c:5494) process_backlog (./include/linux/rcupdate.h:805 net/core/dev.c:5934) __napi_poll (net/core/dev.c:6496) net_rx_action (net/core/dev.c:6565 net/core/dev.c:6696) __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:572) do_softirq (kernel/softirq.c:472 kernel/softirq.c:459) </IRQ> <TASK> __local_bh_enable_ip (kernel/softirq.c:396) __dev_queue_xmit (net/core/dev.c:4272) ip6_finish_output2 (./include/net/neighbour.h:544 net/ipv6/ip6_output.c:134) rawv6_sendmsg (./include/net/dst.h:458 ./include/linux/netfilter.h:303 net/ipv6/raw.c:656 net/ipv6/raw.c:914) sock_sendmsg (net/socket.c:724 net/socket.c:747) __sys_sendto (net/socket.c:2144) __x64_sys_sendto (net/socket.c:2156 net/socket.c:2152 net/socket.c:2152) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) RIP: 0033:0x7f453a138aea Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 RSP: 002b:00007ffcc212a1c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007ffcc212a288 RCX: 00007f453a138aea RDX: 0000000000000060 RSI: 00007f4539084c20 RDI: 0000000000000003 RBP: 00007f4538308e80 R08: 00007ffcc212a300 R09: 000000000000001c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007f4539712d1b </TASK> Modules linked in: Fixes: 8610c7c6e3bd ("net: ipv6: add support for rpl sr exthdr") Reported-by: Max VA Closes: https://www.interruptlabs.co.uk/articles/linux-ipv6-route-of-death Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230605180617.67284-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06netlink: specs: ethtool: fix random typosJakub Kicinski
Working on the code gen for C reveals typos in the ethtool spec as the compiler tries to find the names in the existing uAPI header. Fix the mistakes. Fixes: a353318ebf24 ("tools: ynl: populate most of the ethtool spec") Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230605233257.843977-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-07netfilter: nf_tables: out-of-bound check in chain blobPablo Neira Ayuso
Add current size of rule expressions to the boundary check. Fixes: 2c865a8a28a1 ("netfilter: nf_tables: add rule blob layout") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-07netfilter: ipset: Add schedule point in call_ad().Kuniyuki Iwashima
syzkaller found a repro that causes Hung Task [0] with ipset. The repro first creates an ipset and then tries to delete a large number of IPs from the ipset concurrently: IPSET_ATTR_IPADDR_IPV4 : 172.20.20.187 IPSET_ATTR_CIDR : 2 The first deleting thread hogs a CPU with nfnl_lock(NFNL_SUBSYS_IPSET) held, and other threads wait for it to be released. Previously, the same issue existed in set->variant->uadt() that could run so long under ip_set_lock(set). Commit 5e29dc36bd5e ("netfilter: ipset: Rework long task execution when adding/deleting entries") tried to fix it, but the issue still exists in the caller with another mutex. While adding/deleting many IPs, we should release the CPU periodically to prevent someone from abusing ipset to hang the system. Note we need to increment the ipset's refcnt to prevent the ipset from being destroyed while rescheduling. [0]: INFO: task syz-executor174:268 blocked for more than 143 seconds. Not tainted 6.4.0-rc1-00145-gba79e9a73284 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor174 state:D stack:0 pid:268 ppid:260 flags:0x0000000d Call trace: __switch_to+0x308/0x714 arch/arm64/kernel/process.c:556 context_switch kernel/sched/core.c:5343 [inline] __schedule+0xd84/0x1648 kernel/sched/core.c:6669 schedule+0xf0/0x214 kernel/sched/core.c:6745 schedule_preempt_disabled+0x58/0xf0 kernel/sched/core.c:6804 __mutex_lock_common kernel/locking/mutex.c:679 [inline] __mutex_lock+0x6fc/0xdb0 kernel/locking/mutex.c:747 __mutex_lock_slowpath+0x14/0x20 kernel/locking/mutex.c:1035 mutex_lock+0x98/0xf0 kernel/locking/mutex.c:286 nfnl_lock net/netfilter/nfnetlink.c:98 [inline] nfnetlink_rcv_msg+0x480/0x70c net/netfilter/nfnetlink.c:295 netlink_rcv_skb+0x1c0/0x350 net/netlink/af_netlink.c:2546 nfnetlink_rcv+0x18c/0x199c net/netfilter/nfnetlink.c:658 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x664/0x8cc net/netlink/af_netlink.c:1365 netlink_sendmsg+0x6d0/0xa4c net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x4b8/0x810 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1f8/0x2a4 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2593 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x84/0x270 arch/arm64/kernel/syscall.c:52 el0_svc_common+0x134/0x24c arch/arm64/kernel/syscall.c:142 do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:193 el0_svc+0x2c/0x7c arch/arm64/kernel/entry-common.c:637 el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Reported-by: syzkaller <syzkaller@googlegroups.com> Fixes: a7b4f989a629 ("netfilter: ipset: IP set core support") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-07netfilter: conntrack: fix NULL pointer dereference in nf_confirm_cthelperTijs Van Buggenhout
An nf_conntrack_helper from nf_conn_help may become NULL after DNAT. Observed when TCP port 1720 (Q931_PORT), associated with h323 conntrack helper, is DNAT'ed to another destination port (e.g. 1730), while nfqueue is being used for final acceptance (e.g. snort). This happenned after transition from kernel 4.14 to 5.10.161. Workarounds: * keep the same port (1720) in DNAT * disable nfqueue * disable/unload h323 NAT helper $ linux-5.10/scripts/decode_stacktrace.sh vmlinux < /tmp/kernel.log BUG: kernel NULL pointer dereference, address: 0000000000000084 [..] RIP: 0010:nf_conntrack_update (net/netfilter/nf_conntrack_core.c:2080 net/netfilter/nf_conntrack_core.c:2134) nf_conntrack [..] nfqnl_reinject (net/netfilter/nfnetlink_queue.c:237) nfnetlink_queue nfqnl_recv_verdict (net/netfilter/nfnetlink_queue.c:1230) nfnetlink_queue nfnetlink_rcv_msg (net/netfilter/nfnetlink.c:241) nfnetlink [..] Fixes: ee04805ff54a ("netfilter: conntrack: make conntrack userspace helpers work again") Signed-off-by: Tijs Van Buggenhout <tijs.van.buggenhout@axsguard.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-07netfilter: nft_bitwise: fix register trackingJeremy Sowden
At the end of `nft_bitwise_reduce`, there is a loop which is intended to update the bitwise expression associated with each tracked destination register. However, currently, it just updates the first register repeatedly. Fix it. Fixes: 34cc9e52884a ("netfilter: nf_tables: cancel tracking for clobbered destination registers") Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-07netfilter: nf_tables: Add null check for nla_nest_start_noflag() in ↵Gavrilov Ilia
nft_dump_basechain_hook() The nla_nest_start_noflag() function may fail and return NULL; the return value needs to be checked. Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with SVACE. Fixes: d54725cd11a5 ("netfilter: nf_tables: support for multiple devices per netdev hook") Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-06Merge branch 'tools-ynl-user-space-c'Jakub Kicinski
Jakub Kicinski says: ==================== tools: ynl: user space C Use the code gen which is already in tree to generate a user space library for a handful of simple families. I find YNL C quite useful in some WIP projects, and I think others may find it useful, too. I was hoping someone will pick this work up and finish it... but it seems that Python YNL has largely stolen the thunder. Python may not be great for selftest, tho, and actually this lib is more fully-featured. The Python script was meant as a quick demo, funny how those things go. v2: https://lore.kernel.org/all/20230604175843.662084-1-kuba@kernel.org/ v1: https://lore.kernel.org/all/20230603052547.631384-1-kuba@kernel.org/ ==================== Link: https://lore.kernel.org/r/20230605190108.809439-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06tools: ynl: add sample for netdevJakub Kicinski
Add a sample application using the C library. My main goal is to make writing selftests easier but until I have some of those ready I think it's useful to show off the functionality and let people poke and tinker. Sample outputs - dump: $ ./netdev Select ifc ($ifindex; or 0 = dump; or -2 ntf check): 0 lo[1] 0: enp1s0[2] 23: basic redirect rx-sg Notifications (watching veth pair getting added and deleted): $ ./netdev Select ifc ($ifindex; or 0 = dump; or -2 ntf check): -2 [53] 0: (ntf: dev-add-ntf) [54] 0: (ntf: dev-add-ntf) [54] 23: basic redirect rx-sg (ntf: dev-change-ntf) [53] 23: basic redirect rx-sg (ntf: dev-change-ntf) [53] 23: basic redirect rx-sg (ntf: dev-del-ntf) [54] 23: basic redirect rx-sg (ntf: dev-del-ntf) Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06tools: ynl: support fou and netdev in CJakub Kicinski
Generate the code for netdev and fou families. They are simple and already supported by the code gen. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06tools: ynl: user space helpersJakub Kicinski
Add "fixed" part of the user space Netlink Spec-based library. This will get linked with the protocol implementations to form a full API. Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06tools: ynl-gen: clean up stray new lines at the end of reply-less requestsJakub Kicinski
Do not print empty lines before closing brackets. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06selftests/bpf: Fix sockopt_sk selftestYonghong Song
Commit f4e4534850a9 ("net/netlink: fix NETLINK_LIST_MEMBERSHIPS length report") fixed NETLINK_LIST_MEMBERSHIPS length report which caused selftest sockopt_sk failure. The failure log looks like test_sockopt_sk:PASS:join_cgroup /sockopt_sk 0 nsec run_test:PASS:skel_load 0 nsec run_test:PASS:setsockopt_link 0 nsec run_test:PASS:getsockopt_link 0 nsec getsetsockopt:FAIL:Unexpected NETLINK_LIST_MEMBERSHIPS value unexpected Unexpected NETLINK_LIST_MEMBERSHIPS value: actual 8 != expected 4 run_test:PASS:getsetsockopt 0 nsec #201 sockopt_sk:FAIL In net/netlink/af_netlink.c, function netlink_getsockopt(), for NETLINK_LIST_MEMBERSHIPS, nlk->ngroups equals to 36. Before Commit f4e4534850a9, the optlen is calculated as ALIGN(nlk->ngroups / 8, sizeof(u32)) = 4 After that commit, the optlen is ALIGN(BITS_TO_BYTES(nlk->ngroups), sizeof(u32)) = 8 Fix the test by setting the expected optlen to be 8. Fixes: f4e4534850a9 ("net/netlink: fix NETLINK_LIST_MEMBERSHIPS length report") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230606172202.1606249-1-yhs@fb.com
2023-06-06Merge tag 'spi-fix-v6.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A small collection of driver specific fixes, none of them particularly remarkable or severe" * tag 'spi-fix-v6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: qup: Request DMA before enabling clocks spi: mt65xx: make sure operations completed before unloading spi: lpspi: disable lpspi module irq in DMA mode
2023-06-06wifi: cfg80211: fix locking in regulatory disconnectJohannes Berg
This should use wiphy_lock() now instead of requiring the RTNL, since __cfg80211_leave() via cfg80211_leave() is now requiring that lock to be held. Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-06wifi: cfg80211: fix locking in sched scan stop workJohannes Berg
This should use wiphy_lock() now instead of acquiring the RTNL, since cfg80211_stop_sched_scan_req() now needs that. Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-06Merge tag 'gfs2-v6.4-rc4-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fix from Andreas Gruenbacher: - Don't get stuck writing page onto itself under direct I/O * tag 'gfs2-v6.4-rc4-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Don't get stuck writing page onto itself under direct I/O
2023-06-06Merge tag 'platform-drivers-x86-v6.4-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - various Microsoft Surface support fixes - one fix for the INT3472 driver * tag 'platform-drivers-x86-v6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: int3472: Avoid crash in unregistering regulator gpio platform/surface: aggregator_tabletsw: Add support for book mode in POS subsystem platform/surface: aggregator_tabletsw: Add support for book mode in KIP subsystem platform/surface: aggregator: Allow completion work-items to be executed in parallel platform/surface: aggregator: Make to_ssam_device_driver() respect constness
2023-06-06qed/qede: Fix scheduling while atomicManish Chopra
Statistics read through bond interface via sysfs causes below bug and traces as it triggers the bonding module to collect the slave device statistics while holding the spinlock, beneath that qede->qed driver statistics flow gets scheduled out due to usleep_range() used in PTT acquire logic [ 3673.988874] Hardware name: HPE ProLiant DL365 Gen10 Plus/ProLiant DL365 Gen10 Plus, BIOS A42 10/29/2021 [ 3673.988878] Call Trace: [ 3673.988891] dump_stack_lvl+0x34/0x44 [ 3673.988908] __schedule_bug.cold+0x47/0x53 [ 3673.988918] __schedule+0x3fb/0x560 [ 3673.988929] schedule+0x43/0xb0 [ 3673.988932] schedule_hrtimeout_range_clock+0xbf/0x1b0 [ 3673.988937] ? __hrtimer_init+0xc0/0xc0 [ 3673.988950] usleep_range+0x5e/0x80 [ 3673.988955] qed_ptt_acquire+0x2b/0xd0 [qed] [ 3673.988981] _qed_get_vport_stats+0x141/0x240 [qed] [ 3673.989001] qed_get_vport_stats+0x18/0x80 [qed] [ 3673.989016] qede_fill_by_demand_stats+0x37/0x400 [qede] [ 3673.989028] qede_get_stats64+0x19/0xe0 [qede] [ 3673.989034] dev_get_stats+0x5c/0xc0 [ 3673.989045] netstat_show.constprop.0+0x52/0xb0 [ 3673.989055] dev_attr_show+0x19/0x40 [ 3673.989065] sysfs_kf_seq_show+0x9b/0xf0 [ 3673.989076] seq_read_iter+0x120/0x4b0 [ 3673.989087] new_sync_read+0x118/0x1a0 [ 3673.989095] vfs_read+0xf3/0x180 [ 3673.989099] ksys_read+0x5f/0xe0 [ 3673.989102] do_syscall_64+0x3b/0x90 [ 3673.989109] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3673.989115] RIP: 0033:0x7f8467d0b082 [ 3673.989119] Code: c0 e9 b2 fe ff ff 50 48 8d 3d ca 05 08 00 e8 35 e7 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 [ 3673.989121] RSP: 002b:00007ffffb21fd08 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 3673.989127] RAX: ffffffffffffffda RBX: 000000000100eca0 RCX: 00007f8467d0b082 [ 3673.989128] RDX: 00000000000003ff RSI: 00007ffffb21fdc0 RDI: 0000000000000003 [ 3673.989130] RBP: 00007f8467b96028 R08: 0000000000000010 R09: 00007ffffb21ec00 [ 3673.989132] R10: 00007ffffb27b170 R11: 0000000000000246 R12: 00000000000000f0 [ 3673.989134] R13: 0000000000000003 R14: 00007f8467b92000 R15: 0000000000045a05 [ 3673.989139] CPU: 30 PID: 285188 Comm: read_all Kdump: loaded Tainted: G W OE Fix this by collecting the statistics asynchronously from a periodic delayed work scheduled at default stats coalescing interval and return the recent copy of statisitcs from .ndo_get_stats64(), also add ability to configure/retrieve stats coalescing interval using below commands - ethtool -C ethx stats-block-usecs <val> ethtool -c ethx Fixes: 133fac0eedc3 ("qede: Add basic ethtool support") Cc: Sudarsana Kalluru <skalluru@marvell.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Manish Chopra <manishc@marvell.com> Link: https://lore.kernel.org/r/20230605112600.48238-1-manishc@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-06Merge tag 'for-linus-2023060501' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fix from Jiri Kosina: - Final, confirmed fix for regression causing some devices connected via Logitech HID++ Unifying receiver take too long to initialize (Benjamin Tissoires) * tag 'for-linus-2023060501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: hidpp: terminate retry loop on success
2023-06-06net/pppoe: fix a typo for the PPPOE_HASH_BITS_1 definitionLukas Bulwahn
Instead of its intention to define PPPOE_HASH_BITS_1, commit 96ba44c637b0 ("net/pppoe: make number of hash bits configurable") actually defined config PPPOE_HASH_BITS_2 twice in the ppp's Kconfig file due to a quick typo with the numbers. Fix the typo and define PPPOE_HASH_BITS_1. Fixes: 96ba44c637b0 ("net/pppoe: make number of hash bits configurable") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Jaco Kroon <jaco@uls.co.za> Link: https://lore.kernel.org/r/20230605072743.11247-1-lukas.bulwahn@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-06mac_pton: Clean up the header inclusionsAndy Shevchenko
Since hex_to_bin() is provided by hex.h there is no need to require kernel.h. Replace the latter by the former and add missing export.h. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230604132858.6650-1-andriy.shevchenko@linux.intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-06gro: decrease size of CBRichard Gobert
The GRO control block (NAPI_GRO_CB) is currently at its maximum size. This commit reduces its size by putting two groups of fields that are used only at different times into a union. Specifically, the fields frag0 and frag0_len are the fields that make up the frag0 optimisation mechanism, which is used during the initial parsing of the SKB. The fields last and age are used after the initial parsing, while the SKB is stored in the GRO list, waiting for other packets to arrive. There was one location in dev_gro_receive that modified the frag0 fields after setting last and age. I changed this accordingly without altering the code behaviour. Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230601161407.GA9253@debian Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-06wifi: iwlwifi: mvm: Fix -Warray-bounds bug in iwl_mvm_wait_d3_notif()Gustavo A. R. Silva
kmemdup() at line 2735 is not duplicating enough memory for notif->tid_tear_down and notif->station_id. As it only duplicates 612 bytes: up to offsetofend(struct iwl_wowlan_info_notif, received_beacons), this is the range of [0, 612) bytes. 2735 notif = kmemdup(notif_v1, 2736 offsetofend(struct iwl_wowlan_info_notif, 2737 received_beacons), 2738 GFP_ATOMIC); which evidently does not cover bytes 612 and 613 for members tid_tear_down and station_id in struct iwl_wowlan_info_notif. See below: $ pahole -C iwl_wowlan_info_notif drivers/net/wireless/intel/iwlwifi/mvm/d3.o struct iwl_wowlan_info_notif { struct iwl_wowlan_gtk_status_v3 gtk[2]; /* 0 488 */ /* --- cacheline 7 boundary (448 bytes) was 40 bytes ago --- */ struct iwl_wowlan_igtk_status igtk[2]; /* 488 80 */ /* --- cacheline 8 boundary (512 bytes) was 56 bytes ago --- */ __le64 replay_ctr; /* 568 8 */ /* --- cacheline 9 boundary (576 bytes) --- */ __le16 pattern_number; /* 576 2 */ __le16 reserved1; /* 578 2 */ __le16 qos_seq_ctr[8]; /* 580 16 */ __le32 wakeup_reasons; /* 596 4 */ __le32 num_of_gtk_rekeys; /* 600 4 */ __le32 transmitted_ndps; /* 604 4 */ __le32 received_beacons; /* 608 4 */ u8 tid_tear_down; /* 612 1 */ u8 station_id; /* 613 1 */ u8 reserved2[2]; /* 614 2 */ /* size: 616, cachelines: 10, members: 13 */ /* last cacheline: 40 bytes */ }; Therefore, when the following assignments take place, actually no memory has been allocated for those objects: 2743 notif->tid_tear_down = notif_v1->tid_tear_down; 2744 notif->station_id = notif_v1->station_id; Fix this by allocating space for the whole notif object and zero out the remaining space in memory after member station_id. This also fixes the following -Warray-bounds issues: CC drivers/net/wireless/intel/iwlwifi/mvm/d3.o drivers/net/wireless/intel/iwlwifi/mvm/d3.c: In function ‘iwl_mvm_wait_d3_notif’: drivers/net/wireless/intel/iwlwifi/mvm/d3.c:2743:30: warning: array subscript ‘struct iwl_wowlan_info_notif[0]’ is partly outside array bounds of ‘unsigned char[612]’ [-Warray-bounds=] 2743 | notif->tid_tear_down = notif_v1->tid_tear_down; | from drivers/net/wireless/intel/iwlwifi/mvm/d3.c:7: In function ‘kmemdup’, inlined from ‘iwl_mvm_wait_d3_notif’ at drivers/net/wireless/intel/iwlwifi/mvm/d3.c:2735:12: include/linux/fortify-string.h:765:16: note: object of size 612 allocated by ‘__real_kmemdup’ 765 | return __real_kmemdup(p, size, gfp); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/wireless/intel/iwlwifi/mvm/d3.c: In function ‘iwl_mvm_wait_d3_notif’: drivers/net/wireless/intel/iwlwifi/mvm/d3.c:2744:30: warning: array subscript ‘struct iwl_wowlan_info_notif[0]’ is partly outside array bounds of ‘unsigned char[612]’ [-Warray-bounds=] 2744 | notif->station_id = notif_v1->station_id; | ^~ In function ‘kmemdup’, inlined from ‘iwl_mvm_wait_d3_notif’ at drivers/net/wireless/intel/iwlwifi/mvm/d3.c:2735:12: include/linux/fortify-string.h:765:16: note: object of size 612 allocated by ‘__real_kmemdup’ 765 | return __real_kmemdup(p, size, gfp); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ Link: https://github.com/KSPP/linux/issues/306 Fixes: 905d50ddbc83 ("wifi: iwlwifi: mvm: support wowlan info notification version 2") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Acked-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/ZHpGN555FwAKGduH@work Signed-off-by: Johannes Berg <johannes.berg@intel.com>