summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)Author
2025-04-14net/mlx5: HWS, Cleanup matcher action STE tableVlad Dogaru
Remove the matcher action STE implementation now that the code uses per-queue action STE pools. This also allows simplifying matcher code because it is now only handling a single type of RTC/STE. The matcher resize data is also going away. Matchers were saving old action STE data because the rules still used it, but now that data lives in the action STE pool and is no longer coupled to a matcher. Furthermore, matchers no longer need to rehash a due to action template addition. If a new action template needs more action STEs, we simply update the matcher's num_of_action_stes and future rules will allocate the correct number. Existing rules are unaffected by such an operation and can continue to use their existing action STEs. The range action was using the matcher action STE implementation, but there was no reason to do this other than the container fitting the purpose. Extract that information to a separate structure. Finally, stop dumping per-matcher information about action RTCs, because they no longer exist. A later patch in this series will add support for dumping action STE pools. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/1744312662-356571-11-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net/mlx5: HWS, Use the new action STE poolVlad Dogaru
Use the central action STE pool when creating / updating rules. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/1744312662-356571-10-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net/mlx5: HWS, Implement action STE poolVlad Dogaru
Implement a per-queue pool of action STEs that match STEs can link to, regardless of matcher. The code relies on hints to optimize whether a given rule is added to rx-only, tx-only or both. Correspondingly, action STEs need to be added to different RTC for ingress or egress paths. For rx-and-tx rules, the current rule implementation dictates that the offsets for a given rule must be the same in both RTCs. To avoid wasting STEs, each action STE pool element holds 3 pools: rx-only, tx-only, and rx-and-tx, corresponding to the possible values of the pool optimization enum. The implementation then chooses at rule creation / update which of these elements to allocate from. Each element holds multiple action STE tables, which wrap an RTC, an STE range, the logic to buddy-allocate offsets from the range, and an STC that allows match STEs to point to this table. When allocating offsets from an element, we iterate through available action STE tables and, if needed, create a new table. Similar to the previous implementation, this iteration does not free any resources. This is implemented in a subsequent patch. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/1744312662-356571-9-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net/mlx5: HWS, Fix pool size optimizationVlad Dogaru
The optimization to create a size-one STE range for the unused direction was broken. The hardware prevents us from creating RTCs over unallocated STE space, so the only reason this has worked so far is because the optimization was never used. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/1744312662-356571-8-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net/mlx5: HWS, Add fullness tracking to poolVlad Dogaru
Future users will need to query whether a pool is empty. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/1744312662-356571-7-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net/mlx5: HWS, Cleanup after pool refactoringVlad Dogaru
Remove members which are now no longer used. In fact, many of the `struct mlx5hws_pool_chunk` were not even written to beyond being initialized, but they were used in various internals. Also cleanup some local variables which made more sense when the API was thicker. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/1744312662-356571-6-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net/mlx5: HWS, Refactor pool implementationVlad Dogaru
Refactor the pool implementation to remove unused flags and clarify its usage. A pool represents a single range of STEs or STCs which are allocated at pool creation time. Pools are used under three patterns: 1. STCs are allocated one at a time from a global pool using a bitmap based implementation. 2. Action STEs are allocated in power-of-two blocks using a buddy algorithm. 3. Match STEs do not use allocation, since insertion into these tables is based on hashes or direct addressing. In such cases we use a pool only to create the STE range. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/1744312662-356571-5-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net/mlx5: HWS, Make pool single resourceVlad Dogaru
The pool implementation claimed to support multiple resources, but this does not really make sense in context. Callers always allocate a single STC or STE chunk of exactly the size provided. The code that handled multiple resources was unused (and likely buggy) due to the combination of flags passed by callers. Simplify the pool by having it handle a single resource. As a result of this simplification, chunks no longer contain a resource offset (there is now only one resource per pool), and the get_base_id functions no longer take a chunk parameter. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/1744312662-356571-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net/mlx5: HWS, Remove unused element arrayVlad Dogaru
Remove the array of elements wrapped in a struct because in reality only the first element was ever used. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/1744312662-356571-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net/mlx5: HWS, Fix matcher action template attachVlad Dogaru
The procedure of attaching an action template to an existing matcher had a few issues: 1. Attaching accidentally overran the `at` array in bwc_matcher, which would result in memory corruption. This bug wasn't triggered, but it is possible to trigger it by attaching action templates beyond the initial buffer size of 8. Fix this by converting to a dynamically sized buffer and reallocating if needed. 2. Similarly, the `at` array inside the native matcher was never reallocated. Fix this the same as above. 3. The bwc layer treated any error in action template attach as a signal that the matcher should be rehashed to account for a larger number of action STEs. In reality, there are other unrelated errors that can arise and they should be propagated upstack. Fix this by adding a `need_rehash` output parameter that's orthogonal to error codes. Fixes: 2111bb970c78 ("net/mlx5: HWS, added backward-compatible API handling") Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/1744312662-356571-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: stmmac: remove GMAC_1US_TIC_COUNTER definitionRussell King (Oracle)
GMAC_1US_TIC_COUNTER is now no longer used, so remove the definition. This was duplicated by GMAC4_MAC_ONEUS_TIC_COUNTER further down in the same file. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u3Vv0-000E87-DQ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: stmmac: intel-plat: remove eee_usecs_rate and hardware writeRussell King (Oracle)
Remove the write to GMAC_1US_TIC_COUNTER for two reasons: 1. during initialisation or reinitialisation of the DWMAC core, the core is reset, which sets this register back to its default value. Writing it prior to stmmac_dvr_probe() has no effect. 2. Since commit 8efbdbfa9938 ("net: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register"), GMAC4/5 core code will set this register based on the rate of plat->stmmac_clk. This clock is fetched by devm_stmmac_probe_config_dt(), and plat->clk_ptp_rate will be set to its rate profided a "ptp_ref" clock is not provided. In any case, Marek's commit will set the effectual value of this register. Therefore, dwmac-intel-plat.c writing GMAC_1US_TIC_COUNTER serves no useful purpose and can be removed. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u3Vuq-000E7s-5Y@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: stmmac: intel: remove eee_usecs_rate and hardware writeRussell King (Oracle)
Remove the write to GMAC_1US_TIC_COUNTER for two reasons: 1. during initialisation or reinitialisation of the DWMAC core, the core is reset, which sets this register back to its default value. Writing it prior to stmmac_dvr_probe() has no effect. 2. Since commit 8efbdbfa9938 ("net: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register"), GMAC4/5 core code will set this register based on the rate of plat->stmmac_clk. This clock is created by the same code which initialises plat->eee_usecs_rate, which is also created to run at this same rate. Since Marek's commit, this will set this register appropriately using the rate of this clock. Therefore, dwmac-intel.c writing GMAC_1US_TIC_COUNTER serves no useful purpose and can be removed. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u3Vul-000E7m-1j@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: stmmac: dwc-qos: remove tegra_eqos_init()Russell King (Oracle)
tegra_eqos_init() initialises the 1US TIC counter for the EEE timers. However, the DWGMAC core is reset after this write, which clears this register to its default. However, dwmac4_core_init() configures this register using the same clock, which happens after reset - thus this is the write which ensures that the register is correctly configured. Therefore, tegra_eqos_init() is not required and is removed. This also means eqos->clk_slave can also be removed. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u3Vuf-000E7g-U4@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14page_pool: Move pp_magic check into helper functionsToke Høiland-Jørgensen
Since we are about to stash some more information into the pp_magic field, let's move the magic signature checks into a pair of helper functions so it can be changed in one place. Reviewed-by: Mina Almasry <almasrymina@google.com> Tested-by: Yonglong Liu <liuyonglong@huawei.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409-page-pool-track-dma-v9-1-6a9ef2e0cba8@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-04-11 (ice, i40e, ixgbe, igc, e1000e) For ice: Mateusz and Larysa add support for LLDP packets to be received on a VF and transmitted by a VF in switchdev mode. Additional information: https://lore.kernel.org/intel-wired-lan/20250214085215.2846063-1-larysa.zaremba@intel.com/ Karol adds timesync support for E825C devices using 2xNAC (Network Acceleration Complex) configuration. 2xNAC mode is the mode in which IO die is housing two complexes and each of them has its own PHY connected to it. Martyna adds messaging to clarify filter errors when recipe space is exhausted. Colin Ian King adds static modifier to a const array to avoid stack usage. For i40e: Kyungwook Boo changes variable declaration types to prevent possible underflow. For ixgbe: Rand Deeb adjusts retry values so that retries are attempted. For igc: Rui Salvaterra sets VLAN offloads to be enabled as default. For e1000e: Piotr Wejman converts driver to use newer hardware timestamping API. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: net: e1000e: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() igc: enable HW vlan tag insertion/stripping by default ixgbe: Fix unreachable retry logic in combined and byte I2C write functions i40e: fix MMIO write access to an invalid page in i40e_clear_hw ice: make const read-only array dflt_rules static ice: improve error message for insufficient filter space ice: enable timesync operation on 2xNAC E825 devices ice: refactor ice_sbq_msg_dev enum ice: remove SW side band access workaround for E825 ice: enable LLDP TX for VFs through tc ice: support egress drop rules on PF ice: remove headers argument from ice_tc_count_lkups ice: receive LLDP on trusted VFs ice: do not add LLDP-specific filter if not necessary ice: fix check for existing switch rule ==================== Link: https://patch.msgid.link/20250411204401.3271306-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14Merge branch '1GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== igc: Fix PTM timeout Christopher S M Hall says: There have been sporadic reports of PTM timeouts using i225/i226 devices These timeouts have been root caused to: 1) Manipulating the PTM status register while PTM is enabled and triggered 2) The hardware retrying too quickly when an inappropriate response is received from the upstream device The issue can be reproduced with the following: $ sudo phc2sys -R 1000 -O 0 -i tsn0 -m Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to quickly reproduce the issue. PHC2SYS exits with: "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction fails The first patch in this series also resolves an issue reported by Corinna Vinschen relating to kdump: This patch also fixes a hang in igc_probe() when loading the igc driver in the kdump kernel on systems supporting PTM. The igc driver running in the base kernel enables PTM trigger in igc_probe(). Therefore the driver is always in PTM trigger mode, except in brief periods when manually triggering a PTM cycle. When a crash occurs, the NIC is reset while PTM trigger is enabled. Due to a hardware problem, the NIC is subsequently in a bad busmaster state and doesn't handle register reads/writes. When running igc_probe() in the kdump kernel, the first register access to a NIC register hangs driver probing and ultimately breaks kdump. With this patch, igc has PTM trigger disabled most of the time, and the trigger is only enabled for very brief (10 - 100 us) periods when manually triggering a PTM cycle. Chances that a crash occurs during a PTM trigger are not zero, but extremly reduced. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igc: add lock preventing multiple simultaneous PTM transactions igc: cleanup PTP module if probe fails igc: handle the IGC_PTP_ENABLED flag correctly igc: move ktime snapshot into PTM retry loop igc: increase wait time before retrying PTM igc: fix PTM cycle trigger logic ==================== Link: https://patch.msgid.link/20250411162857.2754883-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14bna: bnad_dim_timeout: Rename del_timer_sync in commentWangYuli
Commit 8fa7292fee5c ("treewide: Switch/rename to timer_delete[_sync]()") switched del_timer_sync to timer_delete_sync, but did not modify the comment for bnad_dim_timeout(). Now fix it. Signed-off-by: WangYuli <wangyuli@uniontech.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/61DDCE7AB5B6CE82+20250411101736.160981-1-wangyuli@uniontech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11net: hibmcge: fix multiple phy_stop() issueJijie Shao
After detecting the np_link_fail exception, the driver attempts to fix the exception by using phy_stop() and phy_start() in the scheduled task. However, hbg_fix_np_link_fail() and .ndo_stop() may be concurrently executed. As a result, phy_stop() is executed twice, and the following Calltrace occurs: hibmcge 0000:84:00.2 enp132s0f2: Link is Down hibmcge 0000:84:00.2: failed to link between MAC and PHY, try to fix... ------------[ cut here ]------------ called from state HALTED WARNING: CPU: 71 PID: 23391 at drivers/net/phy/phy.c:1503 phy_stop... ... pc : phy_stop+0x138/0x180 lr : phy_stop+0x138/0x180 sp : ffff8000c76bbd40 x29: ffff8000c76bbd40 x28: 0000000000000000 x27: 0000000000000000 x26: ffff2020047358c0 x25: ffff202004735940 x24: ffff20200000e405 x23: ffff2020060e5178 x22: ffff2020060e4000 x21: ffff2020060e49c0 x20: ffff2020060e5170 x19: ffff20202538e000 x18: 0000000000000020 x17: 0000000000000000 x16: ffffcede02e28f40 x15: ffffffffffffffff x14: 0000000000000000 x13: 205d313933333254 x12: 5b5d393430303233 x11: ffffcede04555958 x10: ffffcede04495918 x9 : ffffcede0274fee0 x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001 x5 : 00000000002bffa8 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff20202e429480 Call trace: phy_stop+0x138/0x180 hbg_fix_np_link_fail+0x4c/0x90 [hibmcge] hbg_service_task+0xfc/0x148 [hibmcge] process_one_work+0x180/0x398 worker_thread+0x210/0x328 kthread+0xe0/0xf0 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- This patch adds the rtnl_lock to hbg_fix_np_link_fail() to ensure that other operations are not performed concurrently. In addition, np_link_fail exception can be fixed only when the PHY is link. Fixes: e0306637e85d ("net: hibmcge: Add support for mac link exception handling feature") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250410021327.590362-8-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11net: hibmcge: fix not restore rx pause mac addr after reset issueJijie Shao
The MAC hardware supports receiving two types of pause frames from link partner. One is a pause frame with a destination address of 01:80:C2:00:00:01. The other is a pause frame whose destination address is the address of the hibmcge driver. 01:80:C2:00:00:01 is supported by default. In .ndo_set_mac_address(), the hibmcge driver calls .hbg_hw_set_rx_pause_mac_addr() to set its mac address as the destination address of the rx puase frame. Therefore, pause frames with two types of MAC addresses can be received. Currently, the rx pause addr does not restored after reset. As a result, pause frames whose destination address is the hibmcge driver address cannot be correctly received. This patch restores the configuration by calling .hbg_hw_set_rx_pause_mac_addr() after reset is complete. Fixes: 3f5a61f6d504 ("net: hibmcge: Add reset supported in this module") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250410021327.590362-7-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11net: hibmcge: fix the incorrect np_link fail state issue.Jijie Shao
In the debugfs file, the driver displays the np_link fail state based on the HBG_NIC_STATE_NP_LINK_FAIL. However, HBG_NIC_STATE_NP_LINK_FAIL is cleared in hbg_service_task() So, this value of np_link fail is always false. This patch directly reads the related register to display the real state. Fixes: e0306637e85d ("net: hibmcge: Add support for mac link exception handling feature") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250410021327.590362-6-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11net: hibmcge: fix wrong mtu log issueJijie Shao
A dbg log is generated when the driver modifies the MTU, which is expected to trace the change of the MTU. However, the log is recorded after WRITE_ONCE(). At this time, netdev->mtu has been changed to the new value. As a result, netdev->mtu is the same as new_mtu. This patch modifies the log location and records logs before WRITE_ONCE(). Fixes: ff4edac6e9bd ("net: hibmcge: Implement some .ndo functions") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250410021327.590362-5-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11net: hibmcge: fix the share of irq statistics among different network ports ↵Jijie Shao
issue hbg_irqs is a global array which contains irq statistics. However, the irq statistics of different network ports point to the same global array. As a result, the statistics are incorrect. This patch allocates a statistics array for each network port to prevent the statistics of different network ports from affecting each other. irq statistics are removed from hbg_irq_info. Therefore, all data in hbg_irq_info remains unchanged. Therefore, the input parameter of some functions is changed to const. Fixes: 4d089035fa19 ("net: hibmcge: Add interrupt supported in this module") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250410021327.590362-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11net: hibmcge: fix incorrect multicast filtering issueJijie Shao
The driver does not support multicast filtering, the mask must be set to 0xFFFFFFFF. Otherwise, incorrect filtering occurs. This patch fixes this problem. Fixes: 37b367d60d0f ("net: hibmcge: Add unicast frame filter supported in this module") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250410021327.590362-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11net: hibmcge: fix incorrect pause frame statistics issueJijie Shao
The driver supports pause frames, but does not pass pause frames based on rx pause enable configuration, resulting in incorrect pause frame statistics. like this: mz eno3 '01 80 c2 00 00 01 00 18 2d 04 00 9c 88 08 00 01 ff ff' \ -p 64 -c 100 ethtool -S enp132s0f2 | grep -v ": 0" NIC statistics: rx_octets_total_filt_cnt: 6800 rx_filt_pkt_cnt: 100 The rx pause frames are filtered by the MAC hardware. This patch configures pass pause frames based on the rx puase enable status to ensure that rx pause frames are not filtered. mz eno3 '01 80 c2 00 00 01 00 18 2d 04 00 9c 88 08 00 01 ff ff' \ -p 64 -c 100 ethtool --include-statistics -a enp132s0f2 Pause parameters for enp132s0f2: Autonegotiate: on RX: on TX: on RX negotiated: on TX negotiated: on Statistics: tx_pause_frames: 0 rx_pause_frames: 100 Fixes: 3a03763f3876 ("net: hibmcge: Add pauseparam supported in this module") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250410021327.590362-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11net: airoha: Add L2 hw acceleration supportLorenzo Bianconi
Similar to mtk driver, introduce the capability to offload L2 traffic defining flower rules in the PSE/PPE engine available on EN7581 SoC. Since the hw always reports L2/L3/L4 flower rules, link all L2 rules sharing the same L2 info (with different L3/L4 info) in the L2 subflows list of a given L2 PPE entry. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/20250409-airoha-flowtable-l2b-v2-2-4a1e3935ea92@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11net: airoha: Add l2_flows rhashtableLorenzo Bianconi
Introduce l2_flows rhashtable in airoha_ppe struct in order to store L2 flows committed by upper layers of the kernel. This is a preliminary patch in order to offload L2 traffic rules. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/20250409-airoha-flowtable-l2b-v2-1-4a1e3935ea92@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11pds_core: fix memory leak in pdsc_debugfs_add_qcq()Abdun Nihaal
The memory allocated for intr_ctrl_regset, which is passed to debugfs_create_regset32() may not be cleaned up when the driver is removed. Fix that by using device managed allocation for it. Fixes: 45d76f492938 ("pds_core: set up device and adminq") Signed-off-by: Abdun Nihaal <abdun.nihaal@gmail.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://patch.msgid.link/20250409054450.48606-1-abdun.nihaal@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11net: e1000e: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Piotr Wejman
Update the driver to use the new hardware timestamping API added in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping"). Use Netlink extack for error reporting in e1000e_config_hwtstamp. Align the indentation of net_device_ops. Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Signed-off-by: Piotr Wejman <wejmanpm@gmail.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11igc: enable HW vlan tag insertion/stripping by defaultRui Salvaterra
This is enabled by default in other Intel drivers I've checked (e1000, e1000e, iavf, igb and ice). Fixes an out-of-the-box performance issue when running OpenWrt on typical mini-PCs with igc-supported Ethernet controllers and 802.1Q VLAN configurations, as ethtool isn't part of the default packages and sane defaults are expected. In my specific case, with an Intel N100-based machine with four I226-V Ethernet controllers, my upload performance increased from under 30 Mb/s to the expected ~1 Gb/s. Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ixgbe: Fix unreachable retry logic in combined and byte I2C write functionsRand Deeb
The current implementation of `ixgbe_write_i2c_combined_generic_int` and `ixgbe_write_i2c_byte_generic_int` sets `max_retry` to `1`, which makes the condition `retry < max_retry` always evaluate to `false`. This renders the retry mechanism ineffective, as the debug message and retry logic are never executed. This patch increases `max_retry` to `3` in both functions, aligning them with the retry logic in `ixgbe_read_i2c_combined_generic_int`. This ensures that the retry mechanism functions as intended, improving robustness in case of I2C write failures. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Rand Deeb <rand.sec96@gmail.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11i40e: fix MMIO write access to an invalid page in i40e_clear_hwKyungwook Boo
When the device sends a specific input, an integer underflow can occur, leading to MMIO write access to an invalid page. Prevent the integer underflow by changing the type of related variables. Signed-off-by: Kyungwook Boo <bookyungwook@gmail.com> Link: https://lore.kernel.org/lkml/ffc91764-1142-4ba2-91b6-8c773f6f7095@gmail.com/T/ Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ice: make const read-only array dflt_rules staticColin Ian King
Don't populate the const read-only array dflt_rules on the stack at run time, instead make it static. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ice: improve error message for insufficient filter spaceMartyna Szapar-Mudlaw
When adding a rule to switch through tc, if the operation fails due to not enough free recipes (-ENOSPC), provide a clearer error message: "Unable to add filter: insufficient space available." This improves user feedback by distinguishing space limitations from other generic failures. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ice: enable timesync operation on 2xNAC E825 devicesKarol Kolacinski
According to the E825C specification, SBQ address for ports on a single complex is device 2 for PHY 0 and device 13 for PHY1. For accessing ports on a dual complex E825C (so called 2xNAC mode), the driver should use destination device 2 (referred as phy_0) for the current complex PHY and device 13 (referred as phy_0_peer) for peer complex PHY. Differentiate SBQ destination device by checking if current PF port number is on the same PHY as target port number. Adjust 'ice_get_lane_number' function to provide unique port number for ports from PHY1 in 'dual' mode config (by adding fixed offset for PHY1 ports). Cache this value in ice_hw struct. Introduce ice_get_primary_hw wrapper to get access to timesync register not available from second NAC. Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ice: refactor ice_sbq_msg_dev enumKarol Kolacinski
Rename ice_sbq_msg_dev to ice_sbq_dev_id to reflect the meaning of this type more precisely. This enum type describes RDA (Remote Device Access) client ids, accessible over SB (Side Band) interface. Rename enum elements to make a driver namespace more cleaner and consistent with other definitions within SB Remove unused 'rmn_x' entries, specific to unsupported E824 device. Adjust clients '2' and '13' names (phy_0 and phy_0_peer respectively) to be compliant with EAS doc. According to the specification, regardless of the complex entity (single or dual), when accessing its own ports, they're accessed always as 'phy_0' client. And referred as 'phy_0_peer' when handling ports connected to the other complex. Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ice: remove SW side band access workaround for E825Karol Kolacinski
Due to the bug in FW/NVM autoload mechanism (wrong default SB_REM_DEV_CTL register settings), the access to peer PHY and CGU clients was disabled by default. As the workaround solution, the register value was overwritten by the driver at the probe or reset handling. Remove workaround as it's not needed anymore. The fix in autoload procedure has been provided with NVM 3.80 version. NOTE: at the time the fix was provided in NVM, the E825C product was not officially available on the market, so it's not expected this change will cause regression when running with older driver/kernel versions. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ice: enable LLDP TX for VFs through tcLarysa Zaremba
Only a single VSI can be in charge of sending LLDP frames, sometimes it is beneficial to assign this function to a VF, that is possible to do with tc capabilities in the switchdev mode. It requires first blocking the PF from sending the LLDP frames with a following command: tc filter add dev <ifname> egress protocol lldp flower skip_sw action drop Then it becomes possible to configure a forward rule from a VF port representor to uplink instead. tc filter add dev <vf_ifname> ingress protocol lldp flower skip_sw action mirred egress redirect dev <ifname> How LLDP exclusivity was done previously is LLDP traffic was blocked for a whole port by a single rule and PF was bypassing that. Now at least in the switchdev mode, every separate VSI has to have its own drop rule. Another complication is the fact that tc does not respect when the driver refuses to delete a rule, so returning an error results in a HW rule still present with no way to reference it through tc. This is addressed by allowing the PF rule to be deleted at any time, but making the VF forward rule "dormant" in such case, this means it is deleted from HW but stays in tc and driver's bookkeeping to be restored when drop rule is added back to the PF. Implement tc configuration handling which enables the user to transmit LLDP packets from VF instead of PF. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ice: support egress drop rules on PFLarysa Zaremba
tc clsact qdisc allows us to add offloaded egress rules with commands such as the following one: tc filter add dev <ifname> egress protocol lldp flower skip_sw action drop Support the egress rule drop action when added to PF, with a few caveats: * in switchdev mode, all PF traffic has to go uplink with an exception for LLDP that can be delegated to a single VSI at a time * in legacy mode, we cannot delegate LLDP functionality to another VSI, so such packets from PF should not be blocked. Also, simplify the rule direction logic, it was previously derived from actions, but actually can be inherited from the tc block (and flipped in case of port representors). Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ice: remove headers argument from ice_tc_count_lkupsLarysa Zaremba
Remove the headers argument from the ice_tc_count_lkups() function, because it is not used anywhere. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ice: receive LLDP on trusted VFsMateusz Pacuszka
When a trusted VF tries to configure an LLDP multicast address, configure a rule that would mirror the traffic to this VF, untrusted VFs are not allowed to receive LLDP at all, so the request to add LLDP MAC address will always fail for them. Add a forwarding LLDP filter to a trusted VF when it tries to add an LLDP multicast MAC address. The MAC address has to be added after enabling trust (through restarting the LLDP service). Signed-off-by: Mateusz Pacuszka <mateuszx.pacuszka@intel.com> Co-developed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ice: do not add LLDP-specific filter if not necessaryLarysa Zaremba
Commit 34295a3696fb ("ice: implement new LLDP filter command") introduced the ability to use LLDP-specific filter that directs all LLDP traffic to a single VSI. However, current goal is for all trusted VFs to be able to see LLDP neighbors, which is impossible to do with the special filter. Make using the generic filter the default choice and fall back to special one only if a generic filter cannot be added. That way setups with "NVMs where an already existent LLDP filter is blocking the creation of a filter to allow LLDP packets" will still be able to configure software Rx LLDP on PF only, while all other setups would be able to forward them to VFs too. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11ice: fix check for existing switch ruleMateusz Pacuszka
In case the rule already exists and another VSI wants to subscribe to it new VSI list is being created and both VSIs are moved to it. Currently, the check for already existing VSI with the same rule is done based on fdw_id.hw_vsi_id, which applies only to LOOKUP_RX flag. Change it to vsi_handle. This is software VSI ID, but it can be applied here, because vsi_map itself is also based on it. Additionally change return status in case the VSI already exists in the VSI map to "Already exists". Such case should be handled by the caller. Signed-off-by: Mateusz Pacuszka <mateuszx.pacuszka@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11igc: add lock preventing multiple simultaneous PTM transactionsChristopher S M Hall
Add a mutex around the PTM transaction to prevent multiple transactors Multiple processes try to initiate a PTM transaction, one or all may fail. This can be reproduced by running two instances of the following: $ sudo phc2sys -O 0 -i tsn0 -m PHC2SYS exits with: "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction fails Note: Normally two instance of PHC2SYS will not run, but one process should not break another. Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()") Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com> Reviewed-by: Corinna Vinschen <vinschen@redhat.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11igc: cleanup PTP module if probe failsChristopher S M Hall
Make sure that the PTP module is cleaned up if the igc_probe() fails by calling igc_ptp_stop() on exit. Fixes: d89f88419f99 ("igc: Add skeletal frame for Intel(R) 2.5G Ethernet Controller support") Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com> Reviewed-by: Corinna Vinschen <vinschen@redhat.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11igc: handle the IGC_PTP_ENABLED flag correctlyChristopher S M Hall
All functions in igc_ptp.c called from igc_main.c should check the IGC_PTP_ENABLED flag. Adding check for this flag to stop and reset functions. Fixes: 5f2958052c58 ("igc: Add basic skeleton for PTP") Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com> Reviewed-by: Corinna Vinschen <vinschen@redhat.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11igc: move ktime snapshot into PTM retry loopChristopher S M Hall
Move ktime_get_snapshot() into the loop. If a retry does occur, a more recent snapshot will result in a more accurate cross-timestamp. Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com> Reviewed-by: Corinna Vinschen <vinschen@redhat.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11igc: increase wait time before retrying PTMChristopher S M Hall
The i225/i226 hardware retries if it receives an inappropriate response from the upstream device. If the device retries too quickly, the root port does not respond. The wait between attempts was reduced from 10us to 1us in commit 6b8aa753a9f9 ("igc: Decrease PTM short interval from 10 us to 1 us"), which said: With the 10us interval, we were seeing PTM transactions take around 12us. Hardware team suggested this interval could be lowered to 1us which was confirmed with PCIe sniffer. With the 1us interval, PTM dialogs took around 2us. While a 1us short cycle time was thought to be theoretically sufficient, it turns out in practice it is not quite long enough. It is unclear if the problem is in the root port or an issue in i225/i226. Increase the wait from 1us to 4us. Increasing to 2us appeared to work in practice on the setups we have available. A value of 4us was chosen due to the limited hardware available for testing, with a goal of ensuring we wait long enough without overly penalizing the response time when unnecessary. The issue can be reproduced with the following: $ sudo phc2sys -R 1000 -O 0 -i tsn0 -m Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to quickly reproduce the issue. PHC2SYS exits with: "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction fails Fixes: 6b8aa753a9f9 ("igc: Decrease PTM short interval from 10 us to 1 us") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com> Reviewed-by: Corinna Vinschen <vinschen@redhat.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11igc: fix PTM cycle trigger logicChristopher S M Hall
Writing to clear the PTM status 'valid' bit while the PTM cycle is triggered results in unreliable PTM operation. To fix this, clear the PTM 'trigger' and status after each PTM transaction. The issue can be reproduced with the following: $ sudo phc2sys -R 1000 -O 0 -i tsn0 -m Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to quickly reproduce the issue. PHC2SYS exits with: "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction fails This patch also fixes a hang in igc_probe() when loading the igc driver in the kdump kernel on systems supporting PTM. The igc driver running in the base kernel enables PTM trigger in igc_probe(). Therefore the driver is always in PTM trigger mode, except in brief periods when manually triggering a PTM cycle. When a crash occurs, the NIC is reset while PTM trigger is enabled. Due to a hardware problem, the NIC is subsequently in a bad busmaster state and doesn't handle register reads/writes. When running igc_probe() in the kdump kernel, the first register access to a NIC register hangs driver probing and ultimately breaks kdump. With this patch, igc has PTM trigger disabled most of the time, and the trigger is only enabled for very brief (10 - 100 us) periods when manually triggering a PTM cycle. Chances that a crash occurs during a PTM trigger are not 0, but extremely reduced. Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Christopher S M Hall <christopher.s.hall@intel.com> Reviewed-by: Corinna Vinschen <vinschen@redhat.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Corinna Vinschen <vinschen@redhat.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-11net: stmmac: stm32: simplify clock handlingRussell King (Oracle)
Some stm32 implementations need the receive clock running in suspend, as indicated by dwmac->ops->clk_rx_enable_in_suspend. The existing code achieved this in a rather complex way, by passing a flag around. However, the clk API prepare/enables are counted - which means that a clock won't be stopped as long as there are more prepare and enables than disables and unprepares, just like a reference count. Therefore, we can simplify this logic by calling clk_prepare_enable() an additional time in the probe function if this flag is set, and then balancing that at remove time. With this, we can avoid passing a "are we suspending" and "are we resuming" flag to various functions in the driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>