summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-15selftests: mptcp: simult_flows: unify errors msgsMatthieu Baerts (NGI0)
In order to unify what is printed in case of error, similar to what is done in mptcp_connect.sh and mptcp_join.sh, it is interesting to do the following modifications in simult_flows.sh: - Print the rc errors at the end of the line. - Print the MIB counters. - Use the same ss options: add -M (MPTCP sockets) and -e (detailed socket information). While at it, also print of the 'max' time only in case of success, because 'mptcp_connect.c' will already print this info in case of error, e.g.: transfer slower than expected! runtime 11948 ms, expected 11921 ms Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250114-net-next-mptcp-st-more-debug-err-v1-1-2ffb16a6cf35@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15mptcp: fix for setting remote ipv4mapped addressGeliang Tang
Commit 1c670b39cec7 ("mptcp: change local addr type of subflow_destroy") introduced a bug in mptcp_pm_nl_subflow_destroy_doit(). ipv6_addr_set_v4mapped() should be called to set the remote ipv4 address 'addr_r.addr.s_addr' to the remote ipv6 address 'addr_r.addr6', not 'addr_l.addr.addr6', which is the local ipv6 address. Fixes: 1c670b39cec7 ("mptcp: change local addr type of subflow_destroy") Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250114-net-next-mptcp-fix-remote-addr-v1-1-debcd84ea86f@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15Merge branch 'net-bcm-asp2-fix-fallout-from-phylib-eee-changes'Jakub Kicinski
Russell King says: ==================== net: bcm: asp2: fix fallout from phylib EEE changes This series addresses the fallout from the phylib changes in the Broadcom ASP2 driver. The first patch uses phylib's copy of the LPI timer setting, which means the driver no longer has to track this. It will be set in hardware each time the adjust_link function is called when the link is up, and will be read at initialisation time to set the current value. The second patch removes the driver's storage of tx_lpi_enabled, which has become redundant since phylib managed EEE was merged. The driver does nothing with this flag other than storing it. The last patch converts the driver to use phylib's enable_tx_lpi flag rather than trying to maintain its own copy. ==================== Link: https://patch.msgid.link/Z4aV3RmSZJ1WS3oR@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15net: bcm: asp2: convert to phylib managed EEERussell King (Oracle)
Convert the Broadcom ASP2 driver to use phylib managed EEE support. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/E1tXk81-000r4x-TS@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15net: bcm: asp2: remove tx_lpi_enabledRussell King (Oracle)
Phylib maintains a copy of tx_lpi_enabled, which will be used to populate the member when phy_ethtool_get_eee(). Therefore, writing to this member before phy_ethtool_get_eee() will have no effect. Remove it. Also remove setting our copy of info->eee.tx_lpi_enabled which becomes write-only. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/E1tXk7w-000r4r-Pq@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15net: bcm: asp2: fix LPI timer handlingRussell King (Oracle)
Fix the LPI timer handling in Broadcom ASP2 driver after the phylib managed EEE patches were merged. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/E1tXk7r-000r4l-Li@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15selftests: net: Adapt ethtool mq tests to fix in qdisc graftVictor Nogueira
Because of patch[1] the graft behaviour changed So the command: tcq replace parent 100:1 handle 204: Is no longer valid and will not delete 100:4 added by command: tcq replace parent 100:4 handle 204: pfifo_fast So to maintain the original behaviour, this patch manually deletes 100:4 and grafts 100:1 Note: This change will also work fine without [1] [1] https://lore.kernel.org/netdev/20250111151455.75480-1-jhs@mojatatu.com/T/#u Signed-off-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-14net: fec: handle page_pool_dev_alloc_pages errorKevin Groeneveld
The fec_enet_update_cbd function calls page_pool_dev_alloc_pages but did not handle the case when it returned NULL. There was a WARN_ON(!new_page) but it would still proceed to use the NULL pointer and then crash. This case does seem somewhat rare but when the system is under memory pressure it can happen. One case where I can duplicate this with some frequency is when writing over a smbd share to a SATA HDD attached to an imx6q. Setting /proc/sys/vm/min_free_kbytes to higher values also seems to solve the problem for my test case. But it still seems wrong that the fec driver ignores the memory allocation error and can crash. This commit handles the allocation error by dropping the current packet. Fixes: 95698ff6177b5 ("net: fec: using page pool to manage RX buffers") Signed-off-by: Kevin Groeneveld <kgroeneveld@lenbrook.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20250113154846.1765414-1-kgroeneveld@lenbrook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14Merge branch 'net-stmmac-further-eee-cleanups-and-one-fix'Jakub Kicinski
Russell King says: ==================== net: stmmac: further EEE cleanups (and one fix!) This series continues the EEE cleanup of the stmmac driver, and includes one fix. As mentioned in the previous series, I wasn't entirely happy with the "stmmac_disable_sw_eee_mode" name, so the first patch renames this to "stmmac_stop_sw_lpi" instead, which I think better describes what this function is doing - stopping the transmit of the LPI state because we have a packet ot send. Patch 2 corrects the priv->eee_sw_timer_en flag when EEE has been disabled. Currently upon disable, priv->eee_enabled is set false, but through the weird logic that was present prior to the previous series, priv->eee_sw_timer_en was set true. This behaviour was kept as the previous series was cleanup, not fixes. This patch fixes this. Having fixed priv->eee_sw_timer_en to actually indicate whether software timed EEE mode is being used, it becomes no longer necessary to test priv->eee_enabled in addition. Patch 3 removes the redundant test. Patch 4 also uses priv->eee_sw_timer_en before manipulating the software EEE state in the suspend method rather than using priv->eee_enabled, which brings consistency. Patch 5 provides stmmac_try_to_start_sw_lpi() which complements stmmac_stop_sw_lpi(), and allows us to move duplicated code into one location. Patch 6 splits stmmac_enable_eee_mode() - one part of this function tests whether there are any queues that have unfinished work (in other words are busy). Separate out this code into a separate function. Patch 7 also splits out the mod_timer() for the software EEE timer intoi a seperate function (the reason will be in patch 9.) Patch 8 merges the remains of stmmac_enable_eee_mode() into stmmac_try_to_start_sw_lpi(). Patch 9 fixes the delay between transmit and entering LPI. Currently, when cleaning the transmit queues, if we discover that we have finished cleaning up all queues, we immediately instruct the hardware to enter LPI mode without waiting for the LPI timer. However, we should wait for the LPI timer to expire. Therefore, the transmit cleanup path needs to call stmmac_restart_sw_lpi_timer() instead of stmmac_try_to_start_sw_lpi(). ==================== Link: https://patch.msgid.link/Z4T84SbaC4D-fN5y@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: restart LPI timer after cleaning transmit descriptorsRussell King (Oracle)
Fix a bug in the LPI handling, where it is possible to immediately enter LPI mode after cleaning the transmit descriptors when all queues are empty rather than waiting for the LPI timeout to expire. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tXItg-000MBg-TW@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: combine stmmac_enable_eee_mode()Russell King (Oracle)
Combine stmmac_enable_eee_mode() with stmmac_try_to_start_sw_lpi() which makes the code easier to read and the flow more logical. We can now trivially see that if the transmit queues are busy, we (re-)start the eee_ctrl_timer. Otherwise, if the transmit path is not already in LPI mode, we ask the hardware to enter LPI mode. I believe that now we can see better what is going on here, this shows that there is a bug with the software LPI timer implementation. The LPI timer is supposed to define how long after the last transmittion completed before we start signalling LPI. However, this code structure shows that if all transmit queues are empty, and stmmac_try_to_start_sw_lpi() is called immediately after cleaning the transmit queue, we will instruct the hardware to start signalling LPI immediately. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tXItb-000MBa-OU@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: provide function for restarting sw LPI timerRussell King (Oracle)
Provide a function that encapsulates restarting the software LPI timer when we have determined that the transmit path is busy, or whether the EEE parameters have changed. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tXItW-000MBU-KQ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: provide stmmac_eee_tx_busy()Russell King (Oracle)
Extract the code which checks whether there's still work to do on any of the stmmac transmit queues. This will allow us to combine stmmac_enable_eee_mode() with stmmac_try_to_start_sw_lpi() in the next patch. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tXItR-000MBO-GF@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: add stmmac_try_to_start_sw_lpi()Russell King (Oracle)
There are two places which call stmmac_enable_eee_mode() and follow it immediately by modifying the expiry of priv->eee_ctrl_timer. Both code paths are trying to enable LPI mode. Remove this duplication by providing a function for this. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tXItM-000MBI-CX@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: check priv->eee_sw_timer_en in suspend pathRussell King (Oracle)
The suspend path uses priv->eee_enabled when cleaning up the software timed LPI mode. Use priv->eee_sw_timer_en instead so we're consistently using a single control for software-based timer handling. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tXItH-000MBC-8i@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: simplify TX cleanup decision for ending sw LPI modeRussell King (Oracle)
As mentioned in "net: stmmac: correct priv->eee_sw_timer_en setting", we can simplify some fast-path tests. The transmit cleaning path checks whether EEE is enabled, the transmit path is not in LPI mode, and that we're using software timed mode. Since the above mentioned commit, checking whether EEE is enabled is no longer necessary as priv->eee_sw_timer_en will be false when EEE is disabled. Simplify this test. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tXItC-000MB6-54@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: correct priv->eee_sw_timer_en settingRussell King (Oracle)
If we are disabling EEE/LPI, then we should not be enabling software mode. The only time when we should is if EEE is active, and we are wanting to use software-timed EEE mode. Therefore, in the disable path of stmmac_eee_init(), ensure that priv->eee_sw_timer_en is set false as we are going to be calling del_timer_sync() on the timer. This will allow us to simplify some fast-path tests in later patches. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tXIt7-000MB0-0W@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: rename stmmac_disable_sw_eee_mode()Russell King (Oracle)
stmmac_disable_sw_eee_mode() was not a good choice for this functions purpose - which is to stop transmitting LPI because we want to send a packet. Rename it to stmmac_stop_sw_lpi(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tXIt1-000MAu-TE@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: ethernet: sunplus: Switch to ndo_eth_ioctl谢致邦 (XIE Zhibang)
The device ioctl handler no longer calls ndo_do_ioctl, but calls ndo_eth_ioctl to handle mii ioctls since commit a76053707dbf ("dev_ioctl: split out ndo_eth_ioctl"). However, sunplus still used ndo_do_ioctl when it was introduced. So switch to ndo_eth_ioctl. Bad commit fd3040b9394c ("net: ethernet: Add driver for Sunplus SP7021") was the initial driver commit, meaning that PHY IOCTLs where never available on this driver. Therefore don't consider this as a fix. Found by code inspection. Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com> Link: https://patch.msgid.link/tencent_8CF8A72C708E96B9C7DC1AF96FEE19AF3D05@qq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14Merge branch 'net-ethernet-simplify-few-things'Jakub Kicinski
Krzysztof Kozlowski says: ==================== net: ethernet: Simplify few things Few code simplifications without functional impact. Not tested on hardware. ==================== Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-0-3423889935f7@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: stm32: Use syscon_regmap_lookup_by_phandle_argsKrzysztof Kozlowski
Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over syscon_regmap_lookup_by_phandle() combined with getting the syscon argument. Except simpler code this annotates within one line that given phandle has arguments, so grepping for code would be easier. There is also no real benefit in printing errors on missing syscon argument, because this is done just too late: runtime check on static/build-time data. Dtschema and Devicetree bindings offer the static/build-time check for this already. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-5-3423889935f7@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: sti: Use syscon_regmap_lookup_by_phandle_argsKrzysztof Kozlowski
Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over syscon_regmap_lookup_by_phandle() combined with getting the syscon argument. Except simpler code this annotates within one line that given phandle has arguments, so grepping for code would be easier. There is also no real benefit in printing errors on missing syscon argument, because this is done just too late: runtime check on static/build-time data. Dtschema and Devicetree bindings offer the static/build-time check for this already. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-4-3423889935f7@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: stmmac: imx: Use syscon_regmap_lookup_by_phandle_argsKrzysztof Kozlowski
Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over syscon_regmap_lookup_by_phandle() combined with getting the syscon argument. Except simpler code this annotates within one line that given phandle has arguments, so grepping for code would be easier. There is also no real benefit in printing errors on missing syscon argument, because this is done just too late: runtime check on static/build-time data. Dtschema and Devicetree bindings offer the static/build-time check for this already. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-3-3423889935f7@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: ti: am65-cpsw-nuss: Use syscon_regmap_lookup_by_phandle_argsKrzysztof Kozlowski
Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over syscon_regmap_lookup_by_phandle() combined with getting the syscon argument. Except simpler code this annotates within one line that given phandle has arguments, so grepping for code would be easier. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-2-3423889935f7@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: ti: icssg-prueth: Do not print physical memory addressesKrzysztof Kozlowski
Debugging messages should not reveal anything about memory addresses. This also solves arm compile test warnings: drivers/net/ethernet/ti/icssg/icssg_prueth_sr1.c:1034:49: error: format specifies type 'unsigned long long' but the argument has type 'phys_addr_t' (aka 'unsigned int') [-Werror,-Wformat] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: MD Danish Anwar <danishanwar@ti.com> Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-1-3423889935f7@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: netpoll: ensure skb_pool list is always initializedJohn Sperbeck
When __netpoll_setup() is called directly, instead of through netpoll_setup(), the np->skb_pool list head isn't initialized. If skb_pool_flush() is later called, then we hit a NULL pointer in skb_queue_purge_reason(). This can be seen with this repro, when CONFIG_NETCONSOLE is enabled as a module: ip tuntap add mode tap tap0 ip link add name br0 type bridge ip link set dev tap0 master br0 modprobe netconsole netconsole=4444@10.0.0.1/br0,9353@10.0.0.2/ rmmod netconsole The backtrace is: BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page ... ... ... Call Trace: <TASK> __netpoll_free+0xa5/0xf0 br_netpoll_cleanup+0x43/0x50 [bridge] do_netpoll_cleanup+0x43/0xc0 netconsole_netdev_event+0x1e3/0x300 [netconsole] unregister_netdevice_notifier+0xd9/0x150 cleanup_module+0x45/0x920 [netconsole] __se_sys_delete_module+0x205/0x290 do_syscall_64+0x70/0x150 entry_SYSCALL_64_after_hwframe+0x76/0x7e Move the skb_pool list setup and initial skb fill into __netpoll_setup(). Fixes: 221a9c1df790 ("net: netpoll: Individualize the skb pool") Signed-off-by: John Sperbeck <jsperbeck@google.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250114011354.2096812-1-jsperbeck@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: xilinx: axienet: Fix IRQ coalescing packet count overflowSean Anderson
If coalesce_count is greater than 255 it will not fit in the register and will overflow. This can be reproduced by running # ethtool -C ethX rx-frames 256 which will result in a timeout of 0us instead. Fix this by checking for invalid values and reporting an error. Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://patch.msgid.link/20250113163001.2335235-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14socket: Remove unused kernel_sendmsg_lockedDr. David Alan Gilbert
The last use of kernel_sendmsg_locked() was removed in 2023 by commit dc97391e6610 ("sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250112131318.63753-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: phy: Constify struct mdio_device_idChristophe JAILLET
'struct mdio_device_id' is not modified in these drivers. Constifying these structures moves some data to a read-only section, so increase overall security. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 27014 12792 0 39806 9b7e drivers/net/phy/broadcom.o After: ===== text data bss dec hex filename 27206 12600 0 39806 9b7e drivers/net/phy/broadcom.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/403c381b7d9156b67ad68ffc44b8eee70c5e86a9.1736691226.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14nfp: bpf: prevent integer overflow in nfp_bpf_event_output()Dan Carpenter
The "sizeof(struct cmsg_bpf_event) + pkt_size + data_size" math could potentially have an integer wrapping bug on 32bit systems. Check for this and return an error. Fixes: 9816dd35ecec ("nfp: bpf: perf event output helpers support") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/6074805b-e78d-4b8a-bf05-e929b5377c28@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14Merge branch 'net-phy-realtek-add-hwmon-support'Jakub Kicinski
Heiner Kallweit says: ==================== net: phy: realtek: add hwmon support This adds hwmon support for the temperature sensor on RTL822x. It's available on the standalone versions of the PHY's, and on the internal PHY's of RTL8125B(P)/RTL8125D/RTL8126. ==================== Link: https://patch.msgid.link/7319d8f9-2d6f-4522-92e8-a8a4990042fb@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: phy: realtek: add hwmon support for temp sensor on RTL822xHeiner Kallweit
This adds hwmon support for the temperature sensor on RTL822x. It's available on the standalone versions of the PHY's, and on the integrated PHY's in RTL8125B/RTL8125D/RTL8126. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/ad6bfe9f-6375-4a00-84b4-bfb38a21bd71@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: phy: move realtek PHY driver to its own subdirectoryHeiner Kallweit
In preparation of adding a source file with hwmon support, move the Realtek PHY driver to its own subdirectory and rename realtek.c to realtek_main.c. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/c566551b-c915-4e34-9b33-129a6ddd6e4c@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: phy: realtek: add support for reading MDIO_MMD_VEND2 regs on ↵Heiner Kallweit
RTL8125/RTL8126 RTL8125/RTL8126 don't support MMD access to the internal PHY, but provide a mechanism to access at least all MDIO_MMD_VEND2 registers. By exposing this mechanism standard MMD access functions can be used to access the MDIO_MMD_VEND2 registers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/e821b302-5fe6-49ab-aabd-05da500581c0@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: airoha: Enforce ETS Qdisc priomapLorenzo Bianconi
EN7581 SoC supports fixed QoS band priority where WRR queues have lowest priorities with respect to SP ones. E.g: WRR0, WRR1, .., WRRm, SP0, SP1, .., SPn Enforce ETS Qdisc priomap according to the hw capabilities. Suggested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Davide Caratti <dcaratti@redhat.com> Link: https://patch.msgid.link/20250112-airoha_ets_priomap-v1-1-fb616de159ba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: ethernet: ti: am65-cpsw: VLAN-aware CPSW only if !DSAAlexander Sverdlin
Only configure VLAN-aware CPSW mode if no port is used as DSA CPU port. VLAN-aware mode interferes with some DSA tagging schemes and makes stacking DSA switches downstream of CPSW impossible. Previous attempts to address the issue linked below. Link: https://lore.kernel.org/netdev/20240227082815.2073826-1-s-vadapalli@ti.com/ Link: https://lore.kernel.org/linux-arm-kernel/4699400.vD3TdgH1nR@localhost/ Co-developed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Link: https://patch.msgid.link/20250110125737.546184-1-alexander.sverdlin@siemens.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14tsnep: Link queues to NAPIsGerhard Engleder
Use netif_queue_set_napi() to link queues to NAPI instances so that they can be queried with netlink. $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump queue-get --json='{"ifindex": 11}' [{'id': 0, 'ifindex': 11, 'napi-id': 9, 'type': 'rx'}, {'id': 1, 'ifindex': 11, 'napi-id': 10, 'type': 'rx'}, {'id': 0, 'ifindex': 11, 'napi-id': 9, 'type': 'tx'}, {'id': 1, 'ifindex': 11, 'napi-id': 10, 'type': 'tx'}] Additionally use netif_napi_set_irq() to also provide NAPI interrupt number to userspace. $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --do napi-get --json='{"id": 9}' {'defer-hard-irqs': 0, 'gro-flush-timeout': 0, 'id': 9, 'ifindex': 11, 'irq': 42, 'irq-suspend-timeout': 0} Providing information about queues to userspace makes sense as APIs like XSK provide queue specific access. Also XSK busy polling relies on queues linked to NAPIs. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20250110223939.37490-1-gerhard@engleder-embedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14ice: Add in/out PTP pin delaysKarol Kolacinski
HW can have different input/output delays for each of the pins. Currently, only E82X adapters have delay compensation based on TSPLL config and E810 adapters have constant 1 ms compensation, both cases only for output delays and the same one for all pins. E825 adapters have different delays for SDP and other pins. Those delays are also based on direction and input delays are different than output ones. This is the main reason for moving delays to pin description structure. Add a field in ice_ptp_pin_desc structure to reflect that. Delay values are based on approximate calculations of HW delays based on HW spec. Implement external timestamp (input) delay compensation. Remove existing definitions and wrappers for periodic output propagation delays. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: implement low latency PHY timer updatesJacob Keller
Programming the PHY registers in preparation for an increment value change or a timer adjustment on E810 requires issuing Admin Queue commands for each PHY register. It has been found that the firmware Admin Queue processing occasionally has delays of tens or rarely up to hundreds of milliseconds. This delay cascades to failures in the PTP applications which depend on these updates being low latency. Consider a standard PTP profile with a sync rate of 16 times per second. This means there is ~62 milliseconds between sync messages. A complete cycle of the PTP algorithm 1) Sync message (with Tx timestamp) from source 2) Follow-up message from source 3) Delay request (with Tx timestamp) from sink 4) Delay response (with Rx timestamp of request) from source 5) measure instantaneous clock offset 6) request time adjustment via CLOCK_ADJTIME systemcall The Tx timestamps have a default maximum timeout of 10 milliseconds. If we assume that the maximum possible time is used, this leaves us with ~42 milliseconds of processing time for a complete cycle. The CLOCK_ADJTIME system call is synchronous and will block until the driver completes its timer adjustment or frequency change. If the writes to prepare the PHY timers get hit by a latency spike of 50 milliseconds, then the PTP application will be delayed past the point where the next cycle should start. Packets from the next cycle may have already arrived and are waiting on the socket. In particular, LinuxPTP ptp4l may start complaining about missing an announce message from the source, triggering a fault. In addition, the clockcheck logic it uses may trigger. This clockcheck failure occurs because the timestamp captured by hardware is compared against a reading of CLOCK_MONOTONIC. It is assumed that the time when the Rx timestamp is captured and the read from CLOCK_MONOTONIC are relatively close together. This is not the case if there is a significant delay to processing the Rx packet. Newer firmware supports programming the PHY registers over a low latency interface which bypasses the Admin Queue. Instead, software writes to the REG_LL_PROXY_L and REG_LL_PROXY_H registers. Firmware reads these registers and then programs the PHY timers. Implement functions to use this interface when available to program the PHY timers instead of using the Admin Queue. This avoids the Admin Queue latency and ensures that adjustments happen within acceptable latency bounds. Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: check low latency PHY timer update firmware capabilityJacob Keller
Newer versions of firmware support programming the PHY timer via the low latency interface exposed over REG_LL_PROXY_L and REG_LL_PROXY_H. Add support for checking the device capabilities for this feature. Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: add lock to protect low latency interfaceJacob Keller
Newer firmware for the E810 devices support a 'low latency' interface to interact with the PHY without using the Admin Queue. This is interacted with via the REG_LL_PROXY_L and REG_LL_PROXY_H registers. Currently, this interface is only used for Tx timestamps. There are two different mechanisms, including one which uses an interrupt for firmware to signal completion. However, these two methods are mutually exclusive, so no synchronization between them was necessary. This low latency interface is being extended in future firmware to support also programming the PHY timers. Use of the interface for PHY timers will need synchronization to ensure there is no overlap with a Tx timestamp. The interrupt-based response complicates the locking somewhat. We can't use a simple spinlock. This would require being acquired in ice_ptp_req_tx_single_tstamp, and released in ice_ptp_complete_tx_single_tstamp. The ice_ptp_req_tx_single_tstamp function is called from the threaded IRQ, and the ice_ptp_complete_tx_single_stamp is called from the low latency IRQ, so we would need to acquire the lock with IRQs disabled. To handle this, we'll use a wait queue along with wait_event_interruptible_locked_irq in the update flows which don't use the interrupt. The interrupt flow will acquire the wait queue lock, set the ATQBAL_FLAGS_INTR_IN_PROGRESS, and then initiate the firmware low latency request, and unlock the wait queue lock. Upon receipt of the low latency interrupt, the lock will be acquired, the ATQBAL_FLAGS_INTR_IN_PROGRESS bit will be cleared, and the firmware response will be captured, and wake_up_locked() will be called on the wait queue. The other flows will use wait_event_interruptible_locked_irq() to wait until the ATQBAL_FLAGS_INTR_IN_PROGRESS is clear. This function checks the condition under lock, but does not hold the lock while waiting. On return, the lock is held, and a return of zero indicates we hold the lock and the in-progress flag is not set. This will ensure that threads which need to use the low latency interface will sleep until they can acquire the lock without any pending low latency interrupt flow interfering. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: rename TS_LL_READ* macros to REG_LL_PROXY_H_*Jacob Keller
The TS_LL_READ macros are used as part of the low latency Tx timestamp interface. A future firmware extension will add support for performing PHY timer updates over this interface. Using TS_LL_READ as the prefix for these macros will be confusing once the interface is used for other purposes. Rename the macros, using the prefix REG_LL_PROXY_H, to better clarify that this is for the low latency interface. Additionally add macros for PF_SB_ATQBAH and PF_SB_ATQBAL registers to better clarify content of this registers as PF_SB_ATQBAH contain low part of Tx timestamp Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: use read_poll_timeout_atomic in ice_read_phy_tstamp_ll_e810Jacob Keller
The ice_read_phy_tstamp_ll_e810 function repeatedly reads the PF_SB_ATQBAL register until the TS_LL_READ_TS bit is cleared. This is a perfect candidate for using rd32_poll_timeout. However, the default implementation uses a sleep-based wait. Use read_poll_timeout_atomic macro which is based on the non-sleeping implementation and use it to replace the loop reading in the ice_read_phy_tstamp_ll_e810 function. Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: use string choice helpersR Sundar
Use string choice helpers for better readability. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@inria.fr> Closes: https://lore.kernel.org/r/202410121553.SRNFzc2M-lkp@intel.com/ Signed-off-by: R Sundar <prosunofficial@gmail.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: add fw and port health reportersKonrad Knitter
Firmware generates events for global events or port specific events. Driver shall subscribe for health status events from firmware on supported FW versions >= 1.7.6. Driver shall expose those under specific health reporter, two new reporters are introduced: - FW health reporter shall represent global events (problems with the image, recovery mode); - Port health reporter shall represent port-specific events (module failure). Firmware only reports problems when those are detected, it does not store active fault list. Driver will hold only last global and last port-specific event. Driver will report all events via devlink health report, so in case of multiple events of the same source they can be reviewed using devlink autodump feature. $ devlink health pci/0000:b1:00.3: reporter fw state healthy error 0 recover 0 auto_dump true reporter port state error error 1 recover 0 last_dump_date 2024-03-17 last_dump_time 09:29:29 auto_dump true $ devlink health diagnose pci/0000:b1:00.3 reporter port Syndrome: 262 Description: Module is not present. Possible Solution: Check that the module is inserted correctly. Port Number: 0 Tested on Intel Corporation Ethernet Controller E810-C for SFP Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Co-developed-by: Sharon Haroni <sharon.haroni@intel.com> Signed-off-by: Sharon Haroni <sharon.haroni@intel.com> Co-developed-by: Nicholas Nunley <nicholas.d.nunley@intel.com> Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com> Co-developed-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Konrad Knitter <konrad.knitter@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: add recipe priority check in searchMichal Swiatkowski
The new recipe should be added even if exactly the same recipe already exists with different priority. Example use case is when the rule is being added from TC tool context. It should has the highest priority, but if the recipe already exists the rule will inherit it priority. It can lead to the situation when the rule added from TC tool has lower priority than expected. The solution is to check the recipe priority when trying to find existing one. Previous recipe is still useful. Example: RID 8 -> priority 4 RID 10 -> priority 7 The difference is only in priority rest is let's say eth + mac + direction. Adding ARP + MAC_A + RX on RID 8, forward to VF0_VSI After that IP + MAC_B + RX on RID 10 (from TC tool), forward to PF0 Both will work. In case of adding ARP + MAC_A + RX on RID 8, forward to VF0_VSI ARP + MAC_A + RX on RID 10, forward to PF0. Only second one will match, but this is expected. Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: ice_probe: init ice_adapter after HW initPrzemek Kitszel
Move ice_adapter initialization to be after HW init, so it could use HW capabilities, like number of PFs. This is needed for devlink-resource based RSS LUT size management for PF/VF (not in this series). Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: minor: rename goto labels from err to unrollPrzemek Kitszel
Clean up goto labels after previous commit, to conform to single naming scheme in ice_probe() and ice_init_dev(). Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: split ice_init_hw() out from ice_init_dev()Przemek Kitszel
Split ice_init_hw() call out from ice_init_dev(). Such move enables pulling the former to be even earlier on call path, what would enable moving ice_adapter init to be between the two (in subsequent commit). Such move enables ice_adapter to know about number of PFs. Do the same for ice_deinit_hw(), so the init and deinit calls could be easily mirrored. Next commit will rename unrelated goto labels to unroll prefix. Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: c827: move wait for FW to ice_init_hw()Przemek Kitszel
Move call to ice_wait_for_fw() from ice_init_dev() into ice_init_hw(), where it fits better. This requires also to move ice_wait_for_fw() to ice_common.c. ice_is_pf_c827() is now used only in ice_common.c, so it could be static. CC: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>