linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-05-08	net: enetc: move generic VLAN hash filter functions to enetc_pf_common.c	Wei Fang
	The VLAN hash filters of ENETC v1 and v4 are basically the same, the only difference is that the offset of the VLAN hash filter registers has been changed in ENETC v4. So some functions like enetc_vlan_rx_add_vid() and enetc_vlan_rx_del_vid() only need to be slightly modified to be reused by ENETC v4. Currently, we just move these functions from enetc_pf.c to enetc_pf_common.c. Appropriate modifications will be made for ENETC4 in a subsequent patch. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-13-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	net: enetc: extract enetc_refresh_vlan_ht_filter()	Wei Fang
	Extract the common function enetc_refresh_vlan_ht_filter() from enetc_sync_vlan_ht_filter() so that it can be reused by the ENETC v4 PF and VF drivers in the future. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-12-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	net: enetc: enable RSS feature by default	Wei Fang
	Receive side scaling (RSS) is a network driver technology that enables the efficient distribution of network receive processing across multiple CPUs in multiprocessor systems. Therefore, it is better to enable RSS by default so that the CPU load can be balanced and network performance can be improved when then network is enabled. Signed-off-by: Wei Fang <wei.fang@nxp.com> Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-11-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	net: enetc: change enetc_set_rss() to void type	Wei Fang
	Actually enetc_set_rss() does not need a return value, so change its type to void. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-10-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	net: enetc: add RSS support for i.MX95 ENETC PF	Wei Fang
	Compared with LS1028A, there are two main differences: first, i.MX95 ENETC uses NTMP 2.0 to manage the RSS table, and second, the offset of the RSS Key registers is different. Some modifications have been made in the previous patches based on these differences to ensure that the relevant interfaces are compatible with i.MX95. So it's time to add RSS support to i.MX95 ENETC PF. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-9-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	net: enetc: make enetc_set_rss_key() reusable	Wei Fang
	Since the offset of the RSS key registers of i.MX95 ENETC is different from that of LS1028A, so add enetc_get_rss_key_base() to get the base offset for the different chips, so that enetc_set_rss_key() can be reused for this trivial thing. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-8-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	net: enetc: add set/get_rss_table() hooks to enetc_si_ops	Wei Fang
	Since i.MX95 ENETC (v4) uses NTMP 2.0 to manage the RSS table, which is different from LS1028A ENETC (v1). In order to reuse some functions related to the RSS table, so add .get_rss_table() and .set_rss_table() hooks to enetc_si_ops. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-7-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	net: enetc: add debugfs interface to dump MAC filter	Wei Fang
	ENETC's MAC filter consists of hash MAC filter and exact MAC filter. Hash MAC filter is a 64-bit entry hash table consisting of two 32-bit registers. Exact MAC filter is implemented by configuring MAC address filter table through command BD ring. The table is stored in ENETC's internal memory and needs to be read through command BD ring. In order to facilitate debugging, added a debugfs interface to get the relevant information about MAC filter. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-6-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	net: enetc: add MAC filtering for i.MX95 ENETC PF	Wei Fang
	The i.MX95 ENETC supports both MAC hash filter and MAC exact filter. MAC hash filter is implenented through a 64-bit hash table to match against the hashed addresses, PF and VFs each have two MAC hash tables, one is for unicast and the other one is for multicast. But MAC exact filter is shared between SIs (PF and VFs), each table entry contains a MAC address that may be unicast or multicast and the entry also contains an SI bitmap field that indicates for which SIs the entry is valid. For i.MX95 ENETC, MAC exact filter only has 4 entries. According to the observation of the system default network configuration, the MAC filter will be configured with multiple multicast addresses, so MAC exact filter does not have enough entries to implement multicast filtering. Therefore, the current MAC exact filter is only used for unicast filtering. If the number of unicast addresses exceeds 4, then MAC hash filter is used. Note that both MAC hash filter and MAC exact filter can only be accessed by PF, VFs can notify PF to set its corresponding MAC filter through the mailbox mechanism of ENETC. But currently MAC filter is only added for i.MX95 ENETC PF. The MAC filter support of ENETC VFs will be supported in subsequent patches. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-5-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	net: enetc: move generic MAC filtering interfaces to enetc-core	Wei Fang
	Although only ENETC PF can access the MAC address filter table, the table entries can specify MAC address filtering for one or more SIs based on SI_BITMAP, which means that the table also supports MAC address filtering for VFs. Currently, only the ENETC v1 PF driver supports MAC address filtering. In order to add the MAC address filtering support for the ENETC v4 PF driver and VF driver in the future, the relevant generic interfaces are moved to the enetc-core driver. This lays the basis for i.MX95 ENETC PF and VFs to support MAC address filtering. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-4-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	net: enetc: add command BD ring support for i.MX95 ENETC	Wei Fang
	The command BD ring is used to configure functionality where the underlying resources may be shared between different entities or being too large to configure using direct registers (such as lookup tables). Because the command BD and table formats of i.MX95 and LS1028A are very different, the software processing logic is also different. So add enetc4_setup_cbdr() and enetc4_teardown_cbdr() for ENETC v4 drivers. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-3-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	net: enetc: add initial netc-lib driver to support NTMP	Wei Fang
	Some NETC functionality is controlled using control messages sent to the hardware using BD ring interface with 32B descriptor similar to transmit BD ring used on ENETC. This BD ring interface is referred to as command BD ring. It is used to configure functionality where the underlying resources may be shared between different entities or being too large to configure using direct registers. Therefore, a messaging protocol called NETC Table Management Protocol (NTMP) is provided for exchanging configuration and management information between the software and the hardware using the command BD ring interface. For the management protocol of LS1028A has been retroactively named NTMP 1.0, and its implementation is in enetc_cbdr.c and enetc_qos.c. However, NTMP of i.MX95 has been upgraded to version 2.0, which is incompatible with LS1028A, because the message formats have been changed. Therefore, add the netc-lib driver to support NTMP 2.0 to operate various tables. Note that, only MAC address filter table and RSS table are supported at the moment. More tables will be supported in subsequent patches. It is worth mentioning that the purpose of the netc-lib driver is to provide some NTMP-based generic interfaces for ENETC and NETC Switch drivers. Currently, it only supports the configurations of some tables. Interfaces such as tc flower and debugfs will be added in the future. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250506080735.3444381-2-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	selftests: net-drv: remove the nic_performance and nic_link_layer tests	Jakub Kicinski
	Revert fbbf93556f0c ("selftests: nic_performance: Add selftest for performance of NIC driver") Revert c087dc54394b ("selftests: nic_link_layer: Add selftest case for speed and duplex states") Revert 6116075e18f7 ("selftests: nic_link_layer: Add link layer selftest for NIC driver") These tests don't clean up after themselves, don't use the disruptive annotations, don't get included in make install etc. etc. The tests were added before we have any "HW" runner, so the issues were missed. Our CI doesn't have any way of excluding broken tests, remove these for now to stop the random pollution of results due to broken env. We can always add them back once / if fixed. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250507140109.929801-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	selftests: netfilter: fix conntrack stress test failures on debug kernels	Florian Westphal
	Jakub reports test failures on debug kernel: FAIL: proc inconsistency after uniq filter for ... This is because entries are expiring while validation is happening. Increase the timeout of ctnetlink injected entries and the icmp (ping) timeout to 1h to avoid this. To reduce run-time, add less entries via ctnetlink when KSFT_MACHINE_SLOW is set. also log of a failed run had: PASS: dump in netns had same entry count (-C 0, -L 0, -p 0, /proc 0) ... i.e. all entries already expired: add a check and set failure if this happens. While at it, include a diff when there were duplicate entries and add netns name to error messages (it tells if icmp or ctnetlink failed). Fixes: d33f889fd80c ("selftests: netfilter: add conntrack stress test") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20250506061125.1a244d12@kernel.org/ Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20250507075000.5819-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.15-rc6). No conflicts. Adjacent changes: net/core/dev.c: 08e9f2d584c4 ("net: Lock netdevices during dev_shutdown") a82dc19db136 ("net: avoid potential race between netdev_get_by_index_lock() and netns switch") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08	Merge tag 'net-6.15-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from CAN, WiFi and netfilter. We have still a comple of regressions open due to the recent drivers locking refactor. The patches are in-flight, but not ready yet. Current release - regressions: - core: lock netdevices during dev_shutdown - sch_htb: make htb_deactivate() idempotent - eth: virtio-net: don't re-enable refill work too early Current release - new code bugs: - eth: icssg-prueth: fix kernel panic during concurrent Tx queue access Previous releases - regressions: - gre: fix again IPv6 link-local address generation. - eth: b53: fix learning on VLAN unaware bridges Previous releases - always broken: - wifi: fix out-of-bounds access during multi-link element defragmentation - can: - initialize spin lock on device probe - fix order of unregistration calls - openvswitch: fix unsafe attribute parsing in output_userspace() - eth: - virtio-net: fix total qstat values - mtk_eth_soc: reset all TX queues on DMA free - fbnic: firmware IPC mailbox fixes" * tag 'net-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits) virtio-net: fix total qstat values net: export a helper for adding up queue stats fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready fbnic: Cleanup handling of completions fbnic: Actually flush_tx instead of stalling out fbnic: Add additional handling of IRQs fbnic: Gate AXI read/write enabling on FW mailbox fbnic: Fix initialization of mailbox descriptor rings net: dsa: b53: do not set learning and unicast/multicast on up net: dsa: b53: fix learning on VLAN unaware bridges net: dsa: b53: fix toggling vlan_filtering net: dsa: b53: do not program vlans when vlan filtering is off net: dsa: b53: do not allow to configure VLAN 0 net: dsa: b53: always rejoin default untagged VLAN on bridge leave net: dsa: b53: fix VLAN ID for untagged vlan on bridge leave net: dsa: b53: fix flushing old pvid VLAN on pvid change net: dsa: b53: fix clearing PVID of a port net: dsa: b53: keep CPU port always tagged again ...
2025-05-08	Merge tag 's390-6.15-4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - Fix potential use-after-free bug and missing error handling in PCI code - Fix dcssblk build error - Fix last breaking event handling in case of stack corruption to allow for better error reporting - Update defconfigs * tag 's390-6.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/pci: Fix duplicate pci_dev_put() in disable_slot() when PF has child VFs s390/pci: Fix missing check for zpci_create_device() error return s390: Update defconfigs s390/dcssblk: Fix build error with CONFIG_DAX=m and CONFIG_DCSSBLK=y s390/entry: Fix last breaking event handling in case of stack corruption s390/configs: Enable options required for TC flow offload s390/configs: Enable VDPA on Nvidia ConnectX-6 network card
2025-05-08	Merge tag 'v6.15-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd	Linus Torvalds
	Pull smb server fixes from Steve French: - Fix UAF closing file table (e.g. in tree disconnect) - Fix potential out of bounds write - Fix potential memory leak parsing lease state in open - Fix oops in rename with empty target * tag 'v6.15-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: Fix UAF in __close_file_table_ids ksmbd: prevent out-of-bounds stream writes by validating *pos ksmbd: fix memory leak in parse_lease_state() ksmbd: prevent rename with empty string
2025-05-08	Merge branch 'virtio-net-fix-total-qstat-values'	Paolo Abeni
	Jakub Kicinski says: ==================== virtio-net: fix total qstat values Another small fix discovered after we enabled virtio multi-queue in netdev CI. The queue stat test fails: # Exception\| Exception: Qstats are lower, fetched later not ok 3 stats.pkt_byte_sum The queue stats from disabled queues are supposed to be reported in the "base" stats. ==================== Link: https://patch.msgid.link/20250507003221.823267-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	virtio-net: fix total qstat values	Jakub Kicinski
	NIPA tests report that the interface statistics reported via qstat are lower than those reported via ip link. Looks like this is because some tests flip the queue count up and down, and we end up with some of the traffic accounted on disabled queues. Add up counters from disabled queues. Fixes: d888f04c09bb ("virtio-net: support queue stat") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250507003221.823267-3-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	net: export a helper for adding up queue stats	Jakub Kicinski
	Older drivers and drivers with lower queue counts often have a static array of queues, rather than allocating structs for each queue on demand. Add a helper for adding up qstats from a queue range. Expectation is that driver will pass a queue range [netdev->real_num_x_queues, MAX). It was tempting to always use num_x_queues as the end, but virtio seems to clamp its queue count after allocating the netdev. And this way we can trivaly reuse the helper for [0, real_..). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250507003221.823267-2-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	Merge branch 'fbnic-fw-ipc-mailbox-fixes'	Paolo Abeni
	Alexander Duyck says: ==================== fbnic: FW IPC Mailbox fixes This series is meant to address a number of issues that have been found in the FW IPC mailbox over the past several months. The main issues addressed are: 1. Resolve a potential race between host and FW during initialization that can cause the FW to only have the lower 32b of an address. 2. Block the FW from issuing DMA requests after we have closed the mailbox and before we have started issuing requests on it. 3. Fix races in the IRQ handlers that can cause the IRQ to unmask itself if it is being processed while we are trying to disable it. 4. Cleanup the Tx flush logic so that we actually lock down the Tx path before we start flushing it instead of letting it free run while we are shutting it down. 5. Fix several memory leaks that could occur if we failed initialization. 6. Cleanup the mailbox completion if we are flushing Tx since we are no longer processing Rx. 7. Move several allocations out of a potential IRQ/atomic context. There have been a few optimizations we also picked up since then. Rather than split them out I just folded them into these diffs. They mostly address minor issues such as how long it takes to initialize and/or fail so I thought they could probably go in with the rest of the patches. They consist of: 1. Do not sleep more than 20ms waiting on FW to respond as the 200ms value likely originated from simulation/emulation testing. 2. Use jiffies to determine timeout instead of sleep * attempts for better accuracy. Reviewed-by: Jakub Kicinski <kuba@kernel.org> ==================== Link: https://patch.msgid.link/174654659243.499179.11194817277075480209.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready	Alexander Duyck
	We had originally thought to have the mailbox go to ready in the background while we were doing other things. One issue with this though is that we can't disable it by clearing the ready state without also blocking interrupts or calls to mbx_poll as it will just pop back to life during an interrupt. In order to prevent that from happening we can pull the code for toggling to ready out of the interrupt path and instead place it in the fbnic_mbx_poll_tx_ready path so that it becomes the only spot where the Rx/Tx can toggle to the ready state. By doing this we can prevent races where we disable the DMA and/or free buffers only to have an interrupt fire and undo what we have done. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654722518.499179.11612865740376848478.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context	Alexander Duyck
	This change pulls the call to fbnic_fw_xmit_cap_msg out of fbnic_mbx_init_desc_ring and instead places it in the polling function for getting the Tx ready. Doing that we can avoid the potential issue with an interrupt coming in later from the firmware that causes it to get fired in interrupt context. Fixes: 20d2e88cc746 ("eth: fbnic: Add initial messaging to notify FW of our presence") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654721876.499179.9839651602256668493.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready	Alexander Duyck
	There were a couple different issues found in fbnic_mbx_poll_tx_ready. Among them were the fact that we were sleeping much longer than we actually needed to as the actual FW could respond in under 20ms. The other issue was that we would just keep polling the mailbox even if the device itself had gone away. To address the responsiveness issues we can decrease the sleeps to 20ms and use a jiffies based timeout value rather than just counting the number of times we slept and then polled. To address the hardware going away we can move the check for the firmware BAR being present from where it was and place it inside the loop after the mailbox descriptor ring is initialized and before we sleep so that we just abort and return an error if the device went away during initialization. With these two changes we see a significant improvement in boot times for the driver. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654721224.499179.2698616208976624755.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Cleanup handling of completions	Alexander Duyck
	There was an issue in that if we were to shutdown we could be left with a completion in flight as the mailbox went away. To address that I have added an fbnic_mbx_evict_all_cmpl function that is meant to essentially create a "broken pipe" type response so that all callers will receive an error indicating that the connection has been broken as a result of us shutting down the mailbox. Fixes: 378e5cc1c6c6 ("eth: fbnic: hwmon: Add completion infrastructure for firmware requests") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654720578.499179.380252598204530873.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Actually flush_tx instead of stalling out	Alexander Duyck
	The fbnic_mbx_flush_tx function had a number of issues. First, we were waiting 200ms for the firmware to process the packets. We can drop this to 20ms and in almost all cases this should be more than enough time. So by changing this we can significantly reduce shutdown time. Second, we were not making sure that the Tx path was actually shut off. As such we could still have packets added while we were flushing the mailbox. To prevent that we can now clear the ready flag for the Tx side and it should stay down since the interrupt is disabled. Third, we kept re-reading the tail due to the second issue. The tail should not move after we have started the flush so we can just read it once while we are holding the mailbox Tx lock. By doing that we are guaranteed that the value should be consistent. Fourth, we were keeping a count of descriptors cleaned due to the second and third issues called out. That count is not a valid reason to be exiting the cleanup, and with the tail only being read once we shouldn't see any cases where the tail moves after the disable so the tracking of count can be dropped. Fifth, we were using attempts * sleep time to determine how long we would wait in our polling loop to flush out the Tx. This can be very imprecise. In order to tighten up the timing we are shifting over to using a jiffies value of jiffies + 10 * HZ + 1 to determine the jiffies value we should stop polling at as this should be accurate within once sleep cycle for the total amount of time spent polling. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654719929.499179.16406653096197423749.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Add additional handling of IRQs	Alexander Duyck
	We have two issues that need to be addressed in our IRQ handling. One is the fact that we can end up double-freeing IRQs in the event of an exception handling error such as a PCIe reset/recovery that fails. To prevent that from becoming an issue we can use the msix_vector values to indicate that we have successfully requested/freed the IRQ by only setting or clearing them when we have completed the given action. The other issue is that we have several potential races in our IRQ path due to us manipulating the mask before the vector has been truly disabled. In order to handle that in the case of the FW mailbox we need to not auto-enable the IRQ and instead will be enabling/disabling it separately. In the case of the PCS vector we can mitigate this by unmapping it and synchronizing the IRQ before we clear the mask. The general order of operations after this change is now to request the interrupt, poll the FW mailbox to ready, and then enable the interrupt. For the shutdown we do the reverse where we disable the interrupt, flush any pending Tx, and then free the IRQ. I am renaming the enable/disable to request/free to be equivilent with the IRQ calls being used. We may see additions in the future to enable/disable the IRQs versus request/free them for certain use cases. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Fixes: 69684376eed5 ("eth: fbnic: Add link detection") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654719271.499179.3634535105127848325.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Gate AXI read/write enabling on FW mailbox	Alexander Duyck
	In order to prevent the device from throwing spurious writes and/or reads at us we need to gate the AXI fabric interface to the PCIe until such time as we know the FW is in a known good state. To accomplish this we use the mailbox as a mechanism for us to recognize that the FW has acknowledged our presence and is no longer sending any stale message data to us. We start in fbnic_mbx_init by calling fbnic_mbx_reset_desc_ring function, disabling the DMA in both directions, and then invalidating all the descriptors in each ring. We then poll the mailbox in fbnic_mbx_poll_tx_ready and when the interrupt is set by the FW we pick it up and mark the mailboxes as ready, while also enabling the DMA. Once we have completed all the transactions and need to shut down we call into fbnic_mbx_clean which will in turn call fbnic_mbx_reset_desc_ring for each ring and shut down the DMA and once again invalidate the descriptors. Fixes: 3646153161f1 ("eth: fbnic: Add register init to set PCIe/Ethernet device config") Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654718623.499179.7445197308109347982.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Fix initialization of mailbox descriptor rings	Alexander Duyck
	Address to issues with the FW mailbox descriptor initialization. We need to reverse the order of accesses when we invalidate an entry versus writing an entry. When writing an entry we write upper and then lower as the lower 32b contain the valid bit that makes the entire address valid. However for invalidation we should write it in the reverse order so that the upper is marked invalid before we update it. Without this change we may see FW attempt to access pages with the upper 32b of the address set to 0 which will likely result in DMAR faults due to write access failures on mailbox shutdown. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654717972.499179.8083789731819297034.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-07	Merge branch 'net-dsa-b53-accumulated-fixes'	Jakub Kicinski
	Jonas Gorski says: ==================== net: dsa: b53: accumulated fixes This patchset aims at fixing most issues observed while running the vlan_unaware_bridge, vlan_aware_bridge and local_termination selftests. Most tests succeed with these patches on BCM53115, connected to a BCM6368. It took me a while to figure out that a lot of tests will fail if all ports have the same MAC address, as the switches drop any frames with DA == SA. Luckily BCM63XX boards often have enough MACs allocated for all ports, so I just needed to assign them. The still failing tests are: FDB learning, both vlan aware aware and unaware: This is expected, as b53 currently does not implement changing the ageing time, and both the bridge code and DSA ignore that, so the learned entries don't age out as expected. ping and ping6 in vlan unaware: These fail because of the now fixed learning, the switch trying to forward packet ingressing on one of the standalone ports to the learned port of the mac address when the packets ingressed on the bridged port. The port VLAN masks only prevent forwarding to other ports, but the ARL lookup will still happen, and the packet gets dropped because the port isn't allowed to forward there. I have a fix/workaround for that, but as it is a bit more controversial and makes use of an unrelated feature, I decided to hold off from that and post it later. This wasn't noticed so far, because learning was never working in VLAN unaware mode, so the traffic was always broadcast (which sidesteps the issue). Finally some of the multicast tests from local_termination fail, where the reception worked except it shouldn't. This doesn't seem to me as a super serious issue, so I didn't attempt to debug/fix these yet. I'm not super confident I didn't break sf2 along the way, but I did compile test and tried to find ways it cause issues (I failed to find any). I hope Florian will tell me. ==================== Link: https://patch.msgid.link/20250429201710.330937-1-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: dsa: b53: do not set learning and unicast/multicast on up	Jonas Gorski
	When a port gets set up, b53 disables learning and enables the port for flooding. This can undo any bridge configuration on the port. E.g. the following flow would disable learning on a port: $ ip link add br0 type bridge $ ip link set sw1p1 master br0 <- enables learning for sw1p1 $ ip link set br0 up $ ip link set sw1p1 up <- disables learning again Fix this by populating dsa_switch_ops::port_setup(), and set up initial config there. Fixes: f9b3827ee66c ("net: dsa: b53: Support setting learning on port") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250429201710.330937-12-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: dsa: b53: fix learning on VLAN unaware bridges	Jonas Gorski
	When VLAN filtering is off, we configure the switch to forward, but not learn on VLAN table misses. This effectively disables learning while not filtering. Fix this by switching to forward and learn. Setting the learning disable register will still control whether learning actually happens. Fixes: dad8d7c6452b ("net: dsa: b53: Properly account for VLAN filtering") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250429201710.330937-11-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: dsa: b53: fix toggling vlan_filtering	Jonas Gorski
	To allow runtime switching between vlan aware and vlan non-aware mode, we need to properly keep track of any bridge VLAN configuration. Likewise, we need to know when we actually switch between both modes, to not have to rewrite the full VLAN table every time we update the VLANs. So keep track of the current vlan_filtering mode, and on changes, apply the appropriate VLAN configuration. Fixes: 0ee2af4ebbe3 ("net: dsa: set configure_vlan_while_not_filtering to true by default") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250429201710.330937-10-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: dsa: b53: do not program vlans when vlan filtering is off	Jonas Gorski
	Documentation/networking/switchdev.rst says: - with VLAN filtering turned off: the bridge is strictly VLAN unaware and its data path will process all Ethernet frames as if they are VLAN-untagged. The bridge VLAN database can still be modified, but the modifications should have no effect while VLAN filtering is turned off. This breaks if we immediately apply the VLAN configuration, so skip writing it when vlan_filtering is off. Fixes: 0ee2af4ebbe3 ("net: dsa: set configure_vlan_while_not_filtering to true by default") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250429201710.330937-9-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: dsa: b53: do not allow to configure VLAN 0	Jonas Gorski
	Since we cannot set forwarding destinations per VLAN, we should not have a VLAN 0 configured, as it would allow untagged traffic to work across ports on VLAN aware bridges regardless if a PVID untagged VLAN exists. So remove the VLAN 0 on join, an re-add it on leave. But only do so if we have a VLAN aware bridge, as without it, untagged traffic would become tagged with VID 0 on a VLAN unaware bridge. Fixes: a2482d2ce349 ("net: dsa: b53: Plug in VLAN support") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250429201710.330937-8-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: dsa: b53: always rejoin default untagged VLAN on bridge leave	Jonas Gorski
	While JOIN_ALL_VLAN allows to join all VLANs, we still need to keep the default VLAN enabled so that untagged traffic stays untagged. So rejoin the default VLAN even for switches with JOIN_ALL_VLAN support. Fixes: 48aea33a77ab ("net: dsa: b53: Add JOIN_ALL_VLAN support") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250429201710.330937-7-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: dsa: b53: fix VLAN ID for untagged vlan on bridge leave	Jonas Gorski
	The untagged default VLAN is added to the default vlan, which may be one, but we modify the VLAN 0 entry on bridge leave. Fix this to use the correct VLAN entry for the default pvid. Fixes: fea83353177a ("net: dsa: b53: Fix default VLAN ID") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250429201710.330937-6-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: dsa: b53: fix flushing old pvid VLAN on pvid change	Jonas Gorski
	Presumably the intention here was to flush the VLAN of the old pvid, not the added VLAN again, which we already flushed before. Fixes: a2482d2ce349 ("net: dsa: b53: Plug in VLAN support") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250429201710.330937-5-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: dsa: b53: fix clearing PVID of a port	Jonas Gorski
	Currently the PVID of ports are only set when adding/updating VLANs with PVID set or removing VLANs, but not when clearing the PVID flag of a VLAN. E.g. the following flow $ ip link add br0 type bridge vlan_filtering 1 $ ip link set sw1p1 master bridge $ bridge vlan add dev sw1p1 vid 10 pvid untagged $ bridge vlan add dev sw1p1 vid 10 untagged Would keep the PVID set as 10, despite the flag being cleared. Fix this by checking if we need to unset the PVID on vlan updates. Fixes: a2482d2ce349 ("net: dsa: b53: Plug in VLAN support") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250429201710.330937-4-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: dsa: b53: keep CPU port always tagged again	Jonas Gorski
	The Broadcom management header does not carry the original VLAN tag state information, just the ingress port, so for untagged frames we do not know from which VLAN they originated. Therefore keep the CPU port always tagged except for VLAN 0. Fixes the following setup: $ ip link add br0 type bridge vlan_filtering 1 $ ip link set sw1p1 master br0 $ bridge vlan add dev br0 pvid untagged self $ ip link add sw1p2.10 link sw1p2 type vlan id 10 Where VID 10 would stay untagged on the CPU port. Fixes: 2c32a3d3c233 ("net: dsa: b53: Do not force CPU to be always tagged") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250429201710.330937-3-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: dsa: b53: allow leaky reserved multicast	Jonas Gorski
	Allow reserved multicast to ignore VLAN membership so STP and other management protocols work without a PVID VLAN configured when using a vlan aware bridge. Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250429201710.330937-2-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: ibmveth: Refactored veth_pool_store for better maintainability	Dave Marquardt
	Make veth_pool_store detect requested pool changes, close device if necessary, update pool, and reopen device. Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250506160004.328347-1-davemarq@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	Merge branch 'netlink-specs-remove-phantom-structs'	Jakub Kicinski
	Jakub Kicinski says: ==================== netlink: specs: remove phantom structs rt-netlink and nl80211 have a few structs which may be helpful for Python decoding of binary attrs, but which don't actually exist in the C uAPI. This prevents us from using struct pointers for binary types in C. We could support this situation better in the codegen, or add these structs to uAPI. That said Johannes suggested we remove the WiFi structs for now, and the rt-link ones are semi-broken. Drop the struct definitions, for now, if someone has a need to use such structs in Python (as opposed to them being defined for completeness) we can revist. v1: https://lore.kernel.org/20250505170215.253672-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250506194101.696272-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	netlink: specs: rt-link: remove implicit structs from devconf	Jakub Kicinski
	devconf is even odder than SNMP. On input it reports an array of u32s which seem to be indexed by the enum values - 1. On output kernel expects a nest where each attr has the enum type as the nla type. sub-type: u32 is probably best we can do right now. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250506194101.696272-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	netlink: specs: remove implicit structs for SNMP counters	Jakub Kicinski
	uAPI doesn't define structs for the SNMP counters, just enums to index them as arrays. Switch to the same representation in the spec. C codegen will soon need all the struct types to actually exist. Note that the existing definition was broken, anyway, as the first member should be the number of counters reported. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250506194101.696272-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	netlink: specs: ovs: correct struct names	Jakub Kicinski
	C codegen will soon support using struct types for binary attrs. Correct the struct names in OvS specs. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250506194101.696272-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	netlink: specs: nl80211: drop structs which are not uAPI	Jakub Kicinski
	C codegen will soon use structs for binary types. A handful of structs in WiFi carry information elements from the wire, defined by the standard. The structs are not part of uAPI, so we can't use them in C directly. We could add them to the uAPI or add some annotation to tell the codegen to output a local version to the user header. The former seems arbitrary since we don't expose structs for most of the standard. The latter seems like a lot of work for a rare occurrence. Drop the struct info for now. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/004030652d592b379e730be2f0344bebc4a03475.camel@sipsolutions.net Link: https://patch.msgid.link/20250506194101.696272-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	Merge branch 'tools-ynl-gen-split-presence-metadata'	Jakub Kicinski
	Jakub Kicinski says: ==================== tools: ynl-gen: split presence metadata The presence metadata indicates whether given attribute was/should be added to the Netlink message. We have 3 types of such metadata: - bit presence for simple values like integers, - len presence for variable size attrs, like binary and strings, - count for arrays. Previously this information was spread around with first two types living in a dedicated sub-struct called _present. The counts resided directly in the main struct with an n_ prefix. Reshuffle these an uniformly store them in dedicated sub-structs. The immediate motivation is that current scheme causes name collisions for TC. ==================== Link: https://patch.msgid.link/20250505165208.248049-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	tools: ynl-gen: move the count into a presence struct too	Jakub Kicinski
	While we reshuffle the presence members, move the counts as well. Previously array count members would have been place directly in the struct, so: struct family_op_req { struct { u32 a:1; u32 b:1; } _present; struct { u32 bin; } _len; u32 a; u64 b; const unsigned char bin; u32 n_multi; << count u32 multi; << objects }; Since len has been moved to its own presence struct move the count as well: struct family_op_req { struct { u32 a:1; u32 b:1; } _present; struct { u32 bin; } _len; struct { u32 multi; << count } _count; u32 a; u64 b; const unsigned char bin; u32 multi; << objects }; This improves the consistency and allows us to remove some hacks in the codegen. Unlike for len there is no known name collision with the existing scheme. Link: https://patch.msgid.link/20250505165208.248049-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>