summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/broadcom
AgeCommit message (Collapse)Author
2024-08-21bnxt_en: Fix double DMA unmapping for XDP_REDIRECTSomnath Kotur
Remove the dma_unmap_page_attrs() call in the driver's XDP_REDIRECT code path. This should have been removed when we let the page pool handle the DMA mapping. This bug causes the warning: WARNING: CPU: 7 PID: 59 at drivers/iommu/dma-iommu.c:1198 iommu_dma_unmap_page+0xd5/0x100 CPU: 7 PID: 59 Comm: ksoftirqd/7 Tainted: G W 6.8.0-1010-gcp #11-Ubuntu Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS 2.15.2 04/02/2024 RIP: 0010:iommu_dma_unmap_page+0xd5/0x100 Code: 89 ee 48 89 df e8 cb f2 69 ff 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 31 d2 31 c9 31 f6 31 ff 45 31 c0 e9 ab 17 71 00 <0f> 0b 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 31 d2 31 c9 RSP: 0018:ffffab1fc0597a48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff99ff838280c8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffab1fc0597a78 R08: 0000000000000002 R09: ffffab1fc0597c1c R10: ffffab1fc0597cd3 R11: ffff99ffe375acd8 R12: 00000000e65b9000 R13: 0000000000000050 R14: 0000000000001000 R15: 0000000000000002 FS: 0000000000000000(0000) GS:ffff9a06efb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000565c34c37210 CR3: 00000005c7e3e000 CR4: 0000000000350ef0 ? show_regs+0x6d/0x80 ? __warn+0x89/0x150 ? iommu_dma_unmap_page+0xd5/0x100 ? report_bug+0x16a/0x190 ? handle_bug+0x51/0xa0 ? exc_invalid_op+0x18/0x80 ? iommu_dma_unmap_page+0xd5/0x100 ? iommu_dma_unmap_page+0x35/0x100 dma_unmap_page_attrs+0x55/0x220 ? bpf_prog_4d7e87c0d30db711_xdp_dispatcher+0x64/0x9f bnxt_rx_xdp+0x237/0x520 [bnxt_en] bnxt_rx_pkt+0x640/0xdd0 [bnxt_en] __bnxt_poll_work+0x1a1/0x3d0 [bnxt_en] bnxt_poll+0xaa/0x1e0 [bnxt_en] __napi_poll+0x33/0x1e0 net_rx_action+0x18a/0x2f0 Fixes: 578fcfd26e2a ("bnxt_en: Let the page pool manage the DMA mapping") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240820203415.168178-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-16bnx2x: Set ivi->vlan field as an integerSimon Horman
In bnx2x_get_vf_config(): * The vlan field of ivi is a 32-bit integer, it is used to store a vlan ID. * The vlan field of bulletin is a 16-bit integer, it is also used to store a vlan ID. In the current code, ivi->vlan is set using memset. But in the case of setting it to the value of bulletin->vlan, this involves reading 32 bits from a 16bit source. This is likely safe, as the following 6 bytes are padding in the same structure, but none the less, it seems undesirable. However, it is entirely unclear to me how this scheme works on big-endian systems. Resolve this by simply assigning integer values to ivi->vlan. Flagged by W=1 builds. f.e. gcc-14 reports: In function 'fortify_memcpy_chk', inlined from 'bnx2x_get_vf_config' at .../bnx2x_sriov.c:2655:4: .../fortify-string.h:580:25: warning: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 580 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Link: https://patch.msgid.link/20240815-bnx2x-int-vlan-v1-1-5940b76e37ad@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-15bnxt_en: Don't clear ntuple filters and rss contexts during ethtool opsPavan Chebbi
The driver currently blindly deletes its cache of RSS cotexts and ntuple filters when the ethtool channel count is changing. It also deletes the ntuple filters cache when the default indirection table is changing. The core will not allow ethtool channels to drop below any that have been configured as ntuple destinations since this commit from 2022: 47f3ecf4763d ("ethtool: Fail number of channels change when it conflicts with rxnfc") So there is absolutely no need to delete the ntuple filters and RSS contexts when changing ethtool channels. It is also unnecessary to delete ntuple filters when the default RSS indirection table is changing. Remove bnxt_clear_usr_fltrs() and bnxt_clear_rss_ctxis() from the ethtool ops and change them to static functions. This bug will cause confusion to the end user and causes failure when running the rss_ctx.py selftest. Fixes: 1018319f949c ("bnxt_en: Invalidate user filters when needed") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20240725111912.7bc17cf6@kernel.org/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20240814225429.199280-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-14bnxt_en: avoid truncation of per rx run debugfs filenameSimon Horman
Although it seems unlikely in practice - there would need to be rx ring indexes greater than 10^10 - it is theoretically possible for the filename of per rx ring debugfs files to be truncated. This is because although a 16 byte buffer is provided, the length of the filename is restricted to 10 bytes. Remove this restriction and allow the entire buffer to be used. Also reduce the buffer to 12 bytes, which is sufficient. Given that the range of rx ring indexes likely much smaller than the maximum range of a 32-bit signed integer, a smaller buffer could be used, with some further changes. But this change seems simple, robust, and has minimal stack overhead. Flagged by gcc-14: .../bnxt_debugfs.c: In function 'bnxt_debug_dev_init': drivers/net/ethernet/broadcom/bnxt/bnxt_debugfs.c:69:30: warning: '%d' directive output may be truncated writing between 1 and 11 bytes into a region of size 10 [-Wformat-truncation=] 69 | snprintf(qname, 10, "%d", ring_idx); | ^~ In function 'debugfs_dim_ring_init', inlined from 'bnxt_debug_dev_init' at .../bnxt_debugfs.c:87:4: .../bnxt_debugfs.c:69:29: note: directive argument in the range [-2147483643, 2147483646] 69 | snprintf(qname, 10, "%d", ring_idx); | ^~~~ .../bnxt_debugfs.c:69:9: note: 'snprintf' output between 2 and 12 bytes into a destination of size 10 69 | snprintf(qname, 10, "%d", ring_idx); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Compile tested only Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20240813-bnxt-str-v2-2-872050a157e7@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-14bnxt_en: Extend maximum length of version string by 1 byteSimon Horman
This corrects an out-by-one error in the maximum length of the package version string. The size argument of snprintf includes space for the trailing '\0' byte, so there is no need to allow extra space for it by reducing the value of the size argument by 1. Found by inspection. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20240813-bnxt-str-v2-1-872050a157e7@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-12ethtool: rss: don't report key if device doesn't support itJakub Kicinski
marvell/otx2 and mvpp2 do not support setting different keys for different RSS contexts. Contexts have separate indirection tables but key is shared with all other contexts. This is likely fine, indirection table is the most important piece. Don't report the key-related parameters from such drivers. This prevents driver-errors, e.g. otx2 always writes the main key, even when user asks to change per-context key. The second reason is that without this change tracking the keys by the core gets complicated. Even if the driver correctly reject setting key with rss_context != 0, change of the main key would have to be reflected in the XArray for all additional contexts. Since the additional contexts don't have their own keys not including the attributes (in Netlink speak) seems intuitive. ethtool CLI seems to deal with it just fine. Having to set the flag in majority of the drivers is a bit tedious but not reporting the key is a safer default. Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-12eth: remove .cap_rss_ctx_supported from updated driversJakub Kicinski
Remove .cap_rss_ctx_supported from drivers which moved to the new API. This makes it easy to grep for drivers which still need to be converted. Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-11bnxt_en: only set dev->queue_mgmt_ops if supported by FWDavid Wei
The queue API calls bnxt_hwrm_vnic_update() to stop/start the flow of packets, which can only properly flush the pipeline if FW indicates support. Add a macro BNXT_SUPPORTS_QUEUE_API that checks for the required flags and only set queue_mgmt_ops if true. Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-11bnxt_en: stop packet flow during bnxt_queue_stop/startDavid Wei
The current implementation when resetting a queue while packets are flowing puts the queue into an inconsistent state. There needs to be some synchronisation with the FW. Add calls to bnxt_hwrm_vnic_update() to set the MRU for both the default and ntuple vnic during queue start/stop. When the MRU is set to 0, flow is stopped. Each Rx queue belongs to either the default or the ntuple vnic. With calling bnxt_hwrm_vnic_update() the calls to napi_enable() and napi_disable() must be removed for reset to work on a queue that has active traffic flowing e.g. iperf3. Co-developed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-11bnxt_en: set vnic->mru in bnxt_hwrm_vnic_cfg()David Wei
Set the newly added vnic->mru field in bnxt_hwrm_vnic_cfg(). Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-11bnxt_en: Check the FW's VNIC flush capabilityMichael Chan
Check the HWRM_VNIC_QCAPS FW response for the receive engine flush capability. This capability indicates that we can reliably support RX ring restart when calling HWRM_VNIC_UPDATE with MRU set to 0. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-11bnxt_en: Add support to call FW to update a VNICMichael Chan
Add the function bnxt_hwrm_vnic_update() to call FW to update a VNIC. This call can be used when disabling and enabling a receive ring within a VNIC. The mru which is the maximum receive size of packets received by the VNIC can be updated. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-11bnxt_en: Update firmware interface to 1.10.3.68Michael Chan
The main changes are: 1. HWRM_VNIC_UPDATE used to safely disable and enable an RX ring within the VNIC. 2. New flag in HWRM_VNIC_QCAPS to indicate FW will do the proper flush during HWRM_VNIC_UPDATE. 3. New flag in HWRM_FUNC_QCAPS to indicate that reservations for some resources such as VNIC can be reduced. 4. New backing store memory types not used by the driver yet. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-08bnx2x: Provide declaration of dmae_reg_go_c in headerSimon Horman
Provide declaration of dmae_reg_go_c in header. This symbol is defined in bnx2x_main.c. And used in that file and bnx2x_stats.c. However, Sparse complains that there is no declaration of the symbol in dmae_reg_go_c nor is the symbol static. .../bnx2x_main.c:291:11: warning: symbol 'dmae_reg_go_c' was not declared. Should it be static? Address this by moving the declaration from bnx2x_stats.c to bnx2x_reg.h. No functional change intended. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240806-bnx2x-dec-v1-1-ae844ec785e4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Link: https://patch.msgid.link/20240808170148.3629934-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-08net: ethtool: fix off-by-one error in max RSS context IDsEdward Cree
Both ethtool_ops.rxfh_max_context_id and the default value used when it's not specified are supposed to be exclusive maxima (the former is documented as such; the latter, U32_MAX, cannot be used as an ID since it equals ETH_RXFH_CONTEXT_ALLOC), but xa_alloc() expects an inclusive maximum. Subtract one from 'limit' to produce an inclusive maximum, and pass that to xa_alloc(). Increase bnxt's max by one to prevent a (very minor) regression, as BNXT_MAX_ETH_RSS_CTX is an inclusive max. This is safe since bnxt is not actually hard-limited; BNXT_MAX_ETH_RSS_CTX is just a leftover from old driver code that managed context IDs itself. Rename rxfh_max_context_id to rxfh_max_num_contexts to make its semantics (hopefully) more obvious. Fixes: 847a8ab18676 ("net: ethtool: let the core choose RSS context IDs") Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/5a2d11a599aa5b0cc6141072c01accfb7758650c.1723045898.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-08net: bcmgenet: Properly overlay PHY and MAC Wake-on-LAN capabilitiesFlorian Fainelli
Some Wake-on-LAN modes such as WAKE_FILTER may only be supported by the MAC, while others might be only supported by the PHY. Make sure that the .get_wol() returns the union of both rather than only that of the PHY if the PHY supports Wake-on-LAN. Fixes: 7e400ff35cbe ("net: bcmgenet: Add support for PHY-based Wake-on-LAN") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240806175659.3232204-1-florian.fainelli@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-07bnxt_en : Fix memory out-of-bounds in bnxt_fill_hw_rss_tbl()Michael Chan
A recent commit has modified the code in __bnxt_reserve_rings() to set the default RSS indirection table to default only when the number of RX rings is changing. While this works for newer firmware that requires RX ring reservations, it causes the regression on older firmware not requiring RX ring resrvations (BNXT_NEW_RM() returns false). With older firmware, RX ring reservations are not required and so hw_resc->resv_rx_rings is not always set to the proper value. The comparison: if (old_rx_rings != bp->hw_resc.resv_rx_rings) in __bnxt_reserve_rings() may be false even when the RX rings are changing. This will cause __bnxt_reserve_rings() to skip setting the default RSS indirection table to default to match the current number of RX rings. This may later cause bnxt_fill_hw_rss_tbl() to use an out-of-range index. We already have bnxt_check_rss_tbl_no_rmgr() to handle exactly this scenario. We just need to move it up in bnxt_need_reserve_rings() to be called unconditionally when using older firmware. Without the fix, if the TX rings are changing, we'll skip the bnxt_check_rss_tbl_no_rmgr() call and __bnxt_reserve_rings() may also skip the bnxt_set_dflt_rss_indir_tbl() call for the reason explained in the last paragraph. Without setting the default RSS indirection table to default, it causes the regression: BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40 Read of size 2 at addr ffff8881c5809618 by task ethtool/31525 Call Trace: __bnxt_hwrm_vnic_set_rss+0xb79/0xe40 bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460 __bnxt_setup_vnic_p5+0x12e/0x270 __bnxt_open_nic+0x2262/0x2f30 bnxt_open_nic+0x5d/0xf0 ethnl_set_channels+0x5d4/0xb30 ethnl_default_set_doit+0x2f1/0x620 Reported-by: Breno Leitao <leitao@debian.org> Closes: https://lore.kernel.org/netdev/ZrC6jpghA3PWVWSB@gmail.com/ Fixes: 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20240806053742.140304-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Link: https://patch.msgid.link/20240801131917.34494-1-pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-31net: cnic: Convert tasklet API to new bottom half workqueue mechanismAllen Pais
Migrate tasklet APIs to the new bottom half workqueue mechanism. It replaces all occurrences of tasklet usage with the appropriate workqueue APIs throughout the cnic driver. This transition ensures compatibility with the latest design and enhances performance. Signed-off-by: Allen Pais <allen.lkml@gmail.com> Link: https://patch.msgid.link/20240730183403.4176544-4-allen.lkml@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-29eth: bnxt: populate defaults in the RSS context structJakub Kicinski
As described in the kdoc for .create_rxfh_context we are responsible for populating the defaults. The core will not call .get_rxfh for non-0 context. The problem can be easily observed since Netlink doesn't currently use the cache. Using netlink ethtool: $ ethtool -x eth0 context 1 [...] RSS hash key: 13:60:cd:60:14:d3:55:36:86:df:90:f2:96:14:e2:21:05:57:a8:8f:a5:12:5e:54:62:7f:fd:3c:15:7e:76:05:71:42:a2:9a:73:80:09:9c RSS hash function: toeplitz: on xor: off crc32: off But using IOCTL ethtool shows: $ ./ethtool-old -x eth0 context 1 [...] RSS hash key: 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 RSS hash function: Operation not supported Fixes: 7964e7884643 ("net: ethtool: use the tracking array for get_rxfh on custom RSS contexts") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-29eth: bnxt: reject unsupported hash functionsJakub Kicinski
In commit under Fixes I split the bnxt_set_rxfh_context() function, and attached the appropriate chunks to new ops. I missed that bnxt_set_rxfh_context() gets called after some initial checks in bnxt_set_rxfh(), namely that the hash function is Toeplitz. Fixes: 5c466b4d4e75 ("eth: bnxt: move from .set_rxfh to .create_rxfh_context and friends") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-25bnxt_en: Fix RSS logic in __bnxt_reserve_rings()Pavan Chebbi
In __bnxt_reserve_rings(), the existing code unconditionally sets the default RSS indirection table to default if netif_is_rxfh_configured() returns false. This used to be correct before we added RSS contexts support. For example, if the user is changing the number of ethtool channels, we will enter this path to reserve the new number of rings. We will then set the RSS indirection table to default to cover the new number of rings if netif_is_rxfh_configured() is false. Now, with RSS contexts support, if the user has added or deleted RSS contexts, we may now enter this path to reserve the new number of VNICs. However, netif_is_rxfh_configured() will not return the correct state if we are still in the middle of set_rxfh(). So the existing code may set the indirection table of the default RSS context to default by mistake. Fix it to check if the reservation of the RX rings is changing. Only check netif_is_rxfh_configured() if it is changing. RX rings will not change in the middle of set_rxfh() and this will fix the issue. Fixes: b3d0083caf9a ("bnxt_en: Support RSS contexts in ethtool .{get|set}_rxfh()") Reported-and-tested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/20240625010210.2002310-1-kuba@kernel.org Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20240724222106.147744-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-25Merge tag 'net-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. A lot of networking people were at a conference last week, busy catching COVID, so relatively short PR. Current release - regressions: - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP Current release - new code bugs: - l2tp: protect session IDR and tunnel session list with one lock, make sure the state is coherent to avoid a warning - eth: bnxt_en: update xdp_rxq_info in queue restart logic - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field Previous releases - regressions: - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len, the field reuses previously un-validated pad Previous releases - always broken: - tap/tun: drop short frames to prevent crashes later in the stack - eth: ice: add a per-VF limit on number of FDIR filters - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash" * tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) tun: add missing verification for short frame tap: add missing verification for short frame mISDN: Fix a use after free in hfcmulti_tx() gve: Fix an edge case for TSO skb validity check bnxt_en: update xdp_rxq_info in queue restart logic tcp: process the 3rd ACK with sk_socket for TFO/MPTCP selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len bpf: Fix a segment issue when downgrading gso_size net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling MAINTAINERS: make Breno the netconsole maintainer MAINTAINERS: Update bonding entry net: nexthop: Initialize all fields in dumped nexthops net: stmmac: Correct byte order of perfect_match selftests: forwarding: skip if kernel not support setting bridge fdb learning limit tipc: Return non-zero value from tipc_udp_addr2str() on error netfilter: nft_set_pipapo_avx2: disable softinterrupts ice: Fix recipe read procedure ice: Add a per-VF limit on number of FDIR filters net: bonding: correctly annotate RCU in bond_should_notify_peers() ...
2024-07-25Merge tag 'driver-core-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big set of driver core changes for 6.11-rc1. Lots of stuff in here, with not a huge diffstat, but apis are evolving which required lots of files to be touched. Highlights of the changes in here are: - platform remove callback api final fixups (Uwe took many releases to get here, finally!) - Rust bindings for basic firmware apis and initial driver-core interactions. It's not all that useful for a "write a whole driver in rust" type of thing, but the firmware bindings do help out the phy rust drivers, and the driver core bindings give a solid base on which others can start their work. There is still a long way to go here before we have a multitude of rust drivers being added, but it's a great first step. - driver core const api changes. This reached across all bus types, and there are some fix-ups for some not-common bus types that linux-next and 0-day testing shook out. This work is being done to help make the rust bindings more safe, as well as the C code, moving toward the end-goal of allowing us to put driver structures into read-only memory. We aren't there yet, but are getting closer. - minor devres cleanups and fixes found by code inspection - arch_topology minor changes - other minor driver core cleanups All of these have been in linux-next for a very long time with no reported problems" * tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) ARM: sa1100: make match function take a const pointer sysfs/cpu: Make crash_hotplug attribute world-readable dio: Have dio_bus_match() callback take a const * zorro: make match function take a const pointer driver core: module: make module_[add|remove]_driver take a const * driver core: make driver_find_device() take a const * driver core: make driver_[create|remove]_file take a const * firmware_loader: fix soundness issue in `request_internal` firmware_loader: annotate doctests as `no_run` devres: Correct code style for functions that return a pointer type devres: Initialize an uninitialized struct member devres: Fix memory leakage caused by driver API devm_free_percpu() devres: Fix devm_krealloc() wasting memory driver core: platform: Switch to use kmemdup_array() driver core: have match() callback in struct bus_type take a const * MAINTAINERS: add Rust device abstractions to DRIVER CORE device: rust: improve safety comments MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER firmware: rust: improve safety comments ...
2024-07-25bnxt_en: update xdp_rxq_info in queue restart logicTaehee Yoo
When the netdev_rx_queue_restart() restarts queues, the bnxt_en driver updates(creates and deletes) a page_pool. But it doesn't update xdp_rxq_info, so the xdp_rxq_info is still connected to an old page_pool. So, bnxt_rx_ring_info->page_pool indicates a new page_pool, but bnxt_rx_ring_info->xdp_rxq is still connected to an old page_pool. An old page_pool is no longer used so it is supposed to be deleted by page_pool_destroy() but it isn't. Because the xdp_rxq_info is holding the reference count for it and the xdp_rxq_info is not updated, an old page_pool will not be deleted in the queue restart logic. Before restarting 1 queue: ./tools/net/ynl/samples/page-pool enp10s0f1np1[6] page pools: 4 (zombies: 0) refs: 8192 bytes: 33554432 (refs: 0 bytes: 0) recycling: 0.0% (alloc: 128:8048 recycle: 0:0) After restarting 1 queue: ./tools/net/ynl/samples/page-pool enp10s0f1np1[6] page pools: 5 (zombies: 0) refs: 10240 bytes: 41943040 (refs: 0 bytes: 0) recycling: 20.0% (alloc: 160:10080 recycle: 1920:128) Before restarting queues, an interface has 4 page_pools. After restarting one queue, an interface has 5 page_pools, but it should be 4, not 5. The reason is that queue restarting logic creates a new page_pool and an old page_pool is not deleted due to the absence of an update of xdp_rxq_info logic. Fixes: 2d694c27d32e ("bnxt_en: implement netdev_queue_mgmt_ops") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: David Wei <dw@davidwei.uk> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Link: https://patch.msgid.link/20240721053554.1233549-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-15Merge branch 'net-make-timestamping-selectable'Jakub Kicinski
First part of "net: Make timestamping selectable" from Kory Maincent. Change the driver-facing type already to lower rebasing pain. Link: https://lore.kernel.org/20240709-feature_ptp_netnext-v17-0-b5317f50df2a@bootlin.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-15net: Add struct kernel_ethtool_ts_infoKory Maincent
In prevision to add new UAPI for hwtstamp we will be limited to the struct ethtool_ts_info that is currently passed in fixed binary format through the ETHTOOL_GET_TS_INFO ethtool ioctl. It would be good if new kernel code already started operating on an extensible kernel variant of that structure, similar in concept to struct kernel_hwtstamp_config vs struct hwtstamp_config. Since struct ethtool_ts_info is in include/uapi/linux/ethtool.h, here we introduce the kernel-only structure in include/linux/ethtool.h. The manual copy is then made in the function called by ETHTOOL_GET_TS_INFO. Acked-by: Shannon Nelson <shannon.nelson@amd.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240709-feature_ptp_netnext-v17-6-b5317f50df2a@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12eth: bnxt: use the indir table from ethtool contextJakub Kicinski
Instead of allocating a separate indir table in the vnic use the one already present in the RSS context allocated by the core. This saves some LoC and also we won't have to worry about syncing the local version back to the core, once core learns how to dump contexts. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-12-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12eth: bnxt: bump the entry size in indir tables to u32Jakub Kicinski
Ethtool core stores indirection table with u32 entries, "just to be safe". Switch the type in the driver, so that it's easier to swap local tables for the core ones. Memory allocations already use sizeof(*entry), switch the memset()s as well. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-11-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12eth: bnxt: pad out the correct indirection tableJakub Kicinski
bnxt allocates tables of max size, and changes the used size based on number of active rings. The unused entries get padded out with zeros. bnxt_modify_rss() seems to always pad out the table of the main / default RSS context, instead of the table of the modified context. I haven't observed any behavior change due to this patch, so I don't think it's a fix. Not entirely sure what role the padding plays, 0 is a valid queue ID. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-10-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12eth: bnxt: use the RSS context XArray instead of the local listJakub Kicinski
Core already maintains all RSS contexts in an XArray, no need to keep a second list in the driver. Remove bnxt_get_max_rss_ctx_ring() completely since core performs the same check already. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-9-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12eth: bnxt: use context priv for struct bnxt_rss_ctxJakub Kicinski
Core can allocate space for per-context driver-private data, use it for struct bnxt_rss_ctx. Inline bnxt_alloc_rss_ctx() at this point, most of the init (as in the actions bnxt_del_one_rss_ctx() will undo) is open coded in bnxt_create_rxfh_context(), anyway. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12eth: bnxt: depend on core cleaning up RSS contextsJakub Kicinski
New RSS context API removes old contexts on netdev unregister. No need to wipe them manually. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12eth: bnxt: remove rss_ctx_bmapJakub Kicinski
Core will allocate IDs for the driver, from the range [1, BNXT_MAX_ETH_RSS_CTX], no need to track the allocations. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12eth: bnxt: move from .set_rxfh to .create_rxfh_context and friendsJakub Kicinski
Use the new ethtool ops for RSS context management. The conversion is pretty straightforward cut / paste of the right chunks of the combined handler. Main change is that we let the core pick the IDs (bitmap will be removed separately for ease of review), so we need to tell the core when we lose a context. Since the new API passes rxfh as const, change bnxt_modify_rss() to also take const. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12eth: bnxt: allow deleting RSS contexts when the device is downJakub Kicinski
Contexts get deleted from FW when the device is down, but they are kept in SW and re-added back on open. bnxt_set_rxfh_context() apparently does not want to deal with complexity of dealing with both the device down and device up cases. This is perhaps acceptable for creating new contexts, but not being able to delete contexts makes core-driven cleanups messy. Specifically with the new RSS API core will try to delete contexts automatically after bringing the device down. Support the delete-while-down case. Skip the FW logic and delete just the driver state. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: net/sched/act_ct.c 26488172b029 ("net/sched: Fix UAF when resolving a clash") 3abbd7ed8b76 ("act_ct: prepare for stolen verdict coming from conntrack and nat engine") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09bnxt: fix crashes when reducing ring count with active RSS contextsJakub Kicinski
bnxt doesn't check if a ring is used by RSS contexts when reducing ring count. Core performs a similar check for the drivers for the main context, but core doesn't know about additional contexts, so it can't validate them. bnxt_fill_hw_rss_tbl_p5() uses ring id to index bp->rx_ring[], which without the check may end up being out of bounds. BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40 Read of size 2 at addr ffff8881c5809618 by task ethtool/31525 Call Trace: __bnxt_hwrm_vnic_set_rss+0xb79/0xe40 bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460 __bnxt_setup_vnic_p5+0x12e/0x270 __bnxt_open_nic+0x2262/0x2f30 bnxt_open_nic+0x5d/0xf0 ethnl_set_channels+0x5d4/0xb30 ethnl_default_set_doit+0x2f1/0x620 Core does track the additional contexts in net-next, so we can move this validation out of the driver as a follow up there. Fixes: b3d0083caf9a ("bnxt_en: Support RSS contexts in ethtool .{get|set}_rxfh()") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240705020005.681746-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-05net: bcmasp: Fix error code in probe()Dan Carpenter
Return an error code if bcmasp_interface_create() fails. Don't return success. Fixes: 490cb412007d ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Justin Chen <justin.chen@broadcom.com> Link: https://patch.msgid.link/ZoWKBkHH9D1fqV4r@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/phy/aquantia/aquantia.h 219343755eae ("net: phy: aquantia: add missing include guards") 61578f679378 ("net: phy: aquantia: add support for PHY LEDs") drivers/net/ethernet/wangxun/libwx/wx_hw.c bd07a9817846 ("net: txgbe: remove separate irq request for MSI and INTx") b501d261a5b3 ("net: txgbe: add FDIR ATR support") https://lore.kernel.org/all/20240703112936.483c1975@canb.auug.org.au/ include/linux/mlx5/mlx5_ifc.h 048a403648fc ("net/mlx5: IFC updates for changing max EQs") 99be56171fa9 ("net/mlx5e: SHAMPO, Re-enable HW-GRO") https://lore.kernel.org/all/20240701133951.6926b2e3@canb.auug.org.au/ Adjacent changes: drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c 4130c67cd123 ("wifi: iwlwifi: mvm: check vif for NULL/ERR_PTR before dereference") 3f3126515fbe ("wifi: iwlwifi: mvm: add mvm-specific guard") include/net/mac80211.h 816c6bec09ed ("wifi: mac80211: fix BSS_CHANGED_UNSOL_BCAST_PROBE_RESP") 5a009b42e041 ("wifi: mac80211: track changes in AP's TPE") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04bnxt_en: Fix the resource check condition for RSS contextsPavan Chebbi
While creating a new RSS context, bnxt_rfs_capable() currently makes a strict check to see if the required VNICs are already available. If the current VNICs are not what is required, either too many or not enough, it will call the firmware to reserve the exact number required. There is a bug in the firmware when the driver tries to relinquish some reserved VNICs and RSS contexts. It will cause the default VNIC to lose its RSS configuration and cause receive packets to be placed incorrectly. Workaround this problem by skipping the resource reduction. The driver will not reduce the VNIC and RSS context reservations when a context is deleted. The resources will be available for use when new contexts are created later. Potentially, this workaround can cause us to run out of VNIC and RSS contexts if there are a lot of VF functions creating and deleting RSS contexts. In the future, we will conditionally disable this workaround when the firmware fix is available. Fixes: 438ba39b25fe ("bnxt_en: Improve RSS context reservation infrastructure") Reported-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20240625010210.2002310-1-kuba@kernel.org/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240703180112.78590-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-02bnxt_en: unlink page pool when stopping Rx queueDavid Wei
Have bnxt call page_pool_disable_direct_recycling() to unlink the old page pool when resetting a queue prior to destroying it, instead of touching a netdev core struct directly. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-01bnxt_en: Remove atomic operations on ptp->tx_availPavan Chebbi
Now that we require the spinlock to protect ptp->txts_prod, change ptp->tx_avail to non-atomic and protect it under the same spinlock. Add a new helper function bnxt_ptp_get_txts_prod() to decrement ptp->tx_avail under spinlock and return the producer. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-01bnxt_en: Increase the max total outstanding PTP TX packets to 4Pavan Chebbi
Start accepting up to 4 TX TS requests on BCM5750X (P5) chips. These PTP TX packets will be queued in the ptp->txts_req[] array waiting for the TX timestamp to complete. The entries in the array will be managed by a producer and consumer index. The producer index is updated under spinlock since multiple TX rings can try to send PTP packets at the same time. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-01bnxt_en: Let bnxt_stamp_tx_skb() return error codePavan Chebbi
Change the function bnxt_stamp_tx_skb() to return 0 for suceess or -EAGAIN if the timestamp is still pending in firmware. The calling PTP aux worker will reschedule based on the return code. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-01bnxt_en: Remove an impossible condition check for PTP TX pending SKBPavan Chebbi
In the current 5750X PTP code paths, there is always at most one TX SKB requested for timestamp and we won't accept another one until we have retrieved the timestamp or it has timed out. Remove the unnecessary check in bnxt_get_tx_ts_p5() for a pending SKB and change the function to void. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-01bnxt_en: Refactor all PTP TX timestamp fields into a structPavan Chebbi
On the older 5750X (P5) chips, we currently support only 1 TX PTP packet in-flight waiting for the timestamp. Refactor the datastructures to prepare to support up to 4 TX PTP packets. Combine all fields required for PTP TX timestamp query into one structure. An array of this structure will be added in follow-on patches to support multiple outstanding TX timestamps. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-01bnxt_en: Add BCM5760X specific PHC registers mappingPavan Chebbi
BCM5760X firmware will advertise direct 64-bit PHC registers access for the driver from BAR0. Make the necessary changes in handling HWRM_PORT_MAC_PTP_QCFG's response and PHC register mapping for 5760X chips. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-01bnxt_en: Add TX timestamp completion logicMichael Chan
The new BCM5760X chips will return the timestamp of TX packets in a new completion. Add logic in __bnxt_poll_work() to handle this completion type to retrieve the timestamp. This feature eliminates the limit on the number of in-flight PTP TX packets. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>