Age | Commit message (Collapse) | Author |
|
All SCTP patches are picked up by netdev maintainers. Two headers were
missing to be listed there.
Reported-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/b3c2dc3a102eb89bd155abca2503ebd015f50ee0.1739193671.git.marcelo.leitner@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-02-11 (idpf, ixgbe, igc)
For idpf:
Sridhar fixes a couple issues in handling of RSC packets.
Josh adds a call to set_real_num_queues() to keep queue count in sync.
For ixgbe:
Piotr removes missed IS_ERR() removal when ERR_PTR usage was removed.
For igc:
Zdenek Bouska fixes reporting of Rx timestamp with AF_XDP.
Siang sets buffer type on empty frame to ensure proper handling.
* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
igc: Set buffer type for empty frames in igc_init_empty_frame
igc: Fix HW RX timestamp when passed by ZC XDP
ixgbe: Fix possible skb NULL pointer dereference
idpf: call set_real_num_queues in idpf_open
idpf: record rx queue in skb for RSC packets
idpf: fix handling rsc packet with a single segment
====================
Link: https://patch.msgid.link/20250211214343.4092496-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This adds support to dump the connection tracking table
("conntrack -L") and the conntrack statistics, ("conntrack -S").
Example conntrack dump:
tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/conntrack.yaml --dump get
[{'id': 59489769,
'mark': 0,
'nfgen-family': 2,
'protoinfo': {'protoinfo-tcp': {'tcp-flags-original': {'flags': {'maxack',
'sack-perm',
'window-scale'},
'mask': set()},
'tcp-flags-reply': {'flags': {'maxack',
'sack-perm',
'window-scale'},
'mask': set()},
'tcp-state': 'established',
'tcp-wscale-original': 7,
'tcp-wscale-reply': 8}},
'res-id': 0,
'secctx': {'secctx-name': 'system_u:object_r:unlabeled_t:s0'},
'status': {'assured',
'confirmed',
'dst-nat-done',
'seen-reply',
'src-nat-done'},
'timeout': 431949,
'tuple-orig': {'tuple-ip': {'ip-v4-dst': '34.107.243.93',
'ip-v4-src': '192.168.0.114'},
'tuple-proto': {'proto-dst-port': 443,
'proto-num': 6,
'proto-src-port': 37104}},
'tuple-reply': {'tuple-ip': {'ip-v4-dst': '192.168.0.114',
'ip-v4-src': '34.107.243.93'},
'tuple-proto': {'proto-dst-port': 37104,
'proto-num': 6,
'proto-src-port': 443}},
'use': 1,
'version': 0},
{'id': 3402229480,
Example stats dump:
tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/conntrack.yaml --dump get-stats
[{'chain-toolong': 0,
'clash-resolve': 3,
'drop': 0,
....
Changes since last iteration:
- Address comments from Donald Hunter, in particular, fixup "get" and
"get-stats" descriptions, the former operation supports both dump
and normal request (returns a single entry, if found), the latter
only supports dumps.
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20250210152159.41077-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 5d4cc87414c5 ("net: reorganize "struct sock" fields"),
the sk_tsflags field shares the same cacheline with sk_forward_alloc.
The UDP protocol does not acquire the sock lock in the RX path;
forward allocations are protected via the receive queue spinlock;
additionally udp_recvmsg() calls sock_recv_cmsgs() unconditionally
touching sk_tsflags on each packet reception.
Due to the above, under high packet rate traffic, when the BH and the
user-space process run on different CPUs, UDP packet reception
experiences a cache miss while accessing sk_tsflags.
The receive path doesn't strictly need to access the problematic field;
change sock_set_timestamping() to maintain the relevant information
in a newly allocated sk_flags bit, so that sock_recv_cmsgs() can
take decisions accessing the latter field only.
With this patch applied, on an AMD epic server with i40e NICs, I
measured a 10% performance improvement for small packets UDP flood
performance tests - possibly a larger delta could be observed with more
recent H/W.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/dbd18c8a1171549f8249ac5a8b30b1b5ec88a425.1739294057.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Donald Hunter says:
====================
netlink: specs: add a spec for nl80211 wiphy
Add a rudimentary YNL spec for nl80211 that includes get-wiphy and
get-interface, along with some required enhancements to YNL and the
netlink schemas.
Patch 1 is a minor cleanup to prepare for patch 2
Patches 2-4 are new features for YNL
Patches 5-7 are updates to ynl_gen_c
Patches 8-9 are schema updates for feature parity
Patch 10 is the new nl80211 spec
====================
Link: https://patch.msgid.link/20250211120127.84858-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a rudimentary YNL spec for nl80211 that covers get-wiphy,
get-interface and get-protocol-features.
./tools/net/ynl/pyynl/cli.py --family nl80211 \
--do get-protocol-features
{'protocol-features': {'split-wiphy-dump'}}
./tools/net/ynl/pyynl/cli.py --family nl80211 \
--dump get-wiphy --json '{ "split-wiphy-dump": true }'
./tools/net/ynl/pyynl/cli.py --family nl80211 \
--dump get-interface
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Link: https://patch.msgid.link/20250211120127.84858-11-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add s8 and s16 types to the genetlink schemas to align scalar types
across all schemas.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-10-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Nested structs are already supported in netlink-raw. Add the same
capability to the genetlink legacy schema.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-9-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Extend ynl-gen-c.py with support for indexed-array that has a scalar
sub-type.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-8-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Turn attribute names with leading digits into valid C names by
prepending an underscore, e.g. 5ghz -> _5ghz
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-7-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the missing s8 and s16 scalar types to the list of recognised
scalars in ynl-gen-c.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-6-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ynl tool uses display-hint to know when to format IP addresses in
printed output, but not to parse IP addresses from --json input. Add
support for parsing ipv4 and ipv6 strings.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-5-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The nl80211 family encodes the list of supported ciphers as a C array of
u32 values. Add support for translating arrays of scalars into strings
for enum names and display hints.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-4-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When decoding an indexed-array with a scalar subtype, it is currently
only possible to add a display-hint. Add support for decoding each value
as an enum.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-3-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
_decode_array_attr() uses variable subattrs in every branch when only
one branch decodes more than a single attribute.
Change the variable name to subattr in the branches that only decode a
single attribute so that the intent is more obvious.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-2-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Russell King says:
====================
net: dsa: add support for phylink managed EEE
This series adds support for phylink managed EEE to DSA, and converts
mt753x to make use of this feature.
Patch 1 implements a helper to indicate whether the MAC LPI operations
are populated (suggested by Vladimir)
Patch 2 makes the necessary changes to the core code - we retain calling
set_mac_eee(), but this method now becomes a way to merely validate the
arguments when using phylink managed EEE rather than performing any
configuration.
Patch 3 converts the mt7530 driver to use phylink managed EEE.
====================
Link: https://patch.msgid.link/Z6nWujbjxlkzK_3P@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert mt7530 to use phylink managed EEE. When enabling EEE, we set
both PMCR_FORCE_EEE1G and PMCR_FORCE_EEE100 irrespective of the speed,
and clear them both when disabling.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thR9q-003vXI-Cp@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In order to allow DSA drivers to use phylink managed EEE, we need to
change the behaviour of the DSA's .set_eee() ethtool method.
Implementation of the DSA .set_mac_eee() method becomes optional with
phylink managed EEE as it is only used to validate the EEE parameters
supplied from userspace. The rest of the EEE state management should
be left to phylink.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thR9l-003vXC-9F@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Provide a helper to determine whether the MAC operations structure
implements the LPI operations, which will be used by both phylink and
DSA.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thR9g-003vX6-4s@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jakub Kicinski says:
====================
eth: fbnic: report software queue stats
Fill in typical software queue stats.
# ./pyynl/cli.py --spec netlink/specs/netdev.yaml --dump qstats-get
[{'ifindex': 2,
'rx-alloc-fail': 0,
'rx-bytes': 398064076,
'rx-csum-complete': 271,
'rx-csum-none': 0,
'rx-packets': 276044,
'tx-bytes': 7223770,
'tx-needs-csum': 28148,
'tx-packets': 28449,
'tx-stop': 0,
'tx-wake': 0}]
Note that we don't collect csum-unnecessary, just the uncommon
cases (and unnecessary is all the rest of the packets). There
is no programatic use for these stats AFAIK, just manual debug.
====================
Link: https://patch.msgid.link/20250211181356.580800-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Looks like recent commit broke the sort order, fix it.
Acked-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250211181356.580800-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Gather and report software Tx queue stats - checksum stats
and queue stop / start.
Acked-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250211181356.580800-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Gather and report software Rx queue stats - checksum stats
and allocation failures.
Acked-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250211181356.580800-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The queue stats struct is used for Rx and Tx queues. Wrap
the Tx stats in a struct and a union, so that we can reuse
the same space for Rx stats on Rx queues.
This also makes it easy to add an assert to the stat handling
code to catch new stats not being aggregated on shutdown.
Acked-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250211181356.580800-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 13c7c941e729 ("netdev: add qstat for csum complete") reserved
the entry for csum complete in the qstats uAPI. Start reporting this
value now that we have a driver which needs it.
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250211181356.580800-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
bch2_nocow_write_convert_unwritten is already in transaction context:
00191 ========= TEST generic/648
00242 kernel BUG at fs/bcachefs/btree_iter.c:3332!
00242 Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
00242 Modules linked in:
00242 CPU: 4 UID: 0 PID: 2593 Comm: fsstress Not tainted 6.13.0-rc3-ktest-g345af8f855b7 #14403
00242 Hardware name: linux,dummy-virt (DT)
00242 pstate: 60001005 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
00242 pc : __bch2_trans_get+0x120/0x410
00242 lr : __bch2_trans_get+0xcc/0x410
00242 sp : ffffff80d89af600
00242 x29: ffffff80d89af600 x28: ffffff80ddb23000 x27: 00000000fffff705
00242 x26: ffffff80ddb23028 x25: ffffff80d8903fe0 x24: ffffff80ebb30168
00242 x23: ffffff80c8aeb500 x22: 000000000000005d x21: ffffff80d8904078
00242 x20: ffffff80d8900000 x19: ffffff80da9e8000 x18: 0000000000000000
00242 x17: 64747568735f6c61 x16: 6e72756f6a20726f x15: 0000000000000028
00242 x14: 0000000000000004 x13: 000000000000f787 x12: ffffffc081bbcdc8
00242 x11: 0000000000000000 x10: 0000000000000003 x9 : ffffffc08094efbc
00242 x8 : 000000001092c111 x7 : 000000000000000c x6 : ffffffc083c31fc4
00242 x5 : ffffffc083c31f28 x4 : ffffff80c8aeb500 x3 : ffffff80ebb30000
00242 x2 : 0000000000000001 x1 : 0000000000000a21 x0 : 000000000000028e
00242 Call trace:
00242 __bch2_trans_get+0x120/0x410 (P)
00242 bch2_inum_offset_err_msg+0x48/0xb0
00242 bch2_nocow_write_convert_unwritten+0x3d0/0x530
00242 bch2_nocow_write+0xeb0/0x1000
00242 __bch2_write+0x330/0x4e8
00242 bch2_write+0x1f0/0x530
00242 bch2_direct_write+0x530/0xc00
00242 bch2_write_iter+0x160/0xbe0
00242 vfs_write+0x1cc/0x360
00242 ksys_write+0x5c/0xf0
00242 __arm64_sys_write+0x20/0x30
00242 invoke_syscall.constprop.0+0x54/0xe8
00242 do_el0_svc+0x44/0xc0
00242 el0_svc+0x34/0xa0
00242 el0t_64_sync_handler+0x104/0x130
00242 el0t_64_sync+0x154/0x158
00242 Code: 6b01001f 54ffff01 79408460 3617fec0 (d4210000)
00242 ---[ end trace 0000000000000000 ]---
00242 Kernel panic - not syncing: Oops - BUG: Fatal exception
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
_orig_restart_count is unused now, according to the logic, trans_was_restarted
should be using _orig_restart_count.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Incorrectly handled transaction restarts can be a source of heisenbugs;
add a mode where we randomly inject them to shake them out.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD fix from Lee Jones:
- Fix syscon users not specifying the "syscon" compatible
* tag 'mfd-fixes-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
mfd: syscon: Restore device_node_to_regmap() for non-syscon nodes
|
|
Oleksij Rempel says:
====================
Use PHYlib for reset randomization and adjustable polling
This patch set tackles a DP83TG720 reset lock issue and improves PHY
polling. Rather than adding a separate polling worker to randomize PHY
resets, I chose to extend the PHYlib framework - which already handles
most of the needed functionality - with adjustable polling. This
approach not only addresses the DP83TG720-specific problem (where
synchronized resets can lock the link) but also lays the groundwork for
optimizing PHY stats polling across all PHY drivers. With generic PHY
stats coming in, we can adjust the polling interval based on hardware
characteristics, such as using longer intervals for PHYs with stable HW
counters or shorter ones for high-speed links prone to counter
overflows.
Patch version changes are tracked in separate patches.
====================
Link: https://patch.msgid.link/20250210082358.200751-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Address the limitations of the DP83TG720 PHY, which cannot reliably
detect or report a stable link state. To handle this, the PHY must be
periodically reset when the link is down. However, synchronized reset
intervals between the PHY and its link partner can result in a deadlock,
preventing the link from re-establishing.
This change introduces a randomized polling interval when the link is
down to desynchronize resets between link partners.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250210082358.200751-3-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce the `phy_get_next_update_time` function to allow PHY drivers
to dynamically determine the time (in jiffies) until the next state
update event. This enables more flexible and adaptive polling intervals
based on the link state or other conditions.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250210082358.200751-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Tariq Toukan says:
====================
mlx5 misc
Patches 1-3 by William reduce the memory consumption for representors to
achieve better scalability.
Patches 4-5 by Akiva expose ICM memory consumption per function.
Patches 6-8 expose helpful information on RSS resources in devlink RX
reporter diagnose.
Patches 9-10 are simple enhancements by Alex Lazar.
====================
Link: https://patch.msgid.link/20250209101716.112774-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In XDP scenarios, fragmented packets can occur if the MTU is larger
than the page size, even when the packet size fits within the linear
part.
If XDP multi-buffer support is disabled, the fragmented part won't be
handled in the TX flow, leading to packet drops.
Since XDP multi-buffer support is always available, this commit removes
the conditional check for enabling it.
This ensures that XDP multi-buffer support is always enabled,
regardless of the `is_xdp_mb` parameter, and guarantees the handling of
fragmented packets in such scenarios.
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-16-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Current loopback test validation ignores non-linear SKB case in
the SKB access, which can lead to failures in scenarios such as
when HW GRO is enabled.
Linearize the SKB so both cases will be handled.
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-15-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Underneath "rx resources" tag expose RSS diagnostic information. For
each RSS expose its rqtn, TIRs and inner TIRs.
$ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx
.......
RSS:
Index: 0 rqtn: 0
TIRs Numbers:
tt: TT_IPV4_TCP tirn: 0
tt: TT_IPV6_TCP tirn: 1
tt: TT_IPV4_UDP tirn: 2
tt: TT_IPV6_UDP tirn: 3
tt: TT_IPV4_IPSEC_AH tirn: 4
tt: TT_IPV6_IPSEC_AH tirn: 5
tt: TT_IPV4_IPSEC_ESP tirn: 6
tt: TT_IPV6_IPSEC_ESP tirn: 7
tt: TT_IPV4 tirn: 8
tt: TT_IPV6 tirn: 9
Inner TIRs Numbers:
tt: TT_IPV4_TCP tirn: 10
tt: TT_IPV6_TCP tirn: 11
tt: TT_IPV4_UDP tirn: 12
tt: TT_IPV6_UDP tirn: 13
tt: TT_IPV4_IPSEC_AH tirn: 14
tt: TT_IPV6_IPSEC_AH tirn: 15
tt: TT_IPV4_IPSEC_ESP tirn: 16
tt: TT_IPV6_IPSEC_ESP tirn: 17
tt: TT_IPV4 tirn: 18
tt: TT_IPV6 tirn: 19
Index: 2 rqtn: 27
TIRs Numbers:
tt: TT_IPV6_TCP tirn: 46
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-14-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add "RX resources" tag to the output of rx reporter diagnose callback.
Underneath add tag for direct TIRs, for each TIR expose its tirn and
the corresponding rqtn.
$ sudo devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx
....
rx resources:
Direct TIRs:
ix: 0 tirn: 20 rqtn: 1
ix: 1 tirn: 21 rqtn: 2
ix: 2 tirn: 22 rqtn: 3
ix: 3 tirn: 23 rqtn: 4
ix: 4 tirn: 24 rqtn: 5
ix: 5 tirn: 25 rqtn: 6
ix: 6 tirn: 26 rqtn: 7
ix: 7 tirn: 27 rqtn: 8
ix: 8 tirn: 28 rqtn: 9
ix: 9 tirn: 29 rqtn: 10
ix: 10 tirn: 30 rqtn: 11
ix: 11 tirn: 31 rqtn: 12
ix: 12 tirn: 32 rqtn: 13
ix: 13 tirn: 33 rqtn: 14
ix: 14 tirn: 34 rqtn: 15
ix: 15 tirn: 35 rqtn: 16
ix: 16 tirn: 36 rqtn: 17
ix: 17 tirn: 37 rqtn: 18
ix: 18 tirn: 38 rqtn: 19
ix: 19 tirn: 39 rqtn: 20
ix: 20 tirn: 40 rqtn: 21
ix: 21 tirn: 41 rqtn: 22
ix: 22 tirn: 42 rqtn: 23
ix: 23 tirn: 43 rqtn: 24
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-13-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move rx reporter RQs diagnose from mlx5e_rx_reporter_diagnose() to a
dedicated function. This change is a preparation for the following
series which extends diagnose output for the rx reporter. While at it,
also pass a mlx5e_priv pointer to
mlx5e_rx_reporter_diagnose_common_config() as this is the argument the
latter actually needs.
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-12-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ICM is a portion of the host's memory assigned to a function by the OS
through requests made by the NIC's firmware.
PF ICM consumption can be accessed directly, while VF/SF ICM consumption
can be accessed through their representors in switchdev mode.
The value is exposed to the user in granularity of 4KB through the vnic
health reporter as follows:
$ devlink health diagnose pci/0000:08:00.0 reporter vnic
vNIC env counters:
total_error_queues: 0 send_queue_priority_update_flow: 0
comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
invalid_command: 0 quota_exceeded_command: 0
nic_receive_steering_discard: 0 icm_consumption: 1032
Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rename mlx5_esw_query_vport_vhca_id to mlx5_vport_get_vhca_id and move
it to vport file. Also, add function declaration to mlx5_core header
file. This better represents the function's usage and allows for it to
be called from other parts of the mlx5_core driver.
Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-10-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
By default, the mq netdev creates a pfifo_fast qdisc. On a
system with 16 core, the pfifo_fast with 3 bands consumes
16 * 3 * 8 (size of pointer) * 1024 (default tx queue len)
= 393KB. The patch sets the tx qlen to representor default
value, 128 (1<<MLX5E_REP_PARAMS_DEF_LOG_SQ_SIZE), which
consumes 16 * 3 * 8 * 128 = 49KB, saving 344KB for each
representor at ECPF.
Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Daniel Jurgens <danielj@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250209101716.112774-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
By experiments, a single queue representor netdev consumes kernel
memory around 2.8MB, and 1.8MB out of the 2.8MB is due to page
pool for the RXQ. Scaling to a thousand representors consumes 2.8GB,
which becomes a memory pressure issue for embedded devices such as
BlueField-2 16GB / BlueField-3 32GB memory.
Since representor netdevs mostly handles miss traffic, and ideally,
most of the traffic will be offloaded, reduce the default non-uplink
rep netdev's RXQ default depth from 1024 to 256 if mdev is ecpf eswitch
manager. This saves around 1MB of memory per regular RQ,
(1024 - 256) * 2KB, allocated from page pool.
With rxq depth of 256, the netlink page pool tool reports
$./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump page-pool-get
{'id': 277,
'ifindex': 9,
'inflight': 128,
'inflight-mem': 786432,
'napi-id': 775}]
This is due to mtu 1500 + headroom consumes half pages, so 256 rxq
entries consumes around 128 pages (thus create a page pool with
size 128), shown above at inflight.
Note that each netdev has multiple types of RQs, including
Regular RQ, XSK, PTP, Drop, Trap RQ. Since non-uplink representor
only supports regular rq, this patch only changes the regular RQ's
default depth.
Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250209101716.112774-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For the ECPF and representors, reduce the max MPWRQ size from 256KB (18)
to 128KB (17). This prepares the later patch for saving representor
memory.
With Striding RQ, there is a minimum of 4 MPWQEs. So with 128KB of max
MPWRQ size, the minimal memory is 4 * 128KB = 512KB. When creating page
pool, consider 1500 mtu, the minimal page pool size will be 512KB/4KB =
128 pages = 256 rx ring entries (2 entries per page).
Before this patch, setting RX ringsize (ethtool -G rx) to 256 causes
driver to allocate page pool size more than it needs due to max MPWRQ
is 256KB (18). Ex: 4 * 256KB = 1MB, 1MB/4KB = 256 pages, but actually
128 pages is good enough. Reducing the max MPWRQ to 128KB fixes the
limitation.
Signed-off-by: William Tu <witu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250209101716.112774-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The recent platform profile changes prevent the tpacpi platform driver
from registering. This error is seen in the kernel logs, and the
various tpacpi entries are not created:
[ 7550.642171] platform thinkpad_acpi: Resources present before probing
This happens because devm_platform_profile_register() is called before
tpacpi_pdev probes (thanks to Kurt Borja for identifying the root
cause).
For now revert back to the old platform_profile_register to fix the
issue. This is quick fix and will be re-implemented later as more
testing is needed for full solution.
Tested on X1 Carbon G12.
Fixes: 31658c916fa6 ("platform/x86: thinkpad_acpi: Use devm_platform_profile_register()")
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Reviewed-by: Kurt Borja <kuurtb@gmail.com>
Link: https://lore.kernel.org/r/20250211173620.16522-1-mpearson-lenovo@squebb.ca
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
This reverts commit b8baac3b9c5cc4b261454ff87d75ae8306016ffd.
IPv4 packets with no DF flag set on result in frequent flow entry
teardown cycles, this is visible in the network topology that is used in
the nft_flowtable.sh test.
nft_flowtable.sh test ocassionally fails reporting that the dscp_fwd
test sees no packets going through the flowtable path.
Fixes: b8baac3b9c5c ("netfilter: flowtable: teardown flow if cached mtu is stale")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-02-10 (ice, igc, e1000e)
For ice:
Karol, Jake, and Michal add PTP support for E830 devices. Karol
refactors and cleans up PTP code. Jake allows for a common
cross-timestamp implementation to be shared for all devices and
Michal adds E830 support.
Mateusz cleans up initial Flow Director rule creation to loop rather
than duplicate repeated similar calls.
For igc:
Siang adjust calls to remove need for close and open calls on loading
XDP program.
For e1000e:
Gerhard Engleder batches register writes for writing multicast table
on real-time kernels.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
e1000e: Fix real-time violations on link up
igc: Avoid unnecessary link down event in XDP_SETUP_PROG process
ice: refactor ice_fdir_create_dflt_rules() function
ice: Implement PTP support for E830 devices
ice: Refactor ice_ptp_init_tx_*
ice: Add unified ice_capture_crosststamp
ice: Process TSYN IRQ in a separate function
ice: Use FIELD_PREP for timestamp values
ice: Remove unnecessary ice_is_e8xx() functions
ice: Don't check device type when checking GNSS presence
====================
Link: https://patch.msgid.link/20250210192352.3799673-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If the netdev lock has been obtained, unlock it before returning.
This bug has been detected by the Clang thread-safety analyzer.
Fixes: afc664987ab3 ("eth: iavf: extend the netdev_lock usage")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20250206175114.1974171-28-bvanassche@acm.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Edward Cree says:
====================
sfc: support devlink flash
Allow upgrading device firmware on Solarflare NICs through standard tools.
====================
Link: https://patch.msgid.link/cover.1739186252.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update the information in sfc's devlink documentation including
support for firmware update with devlink flash.
Also update the help text for CONFIG_SFC_MTD, as it is no longer
strictly required for firmware updates.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/3476b0ef04a0944f03e0b771ec8ed1a9c70db4dc.1739186253.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use MC_CMD_NVRAM_* wrappers to write the firmware to the partition
identified from the image header.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/a746335876b621b3e54cf4e49948148e349a1745.1739186253.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|