summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-11-29Merge branch 'create-a-binding-for-the-marvell-mv88e6xxx-dsa-switches'Jakub Kicinski
Linus Walleij says: ==================== Create a binding for the Marvell MV88E6xxx DSA switches The Marvell switches are lacking DT bindings. I need proper schema checking to add LED support to the Marvell switch. Just how it is, it can't go on like this. Some Device Tree fixes are included in the series, these remove the major and most annoying warnings fallout noise: some warnings remain, and these are of more serious nature, such as missing phy-mode. They can be applied individually, or to the networking tree with the rest of the patches. Thanks to Andrew Lunn, Vladimir Oltean and Russell King for excellent review and feedback! This latest version employs special compatibles in the odd ABI device trees. v8: https://lore.kernel.org/r/20231114-marvell-88e6152-wan-led-v8-0-50688741691b@linaro.org v7: https://lore.kernel.org/r/20231024-marvell-88e6152-wan-led-v7-0-2869347697d1@linaro.org v6: https://lore.kernel.org/r/20231024-marvell-88e6152-wan-led-v6-0-993ab0949344@linaro.org v5: https://lore.kernel.org/r/20231023-marvell-88e6152-wan-led-v5-0-0e82952015a7@linaro.org v4: https://lore.kernel.org/r/20231018-marvell-88e6152-wan-led-v4-0-3ee0c67383be@linaro.org v3: https://lore.kernel.org/r/20231016-marvell-88e6152-wan-led-v3-0-38cd449dfb15@linaro.org v2: https://lore.kernel.org/r/20231014-marvell-88e6152-wan-led-v2-0-7fca08b68849@linaro.org v1: https://lore.kernel.org/r/20231013-marvell-88e6152-wan-led-v1-0-0712ba99857c@linaro.org ==================== Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-0-272934e04681@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29dt-bindings: marvell: Add Marvell MV88E6060 DSA schemaLinus Walleij
The Marvell MV88E6060 is one of the oldest DSA switches from Marvell, and it has DT bindings used in the wild. Let's define them properly. It is different enough from the rest of the MV88E6xxx switches that it deserves its own binding. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-5-272934e04681@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29dt-bindings: marvell: Rewrite MV88E6xxx in schemaLinus Walleij
This is an attempt to rewrite the Marvell MV88E6xxx switch bindings in YAML schema. The current text binding says: WARNING: This binding is currently unstable. Do not program it into a FLASH never to be changed again. Once this binding is stable, this warning will be removed. Well that never happened before we switched to YAML markup, we can't have it like this, what about fixing the mess? Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-4-272934e04681@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29dt-bindings: net: ethernet-switch: Accept special variantsLinus Walleij
Accept special node naming variants for Marvell switches with special node names as ABI. This is maybe not the prettiest but it avoids special-casing the Marvell MV88E6xxx bindings by copying a lot of generic binding code down into that one binding just to special-case these unfixable nodes. Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-3-272934e04681@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29dt-bindings: net: mvusb: Fix up DSA exampleLinus Walleij
When adding a proper schema for the Marvell mx88e6xxx switch, the scripts start complaining about this embedded example: dtschema/dtc warnings/errors: net/marvell,mvusb.example.dtb: switch@0: ports: '#address-cells' is a required property from schema $id: http://devicetree.org/schemas/net/dsa/marvell,mv88e6xxx.yaml# net/marvell,mvusb.example.dtb: switch@0: ports: '#size-cells' is a required property from schema $id: http://devicetree.org/schemas/net/dsa/marvell,mv88e6xxx.yaml# Fix this up by extending the example with those properties in the ports node. While we are at it, rename "ports" to "ethernet-ports" and rename "switch" to "ethernet-switch" as this is recommended practice. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-2-272934e04681@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29dt-bindings: net: dsa: Require ports or ethernet-portsLinus Walleij
Bindings using dsa.yaml#/$defs/ethernet-ports specify that a DSA switch node need to have a ports or ethernet-ports subnode, and that is actually required, so add requirements using oneOf. Suggested-by: Rob Herring <robh@kernel.org> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20231127-marvell-88e6152-wan-led-v9-1-272934e04681@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29Merge branch ↵Jakub Kicinski
'tools-ynl-fixes-for-the-page-pool-sample-and-the-generation-process' Jakub Kicinski says: ==================== tools: ynl: fixes for the page-pool sample and the generation process Minor fixes to the new sample and the Makefiles. ==================== Link: https://lore.kernel.org/r/20231129193622.2912353-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29tools: ynl: don't skip regeneration from make targetsJakub Kicinski
Commit 2b7ac0c87d98 ("tools: ynl-gen: don't touch the output file if content is the same") is working too well. It was added so that ynl-regen -f doesn't make us rebuild half of the kernel, if there are no actual changes in any generated code. When ynl-gen-c is called by make, however, we're better off trusting make's tracking and overwrite the file. Otherwise if output is identical we won't update file timestamps and make will retry code gen on every invocation. Link: https://lore.kernel.org/r/20231129193622.2912353-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29tools: ynl: order building samples after generated codeJakub Kicinski
Parallel builds of ynl: make -C tools/net/ynl/ -j 4 don't work correctly right now. samples get handled before generated, so build of samples does not notice that protos.a has changed. Order samples to be last. Link: https://lore.kernel.org/r/20231129193622.2912353-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29tools: ynl: make sure we use local headers for page-poolJakub Kicinski
Building samples generates the following warning: In file included from page-pool.c:11: generated/netdev-user.h:21:45: warning: ‘enum netdev_xdp_rx_metadata’ declared inside parameter list will not be visible outside of this definition or declaration 21 | const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value); | ^~~~~~~~~~~~~~~~~~~~~~ Our magic way of including uAPI headers assumes the sample name matches the family name. We need to copy the flags over. Fixes: 637567e4a3ef ("tools: ynl: add sample for getting page-pool information") Link: https://lore.kernel.org/r/20231129193622.2912353-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29tools: ynl: fix build of the page-pool sampleJakub Kicinski
The name of the "destroyed" field in the reply was not changed in the sample after we started calling it "detach_time". page-pool.c: In function ‘main’: page-pool.c:84:33: error: ‘struct <anonymous>’ has no member named ‘destroyed’ 84 | if (pp->_present.destroyed) | ^ Fixes: 637567e4a3ef ("tools: ynl: add sample for getting page-pool information") Link: https://lore.kernel.org/r/20231129193622.2912353-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29Merge branch ↵Jakub Kicinski
'fine-tune-flow-control-and-speed-configurations-in-microchip-ksz8xxx-dsa-driver' Oleksij Rempel says: ==================== Fine-Tune Flow Control and Speed Configurations in Microchip KSZ8xxx DSA Driver This patch set focuses on enhancing the configurability of flow control, speed, and duplex settings in the Microchip KSZ8xxx DSA driver. The first patch allows more granular control over the CPU port's flow control, speed, and duplex settings. The second patch introduces a method for port configurations for port with integrated PHYs, primarily concerning flow control based on duplex mode. ==================== Link: https://lore.kernel.org/r/20231127145101.3039399-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29net: dsa: microchip: make phylink_mac_link_up() not optionalOleksij Rempel
Last part of the driver do now support phylink_mac_link_up(). So, make it not optional. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20231127145101.3039399-4-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29net: dsa: microchip: ksz8: Add function to configure ports with integrated PHYsOleksij Rempel
This patch introduces the function 'ksz8_phy_port_link_up' to the Microchip KSZ8xxx driver. This function is responsible for setting up flow control and duplex settings for the ports that are integrated with PHYs. The KSZ8795 switch supports asymmetric pause control, which can't be fully utilized since a single bit controls both RX and TX pause. Despite this, the flow control can be adjusted based on the auto-negotiation process, taking into account the capabilities of both link partners. On the other hand, the KSZ8873's PORT_FORCE_FLOW_CTRL bit can be set by the hardware bootstrap, which ignores the auto-negotiation result. Therefore, even in auto-negotiation mode, we need to ensure that this bit is correctly set. When auto-negotiation isn't in use, we enforce symmetric pause control for the KSZ8795 switch. Please note, forcing flow control disable on a port while still advertising pause support isn't possible. While this scenario might not be practical or desired, it's important to be aware of this limitation when working with the KSZ8873 and similar devices. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20231127145101.3039399-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29net: dsa: microchip: ksz8: Make flow control, speed, and duplex on CPU port ↵Oleksij Rempel
configurable Allow flow control, speed, and duplex settings on the CPU port to be configurable. Previously, the speed and duplex relied on default switch values, which limited flexibility. Additionally, flow control was hardcoded and only functional in duplex mode. This update enhances the configurability of these parameters. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20231127145101.3039399-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29Merge branch 'gve-add-support-for-non-4k-page-sizes'Jakub Kicinski
John Fraker says: ==================== gve: Add support for non-4k page sizes. This patch series adds support for non-4k page sizes to the driver. Prior to this patch series, the driver assumes a 4k page size in many small ways, and will crash in a kernel compiled for a different page size. This changeset aims to be a minimal changeset that unblocks certain arm platforms with large page sizes. ==================== Link: https://lore.kernel.org/r/20231128002648.320892-1-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29gve: Remove dependency on 4k page size.John Fraker
Prior to this change, gve crashes when attempting to run in kernels with page sizes other than 4k. This change removes unnecessary references to PAGE_SIZE and replaces them with more meaningful constants. Signed-off-by: Jordan Kimbrough <jrkim@google.com> Signed-off-by: John Fraker <jfraker@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231128002648.320892-6-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29gve: Add page size register to the register_page_list command.John Fraker
This register is required on platforms with page sizes greater than 4k. This is because the tx side of the driver vmaps the entire queue page list of pages into a single flat address space, then uses the entire space. Without communicating the guest page size to the backend, the backend will only access the first 4k of each page in the queue page list. Signed-off-by: Jordan Kimbrough <jrkim@google.com> Signed-off-by: John Fraker <jfraker@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231128002648.320892-5-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29gve: Remove obsolete checks that rely on page size.John Fraker
These checks are safe to remove as they are no longer enforced by the backend. Retaining them would require updating them to work differently with page sizes larger than 4k. Signed-off-by: Jordan Kimbrough <jrkim@google.com> Signed-off-by: John Fraker <jfraker@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231128002648.320892-4-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29gve: Deprecate adminq_pfn for pci revision 0x1.John Fraker
adminq_pfn assumes a page size of 4k, causing this mechanism to break in kernels compiled with different page sizes. A new PCI device revision was needed for the device to be able to communicate with the driver how to set up the admin queue prior to having access to the admin queue. Signed-off-by: Jordan Kimbrough <jrkim@google.com> Signed-off-by: John Fraker <jfraker@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231128002648.320892-3-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29gve: Perform adminq allocations through a dma_pool.John Fraker
This allows the adminq to be smaller than a page, paving the way for non 4k page support. This is to support platforms where PAGE_SIZE is not 4k, such as some ARM platforms. Signed-off-by: Jordan Kimbrough <jrkim@google.com> Signed-off-by: John Fraker <jfraker@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231128002648.320892-2-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28Merge branch '40GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-11-27 (i40e, iavf) This series contains updates to i40e and iavf drivers. Ivan Vecera performs more cleanups on i40e and iavf drivers; removing unused fields, defines, and unneeded fields. Petr Oros utilizes iavf_schedule_aq_request() helper to replace open coded equivalents. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: iavf: use iavf_schedule_aq_request() helper iavf: Remove queue tracking fields from iavf_adminq_ring i40e: Remove queue tracking fields from i40e_adminq_ring i40e: Remove AQ register definitions for VF types i40e: Delete unused and useless i40e_pf fields ==================== Link: https://lore.kernel.org/r/20231127211037.1135403-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28r8169: remove multicast filter limitHeiner Kallweit
Once upon a time, when r8169 was new, the multicast filter limit code was copied from RTL8139 driver. There the filter limit is even user-configurable. The filtering is hash-based and we don't have perfect filtering. Actually the mc filtering on RTL8125 still seems to be the same as used on 8390/NE2000. So it's not clear to me which benefit it should bring when switching to all-multi mode once a certain number of filter bits is set. More the opposite: Filtering out at least some unwanted mc traffic is better than no filtering. Also the available chip documentation doesn't mention any restriction. Therefore remove the filter limit. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/57076c05-3730-40d1-ab9a-5334b263e41a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28ice: fix error code in ice_eswitch_attach()Dan Carpenter
Set the "err" variable on this error path. Fixes: fff292b47ac1 ("ice: add VF representors one by one") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Link: https://lore.kernel.org/r/e0349ee5-76e6-4ff4-812f-4aa0d3f76ae7@moroto.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28nfp: ethtool: support TX/RX pause frame on/offYu Xiao
Add support for ethtool -A tx on/off and rx on/off. Signed-off-by: Yu Xiao <yu.xiao@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231127055116.6668-1-louis.peens@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28Merge branch 'net-page_pool-add-netlink-based-introspection'Paolo Abeni
Jakub Kicinski says: ==================== net: page_pool: add netlink-based introspection We recently started to deploy newer kernels / drivers at Meta, making significant use of page pools for the first time. We immediately run into page pool leaks both real and false positive warnings. As Eric pointed out/predicted there's no guarantee that applications will read / close their sockets so a page pool page may be stuck in a socket (but not leaked) forever. This happens a lot in our fleet. Most of these are obviously due to application bugs but we should not be printing kernel warnings due to minor application resource leaks. Conversely the page pool memory may get leaked at runtime, and we have no way to detect / track that, unless someone reconfigures the NIC and destroys the page pools which leaked the pages. The solution presented here is to expose the memory use of page pools via netlink. This allows for continuous monitoring of memory used by page pools, regardless if they were destroyed or not. Sample in patch 15 can print the memory use and recycling efficiency: $ ./page-pool eth0[2] page pools: 10 (zombies: 0) refs: 41984 bytes: 171966464 (refs: 0 bytes: 0) recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201) v4: - use dev_net(netdev)->loopback_dev - extend inflight doc v3: https://lore.kernel.org/all/20231122034420.1158898-1-kuba@kernel.org/ - ID is still here, can't decide if it matters - rename destroyed -> detach-time, good enough? - fix build for netsec v2: https://lore.kernel.org/r/20231121000048.789613-1-kuba@kernel.org - hopefully fix build with PAGE_POOL=n v1: https://lore.kernel.org/all/20231024160220.3973311-1-kuba@kernel.org/ - The main change compared to the RFC is that the API now exposes outstanding references and byte counts even for "live" page pools. The warning is no longer printed if page pool is accessible via netlink. RFC: https://lore.kernel.org/all/20230816234303.3786178-1-kuba@kernel.org/ ==================== Link: https://lore.kernel.org/r/20231126230740.2148636-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28tools: ynl: add sample for getting page-pool informationJakub Kicinski
Regenerate the tools/ code after netdev spec changes. Add sample to query page-pool info in a concise fashion: $ ./page-pool eth0[2] page pools: 10 (zombies: 0) refs: 41984 bytes: 171966464 (refs: 0 bytes: 0) recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201) Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: mute the periodic warning for visible page poolsJakub Kicinski
Mute the periodic "stalled pool shutdown" warning if the page pool is visible to user space. Rolling out a driver using page pools to just a few hundred hosts at Meta surfaces applications which fail to reap their broken sockets. Obviously it's best if the applications are fixed, but we don't generally print warnings for application resource leaks. Admins can now depend on the netlink interface for getting page pool info to detect buggy apps. While at it throw in the ID of the pool into the message, in rare cases (pools from destroyed netns) this will make finding the pool with a debugger easier. Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: expose page pool stats via netlinkJakub Kicinski
Dump the stats into netlink. More clever approaches like dumping the stats per-CPU for each CPU individually to see where the packets get consumed can be implemented in the future. A trimmed example from a real (but recently booted system): $ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \ --dump page-pool-stats-get [{'info': {'id': 19, 'ifindex': 2}, 'alloc-empty': 48, 'alloc-fast': 3024, 'alloc-refill': 0, 'alloc-slow': 48, 'alloc-slow-high-order': 0, 'alloc-waive': 0, 'recycle-cache-full': 0, 'recycle-cached': 0, 'recycle-released-refcnt': 0, 'recycle-ring': 0, 'recycle-ring-full': 0}, {'info': {'id': 18, 'ifindex': 2}, 'alloc-empty': 66, 'alloc-fast': 11811, 'alloc-refill': 35, 'alloc-slow': 66, 'alloc-slow-high-order': 0, 'alloc-waive': 0, 'recycle-cache-full': 1145, 'recycle-cached': 6541, 'recycle-released-refcnt': 0, 'recycle-ring': 1275, 'recycle-ring-full': 0}, {'info': {'id': 17, 'ifindex': 2}, 'alloc-empty': 73, 'alloc-fast': 62099, 'alloc-refill': 413, ... Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: report when page pool was destroyedJakub Kicinski
Report when page pool was destroyed. Together with the inflight / memory use reporting this can serve as a replacement for the warning about leaked page pools we currently print to dmesg. Example output for a fake leaked page pool using some hacks in netdevsim (one "live" pool, and one "leaked" on the same dev): $ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \ --dump page-pool-get [{'id': 2, 'ifindex': 3}, {'id': 1, 'ifindex': 3, 'destroyed': 133, 'inflight': 1}] Tested-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: report amount of memory held by page poolsJakub Kicinski
Advanced deployments need the ability to check memory use of various system components. It makes it possible to make informed decisions about memory allocation and to find regressions and leaks. Report memory use of page pools. Report both number of references and bytes held. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: add netlink notifications for state changesJakub Kicinski
Generate netlink notifications about page pool state changes. Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: implement GET in the netlink APIJakub Kicinski
Expose the very basic page pool information via netlink. Example using ynl-py for a system with 9 queues: $ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \ --dump page-pool-get [{'id': 19, 'ifindex': 2, 'napi-id': 147}, {'id': 18, 'ifindex': 2, 'napi-id': 146}, {'id': 17, 'ifindex': 2, 'napi-id': 145}, {'id': 16, 'ifindex': 2, 'napi-id': 144}, {'id': 15, 'ifindex': 2, 'napi-id': 143}, {'id': 14, 'ifindex': 2, 'napi-id': 142}, {'id': 13, 'ifindex': 2, 'napi-id': 141}, {'id': 12, 'ifindex': 2, 'napi-id': 140}, {'id': 11, 'ifindex': 2, 'napi-id': 139}, {'id': 10, 'ifindex': 2, 'napi-id': 138}] Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: add nlspec for basic access to page poolsJakub Kicinski
Add a Netlink spec in YAML for getting very basic information about page pools. Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28eth: link netdev to page_pools in driversJakub Kicinski
Link page pool instances to netdev for the drivers which already link to NAPI. Unless the driver is doing something very weird per-NAPI should imply per-netdev. Add netsec as well, Ilias indicates that it fits the mold. Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: stash the NAPI ID for easier accessJakub Kicinski
To avoid any issues with race conditions on accessing napi and having to think about the lifetime of NAPI objects in netlink GET - stash the napi_id to which page pool was linked at creation time. Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: record pools per netdevJakub Kicinski
Link the page pools with netdevs. This needs to be netns compatible so we have two options. Either we record the pools per netns and have to worry about moving them as the netdev gets moved. Or we record them directly on the netdev so they move with the netdev without any extra work. Implement the latter option. Since pools may outlast netdev we need a place to store orphans. In time honored tradition use loopback for this purpose. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: id the page poolsJakub Kicinski
To give ourselves the flexibility of creating netlink commands and ability to refer to page pool instances in uAPIs create IDs for page pools. Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: factor out uninitJakub Kicinski
We'll soon (next change in the series) need a fuller unwind path in page_pool_create() so create the inverse of page_pool_init(). Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-27Merge tag 'wireless-next-2023-11-27' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.8 The first features pull request for v6.8. Not so big in number of commits but we removed quite a few ancient drivers: libertas 16-bit PCMCIA support, atmel, hostap, zd1201, orinoco, ray_cs, wl3501 and rndis_wlan. Major changes: cfg80211/mac80211 - extend support for scanning while Multi-Link Operation (MLO) connected * tag 'wireless-next-2023-11-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (68 commits) wifi: nl80211: Documentation update for NL80211_CMD_PORT_AUTHORIZED event wifi: mac80211: Extend support for scanning while MLO connected wifi: cfg80211: Extend support for scanning while MLO connected wifi: ieee80211: fix PV1 frame control field name rfkill: return ENOTTY on invalid ioctl MAINTAINERS: update iwlwifi maintainers wifi: rtw89: 8922a: read efuse content from physical map wifi: rtw89: 8922a: read efuse content via efuse map struct from logic map wifi: rtw89: 8852c: read RX gain offset from efuse for 6GHz channels wifi: rtw89: mac: add to access efuse for WiFi 7 chips wifi: rtw89: mac: use mac_gen pointer to access about efuse wifi: rtw89: 8922a: add 8922A basic chip info wifi: rtlwifi: drop unused const_amdpci_aspm wifi: mwifiex: mwifiex_process_sleep_confirm_resp(): remove unused priv variable wifi: rtw89: regd: update regulatory map to R65-R44 wifi: rtw89: regd: handle policy of 6 GHz according to BIOS wifi: rtw89: acpi: process 6 GHz band policy from DSM wifi: rtlwifi: simplify rtl_action_proc() and rtl_tx_agg_start() wifi: rtw89: pci: update interrupt mitigation register for 8922AE wifi: rtw89: pci: correct interrupt mitigation register for 8852CE ... ==================== Link: https://lore.kernel.org/r/20231127180056.0B48DC433C8@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-27Merge branch 'selftests-tc-testing-updates-and-cleanups-for-tdc'Jakub Kicinski
Pedro Tammela says: ==================== selftests: tc-testing: updates and cleanups for tdc Address the recommendations from the previous series and cleanup some leftovers. ==================== Link: https://lore.kernel.org/r/20231124154248.315470-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-27selftests: tc-testing: remove unused importPedro Tammela
Remove this leftover from the times we pre-allocated everything Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231124154248.315470-6-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-27selftests: tc-testing: cleanup on Ctrl-CPedro Tammela
Cleanup net namespaces and other resources if we get a SIGINT (Ctrl-C). As user visible resources are allocated on a per test basis, it's only required to catch this condition when (possibly) running tests. So far calling post_suite is enough to free up anything that might linger. A missing keyword replacement for nsPlugin is also included. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231124154248.315470-5-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-27selftests: tc-testing: prefix iproute2 functions with "ipr2"Pedro Tammela
As suggested by Simon, prefix the functions that operate on iproute2 commands in contrast with the "nl" netlink prefix. Cc: Simon Horman <horms@kernel.org> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231124154248.315470-4-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-27selftests: tc-testing: remove unnecessary time.sleepPedro Tammela
This operation is redundant and it's not stabilizing nor waiting for anything. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231124154248.315470-3-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-27selftests: tc-testing: remove buildebpf pluginPedro Tammela
As tdc only tests loading/deleting and anything more complicated is better left to the ebpf test suite, provide a pre-compiled version of 'action.c' and don't bother compiling it in kselftests or on the fly at all. Cc: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231124154248.315470-2-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-27Merge branch 'net-phylink-improve-phy-validation'Jakub Kicinski
Russell King says: ==================== net: phylink: improve PHY validation One of the issues which has concerned me about the rate matching implenentation that we have is that phy_get_rate_matching() returns whether rate matching will be used for a particular interface, and we enquire only for one interface. Aquantia PHYs can be programmed with the rate matching and interface mode settings on a per-media speed basis using the per-speed vendor 1 global configuration registers. Thus, it is possible for the PHY to be configured to use rate matching for 10G, 5G, 2.5G with 10GBASE-R, and then SGMII for the remaining speeds. Therefore, it clearly doesn't make sense to enquire about rate matching for just one interface mode. Also, PHYs that change their interfaces are handled sub-optimally, in that we validate all the interface modes that the host supports, rather than the interface modes that the PHY will use. This patch series changes the way we validate PHYs, but in order to do so, we need to know exactly which interface modes will be used by the PHY. So that phylib can convey this information, we add "possible_interfaces" to struct phy_device. possible_interfaces is to be filled in by a phylib driver once the PHY is configured (in other words in the PHYs .config_init method) with the interface modes that it will switch between. This then allows users of phylib to know which interface modes will be used by the PHY. This allows us to solve both these issues: where possible_interfaces is provided, we can validate which ethtool link modes can be supported by looking at which interface modes that both the PHY and host support, and request rate matching information for each mode. This should improve the accuracy of the validation. Sending this out again without RFC as Jie Luo will need it for the QCA8084 changes. No changes except to add the attributations already received. Thanks! ==================== Link: https://lore.kernel.org/r/ZWCWn+uNkVLPaQhn@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-27net: phylink: use the PHY's possible_interfaces if populatedRussell King (Oracle)
Some PHYs such as Aquantia, Broadcom 84881, and Marvell 88X33x0 can switch between a set of interface types depending on the negotiated media speed, or can use rate adaption for some or all of these interface types. We currently assume that these are Clause 45 PHYs that are configured not to use a specific set of interface modes, which has worked so far, but is just a work-around. In this workaround, we validate using all interfaces that the MAC supports, which can lead to extra modes being advertised that can not be supported. To properly address this, switch to using the newly introduced PHY possible_interfaces bitmap which indicates which interface modes will be used by the PHY as configured. We calculate the union of the PHY's possible interfaces and MACs supported interfaces, checking that is non-empty. If the PHY is on a SFP, we further reduce the set by those which can be used on a SFP module, again checking that is non-empty. Finally, we validate the subset of interfaces, taking account of whether rate matching will be used for each individual interface mode. This becomes independent of whether the PHY is clause 22 or clause 45. It is encouraged that all PHYs that switch interface modes or use rate matching should populate phydev->possible_interfaces. Tested-by: Luo Jie <quic_luoj@quicinc.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/E1r6VIV-00DDMF-Pi@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-27net: phylink: split out PHY validation from phylink_bringup_phy()Russell King (Oracle)
When bringing up a PHY, we need to work out which ethtool link modes it should support and advertise. Clause 22 PHYs operate in a single interface mode, which can be easily dealt with. However, clause 45 PHYs tend to switch interface mode depending on the media. We need more flexible validation at this point, so this patch splits out that code in preparation to changing it. Tested-by: Luo Jie <quic_luoj@quicinc.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/E1r6VIQ-00DDM9-LK@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-27net: phylink: pass PHY into phylink_validate_mask()Russell King (Oracle)
Pass the phy (if any) into phylink_validate_mask() so that we can validate each interface with its rate matching setting. Tested-by: Luo Jie <quic_luoj@quicinc.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/E1r6VIL-00DDM3-HJ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>