summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/freescale
AgeCommit message (Collapse)Author
2023-01-23net: enetc: stop configuring pMAC in lockstep with eMACVladimir Oltean
The MWLM bit (MAC write lock-step mode) allows register writes to the pMAC to be auto-performed whenever the corresponding eMAC register is written by the driver. This allows their configuration to remain in sync. The driver has set this bit since the initial commit, but it doesn't do anything, since the hardware feature doesn't work (and the bit has been removed from more recent versions of the documentation). The driver does attempt, more or less, to keep those MAC registers in sync by writing the same value once to e.g. ENETC_PM0_CMD_CFG (eMAC) and once to ENETC_PM1_CMD_CFG (pMAC). Because the lockstep feature doesn't work, that's what it will stick to. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: enetc: add definition for offset between eMAC and pMAC regsVladimir Oltean
This is a preliminary patch which replaces the hardcoded 0x1000 present in other PM1 (port MAC 1, aka pMAC) register definitions, which is an offset to the PM0 (port MAC 0, aka eMAC) equivalent register. This definition will be used in more places by future code. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: enetc: detect frame preemption hardware capabilityVladimir Oltean
Similar to other TSN features, query the Station Interface capability register to see whether preemption is supported on this port or not. On LS1028A, preemption is available on ports 0 and 2, but not on 1 and 3. This will allow us in the future to write the pMAC registers only on the ENETC ports where a pMAC actually exists. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: enetc: build common object files into a separate moduleVladimir Oltean
The build system is complaining about the following: enetc.o is added to multiple modules: fsl-enetc fsl-enetc-vf enetc_cbdr.o is added to multiple modules: fsl-enetc fsl-enetc-vf enetc_ethtool.o is added to multiple modules: fsl-enetc fsl-enetc-vf Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: fec: Use page_pool_put_full_page when freeing rx buffersWei Fang
The page_pool_release_page was used when freeing rx buffers, and this function just unmaps the page (if mapped) and does not recycle the page. So after hundreds of down/up the eth0, the system will out of memory. For more details, please refer to the following reproduce steps and bug logs. To solve this issue and refer to the doc of page pool, the page_pool_put_full_page should be used to replace page_pool_release_page. Because this API will try to recycle the page if the page refcnt equal to 1. After testing 20000 times, the issue can not be reproduced anymore (about testing 391 times the issue will occur on i.MX8MN-EVK before). Reproduce steps: Create the test script and run the script. The script content is as follows: LOOPS=20000 i=1 while [ $i -le $LOOPS ] do echo "TINFO:ENET $curface up and down test $i times" org_macaddr=$(cat /sys/class/net/eth0/address) ifconfig eth0 down ifconfig eth0 hw ether $org_macaddr up i=$(expr $i + 1) done sleep 5 if cat /sys/class/net/eth0/operstate | grep 'up';then echo "TEST PASS" else echo "TEST FAIL" fi Bug detail logs: TINFO:ENET up and down test 391 times [ 850.471205] Qualcomm Atheros AR8031/AR8033 30be0000.ethernet-1:00: attached PHY driver (mii_bus:phy_addr=30be0000.ethernet-1:00, irq=POLL) [ 853.535318] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 853.541694] fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx [ 870.590531] page_pool_release_retry() stalled pool shutdown 199 inflight 60 sec [ 931.006557] page_pool_release_retry() stalled pool shutdown 199 inflight 120 sec TINFO:ENET up and down test 392 times [ 991.426544] page_pool_release_retry() stalled pool shutdown 192 inflight 181 sec [ 1051.838531] page_pool_release_retry() stalled pool shutdown 170 inflight 241 sec [ 1093.751217] Qualcomm Atheros AR8031/AR8033 30be0000.ethernet-1:00: attached PHY driver (mii_bus:phy_addr=30be0000.ethernet-1:00, irq=POLL) [ 1096.446520] page_pool_release_retry() stalled pool shutdown 308 inflight 60 sec [ 1096.831245] fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx [ 1096.839092] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 1112.254526] page_pool_release_retry() stalled pool shutdown 103 inflight 302 sec [ 1156.862533] page_pool_release_retry() stalled pool shutdown 308 inflight 120 sec [ 1172.674516] page_pool_release_retry() stalled pool shutdown 103 inflight 362 sec [ 1217.278532] page_pool_release_retry() stalled pool shutdown 308 inflight 181 sec TINFO:ENET up and down test 393 times [ 1233.086535] page_pool_release_retry() stalled pool shutdown 103 inflight 422 sec [ 1277.698513] page_pool_release_retry() stalled pool shutdown 308 inflight 241 sec [ 1293.502525] page_pool_release_retry() stalled pool shutdown 86 inflight 483 sec [ 1338.110518] page_pool_release_retry() stalled pool shutdown 308 inflight 302 sec [ 1353.918540] page_pool_release_retry() stalled pool shutdown 32 inflight 543 sec [ 1361.179205] Qualcomm Atheros AR8031/AR8033 30be0000.ethernet-1:00: attached PHY driver (mii_bus:phy_addr=30be0000.ethernet-1:00, irq=POLL) [ 1364.255298] fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx [ 1364.263189] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 1371.998532] page_pool_release_retry() stalled pool shutdown 310 inflight 60 sec [ 1398.530542] page_pool_release_retry() stalled pool shutdown 308 inflight 362 sec [ 1414.334539] page_pool_release_retry() stalled pool shutdown 16 inflight 604 sec [ 1432.414520] page_pool_release_retry() stalled pool shutdown 310 inflight 120 sec [ 1458.942523] page_pool_release_retry() stalled pool shutdown 308 inflight 422 sec [ 1474.750521] page_pool_release_retry() stalled pool shutdown 16 inflight 664 sec TINFO:ENET up and down test 394 times [ 1492.830522] page_pool_release_retry() stalled pool shutdown 310 inflight 181 sec [ 1519.358519] page_pool_release_retry() stalled pool shutdown 308 inflight 483 sec [ 1535.166545] page_pool_release_retry() stalled pool shutdown 2 inflight 724 sec [ 1537.090278] eth_test2.sh invoked oom-killer: gfp_mask=0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=0, oom_score_adj=0 [ 1537.101192] CPU: 3 PID: 2379 Comm: eth_test2.sh Tainted: G C 6.1.1+g56321e101aca #1 [ 1537.110249] Hardware name: NXP i.MX8MNano EVK board (DT) [ 1537.115561] Call trace: [ 1537.118005] dump_backtrace.part.0+0xe0/0xf0 [ 1537.122289] show_stack+0x18/0x40 [ 1537.125608] dump_stack_lvl+0x64/0x80 [ 1537.129276] dump_stack+0x18/0x34 [ 1537.132592] dump_header+0x44/0x208 [ 1537.136083] oom_kill_process+0x2b4/0x2c0 [ 1537.140097] out_of_memory+0xe4/0x594 [ 1537.143766] __alloc_pages+0xb68/0xd00 [ 1537.147521] alloc_pages+0xac/0x160 [ 1537.151013] __get_free_pages+0x14/0x40 [ 1537.154851] pgd_alloc+0x1c/0x30 [ 1537.158082] mm_init+0xf8/0x1d0 [ 1537.161228] mm_alloc+0x48/0x60 [ 1537.164368] alloc_bprm+0x7c/0x240 [ 1537.167777] do_execveat_common.isra.0+0x70/0x240 [ 1537.172486] __arm64_sys_execve+0x40/0x54 [ 1537.176502] invoke_syscall+0x48/0x114 [ 1537.180255] el0_svc_common.constprop.0+0xcc/0xec [ 1537.184964] do_el0_svc+0x2c/0xd0 [ 1537.188280] el0_svc+0x2c/0x84 [ 1537.191340] el0t_64_sync_handler+0xf4/0x120 [ 1537.195613] el0t_64_sync+0x18c/0x190 [ 1537.199334] Mem-Info: [ 1537.201620] active_anon:342 inactive_anon:10343 isolated_anon:0 [ 1537.201620] active_file:54 inactive_file:112 isolated_file:0 [ 1537.201620] unevictable:0 dirty:0 writeback:0 [ 1537.201620] slab_reclaimable:2620 slab_unreclaimable:7076 [ 1537.201620] mapped:1489 shmem:2473 pagetables:466 [ 1537.201620] sec_pagetables:0 bounce:0 [ 1537.201620] kernel_misc_reclaimable:0 [ 1537.201620] free:136672 free_pcp:96 free_cma:129241 [ 1537.240419] Node 0 active_anon:1368kB inactive_anon:41372kB active_file:216kB inactive_file:5052kB unevictable:0kB isolated(anon):0kB isolated(file):0kB s [ 1537.271422] Node 0 DMA free:541636kB boost:0kB min:30000kB low:37500kB high:45000kB reserved_highatomic:0KB active_anon:1368kB inactive_anon:41372kB actiB [ 1537.300219] lowmem_reserve[]: 0 0 0 0 [ 1537.303929] Node 0 DMA: 1015*4kB (UMEC) 743*8kB (UMEC) 417*16kB (UMEC) 235*32kB (UMEC) 116*64kB (UMEC) 25*128kB (UMEC) 4*256kB (UC) 2*512kB (UC) 0*1024kBB [ 1537.323938] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [ 1537.332708] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=32768kB [ 1537.341292] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [ 1537.349776] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=64kB [ 1537.358087] 2939 total pagecache pages [ 1537.361876] 0 pages in swap cache [ 1537.365229] Free swap = 0kB [ 1537.368147] Total swap = 0kB [ 1537.371065] 516096 pages RAM [ 1537.373959] 0 pages HighMem/MovableOnly [ 1537.377834] 17302 pages reserved [ 1537.381103] 163840 pages cma reserved [ 1537.384809] 0 pages hwpoisoned [ 1537.387902] Tasks state (memory values in pages): [ 1537.392652] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [ 1537.401356] [ 201] 993 201 1130 72 45056 0 0 rpcbind [ 1537.409772] [ 202] 0 202 4529 1640 77824 0 -250 systemd-journal [ 1537.418861] [ 222] 0 222 4691 801 69632 0 -1000 systemd-udevd [ 1537.427787] [ 248] 994 248 20914 130 65536 0 0 systemd-timesyn [ 1537.436884] [ 497] 0 497 620 31 49152 0 0 atd [ 1537.444938] [ 500] 0 500 854 77 53248 0 0 crond [ 1537.453165] [ 503] 997 503 1470 160 49152 0 -900 dbus-daemon [ 1537.461908] [ 505] 0 505 633 24 40960 0 0 firmwared [ 1537.470491] [ 513] 0 513 2507 180 61440 0 0 ofonod [ 1537.478800] [ 514] 990 514 69640 137 81920 0 0 parsec [ 1537.487120] [ 533] 0 533 599 39 40960 0 0 syslogd [ 1537.495518] [ 534] 0 534 4546 148 65536 0 0 systemd-logind [ 1537.504560] [ 535] 0 535 690 24 45056 0 0 tee-supplicant [ 1537.513564] [ 540] 996 540 2769 168 61440 0 0 systemd-network [ 1537.522680] [ 566] 0 566 3878 228 77824 0 0 connmand [ 1537.531168] [ 645] 998 645 1538 133 57344 0 0 avahi-daemon [ 1537.540004] [ 646] 998 646 1461 64 57344 0 0 avahi-daemon [ 1537.548846] [ 648] 992 648 781 41 45056 0 0 rpc.statd [ 1537.557415] [ 650] 64371 650 590 23 45056 0 0 ninfod [ 1537.565754] [ 653] 61563 653 555 24 45056 0 0 rdisc [ 1537.573971] [ 655] 0 655 374569 2999 290816 0 -999 containerd [ 1537.582621] [ 658] 0 658 1311 20 49152 0 0 agetty [ 1537.590922] [ 663] 0 663 1529 97 49152 0 0 login [ 1537.599138] [ 666] 0 666 3430 202 69632 0 0 wpa_supplicant [ 1537.608147] [ 667] 0 667 2344 96 61440 0 0 systemd-userdbd [ 1537.617240] [ 677] 0 677 2964 314 65536 0 100 systemd [ 1537.625651] [ 679] 0 679 3720 646 73728 0 100 (sd-pam) [ 1537.634138] [ 687] 0 687 1289 403 45056 0 0 sh [ 1537.642108] [ 789] 0 789 970 93 45056 0 0 eth_test2.sh [ 1537.650955] [ 2355] 0 2355 2346 94 61440 0 0 systemd-userwor [ 1537.660046] [ 2356] 0 2356 2346 94 61440 0 0 systemd-userwor [ 1537.669137] [ 2358] 0 2358 2346 95 57344 0 0 systemd-userwor [ 1537.678258] [ 2379] 0 2379 970 93 45056 0 0 eth_test2.sh [ 1537.687098] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-0.slice/user@0.service,tas0 [ 1537.703009] Out of memory: Killed process 679 ((sd-pam)) total-vm:14880kB, anon-rss:2584kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:72kB oom_score_ad0 [ 1553.246526] page_pool_release_retry() stalled pool shutdown 310 inflight 241 sec Fixes: 95698ff6177b ("net: fec: using page pool to manage RX buffers") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: shenwei wang <Shenwei.wang@nxp.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/ipa/ipa_interrupt.c drivers/net/ipa/ipa_interrupt.h 9ec9b2a30853 ("net: ipa: disable ipa interrupt during suspend") 8e461e1f092b ("net: ipa: introduce ipa_interrupt_enable()") d50ed3558719 ("net: ipa: enable IPA interrupt handlers separate from registration") https://lore.kernel.org/all/20230119114125.5182c7ab@canb.auug.org.au/ https://lore.kernel.org/all/79e46152-8043-a512-79d9-c3b905462774@tessares.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-19net: phy: Remove probe_capabilitiesAndrew Lunn
Deciding if to probe of PHYs using C45 is now determine by if the bus provides the C45 read method. This makes probe_capabilities redundant so remove it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-18net: enetc: prioritize ability to go down over packet processingVladimir Oltean
napi_synchronize() from enetc_stop() waits until the softirq has finished execution and no longer wants to be rescheduled. However under high traffic load, this will never happen, and the interface can never be closed. The problem is the fact that the NAPI poll routine is written to update the consumer index which makes the device want to put more buffers in the RX ring, which restarts the madness again. Browsing around, it seems that some drivers like i40e keep a bit (__I40E_VSI_DOWN) which they use as communication between the control path and the data path. But that isn't my first choice, because complications ensue - since the enetc hardirq may trigger while we are in a theoretical ENETC_DOWN state, it may happen that enetc_msix() masks it, but enetc_poll() never unmasks it. To prevent a stall in that case, one would need to schedule all NAPI instances when ENETC_DOWN gets cleared, to process what's pending. I find it more desirable for the control path - enetc_stop() - to just quiesce the RX ring and let the softirq finish what remains there, without any explicit communication, just by making hardware not provide any more packets. This seems possible with the Enable bit of the RX BD ring (RBaMR[EN]). I can't seem to find an exact definition of what this bit does, but when the RX ring is disabled, the port seems to no longer update the producer index, and not react to software updates of the consumer index. In fact, the RBaMR[EN] bit is already toggled by the driver, but too late for what we want: enetc_close() -> enetc_stop() -> napi_synchronize() -> enetc_clear_bdrs() -> enetc_clear_rxbdr() The enetc_clear_bdrs() function contains not only logic to disable the RX and TX rings, but also logic to wait for the TX ring stop being busy. We split enetc_clear_bdrs() into enetc_disable_bdrs() and enetc_wait_bdrs(). One needs to run before napi_synchronize() and the other after (NAPI also processes TX completions, so we maximize our chances of not waiting for the ENETC_TBSR_BUSY bit - unless a packet is stuck for some reason, ofc). We also split off enetc_enable_bdrs() from enetc_setup_bdrs(), and call this from the mirror position in enetc_start() compared to enetc_stop(), i.e. right before netif_tx_start_all_queues(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: set up XDP program under enetc_reconfigure()Vladimir Oltean
Offloading a BPF program to the RX path of the driver suffers from the same problems as the PTP reconfiguration - improper error checking can leave the driver in an invalid state, and the link on the PHY is lost. Reuse the enetc_reconfigure() procedure, but here, we need to run some code in the middle of the ring reconfiguration procedure - while the interface is still down. Introduce a callback which makes that possible. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: rename "xdp" and "dev" in enetc_setup_bpf()Vladimir Oltean
Follow the convention from this driver, which is to name "struct net_device *" as "ndev", and the convention from other drivers, to name "struct netdev_bpf *" as "bpf". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: implement ring reconfiguration procedure for PTP RX timestampingVladimir Oltean
The crude enetc_stop() -> enetc_open() mechanism suffers from 2 problems: 1. improper error checking 2. it involves phylink_stop() -> phylink_start() which loses the link Right now, the driver is prepared to offer a better alternative: a ring reconfiguration procedure which takes the RX BD size (normal or extended) as argument. It allocates new resources (failing if that fails), stops the traffic, and assigns the new resources to the rings. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: move phylink_start/stop out of enetc_start/stopVladimir Oltean
We want to introduce a fast interface reconfiguration procedure, which involves temporarily stopping the rings. But we want enetc_start() and enetc_stop() to not restart PHY autoneg, because that can take a few seconds until it completes again. So we need part of enetc_start() and enetc_stop(), but not all of them. Move phylink_start() right next to phylink_of_phy_connect(), and phylink_stop() right next to phylink_disconnect_phy(), both still in ndo_open() and ndo_stop(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: split ring resource allocation from assignmentVladimir Oltean
We have a few instances in the enetc driver where the ring resources (BD ring iomem, software BD ring, software TSO headers, basically everything except RX buffers) need to be reallocated. For example, when RX timestamping is enabled, the RX BD format changes to an extended one (twice as large). Currently, this is done using a simplistic enetc_close() -> enetc_open() procedure. But this is quite crude, since it also invokes phylink_stop() -> phylink_start(), the link is lost, and a few seconds need to pass for autoneg to complete again. In fact it's bad also due to the improper (yolo) error checking. In case we fail to allocate new resources, we've already freed the old ones, so the interface is more or less stuck. To avoid that, we need a system where reconfiguration is possible in a way in which resources are allocated upfront. This means that there will be a higher memory usage temporarily, but the assignment of resources to rings can be done when both the old and new resources are still available. Introduce a struct enetc_bdr_resource which holds the resources for a ring, be it RX or TX. This structure duplicates a lot of fields from struct enetc_bdr (and access to the same fields in the ring structure was left duplicated, to not change cache characteristics in the fast path). When enetc_alloc_tx_resources() runs, it returns an array of resource elements (one per TX ring), in addition to the existing priv->tx_res. To populate priv->tx_res with that array, one must call enetc_assign_tx_resources(), and this also frees the old resources. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: bring "bool extended" to top-level in enetc_open()Vladimir Oltean
Extended RX buffer descriptors are necessary if they carry RX timestamps, which will be true when PTP timestamping is enabled. Right now, the rx_ring->ext_en is set from the function that allocates ring resources (enetc_alloc_rx_resources() -> enetc_alloc_rxbdr()), and also used later, in enetc_setup_rxbdr(). It is also used in the enetc_rxbd() and enetc_rxbd_next() fast path helpers. We want to decouple resource allocation from BD ring setup, but both procedures depend on BD size (extended or not). Move the "extended" boolean to enetc_open() and pass it both to the RX allocation procedure as well as to the RX ring setup procedure. The latter will set rx_ring->ext_en from now on. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: drop redundant enetc_free_tx_frame() call from enetc_free_txbdr()Vladimir Oltean
The call path in enetc_close() is: enetc_close() -> enetc_free_rxtx_rings() -> enetc_free_tx_ring() -> enetc_free_tx_frame() -> enetc_free_tx_resources() -> enetc_free_txbdr() -> enetc_free_tx_frame() The enetc_free_tx_frame() function is written such that the second call exits without doing anything, but nonetheless, it is completely redundant. Delete it. This makes the TX teardown path more similar to the RX one, where rx_swbd freeing is done in enetc_free_rx_ring(), not in enetc_free_rxbdr(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: rx_swbd and tx_swbd are never NULL in enetc_free_rxtx_rings()Vladimir Oltean
The call path in enetc_close() is: enetc_close() -> enetc_free_rxtx_rings() -> enetc_free_rx_ring() -> tests whether rx_ring->rx_swbd is NULL -> enetc_free_tx_ring() -> tests whether tx_ring->tx_swbd is NULL -> enetc_free_rx_resources() -> enetc_free_rxbdr() -> sets rxr->rx_swbd to NULL -> enetc_free_tx_resources() -> enetc_free_txbdr() -> setx txr->tx_swbd to NULL From the above, it is clear that due to the function ordering, the checks for NULL are redundant, since the software buffer descriptor arrays have not yet been set to NULL. Drop these checks. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: create enetc_dma_free_bdr()Vladimir Oltean
This is a refactoring change which introduces the opposite function of enetc_dma_alloc_bdr(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: set up RX ring indices from enetc_setup_rxbdr()Vladimir Oltean
There is only one place which needs to set up indices in the RX ring. Be consistent with what was done in the TX path and do this in enetc_setup_rxbdr(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: set next_to_clean/next_to_use just from enetc_setup_txbdr()Vladimir Oltean
enetc_alloc_txbdr() deals with allocating resources necessary for a TX ring to work (the array of software BDs and the array of TSO headers). The next_to_clean and next_to_use pointers are overwritten with proper values which are read from hardware here: enetc_open -> enetc_alloc_tx_resources -> enetc_alloc_txbdr -> set to zero -> enetc_setup_bdrs -> enetc_setup_txbdr -> read from hardware So their initialization with zeroes is pointless and confusing. Delete it. Consequently, since enetc_setup_txbdr() has no opposite cleanup function, also delete the resetting of these indices from enetc_free_tx_ring(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-13enetc: Separate C22 and C45 transactionsAndrew Lunn
The enetc MDIO bus driver can perform both C22 and C45 transfers. Create separate functions for each and register the C45 versions using the new API calls where appropriate. This driver is shared with the Felix DSA switch, so update that at the same time. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-13net: enetc: avoid deadlock in enetc_tx_onestep_tstamp()Vladimir Oltean
This lockdep splat says it better than I could: ================================ WARNING: inconsistent lock state 6.2.0-rc2-07010-ga9b9500ffaac-dirty #967 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. kworker/1:3/179 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff3ec4036ce098 (_xmit_ETHER#2){+.?.}-{3:3}, at: netif_freeze_queues+0x5c/0xc0 {IN-SOFTIRQ-W} state was registered at: _raw_spin_lock+0x5c/0xc0 sch_direct_xmit+0x148/0x37c __dev_queue_xmit+0x528/0x111c ip6_finish_output2+0x5ec/0xb7c ip6_finish_output+0x240/0x3f0 ip6_output+0x78/0x360 ndisc_send_skb+0x33c/0x85c ndisc_send_rs+0x54/0x12c addrconf_rs_timer+0x154/0x260 call_timer_fn+0xb8/0x3a0 __run_timers.part.0+0x214/0x26c run_timer_softirq+0x3c/0x74 __do_softirq+0x14c/0x5d8 ____do_softirq+0x10/0x20 call_on_irq_stack+0x2c/0x5c do_softirq_own_stack+0x1c/0x30 __irq_exit_rcu+0x168/0x1a0 irq_exit_rcu+0x10/0x40 el1_interrupt+0x38/0x64 irq event stamp: 7825 hardirqs last enabled at (7825): [<ffffdf1f7200cae4>] exit_to_kernel_mode+0x34/0x130 hardirqs last disabled at (7823): [<ffffdf1f708105f0>] __do_softirq+0x550/0x5d8 softirqs last enabled at (7824): [<ffffdf1f7081050c>] __do_softirq+0x46c/0x5d8 softirqs last disabled at (7811): [<ffffdf1f708166e0>] ____do_softirq+0x10/0x20 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(_xmit_ETHER#2); <Interrupt> lock(_xmit_ETHER#2); *** DEADLOCK *** 3 locks held by kworker/1:3/179: #0: ffff3ec400004748 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c0 #1: ffff80000a0bbdc8 ((work_completion)(&priv->tx_onestep_tstamp)){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c0 #2: ffff3ec4036cd438 (&dev->tx_global_lock){+.+.}-{3:3}, at: netif_tx_lock+0x1c/0x34 Workqueue: events enetc_tx_onestep_tstamp Call trace: print_usage_bug.part.0+0x208/0x22c mark_lock+0x7f0/0x8b0 __lock_acquire+0x7c4/0x1ce0 lock_acquire.part.0+0xe0/0x220 lock_acquire+0x68/0x84 _raw_spin_lock+0x5c/0xc0 netif_freeze_queues+0x5c/0xc0 netif_tx_lock+0x24/0x34 enetc_tx_onestep_tstamp+0x20/0x100 process_one_work+0x28c/0x6c0 worker_thread+0x74/0x450 kthread+0x118/0x11c but I'll say it anyway: the enetc_tx_onestep_tstamp() work item runs in process context, therefore with softirqs enabled (i.o.w., it can be interrupted by a softirq). If we hold the netif_tx_lock() when there is an interrupt, and the NET_TX softirq then gets scheduled, this will take the netif_tx_lock() a second time and deadlock the kernel. To solve this, use netif_tx_lock_bh(), which blocks softirqs from running. Fixes: 7294380c5211 ("enetc: support PTP Sync packet one-step timestamping") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230112105440.1786799-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-12net: remove redundant config PCI dependency for some network driver configsLukas Bulwahn
While reviewing dependencies in some Kconfig files, I noticed the redundant dependency "depends on PCI && PCI_MSI". The config PCI_MSI has always, since its introduction, been dependent on the config PCI. So, it is sufficient to just depend on PCI_MSI, and know that the dependency on PCI is implicitly implied. Reduce the dependencies of some network driver configs. No functional change and effective change of Kconfig dependendencies. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Simon Horman <simon.horman@corigine.com> Acked-by: Dimitris Michailidis <dmichail@fungible.com> Link: https://lore.kernel.org/r/20230111125855.19020-1-lukas.bulwahn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: fec: Separate C22 and C45 transactionsAndrew Lunn
The fec MDIO bus driver can perform both C22 and C45 transfers. Create separate functions for each and register the C45 versions using the new API calls where appropriate. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: mdio: xgmac_mdio: Separate C22 and C45 transactionsAndrew Lunn
The xgmac MDIO bus driver can perform both C22 and C45 transfers. Create separate functions for each and register the C45 versions using the new API calls where appropriate. While at it, remove the misleading comment. According to Vladimir Oltean: - miimcom is a register accessed by fsl_pq_mdio.c, not by xgmac_mdio.c - "device dev" doesn't really refer to anything (maybe "dev_addr"). - I don't understand what is meant by the comment "All PHY configuration has to be done through the TSEC1 MIIM regs". Or rather said, I think I understand, but it is irrelevant to the driver for 2 reasons: * TSEC devices use the fsl_pq_mdio.c driver, not this one * It doesn't matter to this driver whose TSEC registers are used for MDIO access. The driver just works with the registers it's given, which is a concern for the device tree. - barring the above, the rest just describes the MDIO bus API, which is superfluous Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-05net: ethernet: enetc: do not always access skb_shared_info in the XDP pathLorenzo Bianconi
Move XDP skb_shared_info structure initialization in from enetc_map_rx_buff_to_xdp() to enetc_add_rx_buff_to_xdp() and do not always access skb_shared_info in the xdp_buff/xdp_frame since it is located in a different cacheline with respect to hard_start and data xdp pointers. Rely on XDP_FLAGS_HAS_FRAGS flag to check if it really necessary to access non-linear part of the xdp_buff/xdp_frame. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-05net: ethernet: enetc: get rid of xdp_redirect_sg counterLorenzo Bianconi
Remove xdp_redirect_sg counter and the related ethtool entry since it is no longer used. Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-05net: ethernet: enetc: unlock XDP_REDIRECT for XDP non-linear buffersLorenzo Bianconi
Even if full XDP_REDIRECT is not supported yet for non-linear XDP buffers since we allow redirecting just into CPUMAPs, unlock XDP_REDIRECT for S/G XDP buffer and rely on XDP stack to properly take care of the frames. Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-03net: dpaa: Fix dtsec check for PCS availabilitySean Anderson
We want to fail if the PCS is not available, not if it is available. Fix this condition. Fixes: 5d93cfcf7360 ("net: dpaa: Convert to phylink") Reported-by: Christian Zigotzky <info@xenosoft.de> Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-30net: ethernet: freescale: enetc: Drop empty platform remove functionUwe Kleine-König
A remove callback just returning 0 is equivalent to no remove callback at all. So drop the useless function. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-20net: fec: check the return value of build_skb()Wei Fang
The build_skb might return a null pointer but there is no check on the return value in the fec_enet_rx_queue(). So a null pointer dereference might occur. To avoid this, we check the return value of build_skb. If the return value is a null pointer, the driver will recycle the page and update the statistic of ndev. Then jump to rx_processing_done to clear the status flags of the BD so that the hardware can recycle the BD. Fixes: 95698ff6177b ("net: fec: using page pool to manage RX buffers") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Shenwei Wang <Shenwei.wang@nxp.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20221219022755.1047573-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-14net: enetc: avoid buffer leaks on xdp_do_redirect() failureVladimir Oltean
Before enetc_clean_rx_ring_xdp() calls xdp_do_redirect(), each software BD in the RX ring between index orig_i and i can have one of 2 refcount values on its page. We are the owner of the current buffer that is being processed, so the refcount will be at least 1. If the current owner of the buffer at the diametrically opposed index in the RX ring (i.o.w, the other half of this page) has not yet called kfree(), this page's refcount could even be 2. enetc_page_reusable() in enetc_flip_rx_buff() tests for the page refcount against 1, and [ if it's 2 ] does not attempt to reuse it. But if enetc_flip_rx_buff() is put after the xdp_do_redirect() call, the page refcount can have one of 3 values. It can also be 0, if there is no owner of the other page half, and xdp_do_redirect() for this buffer ran so far that it triggered a flush of the devmap/cpumap bulk queue, and the consumers of those bulk queues also freed the buffer, all by the time xdp_do_redirect() returns the execution back to enetc. This is the reason why enetc_flip_rx_buff() is called before xdp_do_redirect(), but there is a big flaw with that reasoning: enetc_flip_rx_buff() will set rx_swbd->page = NULL on both sides of the enetc_page_reusable() branch, and if xdp_do_redirect() returns an error, we call enetc_xdp_free(), which does not deal gracefully with that. In fact, what happens is quite special. The page refcounts start as 1. enetc_flip_rx_buff() figures they're reusable, transfers these rx_swbd->page pointers to a different rx_swbd in enetc_reuse_page(), and bumps the refcount to 2. When xdp_do_redirect() later returns an error, we call the no-op enetc_xdp_free(), but we still haven't lost the reference to that page. A copy of it is still at rx_ring->next_to_alloc, but that has refcount 2 (and there are no concurrent owners of it in flight, to drop the refcount). What really kills the system is when we'll flip the rx_swbd->page the second time around. With an updated refcount of 2, the page will not be reusable and we'll really leak it. Then enetc_new_page() will have to allocate more pages, which will then eventually leak again on further errors from xdp_do_redirect(). The problem, summarized, is that we zeroize rx_swbd->page before we're completely done with it, and this makes it impossible for the error path to do something with it. Since the packet is potentially multi-buffer and therefore the rx_swbd->page is potentially an array, manual passing of the old pointers between enetc_flip_rx_buff() and enetc_xdp_free() is a bit difficult. For the sake of going with a simple solution, we accept the possibility of racing with xdp_do_redirect(), and we move the flip procedure to execute only on the redirect success path. By racing, I mean that the page may be deemed as not reusable by enetc (having a refcount of 0), but there will be no leak in that case, either. Once we accept that, we have something better to do with buffers on XDP_REDIRECT failure. Since we haven't performed half-page flipping yet, we won't, either (and this way, we can avoid enetc_xdp_free() completely, which gives the entire page to the slab allocator). Instead, we'll call enetc_xdp_drop(), which will recycle this half of the buffer back to the RX ring. Fixes: 9d2b68cc108d ("net: enetc: add support for XDP_REDIRECT") Suggested-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20221213001908.2347046-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07dpaa2-switch: Fix memory leak in dpaa2_switch_acl_entry_add() and ↵Yuan Can
dpaa2_switch_acl_entry_remove() The cmd_buff needs to be freed when error happened in dpaa2_switch_acl_entry_add() and dpaa2_switch_acl_entry_remove(). Fixes: 1110318d83e8 ("dpaa2-switch: add tc flower hardware offload on ingress traffic") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221205061515.115012-1-yuancan@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-06net: fec: properly guard irq coalesce setupRasmus Villemoes
Prior to the Fixes: commit, the initialization code went through the same fec_enet_set_coalesce() function as used by ethtool, and that function correctly checks whether the current variant has support for irq coalescing. Now that the initialization code instead calls fec_enet_itr_coal_set() directly, that call needs to be guarded by a check for the FEC_QUIRK_HAS_COALESCE bit. Fixes: df727d4547de (net: fec: don't reset irq coalesce settings to defaults on "ip link up") Reported-by: Greg Ungerer <gregungerer@westnet.com.au> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221205204604.869853-1-linux@rasmusvillemoes.dk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01net: dpaa2-mac: move rtnl_lock() only around phylink_{,dis}connect_phy()Vladimir Oltean
After the introduction of a private mac_lock that serializes access to priv->mac (and port_priv->mac in the switch), the only remaining purpose of rtnl_lock() is to satisfy the locking requirements of phylink_fwnode_phy_connect() and phylink_disconnect_phy(). But the functions these live in, dpaa2_mac_connect() and dpaa2_mac_disconnect(), have contradictory locking requirements. While phylink_fwnode_phy_connect() wants rtnl_lock() to be held, phylink_create() wants it to not be held. Move the rtnl_lock() from top-level (in the dpaa2-eth and dpaa2-switch drivers) to only surround the phylink calls that require it, in the dpaa2-mac library code. This is possible because dpaa2_mac_connect() and dpaa2_mac_disconnect() run unlocked, and there isn't any danger of an AB/BA deadlock between the rtnl_mutex and other private locks. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-switch: serialize changes to priv->mac with a mutexVladimir Oltean
The dpaa2-switch driver uses a DPMAC in the same way as the dpaa2-eth driver, so we need to duplicate the locking solution established by the previous change to the switch driver as well. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-eth: serialize changes to priv->mac with a mutexVladimir Oltean
The dpaa2 architecture permits dynamic connections between objects on the fsl-mc bus, specifically between a DPNI object (represented by a struct net_device) and a DPMAC object (represented by a struct phylink). The DPNI driver is notified when those connections are created/broken through the dpni_irq0_handler_thread() method. To ensure that ethtool operations, as well as netdev up/down operations serialize with the connection/disconnection of the DPNI with a DPMAC, dpni_irq0_handler_thread() takes the rtnl_lock() to block those other operations from taking place. There is code called by dpaa2_mac_connect() which wants to acquire the rtnl_mutex once again, see phylink_create() -> phylink_register_sfp() -> sfp_bus_add_upstream() -> rtnl_lock(). So the strategy doesn't quite work out, even though it's fairly simple. Create a different strategy, where all code paths in the dpaa2-eth driver access priv->mac only while they are holding priv->mac_lock. The phylink instance is not created or connected to the PHY under the priv->mac_lock, but only assigned to priv->mac then. This will eliminate the reliance on the rtnl_mutex. Add lockdep annotations and put comments where holding the lock is not necessary, and priv->mac can be dereferenced freely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-eth: connect to MAC before requesting the "endpoint changed" IRQVladimir Oltean
dpaa2_eth_connect_mac() is called both from dpaa2_eth_probe() and from dpni_irq0_handler_thread(). It could happen that the DPNI gets connected to a DPMAC on the fsl-mc bus exactly during probe, as soon as the "endpoint change" interrupt is requested in dpaa2_eth_setup_irqs(). This will cause the dpni_irq0_handler_thread() to register a phylink instance for that DPMAC. Then, the probing function will also try to register a phylink instance for the same DPMAC, operation which should fail (and this will fail the probing of the driver). Reorder dpaa2_eth_setup_irqs() and dpaa2_eth_connect_mac(), such that dpni_irq0_handler_thread() never races with the DPMAC-related portion of the probing path. Also reorder dpaa2_eth_disconnect_mac() to be in the mirror position of dpaa2_eth_connect_mac() in the teardown path. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-switch replace direct MAC access with dpaa2_switch_port_has_mac()Vladimir Oltean
The helper function will gain a lockdep annotation in a future patch. Make sure to benefit from it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2: publish MAC stringset to ethtool -S even if MAC is missingVladimir Oltean
DPNIs and DPSW objects can connect and disconnect at runtime from DPMAC objects on the same fsl-mc bus. The DPMAC object also holds "ethtool -S" unstructured counters. Those counters are only shown for the entity owning the netdev (DPNI, DPSW) if it's connected to a DPMAC. The ethtool stringset code path is split into multiple callbacks, but currently, connecting and disconnecting the DPMAC takes the rtnl_lock(). This blocks the entire ethtool code path from running, see ethnl_default_doit() -> rtnl_lock() -> ops->prepare_data() -> strset_prepare_data(). This is going to be a problem if we are going to no longer require rtnl_lock() when connecting/disconnecting the DPMAC, because the DPMAC could appear between ops->get_sset_count() and ops->get_strings(). If it appears out of the blue, we will provide a stringset into an array that was dimensioned thinking the DPMAC wouldn't be there => array accessed out of bounds. There isn't really a good way to work around that, and I don't want to put too much pressure on the ethtool framework by playing locking games. Just make the DPMAC counters be always available. They'll be zeroes if the DPNI or DPSW isn't connected to a DPMAC. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-switch: assign port_priv->mac after dpaa2_mac_connect() callVladimir Oltean
The dpaa2-switch has the exact same locking requirements when connected to a DPMAC, so it needs port_priv->mac to always point either to NULL, or to a DPMAC with a fully initialized phylink instance. Make the same preparatory change in the dpaa2-switch driver as in the dpaa2-eth one. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-eth: assign priv->mac after dpaa2_mac_connect() callVladimir Oltean
There are 2 requirements for correct code: - Any time the driver accesses the priv->mac pointer at runtime, it either holds NULL to indicate a DPNI-DPNI connection (or unconnected DPNI), or a struct dpaa2_mac whose phylink instance was fully initialized (created and connected to the PHY). No changes are made to priv->mac while it is being used. Currently, rtnl_lock() watches over the call to dpaa2_eth_connect_mac(), so it serves the purpose of serializing this with all readers of priv->mac. - dpaa2_mac_connect() should run unlocked, because inside it are 2 phylink calls with incompatible locking requirements: phylink_create() requires that the rtnl_mutex isn't held, and phylink_fwnode_phy_connect() requires that the rtnl_mutex is held. The only way to solve those contradictory requirements is to let dpaa2_mac_connect() take rtnl_lock() when it needs to. To solve both requirements, we need to identify the writer side of the priv->mac pointer, which can be wrapped in a mutex private to the driver in a future patch. The dpaa2_mac_connect() cannot be part of the writer side critical section, because of an AB/BA deadlock with rtnl_lock(). So the strategy needs to be that where we prepare the DPMAC by calling dpaa2_mac_connect(), and only make priv->mac point to it once it's fully prepared. This ensures that the writer side critical section has the absolute minimum surface it can. The reverse strategy is adopted in the dpaa2_eth_disconnect_mac() code path. This makes sure that priv->mac is NULL when we start tearing down the DPMAC that we disconnected from, and concurrent code will simply not see it. No locking changes in this patch (concurrent code is still blocked by the rtnl_mutex). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-mac: remove defensive check in dpaa2_mac_disconnect()Vladimir Oltean
dpaa2_mac_disconnect() will only be called with a NULL mac->phylink if dpaa2_mac_connect() failed, or was never called. The callers are these: dpaa2_eth_disconnect_mac(): if (dpaa2_eth_is_type_phy(priv)) dpaa2_mac_disconnect(priv->mac); dpaa2_switch_port_disconnect_mac(): if (dpaa2_switch_port_is_type_phy(port_priv)) dpaa2_mac_disconnect(port_priv->mac); priv->mac can be NULL, but in that case, dpaa2_eth_is_type_phy() returns false, and dpaa2_mac_disconnect() is never called. Similar for dpaa2-switch. When priv->mac is non-NULL, it means that dpaa2_mac_connect() returned zero (success), and therefore, priv->mac->phylink is also a valid pointer. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-mac: absorb phylink_start() call into dpaa2_mac_start()Vladimir Oltean
The phylink handling is intended to be hidden inside the dpaa2_mac object. Move the phylink_start() call into dpaa2_mac_start(), and phylink_stop() into dpaa2_mac_stop(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2: replace dpaa2_mac_is_type_fixed() with dpaa2_mac_is_type_phy()Vladimir Oltean
dpaa2_mac_is_type_fixed() is a header with no implementation and no callers, which is referenced from the documentation though. It can be deleted. On the other hand, it would be useful to reuse the code between dpaa2_eth_is_type_phy() and dpaa2_switch_port_is_type_phy(). That common code should be called dpaa2_mac_is_type_phy(), so let's create that. The removal and the addition are merged into the same patch because, in fact, is_type_phy() is the logical opposite of is_type_fixed(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-eth: don't use -ENOTSUPP error codeVladimir Oltean
dpaa2_eth_setup_dpni() is called from the probe path and dpaa2_eth_set_link_ksettings() is propagated to user space. include/linux/errno.h says that ENOTSUPP is "Defined for the NFSv3 protocol". Conventional wisdom has it to not use it in networking drivers. Replace it with -EOPNOTSUPP. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-30net: devlink: let the core report the driver name instead of the driversVincent Mailhol
The driver name is available in device_driver::name. Right now, drivers still have to report this piece of information themselves in their devlink_ops::info_get callback function. In order to factorize code, make devlink_nl_info_fill() add the driver name attribute. Now that the core sets the driver name attribute, drivers are not supposed to call devlink_info_driver_name_put() anymore. Remove devlink_info_driver_name_put() and clean-up all the drivers using this function in their callback. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Tested-by: Ido Schimmel <idosch@nvidia.com> # mlxsw Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
tools/lib/bpf/ringbuf.c 927cbb478adf ("libbpf: Handle size overflow for ringbuf mmap") b486d19a0ab0 ("libbpf: checkpatch: Fixed code alignments in ringbuf.c") https://lore.kernel.org/all/20221121122707.44d1446a@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-25net: fec: don't reset irq coalesce settings to defaults on "ip link up"Rasmus Villemoes
Currently, when a FEC device is brought up, the irq coalesce settings are reset to their default values (1000us, 200 frames). That's unexpected, and breaks for example use of an appropriate .link file to make systemd-udev apply the desired settings (https://www.freedesktop.org/software/systemd/man/systemd.link.html), or any other method that would do a one-time setup during early boot. Refactor the code so that fec_restart() instead uses fec_enet_itr_coal_set(), which simply applies the settings that are stored in the private data, and initialize that private data with the default values. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-23net: enetc: preserve TX ring priority across reconfigurationVladimir Oltean
In the blamed commit, a rudimentary reallocation procedure for RX buffer descriptors was implemented, for the situation when their format changes between normal (no PTP) and extended (PTP). enetc_hwtstamp_set() calls enetc_close() and enetc_open() in a sequence, and this sequence loses information which was previously configured in the TX BDR Mode Register, specifically via the enetc_set_bdr_prio() call. The TX ring priority is configured by tc-mqprio and tc-taprio, and affects important things for TSN such as the TX time of packets. The issue manifests itself most visibly by the fact that isochron --txtime reports premature packet transmissions when PTP is first enabled on an enetc interface. Save the TX ring priority in a new field in struct enetc_bdr (occupies a 2 byte hole on arm64) in order to make this survive a ring reconfiguration. Fixes: 434cebabd3a2 ("enetc: Add dynamic allocation of extended Rx BD rings") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/r/20221122130936.1704151-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>