summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-01-11bnxt: make sure we return pages to the poolJakub Kicinski
Before the commit under Fixes the page would have been released from the pool before the napi_alloc_skb() call, so normal page freeing was fine (released page == no longer in the pool). After the change we just mark the page for recycling so it's still in the pool if the skb alloc fails, we need to recycle. Same commit added the same bug in the new bnxt_rx_multi_page_skb(). Fixes: 1dc4c557bfed ("bnxt: adding bnxt_xdp_build_skb to build skb from multibuffer xdp_buff") Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Link: https://lore.kernel.org/r/20230111042547.987749-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-11net: hns3: fix wrong use of rss size during VF rss configJie Wang
Currently, it used old rss size to get current tc mode. As a result, the rss size is updated, but the tc mode is still configured based on the old rss size. So this patch fixes it by using the new rss size in both process. Fixes: 93969dc14fcd ("net: hns3: refactor VF rss init APIs with new common rss init APIs") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Hao Lan <lanhao@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230110115359.10163-1-lanhao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-11r8169: disable ASPM in case of tx timeoutHeiner Kallweit
There are still single reports of systems where ASPM incompatibilities cause tx timeouts. It's not clear whom to blame, so let's disable ASPM in case of a tx timeout. v2: - add one-time warning for informing the user Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/92369a92-dc32-4529-0509-11459ba0e391@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-11Merge tag 'perf-tools-fixes-for-v6.2-2-2023-01-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Make 'perf kmem' cope with the removal of some kmem:kmem_cache_alloc_node and kmem:kmalloc_node in the 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") commit, making sure it works with Linux >= 6.2 as well as with older kernels where those tracepoints are present. - Also make it handle the new "node" kmem:kmalloc and kmem:kmem_cache_alloc tracepoint field introduced in that same commit. - Fix hardware tracing PMU address filter duplicate symbol selection, that was preventing to match with static functions with the same name present in different object files. - Fix regression on what linux/types.h file gets used to build the "BPF prologue" 'perf test' entry, the system one lacks the fmode_t definition used in this test, so provide that type in the test itself. - Avoid build breakage with libbpf < 0.8.0 + LIBBPF_DYNAMIC=1. If the user asks for linking with the libbpf package provided by the distro, then it has to be >= 0.8.0. Using the libbpf supplied with the kernel would be a fallback in that case. - Fix the build when libbpf isn't available or explicitly disabled via NO_LIBBPF=1. - Don't try to install libtraceevent plugins as its not anymore in the kernel sources and will thus always fail. * tag 'perf-tools-fixes-for-v6.2-2-2023-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf auxtrace: Fix address filter duplicate symbol selection perf bpf: Avoid build breakage with libbpf < 0.8.0 + LIBBPF_DYNAMIC=1 perf build: Fix build error when NO_LIBBPF=1 perf tools: Don't install libtraceevent plugins as its not anymore in the kernel sources perf kmem: Support field "node" in evsel__process_alloc_event() coping with recent tracepoint restructuring perf kmem: Support legacy tracepoints perf build: Properly guard libbpf includes perf tests bpf prologue: Fix bpf-script-test-prologue test compile issue with clang
2023-01-11s390: update defconfigsHeiko Carstens
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-11Merge branch ↵Jakub Kicinski
'dt-bindings-first-batch-of-dt-schema-conversions-for-amlogic-meson-bindings' Neil Armstrong says: ==================== dt-bindings: first batch of dt-schema conversions for Amlogic Meson bindings Batch conversion of the following bindings: [...] - mdio-mux-meson-g12a.txt ==================== Link: https://lore.kernel.org/r/20221117-b4-amlogic-bindings-convert-v2-0-36ad050bb625@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-11dt-bindings: net: convert mdio-mux-meson-g12a.txt to dt-schemaNeil Armstrong
Convert MDIO bus multiplexer/glue of Amlogic G12a SoC family bindings to dt-schema. Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-11perf auxtrace: Fix address filter duplicate symbol selectionAdrian Hunter
When a match has been made to the nth duplicate symbol, return success not error. Example: Before: $ cat file.c cat: file.c: No such file or directory $ cat file1.c #include <stdio.h> static void func(void) { printf("First func\n"); } void other(void); int main() { func(); other(); return 0; } $ cat file2.c #include <stdio.h> static void func(void) { printf("Second func\n"); } void other(void) { func(); } $ gcc -Wall -Wextra -o test file1.c file2.c $ perf record -e intel_pt//u --filter 'filter func @ ./test' -- ./test Multiple symbols with name 'func' #1 0x1149 l func which is near main #2 0x1179 l func which is near other Disambiguate symbol name by inserting #n after the name e.g. func #2 Or select a global symbol by inserting #0 or #g or #G Failed to parse address filter: 'filter func @ ./test' Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>] Where multiple filters are separated by space or comma. $ perf record -e intel_pt//u --filter 'filter func #2 @ ./test' -- ./test Failed to parse address filter: 'filter func #2 @ ./test' Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>] Where multiple filters are separated by space or comma. After: $ perf record -e intel_pt//u --filter 'filter func #2 @ ./test' -- ./test First func Second func [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data ] $ perf script --itrace=b -Ftime,flags,ip,sym,addr --ns 1231062.526977619: tr strt 0 [unknown] => 558495708179 func 1231062.526977619: tr end call 558495708188 func => 558495708050 _init 1231062.526979286: tr strt 0 [unknown] => 55849570818d func 1231062.526979286: tr end return 55849570818f func => 55849570819d other Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters") Reported-by: Dmitrii Dolgov <9erthalion6@gmail.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230110185659.15979-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-11KVM: s390: interrupt: use READ_ONCE() before cmpxchg()Heiko Carstens
Use READ_ONCE() before cmpxchg() to prevent that the compiler generates code that fetches the to be compared old value several times from memory. Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20230109145456.2895385-1-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-11s390/percpu: add READ_ONCE() to arch_this_cpu_to_op_simple()Heiko Carstens
Make sure that *ptr__ within arch_this_cpu_to_op_simple() is only dereferenced once by using READ_ONCE(). Otherwise the compiler could generate incorrect code. Cc: <stable@vger.kernel.org> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-11s390/cpum_sf: add READ_ONCE() semantics to compare and swap loopsHeiko Carstens
The current cmpxchg_double() loops within the perf hw sampling code do not have READ_ONCE() semantics to read the old value from memory. This allows the compiler to generate code which reads the "old" value several times from memory, which again allows for inconsistencies. For example: /* Reset trailer (using compare-double-and-swap) */ do { te_flags = te->flags & ~SDB_TE_BUFFER_FULL_MASK; te_flags |= SDB_TE_ALERT_REQ_MASK; } while (!cmpxchg_double(&te->flags, &te->overflow, te->flags, te->overflow, te_flags, 0ULL)); The compiler could generate code where te->flags used within the cmpxchg_double() call may be refetched from memory and which is not necessarily identical to the previous read version which was used to generate te_flags. Which in turn means that an incorrect update could happen. Fix this by adding READ_ONCE() semantics to all cmpxchg_double() loops. Given that READ_ONCE() cannot generate code on s390 which atomically reads 16 bytes, use a private compare-and-swap-double implementation to achieve that. Also replace cmpxchg_double() with the private implementation to be able to re-use the old value within the loops. As a side effect this converts the whole code to only use bit fields to read and modify bits within the hws trailer header. Reported-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.ibm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/linux-s390/Y71QJBhNTIatvxUT@osiris/T/#ma14e2a5f7aa8ed4b94b6f9576799b3ad9c60f333 Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-11spi: Merge rename of spi-cs-setup-ns DT propertyMark Brown
The newly added spi-cs-setup-ns doesn't really fit with the existing property names for delays, rename it so that it does before it makes it into a release and becomes ABI.
2023-01-11spi: spidev: remove debug messages that access spidev->spi without lockingBartosz Golaszewski
The two debug messages in spidev_open() dereference spidev->spi without taking the lock and without checking if it's not null. This can lead to a crash. Drop the messages as they're not needed - the user-space will get informed about ENOMEM with the syscall return value. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Link: https://lore.kernel.org/r/20230106100719.196243-2-brgl@bgdev.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2023-01-11spi: spidev: fix a race condition when accessing spidev->spiBartosz Golaszewski
There's a spinlock in place that is taken in file_operations callbacks whenever we check if spidev->spi is still alive (not null). It's also taken when spidev->spi is set to NULL in remove(). This however doesn't protect the code against driver unbind event while one of the syscalls is still in progress. To that end we need a lock taken continuously as long as we may still access spidev->spi. As both the file ops and the remove callback are never called from interrupt context, we can replace the spinlock with a mutex. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Link: https://lore.kernel.org/r/20230106100719.196243-1-brgl@bgdev.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2023-01-11Merge tag 'mlx5-fixes-2023-01-09' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-fixes-2023-01-09
2023-01-11ipv6: raw: Deduct extension header length in rawv6_push_pending_framesHerbert Xu
The total cork length created by ip6_append_data includes extension headers, so we must exclude them when comparing them against the IPV6_CHECKSUM offset which does not include extension headers. Reported-by: Kyle Zeng <zengyhkyle@gmail.com> Fixes: 357b40a18b04 ("[IPV6]: IPV6_CHECKSUM socket option can corrupt kernel memory") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-11Merge tag 'mlx5-updates-2023-01-10' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2023-01-10 1) From Gal: Add debugfs entries for netdev nic driver - ktls, flow steering and hairpin info - useful for debug and performance analysis - e.g hairpin queue attributes, dump ktls tx pool size, etc 2) From Maher: Update shared buffer configuration on PFC commands 2.1) For every change of buffer's headroom, recalculate the size of shared buffer to be equal to "total_buffer_size" - "new_headroom_size". The new shared buffer size will be split in ratio of 3:1 between lossy and lossless pools, respectively. 2.2) For each port buffer change, count the number of lossless buffers. If there is only one lossless buffer, then set its lossless pool usage threshold to be infinite. Otherwise, if there is more than one lossless buffer, set a usage threshold for each lossless buffer. While at it, add more verbosity to debug prints when handling user commands, to assist in future debug. 3) From Tariq: Throttle high rate FW commands 4) From Shay: Properly initialize management PF 5) Various cleanup patches
2023-01-11Merge branch 'NCN26000-PLCA-RS-support'David S. Miller
Piergiorgio Beruto says: ==================== net: add PLCA RS support and onsemi NCN26000 This patchset adds support for getting/setting the Physical Layer Collision Avoidace (PLCA) Reconciliation Sublayer (RS) configuration and status on Ethernet PHYs that supports it. PLCA is a feature that provides improved media-access performance in terms of throughput, latency and fairness for multi-drop (P2MP) half-duplex PHYs. PLCA is defined in Clause 148 of the IEEE802.3 specifications as amended by 802.3cg-2019. Currently, PLCA is supported by the 10BASE-T1S single-pair Ethernet PHY defined in the same standard and related amendments. The OPEN Alliance SIG TC14 defines additional specifications for the 10BASE-T1S PHY, including a standard register map for PHYs that embeds the PLCA RS (see PLCA management registers at https://www.opensig.org/about/specifications/). The changes proposed herein add the appropriate ethtool netlink interface for configuring the PLCA RS on PHYs that supports it. A separate patchset further modifies the ethtool userspace program to show and modify the configuration/status of the PLCA RS. Additionally, this patchset adds support for the onsemi NCN26000 Industrial Ethernet 10BASE-T1S PHY that uses the newly added PLCA infrastructure. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-11drivers/net/phy: add driver for the onsemi NCN26000 10BASE-T1S PHYPiergiorgio Beruto
This patch adds support for the onsemi NCN26000 10BASE-T1S industrial Ethernet PHY. The driver supports Point-to-Multipoint operation without auto-negotiation and with link control handling. The PHY also features PLCA for improving performance in P2MP mode. Signed-off-by: Piergiorgio Beruto <piergiorgio.beruto@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-11drivers/net/phy: add helpers to get/set PLCA configurationPiergiorgio Beruto
This patch adds support in phylib to read/write PLCA configuration for Ethernet PHYs that support the OPEN Alliance "10BASE-T1S PLCA Management Registers" specifications. These can be found at https://www.opensig.org/about/specifications/ Signed-off-by: Piergiorgio Beruto <piergiorgio.beruto@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-11drivers/net/phy: add connection between ethtool and phylib for PLCAPiergiorgio Beruto
This patch adds the required connection between netlink ethtool and phylib to resolve PLCA get/set config and get status messages. Signed-off-by: Piergiorgio Beruto <piergiorgio.beruto@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-11drivers/net/phy: add the link modes for the 10BASE-T1S Ethernet PHYPiergiorgio Beruto
This patch adds the link modes for the IEEE 802.3cg Clause 147 10BASE-T1S Ethernet PHY. According to the specifications, the 10BASE-T1S supports Point-To-Point Full-Duplex, Point-To-Point Half-Duplex and/or Point-To-Multipoint (AKA Multi-Drop) Half-Duplex operations. Signed-off-by: Piergiorgio Beruto <piergiorgio.beruto@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-11net/ethtool: add netlink interface for the PLCA RSPiergiorgio Beruto
Add support for configuring the PLCA Reconciliation Sublayer on multi-drop PHYs that support IEEE802.3cg-2019 Clause 148 (e.g., 10BASE-T1S). This patch adds the appropriate netlink interface to ethtool. Signed-off-by: Piergiorgio Beruto <piergiorgio.beruto@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-11net: lan966x: check for ptp to be enabled in lan966x_ptp_deinit()Clément Léger
If ptp was not enabled due to missing IRQ for instance, lan966x_ptp_deinit() will dereference NULL pointers. Fixes: d096459494a8 ("net: lan966x: Add support for ptp clocks") Signed-off-by: Clément Léger <clement.leger@bootlin.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-10net/mlx5e: Use kzalloc() in mlx5e_accel_fs_tcp_create()YueHaibing
'accel_tcp' is allocted by kvzalloc() now, which is a small chunk. Use kzalloc() directly instead of kvzalloc(). Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5: remove redundant ret variablezhang songyi
Return value from mlx5dr_send_postsend_action() directly instead of taking this in another redundant variable. Signed-off-by: zhang songyi <zhang.songyi@zte.com.cn> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5e: Replace 0-length array with flexible arrayKees Cook
Zero-length arrays are deprecated[1]. Replace struct mlx5e_rx_wqe_cyc's "data" 0-length array with a flexible array. Detected with GCC 13, using -fstrict-flex-arrays=3: drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_alloc_rq': drivers/net/ethernet/mellanox/mlx5/core/en_main.c:827:42: warning: array subscript f is outside array bounds of 'struct mlx5_wqe_data_seg[0]' [-Warray-bounds=] 827 | wqe->data[f].byte_count = 0; | ~~~~~~~~~^~~ In file included from drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h:11, from drivers/net/ethernet/mellanox/mlx5/core/eswitch.h:48, from drivers/net/ethernet/mellanox/mlx5/core/en_main.c:42: drivers/net/ethernet/mellanox/mlx5/core/en.h:250:39: note: while referencing 'data' 250 | struct mlx5_wqe_data_seg data[0]; | ^~~~ [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: netdev@vger.kernel.org Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5e: Replace zero-length array with flexible-array memberGustavo A. R. Silva
Zero-length arrays are deprecated[1] and we are moving towards adopting C99 flexible-array members instead. So, replace zero-length array declaration in struct mlx5e_flow_meter_aso_obj with flex-array member. This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines on memcpy() and help us make progress towards globally enabling -fstrict-flex-arrays=3 [2]. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays [1] Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [2] Link: https://github.com/KSPP/linux/issues/78 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5: Prevent high-rate FW commands from populating all slotsTariq Toukan
Certain connection-based device-offload protocols (like TLS) use per-connection HW objects to track the state, maintain the context, and perform the offload properly. Some of these objects are created, modified, and destroyed via FW commands. Under high connection rate, this type of FW commands might continuously populate all slots of the FW command interface and throttle it, while starving other critical control FW commands. Limit these throttle commands to using only up to a portion (half) of the FW command interface slots. FW commands maximal rate is not hit, and the same high rate is still reached when applying this limitation. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5: Introduce and use opcode getter in command interfaceTariq Toukan
Introduce an opcode getter in the FW command interface, and use it. Initialize the entry's opcode field early in cmd_alloc_ent() and use it when possible. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5: Enable management PF initializationShay Drory
Enable initialization of DPU Management PF, which is a new loopback PF designed for communication with BMC. For now Management PF doesn't support nor require most upper layer protocols so avoid them. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5e: Add hairpin debugfs filesGal Pressman
We refer to a TC NIC rule that involves forwarding as "hairpin". Hairpin queues are mlx5 hardware specific implementation for hardware forwarding of such packets. For debug purposes, introduce debugfs files which: * Expose the number of active hairpins * Dump the hairpin table * Allow control over the number and size of the hairpin queues instead of the hard-coded values. This allows us to get visibility of the feature in order to improve it for next generation hardware. Add debugfs files: fs/tc/hairpin_num_active fs/tc/hairpin_num_queues fs/tc/hairpin_queue_size fs/tc/hairpin_table_dump Note that the new values will only take effect on the next queues creation, it does not affect existing queues. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5e: Add flow steering debugfs directoryGal Pressman
Add a debugfs directory for flow steering related information. The directory is currently empty, and will hold the 'tc' subdirectory in a downstream patch. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5e: Add hairpin params structureGal Pressman
In preparation for downstream work to expose hairpin queues parameters, introduce a hairpin parameters struct as part of the tc structure. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5e: kTLS, Add debugfsTariq Toukan
Add TLS debugfs to improve observability by exposing the size of the tls TX pool. To observe the size of the TX pool: $ cat /sys/kernel/debug/mlx5/<pci>/nic/tls/tx/pool_size Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Co-developed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5e: Add Ethernet driver debugfsGal Pressman
Similar to the mlx5_core debugfs, lay the groundwork for mlx5e debugfs files under /sys/kernel/debug/mlx5/<pci>/nic/.. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5e: Update shared buffer along with device buffer changesMaher Sanalla
Currently, the user can modify device's receive buffer size, modify the mapping between QoS priority groups to buffers and change the buffer state to become lossy/lossless via pfc command. However, the shared receive buffer pool alignments, as a result of such commands, is performed only when the shared buffer is in FW ownership. When a user changes the mapping of priority groups or buffer size, the shared buffer is moved to SW ownership. Therefore, for devices that support shared buffer, handle the shared buffer alignments in accordance to user's desired configurations. Meaning, the following will be performed: 1. For every change of buffer's headroom, recalculate the size of shared buffer to be equal to "total_buffer_size" - "new_headroom_size". The new shared buffer size will be split in ratio of 3:1 between lossy and lossless pools, respectively. 2. For each port buffer change, count the number of lossless buffers. If there is only one lossless buffer, then set its lossless pool usage threshold to be infinite. Otherwise, if there is more than one lossless buffer, set a usage threshold for each lossless buffer. While at it, add more verbosity to debug prints when handling user commands, to assist in future debug. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5e: Add API to query/modify SBPR and SBCM registersMaher Sanalla
To allow users to configure shared receive buffer parameters through dcbnl callbacks, expose an API to query and modify SBPR and SBCM registers, which will be used in the upcoming patch. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10net/mlx5: Expose shared buffer registers bits and structsMaher Sanalla
Add the shared receive buffer management and configuration registers: 1. SBPR - Shared Buffer Pools Register 2. SBCM - Shared Buffer Class Management Register Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-10Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-01-09 (ice) This series contains updates to ice driver only. Jiasheng Jiang frees allocated cmd_buf if write_buf allocation failed to prevent memory leak. Yuan Can adds check, and proper cleanup, of gnss_tty_port allocation call to avoid memory leaks. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Add check for kzalloc ice: Fix potential memory leak in ice_gnss_tty_write() ==================== Link: https://lore.kernel.org/r/20230109225358.3478060-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: sched: disallow noqueue for qdisc classesFrederick Lawler
While experimenting with applying noqueue to a classful queue discipline, we discovered a NULL pointer dereference in the __dev_queue_xmit() path that generates a kernel OOPS: # dev=enp0s5 # tc qdisc replace dev $dev root handle 1: htb default 1 # tc class add dev $dev parent 1: classid 1:1 htb rate 10mbit # tc qdisc add dev $dev parent 1:1 handle 10: noqueue # ping -I $dev -w 1 -c 1 1.1.1.1 [ 2.172856] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 2.173217] #PF: supervisor instruction fetch in kernel mode ... [ 2.178451] Call Trace: [ 2.178577] <TASK> [ 2.178686] htb_enqueue+0x1c8/0x370 [ 2.178880] dev_qdisc_enqueue+0x15/0x90 [ 2.179093] __dev_queue_xmit+0x798/0xd00 [ 2.179305] ? _raw_write_lock_bh+0xe/0x30 [ 2.179522] ? __local_bh_enable_ip+0x32/0x70 [ 2.179759] ? ___neigh_create+0x610/0x840 [ 2.179968] ? eth_header+0x21/0xc0 [ 2.180144] ip_finish_output2+0x15e/0x4f0 [ 2.180348] ? dst_output+0x30/0x30 [ 2.180525] ip_push_pending_frames+0x9d/0xb0 [ 2.180739] raw_sendmsg+0x601/0xcb0 [ 2.180916] ? _raw_spin_trylock+0xe/0x50 [ 2.181112] ? _raw_spin_unlock_irqrestore+0x16/0x30 [ 2.181354] ? get_page_from_freelist+0xcd6/0xdf0 [ 2.181594] ? sock_sendmsg+0x56/0x60 [ 2.181781] sock_sendmsg+0x56/0x60 [ 2.181958] __sys_sendto+0xf7/0x160 [ 2.182139] ? handle_mm_fault+0x6e/0x1d0 [ 2.182366] ? do_user_addr_fault+0x1e1/0x660 [ 2.182627] __x64_sys_sendto+0x1b/0x30 [ 2.182881] do_syscall_64+0x38/0x90 [ 2.183085] entry_SYSCALL_64_after_hwframe+0x63/0xcd ... [ 2.187402] </TASK> Previously in commit d66d6c3152e8 ("net: sched: register noqueue qdisc"), NULL was set for the noqueue discipline on noqueue init so that __dev_queue_xmit() falls through for the noqueue case. This also sets a bypass of the enqueue NULL check in the register_qdisc() function for the struct noqueue_disc_ops. Classful queue disciplines make it past the NULL check in __dev_queue_xmit() because the discipline is set to htb (in this case), and then in the call to __dev_xmit_skb(), it calls into htb_enqueue() which grabs a leaf node for a class and then calls qdisc_enqueue() by passing in a queue discipline which assumes ->enqueue() is not set to NULL. Fix this by not allowing classes to be assigned to the noqueue discipline. Linux TC Notes states that classes cannot be set to the noqueue discipline. [1] Let's enforce that here. Links: 1. https://linux-tc-notes.sourceforge.net/tc/doc/sch_noqueue.txt Fixes: d66d6c3152e8 ("net: sched: register noqueue qdisc") Cc: stable@vger.kernel.org Signed-off-by: Frederick Lawler <fred@cloudflare.com> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20230109163906.706000-1-fred@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10qed: fix a typo in commentDai Shixin
Fix a typo of "permision" which should be "permission". Signed-off-by: Dai Shixin <dai.shixin@zte.com.cn> Signed-off-by: Yang Yang <yang.yang29@zte.com.cn> Link: https://lore.kernel.org/r/202301091935262709751@zte.com.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10Merge branch 'net-mdio-start-separating-c22-and-c45'Jakub Kicinski
Michael Walle says: ==================== net: mdio: Start separating C22 and C45 This patch set starts the separation of C22 and C45 MDIO bus transactions at the API level to the MDIO Bus drivers. C45 read and write ops are added to the MDIO bus driver structure, and the MDIO core will try to use these ops if requested to perform a C45 transfer. If not available a fallback to the older API is made, to allow backwards compatibility until all drivers are converted. A few drivers are then converted to this new API. The core DSA patch was dropped for now as there is still an ongoing discussion. It will be picked up in a later series again. v2: https://lore.kernel.org/r/20221227-v6-2-rc1-c45-seperation-v2-0-ddb37710e5a7@walle.cc v1: https://lore.kernel.org/r/20220508153049.427227-1-andrew@lunn.ch ==================== Link: https://lore.kernel.org/r/20221227-v6-2-rc1-c45-seperation-v3-0-ade1deb438da@walle.cc Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: dsa: mv88e6xxx: Separate C22 and C45 transactionsAndrew Lunn
The global2 SMI MDIO bus driver can perform both C22 and C45 transfers. Create separate functions for each and register the C45 versions using the new API calls where appropriate. Update the SERDES code to make use of these new accessors. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: mdio: add mdiobus_c45_read/write_nested helpersAndrew Lunn
Some DSA devices pass through PHY access to the MDIO bus the switch is on. Add C45 versions of the current C22 helpers for nested accesses to MDIO busses, so that C22 and C45 can be separated in these DSA drivers. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: fec: Separate C22 and C45 transactionsAndrew Lunn
The fec MDIO bus driver can perform both C22 and C45 transfers. Create separate functions for each and register the C45 versions using the new API calls where appropriate. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: mdio: xgmac_mdio: Separate C22 and C45 transactionsAndrew Lunn
The xgmac MDIO bus driver can perform both C22 and C45 transfers. Create separate functions for each and register the C45 versions using the new API calls where appropriate. While at it, remove the misleading comment. According to Vladimir Oltean: - miimcom is a register accessed by fsl_pq_mdio.c, not by xgmac_mdio.c - "device dev" doesn't really refer to anything (maybe "dev_addr"). - I don't understand what is meant by the comment "All PHY configuration has to be done through the TSEC1 MIIM regs". Or rather said, I think I understand, but it is irrelevant to the driver for 2 reasons: * TSEC devices use the fsl_pq_mdio.c driver, not this one * It doesn't matter to this driver whose TSEC registers are used for MDIO access. The driver just works with the registers it's given, which is a concern for the device tree. - barring the above, the rest just describes the MDIO bus API, which is superfluous Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: mdio: mvmdio: Convert XSMI bus to new APIAndrew Lunn
The marvell MDIO driver supports two different hardware blocks. The XSMI block is C45 only. Convert this block to the new API, and only populate the c45 calls in the bus structure. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: mdio: mdio-bitbang: Separate C22 and C45 transactionsAndrew Lunn
The bitbbanging bus driver can perform both C22 and C45 transfers. Create separate functions for each and register the C45 versions using the new driver API calls. The SH Ethernet driver places wrappers around these functions. In order to not break boards which might be using C45, add similar wrappers for C45 operations. Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: mdio: Move mdiobus_c45_addr() next to usersAndrew Lunn
Now that mdiobus_c45_addr() is only used within the MDIO code during fallback, move the function next to its only users. This function should not be used any more in drivers, the c45 helpers should be used in its place, so hiding it away will prevent any new users from being added. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Signed-off-by: Jakub Kicinski <kuba@kernel.org>