Age | Commit message (Collapse) | Author |
|
We don't have to close and open the nic to make sure we have
valid rx timestamps. Once we have the timestamp filter applied to
the HW and the timestamp_fld_format bit is cleared in the rx
completion and the timestamp is non-zero, we can be sure that rx
timestamp is valid data.
Skip close/open when we set any timestamp filter.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-13-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The new 5760X chips supports UDP GSO. Tested using udpgso_bench_tx.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-12-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
rx_filter_miss counter is newly added to the rx_port_stats_ext
stats structure for newer chips. Newer firmware will return the
structure size that includes this counter. Add this entry to
the bnxt_port_stats_ext_arr array and the ethtool -S code will
pick up this counter if it is supported.
Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-11-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On the new P7 chips, TPA for tunnel packets can be independently
enabled for each VNIC. The default TPA configuration should not
include UDP tunnels because the UDP ports for these tunnels are not
known yet. The chip should not aggregate these UDP tunneled packets
using default UDP ports until the ports are known.
Add a new function bnxt_hwrm_vnic_update_tunl_tpa() to enable VXLAN
and Geneve TPA if the corresponding UDP ports are known.
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-10-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new bnxt_udp_tunnels_p7 struct to support the new P7 chips that
can parse VXLAN GPE packets. Add VXLAN GPE tunnel type handling to
the .set_port() and .unset_port() functions. .ndo_features_check()
is also enhanced to support VXLAN GPE which may encapsulate inner
IP packets instead of ethernet packets.
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-9-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In bnxt_udp_tunnel_set_port(), use the proper ALLOC commands instead
of the FREE commands for correctness. The ALLOC and FREE commands
happen to be identical so this is just a cosmetic fix for correctness.
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-8-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The Fast QP modify destroy RoCE feature requires additional QP entries
in QP context backing store. FW reports the extra count to be
allocated during backing store query. Use this value and allocate extra
memory. Note that this works for both the V1 and V1 backing store
FW APIs.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-7-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
TX coalesced completions are supported on newer chips to provide
one TX completion record for multiple TX packets up to the
sq_cons_idx in the completion record. This method saves PCIe
bandwidth by reducing the number of TX completions.
Only very minor changes are now required to support this mode
with the new framework that handles TX completions based on
the consumer indices.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-6-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If xmit_more condition is true, the driver may set the
TX_BD_FLAGS_NO_CMPL flag. If after this packet, the TX ring can no
longer hold a packet with maximum fragments, we will stop the TX
queue. When this happens, we must clear the TX_BD_FLAGS_NO_CMPL flag
on the last packet or there will be no completion and cause TX
timeout.
Fixes: c1056a59aee1 ("bnxt_en: Optimize xmit_more TX path")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Two spots were missed when modifying the TX ring indexing logic.
The use of unmasked TX index in bnxt_tx_int() will cause unnecessary
__bnxt_tx_int() calls. The same issue in bnxt_tx_int_xdp() can
result in illegal array index.
Fixes: 6d1add95536b ("bnxt_en: Modify TX ring indexing logic.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
_bnxt_get_max_rings() that is invoked in bnxt_check_rings() already
accounts for the AGG ring(s) and gives a max value based on that.
Increasing for AGG rings before calling _bnxt_get_max_rings() will
result in checking for twice the number of rings than required and
it can fail. Fix it by adjusting for AGG rings after calling
_bnxt_get_max_rings().
Fixes: f5b29c6afe36 ("bnxt_en: Add helper to get the number of CP rings required for TX rings")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The recent commit to trim the RX and TX rings on P5 chips by assigning
each with max CP rings divided by 2 is not correct. Max CP rings
divided by 2 may be bigger than the original RX or TX and would
lead to failure. In other words, we may be checking for increased
RX/TX rings than required and it may fail.
Fix it by calling __bnxt_trim_rings() instead that would properly
trim RX and TX without the possibility of increasing their values.
Fixes: f5b29c6afe36 ("bnxt_en: Add helper to get the number of CP rings required for TX rings")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.
We expect new_bus->id to be NUL-terminated but not NUL-padded based on
its prior assignment through snprintf:
| snprintf(new_bus->id, MII_BUS_ID_SIZE, "gpio-%x", bus_id);
Due to this, a suitable replacement is `strscpy` [2] due to the fact
that it guarantees NUL-termination on the destination buffer without
unnecessarily NUL-padding.
We can also use sizeof() instead of a length macro as this more closely
ties the maximum buffer size to the destination buffer. Do this for two
instances.
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20231211-strncpy-drivers-net-mdio-mdio-gpio-c-v3-1-76dea53a1a52@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzkaller report:
kernel BUG at net/core/skbuff.c:3452!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty #135
RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
Call Trace:
icmp_glue_bits (net/ipv4/icmp.c:357)
__ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
icmp_push_reply (net/ipv4/icmp.c:370)
__icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
__ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
ip_output (net/ipv4/ip_output.c:427)
__ip_queue_xmit (net/ipv4/ip_output.c:535)
__tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
__tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)
The panic issue was trigered by tcp simultaneous initiation.
The initiation process is as follows:
TCP A TCP B
1. CLOSED CLOSED
2. SYN-SENT --> <SEQ=100><CTL=SYN> ...
3. SYN-RECEIVED <-- <SEQ=300><CTL=SYN> <-- SYN-SENT
4. ... <SEQ=100><CTL=SYN> --> SYN-RECEIVED
5. SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...
// TCP B: not send challenge ack for ack limit or packet loss
// TCP A: close
tcp_close
tcp_send_fin
if (!tskb && tcp_under_memory_pressure(sk))
tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN; // set FIN flag
6. FIN_WAIT_1 --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...
// TCP B: send challenge ack to SYN_FIN_ACK
7. ... <SEQ=301><ACK=101><CTL=ACK> <-- SYN-RECEIVED //challenge ack
// TCP A: <SND.UNA=101>
8. FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic
__tcp_retransmit_skb //skb->len=0
tcp_trim_head
len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
__pskb_trim_head
skb->data_len -= len // skb->len=-1, wrap around
... ...
ip_fragment
icmp_glue_bits //BUG_ON
If we use tcp_trim_head() to remove acked SYN from packet that contains data
or other flags, skb->len will be incorrectly decremented. We can remove SYN
flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
can fix the problem mentioned above.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Linus Walleij says:
====================
Immutable tag for the PEF2256 framer
* tag 'pef2256-framer' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
MAINTAINERS: Add the Lantiq PEF2256 driver entry
pinctrl: Add support for the Lantic PEF2256 pinmux
net: wan: framer: Add support for the Lantiq PEF2256 framer
dt-bindings: net: Add the Lantiq PEF2256 E1/T1/J1 framer
net: wan: Add framer framework support
====================
Link: https://lore.kernel.org/all/CACRpkdYT1J7noFUhObFgfA60XQAfL4rb=knEmWS__TKKtCMh7Q@mail.gmail.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After contributing the driver, add myself as the maintainer for the
Lantiq PEF2256 driver.
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/r/20231128132534.258459-6-herve.codina@bootlin.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The Lantiq PEF2256 is a framer and line interface component designed to
fulfill all required interfacing between an analog E1/T1/J1 line and the
digital PCM system highway/H.100 bus.
This kind of component can be found in old telecommunication system.
It was used to digital transmission of many simultaneous telephone calls
by time-division multiplexing. Also using HDLC protocol, WAN networks
can be reached through the framer.
This pinmux support handles the pin muxing part (pins RP(A..D) and pins
XP(A..D)) of the PEF2256.
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20231128132534.258459-5-herve.codina@bootlin.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The Lantiq PEF2256 is a framer and line interface component designed to
fulfill all required interfacing between an analog E1/T1/J1 line and the
digital PCM system highway/H.100 bus.
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20231128132534.258459-4-herve.codina@bootlin.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The Lantiq PEF2256 is a framer and line interface component designed to
fulfill all required interfacing between an analog E1/T1/J1 line and the
digital PCM system highway/H.100 bus.
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20231128132534.258459-3-herve.codina@bootlin.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
A framer is a component in charge of an E1/T1 line interface.
Connected usually to a TDM bus, it converts TDM frames to/from E1/T1
frames. It also provides information related to the E1/T1 line.
The framer framework provides a set of APIs for the framer drivers
(framer provider) to create/destroy a framer and APIs for the framer
users (framer consumer) to obtain a reference to the framer, and
use the framer.
This basic implementation provides a framer abstraction for:
- power on/off the framer
- get the framer status (line state)
- be notified on framer status changes
- get/set the framer configuration
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20231128132534.258459-2-herve.codina@bootlin.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
qed_ilt_shadow_alloc() will call qed_ilt_shadow_free() to
free p_hwfn->p_cxt_mngr->ilt_shadow on error. However,
qed_cxt_tables_alloc() accesses the freed pointer on failure
of qed_ilt_shadow_alloc() through calling qed_cxt_mngr_free(),
which may lead to use-after-free. Fix this issue by setting
p_mngr->ilt_shadow to NULL in qed_ilt_shadow_free().
Fixes: fe56b9e6a8d9 ("qed: Add module with basic common support")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Link: https://lore.kernel.org/r/20231210045255.21383-1-dinghao.liu@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When compiling with gcc version 14.0.0 20231129 (experimental) and
CONFIG_FORTIFY_SOURCE=y, I've noticed the following warning:
...
In function 'fortify_memcpy_chk',
inlined from 'ax88796c_tx_fixup' at drivers/net/ethernet/asix/ax88796c_main.c:287:2:
./include/linux/fortify-string.h:588:25: warning: call to '__read_overflow2_field'
declared with attribute warning: detected read beyond size of field (2nd parameter);
maybe use struct_group()? [-Wattribute-warning]
588 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...
This call to 'memcpy()' is interpreted as an attempt to copy TX_OVERHEAD
(which is 8) bytes from 4-byte 'sop' field of 'struct tx_pkt_info' and
thus overread warning is issued. Since we actually want to copy both
'sop' and 'seg' fields at once, use the convenient 'struct_group()' here.
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Acked-by: Łukasz Stelmach <l.stelmach@samsung.com>
Link: https://lore.kernel.org/r/20231211090535.9730-1-dmantipov@yandex.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use pcie_capability_read_word() for reading LNKSTA and remove the
custom define that matches to PCI_EXP_LNKSTA.
As only single user for cap_offset remains, replace it with a call to
pci_pcie_cap(). Instead of e1000_adapter, make local variable out of
pci_dev because both users are interested in it.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Fix various bugs / regressions for ext4, including a soft lockup, a
WARN_ON, and a BUG"
* tag 'ext4_for_linus-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
jbd2: fix soft lockup in journal_finish_inode_data_buffers()
ext4: fix warning in ext4_dio_write_end_io()
jbd2: increase the journal IO's priority
jbd2: correct the printing of write_flags in jbd2_write_superblock()
ext4: prevent the normalized size from exceeding EXT_MAX_BLOCKS
|
|
Make the flow for pci shutdown be the same to the pci remove.
iavf_shutdown was implementing an incomplete version
of iavf_remove. It misses several calls to the kernel like
iavf_free_misc_irq, iavf_reset_interrupt_capability, iounmap
that might break the system on reboot or hibernation.
Implement the call of iavf_remove directly in iavf_shutdown to
close this gap.
Fixes below error messages (dmesg) during shutdown stress tests -
[685814.900917] ice 0000:88:00.0: MAC 02:d0:5f:82:43:5d does not exist for
VF 0
[685814.900928] ice 0000:88:00.0: MAC 33:33:00:00:00:01 does not exist for
VF 0
Reproduction:
1. Create one VF interface:
echo 1 > /sys/class/net/<interface_name>/device/sriov_numvfs
2. Run live dmesg on the host:
dmesg -wH
3. On SUT, script below steps into vf_namespace_assignment.sh
<#!/bin/sh> // Remove <>. Git removes # line
if=<VF name> (edit this per VF name)
loop=0
while true; do
echo test round $loop
let loop++
ip netns add ns$loop
ip link set dev $if up
ip link set dev $if netns ns$loop
ip netns exec ns$loop ip link set dev $if up
ip netns exec ns$loop ip link set dev $if netns 1
ip netns delete ns$loop
done
4. Run the script for at least 1000 iterations on SUT:
./vf_namespace_assignment.sh
Expected result:
No errors in dmesg.
Fixes: 129cf89e5856 ("iavf: rename functions and structs to new name")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Co-developed-by: Ranganatha Rao <ranganatha.rao@intel.com>
Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
ntuple-filter feature on/off:
Default is on. If turned off, the filters will be removed from both
PF and iavf list. The removal is irrespective of current filter state.
Steps to reproduce:
-------------------
1. Ensure ntuple is on.
ethtool -K enp8s0 ntuple-filters on
2. Create a filter to receive the traffic into non-default rx-queue like 15
and ensure traffic is flowing into queue into 15.
Now, turn off ntuple. Traffic should not flow to configured queue 15.
It should flow to default RX queue.
Fixes: 0dbfbabb840d ("iavf: Add framework to enable ethtool ntuple filters")
Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
New states introduced:
IAVF_FDIR_FLTR_DIS_REQUEST
IAVF_FDIR_FLTR_DIS_PENDING
IAVF_FDIR_FLTR_INACTIVE
Current FDIR state machines (SM) are not adequate to handle a few
scenarios in the link DOWN/UP event, reset event and ntuple-feature.
For example, when VF link goes DOWN and comes back UP administratively,
the expectation is that previously installed filters should also be
restored. But with current SM, filters are not restored.
So with new SM, during link DOWN filters are marked as INACTIVE in
the iavf list but removed from PF. After link UP, SM will transition
from INACTIVE to ADD_REQUEST to restore the filter.
Similarly, with VF reset, filters will be removed from the PF, but
marked as INACTIVE in the iavf list. Filters will be restored after
reset completion.
Steps to reproduce:
-------------------
1. Create a VF. Here VF is enp8s0.
2. Assign IP addresses to VF and link partner and ping continuously
from remote. Here remote IP is 1.1.1.1.
3. Check default RX Queue of traffic.
ethtool -S enp8s0 | grep -E "rx-[[:digit:]]+\.packets"
4. Add filter - change default RX Queue (to 15 here)
ethtool -U ens8s0 flow-type ip4 src-ip 1.1.1.1 action 15 loc 5
5. Ensure filter gets added and traffic is received on RX queue 15 now.
Link event testing:
-------------------
6. Bring VF link down and up. If traffic flows to configured queue 15,
test is success, otherwise it is a failure.
Reset event testing:
--------------------
7. Reset the VF. If traffic flows to configured queue 15, test is success,
otherwise it is a failure.
Fixes: 0dbfbabb840d ("iavf: Add framework to enable ethtool ntuple filters")
Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
- Fix a couple of potential crashes, one introduced in 6.6 and one
in 5.10
- Fix misbehavior of virtiofs submounts on memory pressure
- Clarify naming in the uAPI for a recent feature
* tag 'fuse-fixes-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: disable FOPEN_PARALLEL_DIRECT_WRITES with FUSE_DIRECT_IO_ALLOW_MMAP
fuse: dax: set fc->dax to NULL in fuse_dax_conn_free()
fuse: share lookup state between submount and its parent
docs/fuse-io: Document the usage of DIRECT_IO_ALLOW_MMAP
fuse: Rename DIRECT_IO_RELAX to DIRECT_IO_ALLOW_MMAP
|
|
e1000e has own copy of PCI Negotiated Link Width field defines. Use the
ones from include/uapi/linux/pci_regs.h instead of the custom ones and
remove the custom ones and convert to FIELD_GET().
Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Use FIELD_GET() to extract PCIe Negotiated Link Width field instead of
custom masking and shifting.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Pull smb server fixes from Steve French:
- Memory leak fix (in lock error path)
- Two fixes for create with allocation size
- FIx for potential UAF in lease break error path
- Five directory lease (caching) fixes found during additional recent
testing
* tag '6.7-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix wrong name of SMB2_CREATE_ALLOCATION_SIZE
ksmbd: fix wrong allocation size update in smb2_open()
ksmbd: avoid duplicate opinfo_put() call on error of smb21_lease_break_ack()
ksmbd: lazy v2 lease break on smb2_write()
ksmbd: send v2 lease break notification for directory
ksmbd: downgrade RWH lease caching state to RH for directory
ksmbd: set v2 lease capability
ksmbd: set epoch in create context v2 lease
ksmbd: fix memory leak in smb2_lock()
|
|
There's issue when do io test:
WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170]
CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G OE
Call trace:
dump_backtrace+0x0/0x1a0
show_stack+0x24/0x30
dump_stack+0xb0/0x100
watchdog_timer_fn+0x254/0x3f8
__hrtimer_run_queues+0x11c/0x380
hrtimer_interrupt+0xfc/0x2f8
arch_timer_handler_phys+0x38/0x58
handle_percpu_devid_irq+0x90/0x248
generic_handle_irq+0x3c/0x58
__handle_domain_irq+0x68/0xc0
gic_handle_irq+0x90/0x320
el1_irq+0xcc/0x180
queued_spin_lock_slowpath+0x1d8/0x320
jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2]
kjournald2+0xec/0x2f0 [jbd2]
kthread+0x134/0x138
ret_from_fork+0x10/0x18
Analyzed informations from vmcore as follows:
(1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list';
(2) Now is processing the 855th jbd2_inode;
(3) JBD2 task has TIF_NEED_RESCHED flag;
(4) There's no pags in address_space around the 855th jbd2_inode;
(5) There are some process is doing drop caches;
(6) Mounted with 'nodioread_nolock' option;
(7) 128 CPUs;
According to informations from vmcore we know 'journal->j_list_lock' spin lock
competition is fierce. So journal_finish_inode_data_buffers() maybe process
slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors().
However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK,
will not call cond_resched(). So may lead to soft lockup.
journal_finish_inode_data_buffers
filemap_fdatawait_range_keep_errors
__filemap_fdatawait_range
while (index <= end)
nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK);
if (!nr_pages)
break; --> If 'nr_pages' is equal zero will break, then will not call cond_resched()
for (i = 0; i < nr_pages; i++)
wait_on_page_writeback(page);
cond_resched();
To solve above issue, add scheduling point in the journal_finish_inode_data_buffers();
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
JingZao(京造) WKB603 keyboard is a rebranded product of Jamesdonkey RS2
keyboard, identified as "hfd.cn WKB603" in wired mode, "WKB603" in bluetooth
mode. Adding them to the list of non-apple keyboards fixes function key.
Signed-off-by: Yan Jun <jerrysteve1101@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Commit 46a0a2c96f0f ("HID: lenovo: Detect quirk-free fw on cptkbd and
stop applying workaround") introduced a regression for ThinkPad
TrackPoint Keyboard II which has similar quirks to cptkbd (so it uses
the same workarounds) but slightly different so that there are
false-positives during detecting well-behaving firmware. This commit
restricts detecting well-behaving firmware to the only model which
known to have one and have stable enough quirks to not cause
false-positives.
Fixes: 46a0a2c96f0f ("HID: lenovo: Detect quirk-free fw on cptkbd and stop applying workaround")
Link: https://lore.kernel.org/linux-input/ZXRiiPsBKNasioqH@jekhomev/
Link: https://bbs.archlinux.org/viewtopic.php?pid=2135468#p2135468
Signed-off-by: Mikhail Khvainitski <me@khvoinitsky.org>
Tested-by: Yauhen Kharuzhy <jekhor@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Linus Walleij says:
====================
net: dsa: realtek: Two RTL8366RB fixes
These minor fixes were found while digging into other
issues: a weirdly named variable and bogus MTU handling.
Fix it up.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
====================
Link: https://lore.kernel.org/r/20231209-rtl8366rb-mtu-fix-v1-0-df863e2b2b2a@linaro.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The MTU callbacks are in layer 1 size, so for example 1500
bytes is a normal setting. Cache this size, and only add
the layer 2 framing right before choosing the setting. On
the CPU port this will however include the DSA tag since
this is transmitted from the parent ethernet interface!
Add the layer 2 overhead such as ethernet and VLAN framing
and FCS before selecting the size in the register.
This will make the code easier to understand.
The rtl8366rb_max_mtu() callback returns a bogus MTU
just subtracting the CPU tag, which is the only thing
we should NOT subtract. Return the correct layer 1
max MTU after removing headers and checksum.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Rename the register name to RTL8366RB instead of the bogus
RTL8368S (internal product name?)
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Because rose_ioctl() accesses sk->sk_receive_queue
without holding a sk->sk_receive_queue.lock, it can
cause a race with rose_accept().
A use-after-free for skb occurs with the following flow.
```
rose_ioctl() -> skb_peek()
rose_accept() -> skb_dequeue() -> kfree_skb()
```
Add sk->sk_receive_queue.lock to rose_ioctl() to fix this issue.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Link: https://lore.kernel.org/r/20231209100538.GA407321@v4bel-B760M-AORUS-ELITE-AX
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Because do_vcc_ioctl() accesses sk->sk_receive_queue
without holding a sk->sk_receive_queue.lock, it can
cause a race with vcc_recvmsg().
A use-after-free for skb occurs with the following flow.
```
do_vcc_ioctl() -> skb_peek()
vcc_recvmsg() -> skb_recv_datagram() -> skb_free_datagram()
```
Add sk->sk_receive_queue.lock to do_vcc_ioctl() to fix this issue.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Link: https://lore.kernel.org/r/20231209094210.GA403126@v4bel-B760M-AORUS-ELITE-AX
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
$ modinfo dnsresolver dns_resolver | grep name
modinfo: ERROR: Module dnsresolver not found.
filename: /lib/modules/6.1.0-9-amd64/kernel/net/dns_resolver/dns_resolver.ko
name: dns_resolver
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Link: https://lore.kernel.org/r/gh4sxphjxbo56n2spgmc66vtazyxgiehpmv5f2gkvgicy6f4rs@tarta.nabijaczleweli.xyz
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The driver is using iowriteXX()/ioreadXX() APIs which are LE IO
accessors simplified as
1. Convert given value _from_ CPU _to_ LE
2. Write it to the device as is
The dev_addr is a byte stream, but because the driver uses 16-bit
IO accessors, it wants to perform double conversion on BE CPUs,
but it took it wrong, as it effectivelly does two times _from_ CPU
_to_ LE. What it has to do is to consider dev_addr as an array of
LE16 and hence do _from_ LE _to_ CPU conversion, followed by implied
_from_ CPU _to_ LE in the iowrite16().
To achieve that, use get_unaligned_le16(). This will make it correct
and allows to avoid sparse warning as reported by LKP.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312030058.hfZPTXd7-lkp@intel.com/
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20231208153327.3306798-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add some missing(not all) attributes in devlink.yaml.
Signed-off-by: Swarup Laxman Kotiaklapudi <swarupkotikalapudi@gmail.com>
Suggested-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231208182515.1206616-1-swarupkotikalapudi@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pedro Tammela says:
====================
net/sched: conditional notification of events for cls and act
This is an optimization we have been leveraging on P4TC but we believe
it will benefit rtnl users in general.
It's common to allocate an skb, build a notification message and then
broadcast an event. In the absence of any user space listeners, these
resources (cpu and memory operations) are wasted. In cases where the subsystem
is lockless (such as in tc-flower) this waste is more prominent. For the
scenarios where the rtnl_lock is held it is not as prominent.
The idea is simple. Build and send the notification iif:
- The user requests via NLM_F_ECHO or
- Someone is listening to the rtnl group (tc mon)
On a simple test with tc-flower adding 1M entries, using just a single core,
there's already a noticeable difference in the cycles spent in tc_new_tfilter
with this patchset.
before:
- 43.68% tc_new_tfilter
+ 31.73% fl_change
+ 6.35% tfilter_notify
+ 1.62% nlmsg_notify
0.66% __tcf_qdisc_find.part.0
0.64% __tcf_chain_get
0.54% fl_get
+ 0.53% tcf_proto_lookup_ops
after:
- 39.20% tc_new_tfilter
+ 34.58% fl_change
0.69% __tcf_qdisc_find.part.0
0.67% __tcf_chain_get
+ 0.61% tcf_proto_lookup_ops
Note, the above test is using iproute2:tc which execs a shell.
We expect people using netlink directly to observe even greater
reductions.
The qdisc side needs some refactoring of the notification routines to fit in
this new model, so they will be sent in a later patchset.
====================
Link: https://lore.kernel.org/r/20231208192847.714940-1-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As of today tc-filter/chain events are unconditionally built and sent to
RTNLGRP_TC. As with the introduction of rtnl_notify_needed we can check
before-hand if they are really needed. This will help to alleviate
system pressure when filters are concurrently added without the rtnl
lock as in tc-flower.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20231208192847.714940-8-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This argument is never called while set to true, so remove it as there's
no need for it.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231208192847.714940-7-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As of today tc-action events are unconditionally built and sent to
RTNLGRP_TC. As with the introduction of rtnl_notify_needed we can check
before-hand if they are really needed.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231208192847.714940-6-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use max() in a couple of places that are open coding it with the
ternary operator.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231208192847.714940-5-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is a convenience helper for routines handling conditional rtnl
events, that is code that might send a notification depending on
rtnl_has_listeners/rtnl_notify_needed.
Instead of:
if (skb)
rtnetlink_send(...)
Use:
rtnetlink_maybe_send(...)
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20231208192847.714940-4-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Building on the rtnl_has_listeners helper, add the rtnl_notify_needed
helper to check if we can bail out early in the notification routines.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20231208192847.714940-3-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As of today, rtnl code creates a new skb and unconditionally fills and
broadcasts it to the relevant group. For most operations this is okay
and doesn't waste resources in general.
When operations are done without the rtnl_lock, as in tc-flower, such
skb allocation, message fill and no-op broadcasting can happen in all
cores of the system, which contributes to system pressure and wastes
precious cpu cycles when no one will receive the built message.
Introduce this helper so rtnetlink operations can simply check if someone
is listening and then proceed if necessary.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20231208192847.714940-2-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|