Age | Commit message (Collapse) | Author |
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
wireless features, notably
* stack
- free SKBTX_WIFI_STATUS flag
- fixes for VLAN multicast in multi-link
- improve codel parameters (revert some old twiddling)
* ath12k
- Enable AHB support for IPQ5332.
- Add monitor interface support to QCN9274.
- Add MLO support to WCN7850.
- Add 802.11d scan offload support to WCN7850.
* ath11k
- Restore hibernation support
* iwlwifi
- EMLSR on two 5 GHz links
* mwifiex
- cleanups/refactoring
along with many other small features/cleanups
* tag 'wireless-next-2025-05-06' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (177 commits)
Revert "wifi: iwlwifi: clean up config macro"
wifi: iwlwifi: move phy_filters to fw_runtime
wifi: iwlwifi: pcie: make sure to lock rxq->read
wifi: iwlwifi: add definitions for iwl_mac_power_cmd version 2
wifi: iwlwifi: clean up config macro
wifi: iwlwifi: mld: simplify iwl_mld_rx_fill_status()
wifi: iwlwifi: mld: rx: simplify channel handling
wifi: iwlwifi: clean up band in RX metadata
wifi: iwlwifi: mld: skip unknown FW channel load values
wifi: iwlwifi: define API for external FSEQ images
wifi: iwlwifi: mld: allow EMLSR on separated 5 GHz subbands
wifi: iwlwifi: mld: use cfg80211_chandef_get_width()
wifi: iwlwifi: mld: fix iwl_mld_emlsr_disallowed_with_link() return
wifi: iwlwifi: mld: clarify variable type
wifi: iwlwifi: pcie: add support for the reset handshake in MSI
wifi: mac80211_hwsim: Prevent tsf from setting if beacon is disabled
wifi: mac80211: restructure tx profile retrieval for MLO MBSSID
wifi: nl80211: add link id of transmitted profile for MLO MBSSID
wifi: ieee80211: Add helpers to fetch EMLSR delay and timeout values
wifi: mac80211: update ML STA with EML capabilities
...
====================
Link: https://patch.msgid.link/20250506174656.119970-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jiri Pirko says:
====================
devlink: sanitize variable typed attributes
This is continuation based on first two patches of
https://lore.kernel.org/20250425214808.507732-1-saeed@kernel.org
Better to take it as a separate patchset, as the rest of the original
patchset is probably settled.
This patchset is taking care of incorrect usage of internal NLA_* values
in uapi, introduces new enum (in patch #2) that shadows NLA_* values and
makes that part of UAPI.
The last two patches removes unnecessary translations with maintaining
clear devlink param driver api.
====================
Link: https://patch.msgid.link/20250505114513.53370-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use newly introduced DEVLINK_VAR_ATTR_TYPE_* enum values instead of
internal NLA_* in fmsg health reporter code.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20250505114513.53370-5-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Assign DEVLINK_PARAM_TYPE_* enum values to DEVLINK_VAR_ATTR_TYPE_* to
ensure the same values are used internally and in UAPI. Benefit from
that by removing the value translations.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20250505114513.53370-4-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Devlink param and health reporter fmsg use attributes with dynamic type
which is determined according to a different type. Currently used values
are NLA_*. The problem is, they are not part of UAPI. They may change
which would cause a break.
To make this future safe, introduce a enum that shadows NLA_* values in
it and is part of UAPI.
Also, this allows to possibly carry types that are unrelated to NLA_*
values.
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20250505114513.53370-3-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
in case the enum has holes, instead of hard stop, generate a validation
callback to check valid enum values.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20250505114513.53370-2-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix ethtool syntax for setting ntuple rule into rss. It should be
`context' instead of `action'.
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250503043007.857215-1-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Michael Klein says:
====================
net: phy: realtek: Add support for PHY LEDs
Changes in V7:
- Remove some unused macros (patch 1)
- Add more register defines for RTL8211F (patch 3)
- Revise macro definition order once more (patch 4)
Changes in V6:
- fix macro definition order (patch 1)
- introduce two more register defines (patch 2)
Changes in V5:
- Split cleanup patch and improve code formatting
Changes in V4:
- Change (!ret) to (ret == 0)
- Replace set_bit() by __set_bit()
Changes in V3:
- move definition of rtl8211e_read_ext_page() to patch 2
- Wrap overlong lines
Changes in V2:
- Designate to net-next
- Add ExtPage access cleanup patch as suggested by Andrew Lunn
====================
Link: https://patch.msgid.link/20250504172916.243185-1-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Like the RTL8211F, the RTL8211E PHY supports up to three LEDs.
Add netdev trigger support for them, too.
Signed-off-by: Michael Klein <michael@fossekall.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250504172916.243185-7-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
rtl8211f_led_hw_control_get() does not need atomic bit operations,
replace set_bit() by __set_bit().
Signed-off-by: Michael Klein <michael@fossekall.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250504172916.243185-6-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Group macro definitions by PHY in lexicographic order. Within each PHY
block, definitions are order by page number and then register number.
Signed-off-by: Michael Klein <michael@fossekall.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250504172916.243185-5-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add some more defines for RTL8211F page and register numbers.
Signed-off-by: Michael Klein <michael@fossekall.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250504172916.243185-4-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Factor out RTL8211E extension page access code to
rtl821x_modify_ext_page() and clean up rtl8211e_config_init()
Signed-off-by: Michael Klein <michael@fossekall.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250504172916.243185-3-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
These macros have there since the first revision but were never used, so
let's just remove them.
Signed-off-by: Michael Klein <michael@fossekall.de>
Link: https://patch.msgid.link/20250504172916.243185-2-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Apparently, nf_conntrack_bridge changes the way in which fragments
are handled, dealing to packet drop. From Huajian Yang.
2) Add a selftest to stress the conntrack subsystem, from Florian Westphal.
3) nft_quota depletion is off-by-one byte, Zhongqiu Duan.
4) Rewrites the procfs to read the conntrack table to speed it up,
from Florian Westphal.
5) Two patches to prevent overflow in nft_pipapo lookup table and to
clamp the maximum bucket size.
6) Update nft_fib selftest to check for loopback packet bypass.
From Florian Westphal.
netfilter pull request 25-05-06
* tag 'nf-next-25-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
selftests: netfilter: nft_fib.sh: check lo packets bypass fib lookup
netfilter: nft_set_pipapo: clamp maximum map bucket size to INT_MAX
netfilter: nft_set_pipapo: prevent overflow in lookup table allocation
netfilter: nf_conntrack: speed up reads from nf_conntrack proc file
netfilter: nft_quota: match correctly when the quota just depleted
selftests: netfilter: add conntrack stress test
netfilter: bridge: Move specific fragmented packet to slow_path instead of dropping it
====================
Link: https://patch.msgid.link/20250505234151.228057-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Fix the tracking of rtnl_link_stats.tx_dropped. The counter
`tmi.drop.frames` is being double counted whereas, the counter
`tti.cm_drop.frames` is being skipped.
Fixes: f2957147ae7a ("eth: fbnic: add support for TTI HW stats")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250503020145.1868252-1-mohsin.bashr@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
ksft runner sends 2 SIGTERMs in a row if a test runs out of time.
Handle this in a similar way we handle SIGINT - cleanup and stop
running further tests.
Because we get 2 signals we need a bit of logic to ignore
the subsequent one, they come immediately one after the other
(due to commit 9616cb34b08e ("kselftest/runner.sh: Propagate SIGTERM
to runner child")).
This change makes sure we run cleanup (scheduled defer()s)
and also print a stack trace on SIGTERM, which doesn't happen
by default. Tests occasionally hang in NIPA and it's impossible
to tell what they are waiting from or doing.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250503011856.46308-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
'net-ibmveth-make-ibmveth-use-new-reset-function-and-new-kunit-testsg'
Dave Marquardt says:
====================
net: ibmveth: Make ibmveth use new reset function and new KUnit testsg
- Fixed struct ibmveth_adapter indentation
- Made ibmveth driver use WARN_ON with recovery rather than BUG_ON. Some
recovery code schedules a reset through new function ibmveth_reset. Also
removed a conflicting and unneeded forward declaration.
- Added KUnit tests for some areas changed by the WARN_ON changes.
====================
Link: https://patch.msgid.link/20250501194944.283729-1-davemarq@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Added KUnit tests for ibmveth_remove_buffer_from_pool and
ibmveth_rxq_get_buffer under new IBMVETH_KUNIT_TEST config option.
Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250501194944.283729-4-davemarq@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Reset the adapter through new function ibmveth_reset, called in
WARN_ON situations. Removed conflicting and unneeded forward
declaration.
Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com>
Link: https://patch.msgid.link/20250501194944.283729-3-davemarq@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Made struct ibmveth_adapter follow indentation rules
Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250501194944.283729-2-davemarq@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In handle_tx_copy, TX batching processes packets below ~PAGE_SIZE and
batches up to 64 messages before calling sock->sendmsg.
Currently, when there are no more messages on the ring to dequeue,
handle_tx_copy re-enables kicks on the ring *before* firing off the
batch sendmsg. However, sock->sendmsg incurs a non-zero delay,
especially if it needs to wake up a thread (e.g., another vhost worker).
If the guest submits additional messages immediately after the last ring
check and disablement, it triggers an EPT_MISCONFIG vmexit to attempt to
kick the vhost worker. This may happen while the worker is still
processing the sendmsg, leading to wasteful exit(s).
This is particularly problematic for single-threaded guest submission
threads, as they must exit, wait for the exit to be processed
(potentially involving a TTWU), and then resume.
In scenarios like a constant stream of UDP messages, this results in a
sawtooth pattern where the submitter frequently vmexits, and the
vhost-net worker alternates between sleeping and waking.
A common solution is to configure vhost-net busy polling via userspace
(e.g., qemu poll-us). However, treating the sendmsg as the "busy"
period by keeping kicks disabled during the final sendmsg and
performing one additional ring check afterward provides a significant
performance improvement without any excess busy poll cycles.
If messages are found in the ring after the final sendmsg, requeue the
TX handler. This ensures fairness for the RX handler and allows
vhost_run_work_list to cond_resched() as needed.
Test Case
TX VM: taskset -c 2 iperf3 -c rx-ip-here -t 60 -p 5200 -b 0 -u -i 5
RX VM: taskset -c 2 iperf3 -s -p 5200 -D
6.12.0, each worker backed by tun interface with IFF_NAPI setup.
Note: TCP side is largely unchanged as that was copy bound
6.12.0 unpatched
EPT_MISCONFIG/second: 5411
Datagrams/second: ~382k
Interval Transfer Bitrate Lost/Total Datagrams
0.00-30.00 sec 15.5 GBytes 4.43 Gbits/sec 0/11481630 (0%) sender
6.12.0 patched
EPT_MISCONFIG/second: 58 (~93x reduction)
Datagrams/second: ~650k (~1.7x increase)
Interval Transfer Bitrate Lost/Total Datagrams
0.00-30.00 sec 26.4 GBytes 7.55 Gbits/sec 0/19554720 (0%) sender
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20250501020428.1889162-1-jon@nutanix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use of strcpy is decpreated, replaces the use of strcpy with strscpy as
recommended.
strscpy was chosen as it requires a NUL terminated non-padded string,
which is the case here.
I am aware there is an explicit bounds check above the second instance,
however using strscpy protects against buffer overflows in any future
code, and there is no good reason I can see to not use it.
I have also replaced the scrscpy above that had 3 params with the
version using 2 params. These are functionally equivalent, but it is
cleaner to have both using 2 params.
Signed-off-by: Ruben Wauters <rubenru09@aol.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250501202935.46318-1-rubenru09@aol.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Maxime Chevallier says:
====================
net: ethtool: Introduce ethnl dump helpers
This is V8 for per-phy DUMP helpers, improving support for ->dumpit()
operations for PHY targetting commands.
This V8 fixes some issues spotted by Jakub (thanks !) on the multi-part
DUMP sequence. The netdev reftracking was reworked to make sure that
during a filtered DUMP, we only keep a ref on the netdev during
individual .dumpit() calls.
v1: https://lore.kernel.org/20250305141938.319282-1-maxime.chevallier@bootlin.com
v2: https://lore.kernel.org/20250308155440.267782-1-maxime.chevallier@bootlin.com
v3: https://lore.kernel.org/20250313182647.250007-1-maxime.chevallier@bootlin.com
v4: https://lore.kernel.org/20250324104012.367366-1-maxime.chevallier@bootlin.com
v5: https://lore.kernel.org/20250410123350.174105-1-maxime.chevallier@bootlin.com
v6: https://lore.kernel.org/20250415085155.132963-1-maxime.chevallier@bootlin.com
v7: https://lore.kernel.org/20250422161717.164440-1-maxime.chevallier@bootlin.com
====================
Link: https://patch.msgid.link/20250502085242.248645-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move away from dev_hold and use netdev_hold with a local reftracker when
performing a DUMP on each netdev.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250502085242.248645-4-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now that we have an infrastructure in ethnl for perphy DUMPs, we can get
rid of the custom ->doit and ->dumpit to deal with PHY listing commands.
As most of the code was custom, this basically means re-writing how we
deal with PHY listing.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250502085242.248645-3-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ethnl commands that target a phy_device need a DUMP implementation that
will fill the reply for every PHY behind a netdev. We therefore need to
iterate over the dev->topo to list them.
When multiple PHYs are behind the same netdev, it's also useful to
perform DUMP with a filter on a given netdev, to get the capability of
every PHY.
Implement dedicated genl ->start(), ->dumpit() and ->done() operations
for PHY-targetting command, allowing filtered dumps and using a dump
context that keep track of the PHY iteration for multi-message dump.
PSE-PD and PLCA are converted to this new set of ops along the way.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250502085242.248645-2-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Clean up two build warnings:
[1]
iou-zcrx.c: In function ‘process_recvzc’:
iou-zcrx.c:263:37: warning: too many arguments for format [-Wformat-extra-args]
263 | error(1, 0, "payload mismatch at ", i);
| ^~~~~~~~~~~~~~~~~~~~~~
[2] Use "%zd" for ssize_t type as better
iou-zcrx.c: In function ‘run_client’:
iou-zcrx.c:357:47: warning: format ‘%d’ expects argument of type ‘int’, but argument 4 has type ‘ssize_t’ {aka ‘long int’} [-Wformat=]
357 | error(1, 0, "send(): %d", sent);
| ~^ ~~~~
| | |
| int ssize_t {aka long int}
| %ld
Signed-off-by: Haiyue Wang <haiyuewa@163.com>
Reviewed-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250502175136.1122-1-haiyuewa@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Matthieu Baerts says:
====================
selftests: mptcp: increase code coverage
Here are various patches slightly improving MPTCP code coverage:
- Patch 1: avoid a harmless 'grep: write error' warning.
- Patch 2: use getaddrinfo() with IPPROTO_MPTCP in more places.
- Patch 3-6: prepare and add support to get info for a specific subflow
when giving the 5-tuple.
- Patch 7: validate the previous patch and cover "subflow_get_info_size"
in the kernel code (mptcp_diag.c).
====================
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-0-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch aims to add chk_dump_subflow in diag.sh. The subflow's
info can be obtained through "ss -tin", then use the 'mptcp_diag'
to verify the token in subflow_info.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/524
Co-developed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-7-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch adds 'get_subflow_info' in 'mptcp_diag', which can check whether
a TCP connection is an MPTCP subflow based on the "INET_ULP_INFO_MPTCP"
with tcp_diag method.
The helper 'print_subflow_info' in 'mptcp_diag' can print the subflow_filed
of an MPTCP subflow for further checking the 'subflow_info' through
inet_diag method.
The example of the whole output should be:
$ ./mptcp_diag -s "127.0.0.1:10000 127.0.0.1:38984"
127.0.0.1:10000 -> 127.0.0.1:38984
It's a mptcp subflow, the subflow info:
flags:Mec token:0000(id:0)/4278e77e(id:0) seq:9288466187236176036 \
sfseq:1 ssnoff:2317083055 maplen:215
Co-developed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-6-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch introduces the '__u32 proto' variable to the 'send_query' and
'recv_nlmsg' functions for further extending function.
In the 'send_query' function, the inclusion of this variable makes the
structure clearer and more readable.
In the 'recv_nlmsg' function, the '__u32 proto' variable ensures that
the 'diag_info' field remains unmodified when processing IPPROTO_TCP data,
thereby preventing unintended transformation into 'mptcp_info' format.
While at it, increment iovlen directly when an item is added to simplify
this portion of the code and improve its readaility.
Co-developed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-5-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch use 'inet_diag_req_v2' instead of 'token' as parameters of
send_query, and construct the req in 'get_mptcpinfo'.
This modification enhances the clarity of the code, and prepare for the
dump_subflow_info.
Co-developed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-4-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch adds a struct named 'params' to save 'target_token' and other
future parameters. This structure facilitates future function expansions.
Co-developed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-3-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
getaddrinfo MPTCP is recently supported in glibc and IPPROTO_MPTCP for
getaddrinfo is used in mptcp_connect.c. But in mptcp_sockopt.c and
mptcp_inq.c, IPPROTO_TCP are still used for getaddrinfo, So this patch
updates them.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-2-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mptcp_lib_get_info_value() will only print the first entry that match
the filter because of the ';q' at the end. As a consequence, the 'sed'
command could finish before the previous 'grep' one and print a 'write
error' warning because it is trying to write data to the closed pipe.
Such warnings are not interesting, they can be hidden by muting stderr
here for grep.
While at it, clearly indicate that mptcp_lib_get_info_value() will only
print the first matched entry to avoid confusions later on.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-1-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sctp_assoc_del_peer() last use was removed in 2015 by
commit 73e6742027f5 ("sctp: Do not try to search for the transport twice")
which now uses rm_peer instead of del_peer.
sctp_chunk_iif() last use was removed in 2016 by
commit 1f45f78f8e51 ("sctp: allow GSO frags to access the chunk too")
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20250501233815.99832-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Refactor to check if the fwnode we got is correct and return if so,
otherwise do additional checks. Using same pattern in all conditionals
makes it slightly easier to read and understand.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20250430143802.3714405-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The last use of __strp_unpause() was removed in 2022 by
commit 84c61fe1a75b ("tls: rx: do not use the standard strparser")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250501002402.308843-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Martin KaFai Lau says:
====================
pull-request: bpf-next 2025-05-02
We've added 14 non-merge commits during the last 10 day(s) which contain
a total of 13 files changed, 740 insertions(+), 121 deletions(-).
The main changes are:
1) Avoid skipping or repeating a sk when using a UDP bpf_iter,
from Jordan Rife.
2) Fixed a crash when a bpf qdisc is set in
the net.core.default_qdisc, from Amery Hung.
3) A few other fixes in the bpf qdisc, from Amery Hung.
- Always call qdisc_watchdog_init() in the .init prologue such that
the .reset/.destroy epilogue can always call qdisc_watchdog_cancel()
without issue.
- bpf_qdisc_init_prologue() was incorrectly returning an error
when the bpf qdisc is set as the default_qdisc and the mq is creating
the default_qdisc. It is now fixed.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
selftests/bpf: Cleanup bpf qdisc selftests
selftests/bpf: Test attaching a bpf qdisc with incomplete operators
bpf: net_sched: Make some Qdisc_ops ops mandatory
selftests/bpf: Test setting and creating bpf qdisc as default qdisc
bpf: net_sched: Fix bpf qdisc init prologue when set as default qdisc
selftests/bpf: Add tests for bucket resume logic in UDP socket iterators
selftests/bpf: Return socket cookies from sock_iter_batch progs
bpf: udp: Avoid socket skips and repeats during iteration
bpf: udp: Use bpf_udp_iter_batch_item for bpf_udp_iter_state batch items
bpf: udp: Get rid of st_bucket_done
bpf: udp: Make sure iter->batch always contains a full bucket snapshot
bpf: udp: Make mem flags configurable through bpf_iter_udp_realloc_batch
bpf: net_sched: Fix using bpf qdisc as default qdisc
selftests/bpf: Fix compilation errors
====================
Link: https://patch.msgid.link/20250503010755.4030524-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With reverted fix:
PASS: fib expression did not cause unwanted packet drops
[ 37.285169] ns1-KK76Kt nft_rpfilter: IN=lo OUT= MAC=00:00:00:00:00:00:00:00:00:00:00:00:08:00 SRC=127.0.0.1 DST=127.0.0.1 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=32287 DF PROTO=ICMP TYPE=8 CODE=0 ID=1818 SEQ=1
FAIL: rpfilter did drop packets
FAIL: ns1-KK76Kt cannot reach 127.0.0.1, ret 0
Check for this.
Link: https://lore.kernel.org/netfilter/20250422114352.GA2092@breakpoint.cc/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Otherwise, it is possible to hit WARN_ON_ONCE in __kvmalloc_node_noprof()
when resizing hashtable because __GFP_NOWARN is unset.
Similar to:
b541ba7d1f5a ("netfilter: conntrack: clamp maximum hashtable size to INT_MAX")
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When calculating the lookup table size, ensure the following
multiplication does not overflow:
- desc->field_len[] maximum value is U8_MAX multiplied by
NFT_PIPAPO_GROUPS_PER_BYTE(f) that can be 2, worst case.
- NFT_PIPAPO_BUCKETS(f->bb) is 2^8, worst case.
- sizeof(unsigned long), from sizeof(*f->lt), lt in
struct nft_pipapo_field.
Then, use check_mul_overflow() to multiply by bucket size and then use
check_add_overflow() to the alignment for avx2 (if needed). Finally, add
lt_size_check_overflow() helper and use it to consolidate this.
While at it, replace leftover allocation using the GFP_KERNEL to
GFP_KERNEL_ACCOUNT for consistency, in pipapo_resize().
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Dumping all conntrack entries via proc interface can take hours due to
linear search to skip entries dumped so far in each cycle.
Apply same strategy used to speed up ipvs proc reading done in
commit 178883fd039d ("ipvs: speed up reads from ip_vs_conn proc file")
to nf_conntrack.
Note that the ctnetlink interface doesn't suffer from this problem, but
many scripts depend on the nf_conntrack proc interface.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The xt_quota compares skb length with remaining quota, but the nft_quota
compares it with consumed bytes.
The xt_quota can match consumed bytes up to quota at maximum. But the
nft_quota break match when consumed bytes equal to quota.
i.e., nft_quota match consumed bytes in [0, quota - 1], not [0, quota].
Fixes: 795595f68d6c ("netfilter: nft_quota: dump consumed quota")
Signed-off-by: Zhongqiu Duan <dzq.aishenghu0@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add a new test case to check:
- conntrack_max limit is effective
- conntrack_max limit cannot be exceeded from within a netns
- resizing the hash table while packets are inflight works
- removal of all conntrack rules disables conntrack in netns
- conntrack tool dump (conntrack -L) returns expected number
of (unique) entries
- procfs interface - if available - has same number of entries
as conntrack -L dump
Expected output with selftest framework:
selftests: net/netfilter: conntrack_resize.sh
PASS: got 1 connections: netns conntrack_max is pernet bound
PASS: got 100 connections: netns conntrack_max is init_net bound
PASS: dump in netns had same entry count (-C 1778, -L 1778, -p 1778, /proc 0)
PASS: dump in netns had same entry count (-C 2000, -L 2000, -p 2000, /proc 0)
PASS: test parallel conntrack dumps
PASS: resize+flood
PASS: got 0 connections: conntrack disabled
PASS: got 1 connections: conntrack enabled
ok 1 selftests: net/netfilter: conntrack_resize.sh
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
dropping it
The config NF_CONNTRACK_BRIDGE will change the bridge forwarding for
fragmented packets.
The original bridge does not know that it is a fragmented packet and
forwards it directly, after NF_CONNTRACK_BRIDGE is enabled, function
nf_br_ip_fragment and br_ip6_fragment will check the headroom.
In original br_forward, insufficient headroom of skb may indeed exist,
but there's still a way to save the skb in the device driver after
dev_queue_xmit.So droping the skb will change the original bridge
forwarding in some cases.
Fixes: 3c171f496ef5 ("netfilter: bridge: add connection tracking system")
Signed-off-by: Huajian Yang <huajianyang@asrmicro.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Commit 32607a332cfe ("ipv4: prefer multipath nexthop that matches source
address") changed IPv4 nexthop selection to prefer a nexthop whose
nexthop device is assigned the specified source address for locally
generated traffic.
While the selection honors the "fib_multipath_use_neigh" sysctl and will
not choose a nexthop with an invalid neighbour, it does not honor the
"ignore_routes_with_linkdown" sysctl and can choose a nexthop without a
carrier:
$ sysctl net.ipv4.conf.all.ignore_routes_with_linkdown
net.ipv4.conf.all.ignore_routes_with_linkdown = 1
$ ip route show 198.51.100.0/24
198.51.100.0/24
nexthop via 192.0.2.2 dev dummy1 weight 1
nexthop via 192.0.2.18 dev dummy2 weight 1 dead linkdown
$ ip route get 198.51.100.1 from 192.0.2.17
198.51.100.1 from 192.0.2.17 via 192.0.2.18 dev dummy2 uid 0
Solve this by skipping over nexthops whose assigned hash upper bound is
minus one, which is the value assigned to nexthops that do not have a
carrier when the "ignore_routes_with_linkdown" sysctl is set.
In practice, this probably does not matter a lot as the initial route
lookup for the source address would not choose a nexthop that does not
have a carrier in the first place, but the change does make the code
clearer.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzkaller reported out-of-bounds read in ipv6_addr_prefix(),
where the prefix length was over 128.
The cited commit accidentally removed some fib6_config
validation from the ioctl path.
Let's restore the validation.
[0]:
BUG: KASAN: slab-out-of-bounds in ip6_route_info_create (./include/net/ipv6.h:616 net/ipv6/route.c:3814)
Read of size 1 at addr ff11000138020ad4 by task repro/261
CPU: 3 UID: 0 PID: 261 Comm: repro Not tainted 6.15.0-rc3-00614-g0d15a26b247d #87 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:123)
print_report (mm/kasan/report.c:409 mm/kasan/report.c:521)
kasan_report (mm/kasan/report.c:636)
ip6_route_info_create (./include/net/ipv6.h:616 net/ipv6/route.c:3814)
ip6_route_add (net/ipv6/route.c:3902)
ipv6_route_ioctl (net/ipv6/route.c:4523)
inet6_ioctl (net/ipv6/af_inet6.c:577)
sock_do_ioctl (net/socket.c:1190)
sock_ioctl (net/socket.c:1314)
__x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:906 fs/ioctl.c:892 fs/ioctl.c:892)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
RIP: 0033:0x7f518fb2de5d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 9f 1b 00 f7 d8 64 89 01 48
RSP: 002b:00007fff14f38d18 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f518fb2de5d
RDX: 00000000200015c0 RSI: 000000000000890b RDI: 0000000000000003
RBP: 00007fff14f38d30 R08: 0000000000000800 R09: 0000000000000800
R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff14f38e48
R13: 0000000000401136 R14: 0000000000403df0 R15: 00007f518fd3c000
</TASK>
Fixes: fa76c1674f2e ("ipv6: Move some validation from ip6_route_info_create() to rtm_to_fib6_config().")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Reported-by: Yi Lai <yi1.lai@linux.intel.com>
Closes: https://lore.kernel.org/netdev/aBAcKDEFoN%2FLntBF@ly-workstation/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250501005335.53683-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ever since commit f5f80e32de12 ("ipv6: remove hard coded limitation on
ipv6_pinfo") that protocols stopped using the old "obj_size -
sizeof(struct ipv6_pinfo)" way of grabbing ipv6_pinfo, that severely
restricted struct layout and caused fun, hard to see issues.
However, mptcp_inet6_sk wasn't fixed (unlike tcp_inet6_sk). Do so.
The non-cloned sockets already do the right thing using
ipv6_pinfo_offset + the generic IPv6 code.
Signed-off-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250430154541.1038561-1-pfalcato@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|