Age | Commit message (Collapse) | Author |
|
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now there are tracepoints, that cover all functionality of
tcp_hash_fail(), but also wire up missing places
They are also faster, can be disabled and provide filtering.
This potentially may create a regression if a userspace depends on dmesg
logs. Fingers crossed, let's see if anyone complains in reality.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Instead of forcing userspace to parse dmesg (that's what currently is
happening, at least in codebase of my current company), provide a better
way, that can be enabled/disabled in runtime.
Currently, there are already tcp events, add hashing related ones there,
too. Rasdaemon currently exercises net_dev_xmit_timeout,
devlink_health_report, but it'll be trivial to teach it to deal with
failed hashes. Otherwise, BGP may trace/log them itself. Especially
exciting for possible investigations is key rotation (RNext_key
requests).
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Two reasons:
1. It's grown up enough
2. In order to not do header spaghetti by including
<trace/events/tcp.h>, which is necessary for TCP tracepoints.
While at it, unexport and make static tcp_inbound_ao_hash().
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It's going to be used more in TCP-AO tracepoints.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It's possible to clean-up some ifdefs by hiding that
tcp_{md5,ao}_needed static branch is defined and compiled only
under related configs, since commit 4c8530dc7d7d ("net/tcp: Only produce
AO/MD5 logs if there are any keys").
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
parameters
The current cbs parameter depends on speed after uplinking,
which is not needed and will report a configuration error
if the port is not initially connected. The UAPI exposed by
tc-cbs requires userspace to recalculate the send slope anyway,
because the formula depends on port_transmit_rate (see man tc-cbs),
which is not an invariant from tc's perspective. Therefore, we
use offload->sendslope and offload->idleslope to derive the
original port_transmit_rate from the CBS formula.
Fixes: 1f705bc61aee ("net: stmmac: Add support for CBS QDISC")
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20240608143524.2065736-1-xiaolei.wang@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
TSO currently fails when the skb's gso_type field has more than one bit
set.
TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a
few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes
virtualization, such as QEMU, a real use-case.
The gso_type and gso_size fields as passed from userspace in
virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type
|= SKB_GSO_DODGY to force the packet to enter the software GSO stack
for verification.
This issue might similarly come up when the CWR bit is set in the TCP
header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit
to be set.
Fixes: a57e5de476be ("gve: DQO: Add TX path")
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Andrei Vagin <avagin@gmail.com>
v2 - Remove unnecessary comments, remove line break between fixes tag
and signoffs.
v3 - Add back unrelated empty line removal.
Link: https://lore.kernel.org/r/20240610225729.2985343-1-joshwash@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- hci_sync: fix not using correct handle
- L2CAP: fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
- L2CAP: fix connection setup in l2cap_connect
* tag 'for-net-2024-06-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: fix connection setup in l2cap_connect
Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
Bluetooth: hci_sync: Fix not using correct handle
====================
Link: https://lore.kernel.org/r/20240610135803.920662-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP as reported by
checkpatch script.
Fixes: 18ff0bcda6d1 ("ethtool: add interface to interact with Ethernet Power Equipment")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://lore.kernel.org/r/20240610083426.740660-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Matthieu Baerts says:
====================
selftests: mptcp: use net/lib.sh to manage netns
The goal of this series is to use helpers from net/lib.sh with MPTCP
selftests.
- Patches 1 to 4 are some clean-ups and preparation in net/lib.sh:
- Patch 1 simplifies the code handling errexit by ignoring possible
errors instead of disabling errexit temporary.
- Patch 2 removes the netns from the list after having cleaned it, not
to try to clean it twice.
- Patch 3 removes the 'readonly' attribute for the netns variable, to
allow using the same name in local variables.
- Patch 4 removes the local 'ns' var, not to conflict with the global
one it needs to setup.
- Patch 5 uses helpers from net/lib.sh to create and delete netns in
MPTCP selftests.
- Patch 6 uses wait_local_port_listen helper from net/net_helper.sh.
====================
Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-0-e36986faac94@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch includes net_helper.sh into mptcp_lib.sh, uses the helper
wait_local_port_listen() defined in it to implement the similar mptcp
helper. This can drop some duplicate code.
It looks like this helper from net_helper.sh was originally coming from
MPTCP, but MPTCP selftests have not been updated to use it from this
shared place.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-6-e36986faac94@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch includes lib.sh into mptcp_lib.sh, uses setup_ns helper
defined in lib.sh to set up namespaces in mptcp_lib_ns_init(), and
uses cleanup_ns to delete namespaces in mptcp_lib_ns_exit().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-5-e36986faac94@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The helper setup_ns() doesn't work when a net namespace named "ns" is
passed to it.
For example, in net/mptcp/diag.sh, the name of the namespace is "ns". If
"setup_ns ns" is used in it, diag.sh fails with errors:
Invalid netns name "./mptcp_connect"
Cannot open network namespace "10000": No such file or directory
Cannot open network namespace "10000": No such file or directory
That is because "ns" is also a local variable in setup_ns, and it will
not set the value for the global variable that has been giving in
argument. To solve this, we could rename the variable, but it sounds
better to drop it, as we can resolve the name using the variable passed
in argument instead.
The other local variables -- "ns_list" and "ns_name" -- are more
unlikely to conflict with existing global variables. They don't seem to
be currently used in any other net selftests.
Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-4-e36986faac94@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It sounds good to mark the global netns variable as 'readonly', but Bash
doesn't allow the creation of local variables with the same name.
Because it looks like 'readonly' is mainly used here to check if a netns
with that name has already been set, it sounds fine to check if a
variable with this name has already been set instead. By doing that, we
avoid having to modify helpers from MPTCP selftests using the same
variable name as the one used to store the created netns name.
While at it, also avoid an unnecessary call to 'eval' to set a local
variable.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-3-e36986faac94@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of only appending items to the list, removing them when the
netns has been deleted.
By doing that, we can make sure 'cleanup_all_ns()' is not trying to
remove already deleted netns.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-2-e36986faac94@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
No need to disable errexit temporary, simply ignore the only possible
and not handled error.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-next-20240607-selftests-mptcp-net-lib-v1-1-e36986faac94@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'net-core-unify-dstats-with-tstats-and-lstats-implement-generic-dstats-collection'
Jeremy Kerr says:
====================
net: core: Unify dstats with tstats and lstats, implement generic dstats collection
The struct pcpu_dstats ("dstats") has a few variations from the other
two stats types (struct pcpu_sw_netstats and struct pcpu_lstats), and
doesn't have generic helpers for collecting the per-cpu stats into a
struct rtnl_link_stats64.
This change unifies dstats with the other types, adds a collection
implementation to the core, and updates the single driver (vrf) to use
this generic implementation.
Of course, questions/comments/etc are most welcome!
v3: https://lore.kernel.org/r/20240605-dstats-v2-0-7fae03f813f3@codeconstruct.com.au
v2: v1: https://lore.kernel.org/r/20240605-dstats-v1-0-1024396e1670@codeconstruct.com.au
====================
Link: https://lore.kernel.org/r/20240607-dstats-v3-0-cc781fe116f7@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The vrf driver has its own dstats-to-rtnl_link_stats64 collection, but
we now have a generic implementation for dstats collection, so switch to
this.
In doing so, we fix a minor issue where the (non-percpu)
dev->stats->tx_errors value was never collected into rtnl_link_stats64,
as the generic dev_get_dstats64() consumes the starting values from
dev->stats.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240607-dstats-v3-3-cc781fe116f7@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We currently have dev_get_tstats64() for collecting per-cpu stats of
type pcpu_sw_netstats ("tstats"). However, tstats doesn't allow for
accounting tx/rx drops. We do have a stats variant that does have stats
for dropped packets: struct pcpu_dstats, but there are no core helpers
for using those stats.
The VRF driver uses dstats, by providing its own collation/fetch
functions to do so.
This change adds a common implementation for dstats-type collection,
used when pcpu_stat_type == NETDEV_PCPU_STAT_DSTAT. This is based on the
VRF driver's existing stats collator (plus the unused tx_drops stat from
there). We will switch the VRF driver to use this in the next change.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240607-dstats-v3-2-cc781fe116f7@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The pcpu_sw_netstats and pcpu_lstats structs both contain a set of
u64_stats_t fields for individual stats, but pcpu_dstats uses u64s
instead.
Make this consistent by using u64_stats_t across all stats types.
The per-cpu dstats are only used by the vrf driver at present, so update
that driver as part of this change.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240607-dstats-v3-1-cc781fe116f7@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and
convert veth & vrf"), stats allocation could be done on net core instead
of this driver.
With this new approach, the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc). This is core responsibility now.
Move ip_tunnel driver to leverage the core allocation.
All the ip_tunnel_init() users call ip_tunnel_init() as part of their
.ndo_init callback. The .ndo_init callback is called before the stats
allocation in netdev_register(), thus, the allocation will happen before
the netdev is visible.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240607084420.3932875-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix a minor typo in the help text for the NET_DSA_TAG_RTL4_A config
option.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Link: https://lore.kernel.org/r/20240607020843.1380735-1-chris.packham@alliedtelesis.co.nz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"Misc:
- Restore debugfs behavior of ignoring unknown mount options
- Fix kernel doc for netfs_wait_for_oustanding_io()
- Fix struct statx comment after new addition for this cycle
- Fix a check in find_next_fd()
iomap:
- Fix data zeroing behavior when an extent spans the block that
contains i_size
- Restore i_size increasing in iomap_write_end() for now to avoid
stale data exposure on xfs with a realtime device
Cachefiles:
- Remove unneeded fdtable.h include
- Improve trace output for cachefiles_obj_{get,put}_ondemand_fd()
- Remove requests from the request list to prevent accessing already
freed requests
- Fix UAF when issuing restore command while the daemon is still
alive by adding an additional reference count to requests
- Fix UAF by grabbing a reference during xarray lookup with xa_lock()
held
- Simplify error handling in cachefiles_ondemand_daemon_read()
- Add consistency checks read and open requests to avoid crashes
- Add a spinlock to protect ondemand_id variable which is used to
determine whether an anonymous cachefiles fd has already been
closed
- Make on-demand reads killable allowing to handle broken cachefiles
daemon better
- Flush all requests after the kernel has been marked dead via
CACHEFILES_DEAD to avoid hung-tasks
- Ensure that closed requests are marked as such to avoid reusing
them with a reopen request
- Defer fd_install() until after copy_to_user() succeeded and thereby
get rid of having to use close_fd()
- Ensure that anonymous cachefiles on-demand fds are reused while
they are valid to avoid pinning already freed cookies"
* tag 'vfs-6.10-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
iomap: Fix iomap_adjust_read_range for plen calculation
iomap: keep on increasing i_size in iomap_write_end()
cachefiles: remove unneeded include of <linux/fdtable.h>
fs/file: fix the check in find_next_fd()
cachefiles: make on-demand read killable
cachefiles: flush all requests after setting CACHEFILES_DEAD
cachefiles: Set object to close if ondemand_id < 0 in copen
cachefiles: defer exposing anon_fd until after copy_to_user() succeeds
cachefiles: never get a new anonymous fd if ondemand_id is valid
cachefiles: add spin_lock for cachefiles_ondemand_info
cachefiles: add consistency check for copen/cread
cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read()
cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read()
cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd()
cachefiles: remove requests from xarray during flushing requests
cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd
statx: Update offset commentary for struct statx
netfs: fix kernel doc for nets_wait_for_outstanding_io()
debugfs: continue to ignore unknown mount options
|
|
'ip6 dscp set $v' in an nftables outpute route chain has no effect.
While nftables does detect the dscp change and calls the reroute hook.
But ip6_route_me_harder never sets the dscp/flowlabel:
flowlabel/dsfield routing rules are ignored and no reroute takes place.
Thanks to Yi Chen for an excellent reproducer script that I used
to validate this change.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Yi Chen <yiche@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Lion Ackermann reported that there is a race condition between namespace cleanup
in ipset and the garbage collection of the list:set type. The namespace
cleanup can destroy the list:set type of sets while the gc of the set type is
waiting to run in rcu cleanup. The latter uses data from the destroyed set which
thus leads use after free. The patch contains the following parts:
- When destroying all sets, first remove the garbage collectors, then wait
if needed and then destroy the sets.
- Fix the badly ordered "wait then remove gc" for the destroy a single set
case.
- Fix the missing rcu locking in the list:set type in the userspace test
case.
- Use proper RCU list handlings in the list:set type.
The patch depends on c1193d9bbbd3 (netfilter: ipset: Add list flush to cancel_gc).
Fixes: 97f7cf1cd80e (netfilter: ipset: fix performance regression in swap operation)
Reported-by: Lion Ackermann <nnamrec@gmail.com>
Tested-by: Lion Ackermann <nnamrec@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Check for mandatory netlink attributes in payload and meta expression
when used embedded from the inner expression, otherwise NULL pointer
dereference is possible from userspace.
Fixes: a150d122b6bd ("netfilter: nft_meta: add inner match support")
Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching")
Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Jacob Keller says:
====================
Intel Wired LAN Driver Updates 2024-06-03
This series includes miscellaneous improvements for the ice as well as a
cleanup to the Makefiles for all Intel net drivers.
Andy fixes all of the Intel net driver Makefiles to use the documented
'*-y' syntax for specifying object files to link into kernel driver
modules, rather than the '*-objs' syntax which works but is documented as
reserved for user-space host programs.
Jacob has a cleanup to refactor rounding logic in the ice driver into a
common roundup_u64 helper function.
Michal Schmidt replaces irq_set_affinity_hint() to use
irq_update_affinity_hint() which behaves better with user-applied affinity
settings.
v2: https://lore.kernel.org/r/20240605-next-2024-06-03-intel-next-batch-v2-0-39c23963fa78@intel.com
v1: https://lore.kernel.org/r/20240603-next-2024-06-03-intel-next-batch-v1-0-e0523b28f325@intel.com
====================
Link: https://lore.kernel.org/r/20240607-next-2024-06-03-intel-next-batch-v3-0-d1470cee3347@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
irq_set_affinity_hint() is deprecated. Use irq_update_affinity_hint()
instead. This removes the side-effect of actually applying the affinity.
The driver does not really need to worry about spreading its IRQs across
CPUs. The core code already takes care of that.
On the contrary, when the driver applies affinities by itself, it breaks
the users' expectations:
1. The user configures irqbalance with IRQBALANCE_BANNED_CPULIST in
order to prevent IRQs from being moved to certain CPUs that run a
real-time workload.
2. ice reconfigures VSIs at runtime due to a MIB change
(ice_dcb_process_lldp_set_mib_change). Reopening a VSI resets the
affinity in ice_vsi_req_irq_msix().
3. ice has no idea about irqbalance's config, so it may move an IRQ to
a banned CPU. The real-time workload suffers unacceptable latency.
I am not sure if updating the affinity hints is at all useful, because
irqbalance ignores them since 2016 ([1]), but at least it's harmless.
This ice change is similar to i40e commit d34c54d1739c ("i40e: Use
irq_update_affinity_hint()").
[1] https://github.com/Irqbalance/irqbalance/commit/dcc411e7bfdd
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Sunil Goutham <sgoutham@marvell.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240607-next-2024-06-03-intel-next-batch-v3-3-d1470cee3347@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In ice_ptp_cfg_clkout(), the ice driver needs to calculate the nearest next
second of a current time value specified in nanoseconds. It implements this
using div64_u64, because the time value is a u64. It could use div_u64
since NSEC_PER_SEC is smaller than 32-bits.
Ideally this would be implemented directly with roundup(), but that can't
work on all platforms due to a division which requires using the specific
macros and functions due to platform restrictions, and to ensure that the
most appropriate and fast instructions are used.
The kernel doesn't currently provide any 64-bit equivalents for doing
roundup. Attempting to use roundup() on a 32-bit platform will result in a
link failure due to not having a direct 64-bit division.
The closest equivalent for this is DIV64_U64_ROUND_UP, which does a
division always rounding up. However, this only computes the division, and
forces use of the div64_u64 in cases where the divisor is a 32bit value and
could make use of div_u64.
Introduce DIV_U64_ROUND_UP based on div_u64, and then use it to implement
roundup_u64 which takes a u64 input value and a u32 rounding value.
The name roundup_u64 matches the naming scheme of div_u64, and future
patches could implement roundup64_u64 if they need to round by a multiple
that is greater than 32-bits.
Replace the logic in ice_ptp.c which does this equivalent with the newly
added roundup_u64.
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240607-next-2024-06-03-intel-next-batch-v3-2-d1470cee3347@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
*-objs suffix is reserved rather for (user-space) host programs while
usually *-y suffix is used for kernel drivers (although *-objs works
for that purpose for now).
Let's correct the old usages of *-objs in Makefiles.
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240607-next-2024-06-03-intel-next-batch-v3-1-d1470cee3347@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Due to timer wheel implementation, a timer will usually fire
after its schedule.
For instance, for HZ=1000, a timeout between 512ms and 4s
has a granularity of 64ms.
For this range of values, the extra delay could be up to 63ms.
For TCP, this means that tp->rcv_tstamp may be after
inet_csk(sk)->icsk_timeout whenever the timer interrupt
finally triggers, if one packet came during the extra delay.
We need to make sure tcp_rtx_probe0_timed_out() handles this case.
Fixes: e89688e3e978 ("net: tcp: fix unexcepted socket die when snd_wnd is 0")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Menglong Dong <imagedong@tencent.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/r/20240607125652.1472540-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Matthieu Baerts says:
====================
mptcp: various fixes
The different patches here are some unrelated fixes for MPTCP:
- Patch 1 ensures 'snd_una' is initialised on connect in case of MPTCP
fallback to TCP followed by retransmissions before the processing of
any other incoming packets. A fix for v5.9+.
- Patch 2 makes sure the RmAddr MIB counter is incremented, and only
once per ID, upon the reception of a RM_ADDR. A fix for v5.10+.
- Patch 3 doesn't update 'add addr' related counters if the connect()
was not possible. A fix for v5.7+.
- Patch 4 updates the mailmap file to add Geliang's new email address.
====================
Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-0-1ab9ddfa3d00@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Just like my other email addresses, map my new one to kernel.org
account too.
My new email address uses "last name, first name" format, which is
different from my other email addresses. This mailmap is also used
to indicate that it is actually the same person.
Suggested-by: Mat Martineau <martineau@kernel.org>
Suggested-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-4-1ab9ddfa3d00@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The creation of new subflows can fail for different reasons. If no
subflow have been created using the received ADD_ADDR, the related
counters should not be updated, otherwise they will never be decremented
for events related to this ID later on.
For the moment, the number of accepted ADD_ADDR is only decremented upon
the reception of a related RM_ADDR, and only if the remote address ID is
currently being used by at least one subflow. In other words, if no
subflow can be created with the received address, the counter will not
be decremented. In this case, it is then important not to increment
pm.add_addr_accepted counter, and not to modify pm.accept_addr bit.
Note that this patch does not modify the behaviour in case of failures
later on, e.g. if the MP Join is dropped or rejected.
The "remove invalid addresses" MP Join subtest has been modified to
validate this case. The broadcast IP address is added before the "valid"
address that will be used to successfully create a subflow, and the
limit is decreased by one: without this patch, it was not possible to
create the last subflow, because:
- the broadcast address would have been accepted even if it was not
usable: the creation of a subflow to this address results in an error,
- the limit of 2 accepted ADD_ADDR would have then been reached.
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
Cc: stable@vger.kernel.org
Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-3-1ab9ddfa3d00@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The RmAddr MIB counter is supposed to be incremented once when a valid
RM_ADDR has been received. Before this patch, it could have been
incremented as many times as the number of subflows connected to the
linked address ID, so it could have been 0, 1 or more than 1.
The "RmSubflow" is incremented after a local operation. In this case,
it is normal to tied it with the number of subflows that have been
actually removed.
The "remove invalid addresses" MP Join subtest has been modified to
validate this case. A broadcast IP address is now used instead: the
client will not be able to create a subflow to this address. The
consequence is that when receiving the RM_ADDR with the ID attached to
this broadcast IP address, no subflow linked to this ID will be found.
Fixes: 7a7e52e38a40 ("mptcp: add RM_ADDR related mibs")
Cc: stable@vger.kernel.org
Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-2-1ab9ddfa3d00@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is strictly related to commit fb7a0d334894 ("mptcp: ensure snd_nxt
is properly initialized on connect"). It turns out that syzkaller can
trigger the retransmit after fallback and before processing any other
incoming packet - so that snd_una is still left uninitialized.
Address the issue explicitly initializing snd_una together with snd_nxt
and write_seq.
Suggested-by: Mat Martineau <martineau@kernel.org>
Fixes: 8fd738049ac3 ("mptcp: fallback in case of simultaneous connect")
Cc: stable@vger.kernel.org
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-1-1ab9ddfa3d00@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the noop_qdisc owner isn't initialized, then it will be 0,
so packets will erroneously be regarded as having been subject
to recursion as long as only CPU 0 queues them. For non-SMP,
that's all packets, of course. This causes a change in what's
reported to userspace, normally noop_qdisc would drop packets
silently, but with this change the syscall returns -ENOBUFS if
RECVERR is also set on the socket.
Fix this by initializing the owner field to -1, just like it
would be for dynamically allocated qdiscs by qdisc_alloc().
Fixes: 0f022d32c3ec ("net/sched: Fix mirred deadlock on device recursion")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240607175340.786bfb938803.I493bf8422e36be4454c08880a8d3703cea8e421a@changeid
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/hfcpci.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/hfcmulti.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/hfcsusb.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/avmfritz.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/speedfax.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/mISDNinfineon.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/w6692.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/netjet.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/mISDNipac.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/hardware/mISDN/mISDNisar.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/mISDN/mISDN_core.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/mISDN/mISDN_dsp.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/isdn/mISDN/l1oip.o
Add the missing invocations of the MODULE_DESCRIPTION() macro.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240607-md-drivers-isdn-v1-1-81fb7001bc3a@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-06-06
We've added 54 non-merge commits during the last 10 day(s) which contain
a total of 50 files changed, 1887 insertions(+), 527 deletions(-).
The main changes are:
1) Add a user space notification mechanism via epoll when a struct_ops
object is getting detached/unregistered, from Kui-Feng Lee.
2) Big batch of BPF selftest refactoring for sockmap and BPF congctl
tests, from Geliang Tang.
3) Add BTF field (type and string fields, right now) iterator support
to libbpf instead of using existing callback-based approaches,
from Andrii Nakryiko.
4) Extend BPF selftests for the latter with a new btf_field_iter
selftest, from Alan Maguire.
5) Add new kfuncs for a generic, open-coded bits iterator,
from Yafang Shao.
6) Fix BPF selftests' kallsyms_find() helper under kernels configured
with CONFIG_LTO_CLANG_THIN, from Yonghong Song.
7) Remove a bunch of unused structs in BPF selftests,
from David Alan Gilbert.
8) Convert test_sockmap section names into names understood by libbpf
so it can deduce program type and attach type, from Jakub Sitnicki.
9) Extend libbpf with the ability to configure log verbosity
via LIBBPF_LOG_LEVEL environment variable, from Mykyta Yatsenko.
10) Fix BPF selftests with regards to bpf_cookie and find_vma flakiness
in nested VMs, from Song Liu.
11) Extend riscv32/64 JITs to introduce shift/add helpers to generate Zba
optimization, from Xiao Wang.
12) Enable BPF programs to declare arrays and struct fields with kptr,
bpf_rb_root, and bpf_list_head, from Kui-Feng Lee.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits)
selftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca
selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca
selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca
selftests/bpf: Add start_test helper in bpf_tcp_ca
selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
libbpf: Auto-attach struct_ops BPF maps in BPF skeleton
selftests/bpf: Add btf_field_iter selftests
selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT
libbpf: Remove callback-based type/string BTF field visitor helpers
bpftool: Use BTF field iterator in btfgen
libbpf: Make use of BTF field iterator in BTF handling code
libbpf: Make use of BTF field iterator in BPF linker code
libbpf: Add BTF field iterator
selftests/bpf: Ignore .llvm.<hash> suffix in kallsyms_find()
selftests/bpf: Fix bpf_cookie and find_vma in nested VM
selftests/bpf: Test global bpf_list_head arrays.
selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types.
selftests/bpf: Test kptr arrays and kptrs in nested struct fields.
bpf: limit the number of levels of a nested struct type.
bpf: look into the types of the fields of a struct type recursively.
...
====================
Link: https://lore.kernel.org/r/20240606223146.23020-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.11
The first "new features" pull request for v6.11 with changes both in
stack and in drivers. Nothing out of ordinary, except that we have
two conflicts this time:
net/mac80211/cfg.c
https://lore.kernel.org/all/20240531124415.05b25e7a@canb.auug.org.au
drivers/net/wireless/microchip/wilc1000/netdev.c
https://lore.kernel.org/all/20240603110023.23572803@canb.auug.org.au
Major changes:
cfg80211/mac80211
* parse Transmit Power Envelope (TPE) data in mac80211 instead of in drivers
wilc1000
* read MAC address during probe to make it visible to user space
iwlwifi
* bump FW API to 91 for BZ/SC devices
* report 64-bit radiotap timestamp
* enable P2P low latency by default
* handle Transmit Power Envelope (TPE) advertised by AP
* start using guard()
rtlwifi
* RTL8192DU support
ath12k
* remove unsupported tx monitor handling
* channel 2 in 6 GHz band support
* Spatial Multiplexing Power Save (SMPS) in 6 GHz band support
* multiple BSSID (MBSSID) and Enhanced Multi-BSSID Advertisements (EMA)
support
* dynamic VLAN support
* add panic handler for resetting the firmware state
ath10k
* add qcom,no-msa-ready-indicator Device Tree property
* LED support for various chipsets
* tag 'wireless-next-2024-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (194 commits)
wifi: ath12k: add hw_link_id in ath12k_pdev
wifi: ath12k: add panic handler
wifi: rtw89: chan: Use swap() in rtw89_swap_sub_entity()
wifi: brcm80211: remove unused structs
wifi: brcm80211: use sizeof(*pointer) instead of sizeof(type)
wifi: ath12k: do not process consecutive RDDM event
dt-bindings: net: wireless: ath11k: Drop "qcom,ipq8074-wcss-pil" from example
wifi: ath12k: fix memory leak in ath12k_dp_rx_peer_frag_setup()
wifi: rtlwifi: handle return value of usb init TX/RX
wifi: rtlwifi: Enable the new rtl8192du driver
wifi: rtlwifi: Add rtl8192du/sw.c
wifi: rtlwifi: Constify rtl_hal_cfg.{ops,usb_interface_cfg} and rtl_priv.cfg
wifi: rtlwifi: Add rtl8192du/dm.{c,h}
wifi: rtlwifi: Add rtl8192du/fw.{c,h} and rtl8192du/led.{c,h}
wifi: rtlwifi: Add rtl8192du/rf.{c,h}
wifi: rtlwifi: Add rtl8192du/trx.{c,h}
wifi: rtlwifi: Add rtl8192du/phy.{c,h}
wifi: rtlwifi: Add rtl8192du/hw.{c,h}
wifi: rtlwifi: Add new members to struct rtl_priv for RTL8192DU
wifi: rtlwifi: Add rtl8192du/table.{c,h}
...
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
Link: https://lore.kernel.org/r/20240607093517.41394C2BBFC@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We use the polling interface to srcu for tracking pending frees; when
shutting down we don't need to wait for an srcu barrier to free them,
but SRCU still gets confused if we shutdown with an outstanding grace
period.
Reported-by: syzbot+6a038377f0a594d7d44e@syzkaller.appspotmail.com
Reported-by: syzbot+0ece6edfd05ed20e32d9@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Turn more asserts into proper recoverable error paths.
Reported-by: syzbot+246b47da27f8e7e7d6fb@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The bucket_gens array and gc_buckets array known their own size; we
should be using those members, and returning an error.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc2-ktest-00018-gebd1d148b278 #144 Not tainted
------------------------------------------------------
fio/1345 is trying to acquire lock:
ffff88813e200ab8 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_truncate+0x76/0xf0
but task is already holding lock:
ffff888105a1fa38 (&sb->s_type->i_mutex_key#13){+.+.}-{3:3}, at: do_truncate+0x7b/0xc0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&sb->s_type->i_mutex_key#13){+.+.}-{3:3}:
down_write+0x3d/0xd0
bch2_write_iter+0x1c0/0x10f0
vfs_write+0x24a/0x560
__x64_sys_pwrite64+0x77/0xb0
x64_sys_call+0x17e5/0x1ab0
do_syscall_64+0x68/0x130
entry_SYSCALL_64_after_hwframe+0x4b/0x53
-> #1 (sb_writers#10){.+.+}-{0:0}:
mnt_want_write+0x4a/0x1d0
filename_create+0x69/0x1a0
user_path_create+0x38/0x50
bch2_fs_file_ioctl+0x315/0xbf0
__x64_sys_ioctl+0x297/0xaf0
x64_sys_call+0x10cb/0x1ab0
do_syscall_64+0x68/0x130
entry_SYSCALL_64_after_hwframe+0x4b/0x53
-> #0 (&c->snapshot_create_lock){++++}-{3:3}:
__lock_acquire+0x1445/0x25b0
lock_acquire+0xbd/0x2b0
down_read+0x40/0x180
bch2_truncate+0x76/0xf0
bchfs_truncate+0x240/0x3f0
bch2_setattr+0x7b/0xb0
notify_change+0x322/0x4b0
do_truncate+0x8b/0xc0
do_ftruncate+0x110/0x270
__x64_sys_ftruncate+0x43/0x80
x64_sys_call+0x1373/0x1ab0
do_syscall_64+0x68/0x130
entry_SYSCALL_64_after_hwframe+0x4b/0x53
other info that might help us debug this:
Chain exists of:
&c->snapshot_create_lock --> sb_writers#10 --> &sb->s_type->i_mutex_key#13
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&sb->s_type->i_mutex_key#13);
lock(sb_writers#10);
lock(&sb->s_type->i_mutex_key#13);
rlock(&c->snapshot_create_lock);
*** DEADLOCK ***
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
fsck_err() does a goto fsck_err on error; factor out check_fix_ptr() so
that our error label can drop our device ref.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We count objects as freed when we move them to the srcu-pending lists
because we're doing the equivalent of a kfree_srcu(); the only
difference is managing the pending list ourself means we can allocate
from the pending list.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|