Age | Commit message (Collapse) | Author |
|
net is not used in ip6_route_info_append() after commit 36f19d5b4f99
("net/ipv6: Remove extra call to ip6_convert_metrics for multipath case").
Let's remove the argument.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/20250418000443.43734-10-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
ip6_route_info_create_nh() will be called under RCU.
It calls fib_nh_common_init() and allocates nhc->nhc_pcpu_rth_output.
As with the reason for rt->fib6_nh->rt6i_pcpu, we want to avoid
GFP_ATOMIC allocation for nhc->nhc_pcpu_rth_output under RCU.
Let's preallocate it in ip6_route_info_create().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250418000443.43734-9-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
ip6_route_info_create_nh() will be called under RCU.
Then, fib6_nh_init() is also under RCU, but per-cpu memory allocation
is very likely to fail with GFP_ATOMIC while bulk-adding IPv6 routes
and we would see a bunch of this message in dmesg.
percpu: allocation failed, size=8 align=8 atomic=1, atomic alloc failed, no space left
percpu: allocation failed, size=8 align=8 atomic=1, atomic alloc failed, no space left
Let's preallocate rt->fib6_nh->rt6i_pcpu in ip6_route_info_create().
If something fails before the original memory allocation in
fib6_nh_init(), ip6_route_info_create_nh() calls fib6_info_release(),
which releases the preallocated per-cpu memory.
Note that rt->fib6_nh->rt6i_pcpu is not preallocated when called via
ipv6_stub, so we still need alloc_percpu_gfp() in fib6_nh_init().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250418000443.43734-8-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT and rely
on RCU to guarantee dev and nexthop lifetime.
Then, we want to allocate as much as possible before entering
the RCU section.
The RCU section will start in the middle of ip6_route_info_create(),
and this is problematic for ip6_route_multipath_add() that calls
ip6_route_info_create() multiple times.
Let's split ip6_route_info_create() into two parts; one for memory
allocation and another for nexthop setup.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/20250418000443.43734-7-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT.
Then, we must perform two lookups for nexthop and dev under RCU
to guarantee their lifetime.
ip6_route_info_create() calls nexthop_find_by_id() first if
RTA_NH_ID is specified, and then allocates struct fib6_info.
nexthop_find_by_id() must be called under RCU, but we do not want
to use GFP_ATOMIC for memory allocation here, which will be likely
to fail in ip6_route_multipath_add().
Let's move nexthop_find_by_id() after the memory allocation so
that we can later split ip6_route_info_create() into two parts:
the sleepable part and the RCU part.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/20250418000443.43734-6-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In ip6_route_multipath_add(), we call rt6_qualify_for_ecmp() for each
entry. If it returns false, the request fails.
rt6_qualify_for_ecmp() returns false if either of the conditions below
is true:
1. f6i->fib6_flags has RTF_ADDRCONF
2. f6i->nh is not NULL
3. f6i->fib6_nh->fib_nh_gw_family is AF_UNSPEC
1 is unnecessary because rtm_to_fib6_config() never sets RTF_ADDRCONF
to cfg->fc_flags.
2. is equivalent with cfg->fc_nh_id.
3. can be replaced by checking RTF_GATEWAY in the base and each multipath
entry because AF_INET6 is set to f6i->fib6_nh->fib_nh_gw_family only when
cfg.fc_is_fdb is true or RTF_GATEWAY is set, but the former is always
false.
These checks do not require RCU and can be done earlier.
Let's perform the equivalent checks in rtm_to_fib6_multipath_config().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250418000443.43734-5-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
ip6_route_info_create() is called from 3 functions:
* ip6_route_add()
* ip6_route_multipath_add()
* addrconf_f6i_alloc()
addrconf_f6i_alloc() does not need validation for struct fib6_config in
ip6_route_info_create().
ip6_route_multipath_add() calls ip6_route_info_create() for multiple
routes with slightly different fib6_config instances, which is copied
from the base config passed from userspace. So, we need not validate
the same config repeatedly.
Let's move such validation into rtm_to_fib6_config().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/20250418000443.43734-4-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Basically, removing an IPv6 route does not require RTNL because
the IPv6 routing tables are protected by per table lock.
inet6_rtm_delroute() calls nexthop_find_by_id() to check if the
nexthop specified by RTA_NH_ID exists. nexthop uses rbtree and
the top-down walk can be safely performed under RCU.
ip6_route_del() already relies on RCU and the table lock, but we
need to extend the RCU critical section a bit more to cover
__ip6_del_rt(). For example, nexthop_for_each_fib6_nh() and
inet6_rt_notify() needs RCU.
Let's call nexthop_find_by_id() and __ip6_del_rt() under RCU and
get rid of RTNL from inet6_rtm_delroute() and SIOCDELRT.
Even if the nexthop is removed after rcu_read_unlock() in
inet6_rtm_delroute(), __remove_nexthop_fib() cleans up the routes
tied to the nexthop, and ip6_route_del() returns -ESRCH. So the
request was at least valid as of nexthop_find_by_id(), and it's just
a matter of timing.
Note that we need to pass false to lwtunnel_valid_encap_type_attr().
The following patches also use the newroute bool.
Note also that fib6_get_table() does not require RCU because once
allocated fib6_table is not freed until netns dismantle. I will
post a follow-up series to convert such callers to RCU-lockless
version. [0]
Link: https://lore.kernel.org/netdev/20250417174557.65721-1-kuniyu@amazon.com/ #[0]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250418000443.43734-3-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We will perform RTM_NEWROUTE and RTM_DELROUTE under RCU, and then
we want to perform some validation out of the RCU scope.
When creating / removing an IPv6 route with RTA_MULTIPATH,
inet6_rtm_newroute() / inet6_rtm_delroute() validates RTA_GATEWAY
in each multipath entry.
Let's do that in rtm_to_fib6_config().
Note that now RTM_DELROUTE returns an error for RTA_MULTIPATH with
0 entries, which was accepted but should result in -EINVAL as
RTM_NEWROUTE.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250418000443.43734-2-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Shannon Nelson says:
====================
pds_core: updates and fixes
This patchset has fixes for issues seen in recent internal testing
of error conditions and stress handling.
Note that the first patch in this series is a leftover from an
earlier patchset that was abandoned:
Link: https://lore.kernel.org/netdev/20250129004337.36898-2-shannon.nelson@amd.com/
====================
Link: https://patch.msgid.link/20250421174606.3892-1-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Make the wait_context a full part of the q_info struct rather
than a stack variable that goes away after pdsc_adminq_post()
is done so that the context is still available after the wait
loop has given up.
There was a case where a slow development firmware caused
the adminq request to time out, but then later the FW finally
finished the request and sent the interrupt. The handler tried
to complete_all() the completion context that had been created
on the stack in pdsc_adminq_post() but no longer existed.
This caused bad pointer usage, kernel crashes, and much wailing
and gnashing of teeth.
Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250421174606.3892-5-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the pds_core driver was first created there were some race
conditions around using the adminq, especially for client drivers.
To reduce the possibility of a race condition there's a check
against pf->state in pds_client_adminq_cmd(). This is problematic
for a couple of reasons:
1. The PDSC_S_INITING_DRIVER bit is set during probe, but not
cleared until after everything in probe is complete, which
includes creating the auxiliary devices. For pds_fwctl this
means it can't make any adminq commands until after pds_core's
probe is complete even though the adminq is fully up by the
time pds_fwctl's auxiliary device is created.
2. The race conditions around using the adminq have been fixed
and this path is already protected against client drivers
calling pds_client_adminq_cmd() if the adminq isn't ready,
i.e. see pdsc_adminq_post() -> pdsc_adminq_inc_if_up().
Fix this by removing the pf->state check in pds_client_adminq_cmd()
because invalid accesses to pds_core's adminq is already handled by
pdsc_adminq_post()->pdsc_adminq_inc_if_up().
Fixes: 10659034c622 ("pds_core: add the aux client API")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250421174606.3892-4-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If the FW doesn't support the PDS_CORE_CMD_FW_CONTROL command
the driver might at the least print garbage and at the worst
crash when the user runs the "devlink dev info" devlink command.
This happens because the stack variable fw_list is not 0
initialized which results in fw_list.num_fw_slots being a
garbage value from the stack. Then the driver tries to access
fw_list.fw_names[i] with i >= ARRAY_SIZE and runs off the end
of the array.
Fix this by initializing the fw_list and by not failing
completely if the devcmd fails because other useful information
is printed via devlink dev info even if the devcmd fails.
Fixes: 45d76f492938 ("pds_core: set up device and adminq")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250421174606.3892-3-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The pds_core's adminq is protected by the adminq_lock, which prevents
more than 1 command to be posted onto it at any one time. This makes it
so the client drivers cannot simultaneously post adminq commands.
However, the completions happen in a different context, which means
multiple adminq commands can be posted sequentially and all waiting
on completion.
On the FW side, the backing adminq request queue is only 16 entries
long and the retry mechanism and/or overflow/stuck prevention is
lacking. This can cause the adminq to get stuck, so commands are no
longer processed and completions are no longer sent by the FW.
As an initial fix, prevent more than 16 outstanding adminq commands so
there's no way to cause the adminq from getting stuck. This works
because the backing adminq request queue will never have more than 16
pending adminq commands, so it will never overflow. This is done by
reducing the adminq depth to 16.
Fixes: 45d76f492938 ("pds_core: set up device and adminq")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250421174606.3892-2-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Mark Bloch says:
====================
net/mlx5: HWS, Improve IP version handling
This small series hardens our checks against a single matcher containing
rules that match on IPv4 and IPv6. This scenario is not supported by
hardware steering and the implementation now signals this instead of
failing silently.
Patches:
* Patch 1 forbids a single definer to match on mixed IP versions for
source and destination address.
* Patch 2 reproduces a couple of firmware checks: it forbids creating
a definer that matches on IP address without matching on IP version,
and also disallows matching on IPv6 addresses and the IPv4 IHL fields
in the same definer.
* Patch 3 forbids mixing rules that match on IPv4 and IPv6 addresses in
the same matcher. The underlying definer mechanism does not support
that.
====================
Link: https://patch.msgid.link/20250422092540.182091-1-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Signal clearly to the user, via an error, that mixing IPv4 and IPv6
rules in the same matcher is not supported. Previously such cases
silently failed by adding a rule that did not work correctly.
Rules can specify an IP version by one of two fields: IP version or
ethertype. At matcher creation, store whether the template matches on
any of these two fields. If yes, inspect each rule for its corresponding
match value and store the IP version inside the matcher to guard against
inconsistencies with subsequent rules.
Furthermore, also check rules for internal consistency, i.e. verify that
the ethertype and IP version match values do not contradict each other.
The logic applies to inner and outer headers independently, to account
for tunneling.
Rules that do not match on IP addresses are not affected.
Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250422092540.182091-4-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replicate some sanity checks that firmware does, since hardware steering
does not go through firmware.
When creating a definer, disallow matching on IP addresses without also
matching on IP version. The latter can be satisfied by matching either
on the version field in the IP header, or on the ethertype field.
Also refuse to match IPv4 IHL alongside IPv6.
Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250422092540.182091-3-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Unify the check for IP version when creating a definer. A given matcher
is deemed to match on IPv6 if any of the higher order (>31) bits of
source or destination address mask are set.
A single packet cannot mix IP versions between source and destination
addresses, so it makes no sense that they would be decided on
independently.
Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250422092540.182091-2-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
MT7531 standalone and MMIO variants found in MT7988 and EN7581 share
most basic properties. Despite that, assisted_learning_on_cpu_port and
mtu_enforcement_ingress were only applied for MT7531 but not for MT7988
or EN7581, causing the expected issues on MMIO devices.
Apply both settings equally also for MT7988 and EN7581 by moving both
assignments form mt7531_setup() to mt7531_setup_common().
This fixes unwanted flooding of packets due to unknown unicast
during DA lookup, as well as issues with heterogenous MTU settings.
Fixes: 7f54cc9772ce ("net: dsa: mt7530: split-off common parts from mt7531_setup")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Chester A. Unal <chester.a.unal@arinc9.com>
Link: https://patch.msgid.link/89ed7ec6d4fa0395ac53ad2809742bb1ce61ed12.1745290867.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The current API, otx2_xdp_sq_append_pkt, verifies the number of available
descriptors before sending packets to the hardware.
However, for AF_XDP, this check is unnecessary because the batch value
is already determined based on the free descriptors.
This patch introduces a new API, "otx2_xsk_sq_append_pkt" to address this.
Remove the logic for releasing the TX buffers, as it is implicitly handled
by xsk_tx_peek_release_desc_batch
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20250420032350.4047706-1-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
igc: Add support for Frame Preemption
Faizal Rahim says:
Introduce support for the FPE feature in the IGC driver.
The patches aligns with the upstream FPE API:
https://patchwork.kernel.org/project/netdevbpf/cover/20230220122343.1156614-1-vladimir.oltean@nxp.com/
https://patchwork.kernel.org/project/netdevbpf/cover/20230119122705.73054-1-vladimir.oltean@nxp.com/
It builds upon earlier work:
https://patchwork.kernel.org/project/netdevbpf/cover/20220520011538.1098888-1-vinicius.gomes@intel.com/
The patch series adds the following functionalities to the IGC driver:
a) Configure FPE using `ethtool --set-mm`.
b) Display FPE settings via `ethtool --show-mm`.
c) View FPE statistics using `ethtool --include-statistics --show-mm'.
e) Block setting preemptible tc in taprio since it is not supported yet.
Existing code already blocks it in mqprio.
Tested:
Enabled CONFIG_PROVE_LOCKING, CONFIG_DEBUG_ATOMIC_SLEEP, CONFIG_DMA_API_DEBUG, and CONFIG_KASAN
1) selftests
2) netdev down/up cycles
3) suspend/resume cycles
4) fpe verification
No bugs or unusual dmesg logs were observed.
Ran 1), 2) and 3) with and without the patch series, compared dmesg and selftest logs - no differences found.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
igc: add support to get frame preemption statistics via ethtool
igc: add support to get MAC Merge data via ethtool
igc: block setting preemptible traffic class in taprio
igc: add support to set tx-min-frag-size
igc: add support for frame preemption verification
igc: set the RX packet buffer size for TSN mode
igc: use FIELD_PREP and GENMASK for existing RX packet buffer size
igc: optimize TX packet buffer utilization for TSN mode
igc: use FIELD_PREP and GENMASK for existing TX packet buffer size
igc: rename I225_RXPBSIZE_DEFAULT and I225_TXPBSIZE_DEFAULT
igc: rename xdp_get_tx_ring() for non-xdp usage
net: ethtool: mm: reset verification status when link is down
net: ethtool: mm: extract stmmac verification logic into common library
net: stmmac: move frag_size handling out of spin_lock
====================
Link: https://patch.msgid.link/20250418163822.3519810-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cong Wang says:
====================
net_sched: Fix UAF vulnerability in HFSC qdisc
This patchset contains two bug fixes and a selftest for the first one
which we have a reliable reproducer, please check each patch
description for details.
====================
Link: https://patch.msgid.link/20250417184732.943057-1-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a selftest to exercise the condition where qdisc implementations
like netem or codel might empty the queue during a peek operation.
This tests the defensive code path in HFSC that checks the queue length
again after peeking to handle this case.
Based on the reproducer from Gerrard, improved by Jamal.
Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20250417184732.943057-4-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Similarly to the previous patch, we need to safe guard hfsc_dequeue()
too. But for this one, we don't have a reliable reproducer.
Fixes: 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 ("Linux-2.6.12-rc2")
Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20250417184732.943057-3-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch fixes a Use-After-Free vulnerability in the HFSC qdisc class
handling. The issue occurs due to a time-of-check/time-of-use condition
in hfsc_change_class() when working with certain child qdiscs like netem
or codel.
The vulnerability works as follows:
1. hfsc_change_class() checks if a class has packets (q.qlen != 0)
2. It then calls qdisc_peek_len(), which for certain qdiscs (e.g.,
codel, netem) might drop packets and empty the queue
3. The code continues assuming the queue is still non-empty, adding
the class to vttree
4. This breaks HFSC scheduler assumptions that only non-empty classes
are in vttree
5. Later, when the class is destroyed, this can lead to a Use-After-Free
The fix adds a second queue length check after qdisc_peek_len() to verify
the queue wasn't emptied.
Fixes: 21f4d5cc25ec ("net_sched/hfsc: fix curve activation in hfsc_change_class()")
Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Reviewed-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20250417184732.943057-2-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Lorenzo Bianconi says:
====================
Enable multiple IRQ lines support in airoha_eth driver
EN7581 ethernet SoC supports 4 programmable IRQ lines each one composed
by 4 IRQ configuration registers to map Tx/Rx queues. Enable multiple
IRQ lines support.
====================
Link: https://patch.msgid.link/20250418-airoha-eth-multi-irq-v1-0-1ab0083ca3c1@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
EN7581 ethernet SoC supports 4 programmable IRQ lines for Tx and Rx
interrupts. Enable multiple IRQ lines support. Map Rx/Tx queues to the
available IRQ lines using the default scheme used in the vendor SDK:
- IRQ0: rx queues [0-4],[7-9],15
- IRQ1: rx queues [21-30]
- IRQ2: rx queues 5
- IRQ3: rx queues 6
Tx queues interrupts are managed by IRQ0.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250418-airoha-eth-multi-irq-v1-2-1ab0083ca3c1@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
EN7581 ethernet SoC supports 4 programmable IRQ lines each one composed
by 4 IRQ configuration registers. Add airoha_irq_bank struct as a
container for independent IRQ lines info (e.g. IRQ number, enabled source
interrupts, ecc). This is a preliminary patch to support multiple IRQ lines
in airoha_eth driver.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250418-airoha-eth-multi-irq-v1-1-1ab0083ca3c1@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Heiner Kallweit says:
====================
r8169: merge chip versions
After 2b065c098c37 ("r8169: refactor chip version detection") we can
merge handling of few chip versions.
====================
Link: https://patch.msgid.link/5e1e14ea-d60f-4608-88eb-3104b6bbace8@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Handling of both chip versions is the same, only difference is
the firmware. So we can merge handling of both chip versions.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/ae866b71-c904-434e-befb-848c831e33ff@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Handling of both chip versions is the same, only difference is
the firmware. So we can merge handling of both chip versions.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/0baad123-c679-4154-923f-fdc12783e900@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Handling of both chip versions is the same, only difference is
the firmware. So we can merge handling of both chip versions.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/97d7ae79-d021-4b6b-b424-89e5e305b029@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
All callers of these functions depend on PHYLIB or select it directly
or indirectly by selecting PHYLINK. Stubs make sense for optional
functionality, but that's not the case here.
MDIO_XGENE usually is selected by NET_XGENE which also selects PHYLIB.
Add a dependency to PHYLIB nevertheless, in order not to break
randconfig builds.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/f7a69a1f-60e9-4ac0-8b7c-481e0cc850e7@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Matthieu Baerts says:
====================
mptcp: pm: Defer freeing userspace pm entries
Here are two unrelated fixes for MPTCP:
- Patch 1: free userspace PM entry with RCU helpers. A fix for v6.14.
- Patch 2: avoid a warning when running diag.sh selftest. A fix for
v6.15-rc1.
====================
Link: https://patch.msgid.link/20250421-net-mptcp-pm-defer-freeing-v1-0-e731dc6e86b9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When running diag.sh in a loop, chk_dump_one will report the following
"grep: write error":
13 ....chk 2 cestab [ OK ]
grep: write error
14 ....chk dump_one [ OK ]
15 ....chk 2->0 msk in use after flush [ OK ]
16 ....chk 2->0 cestab after flush [ OK ]
This error is caused by a broken pipe. When the output of 'ss' is processed
by grep, 'head -n 1' will exit immediately after getting the first line,
causing the subsequent pipe to close. At this time, if 'grep' is still
trying to write data to the closed pipe, it will trigger a SIGPIPE signal,
causing a write error.
One solution is not to use this problematic "head -n 1" command, but to use
mptcp_lib_get_info_value() helper defined in mptcp_lib.sh to get the value
of 'token'.
Fixes: ba2400166570 ("selftests: mptcp: add a test for mptcp_diag_dump_one")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Tested-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250421-net-mptcp-pm-defer-freeing-v1-2-e731dc6e86b9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When path manager entries are deleted from the local address list, they
are first unlinked from the address list using list_del_rcu(). The
entries must not be freed until after the RCU grace period, but the
existing code immediately frees the entry.
Use kfree_rcu_mightsleep() and adjust sk_omem_alloc in open code instead
of using the sock_kfree_s() helper. This code path is only called in a
netlink handler, so the "might sleep" function is preferable to adding
a rarely-used rcu_head member to struct mptcp_pm_addr_entry.
Fixes: 88d097316371 ("mptcp: drop free_list for deleting entries")
Cc: stable@vger.kernel.org
Signed-off-by: Mat Martineau <martineau@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250421-net-mptcp-pm-defer-freeing-v1-1-e731dc6e86b9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
CONFIG_NET_NS=n.
kernel test robot reported the splat below. [0]
Before commit fed176bf3143 ("net: Add ops_undo_single for module
load/unload."), if CONFIG_NET_NS=n, ops was linked to pernet_list
only when init_net had not been initialised, and ops was unlinked
from pernet_list only under the same condition.
Let's say an ops is loaded before the init_net setup but unloaded
after that. Then, the ops remains in pernet_list, which seems odd.
The cited commit added ops_undo_single(), which calls list_add() for
ops to link it to a temporary list, so a minor change was added to
__register_pernet_operations() and __unregister_pernet_operations()
under CONFIG_NET_NS=n to avoid the pernet_list corruption.
However, the corruption must have been left as is.
When CONFIG_NET_NS=n, pernet_list was used to keep ops registered
before the init_net setup, and after that, pernet_list was not used
at all.
This was because some ops annotated with __net_initdata are cleared
out of memory at some point during boot.
Then, such ops is initialised by POISON_FREE_INITMEM (0xcc), resulting
in that ops->list.{next,prev} suddenly switches from a valid pointer
to a weird value, 0xcccccccccccccccc.
To avoid such wild memory access, let's allow the pernet_list
corruption for CONFIG_NET_NS=n.
[0]:
Oops: general protection fault, probably for non-canonical address 0xf999959999999999: 0000 [#1] SMP KASAN NOPTI
KASAN: maybe wild-memory-access in range [0xccccccccccccccc8-0xcccccccccccccccf]
CPU: 2 UID: 0 PID: 346 Comm: modprobe Not tainted 6.15.0-rc1-00294-ga4cba7e98e35 #85 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:__list_add_valid_or_report (lib/list_debug.c:32)
Code: 48 c1 ea 03 80 3c 02 00 0f 85 5a 01 00 00 49 39 74 24 08 0f 85 83 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 1f 01 00 00 4c 39 26 0f 85 ab 00 00 00 4c 39 ee
RSP: 0018:ff11000135b87830 EFLAGS: 00010a07
RAX: dffffc0000000000 RBX: ffffffffc02223c0 RCX: ffffffff8406fcc2
RDX: 1999999999999999 RSI: cccccccccccccccc RDI: ffffffffc02223c0
RBP: ffffffff86064e40 R08: 0000000000000001 R09: fffffbfff0a9f5b5
R10: ffffffff854fadaf R11: 676552203a54454e R12: ffffffff86064e40
R13: ffffffffc02223c0 R14: ffffffff86064e48 R15: 0000000000000021
FS: 00007f6fb0d9e1c0(0000) GS:ff11000858ea0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6fb0eda580 CR3: 0000000122fec005 CR4: 0000000000771ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
register_pernet_operations (./include/linux/list.h:150 (discriminator 5) ./include/linux/list.h:183 (discriminator 5) net/core/net_namespace.c:1315 (discriminator 5) net/core/net_namespace.c:1359 (discriminator 5))
register_pernet_subsys (net/core/net_namespace.c:1401)
inet6_init (net/ipv6/af_inet6.c:535) ipv6
do_one_initcall (init/main.c:1257)
do_init_module (kernel/module/main.c:2942)
load_module (kernel/module/main.c:3409)
init_module_from_file (kernel/module/main.c:3599)
idempotent_init_module (kernel/module/main.c:3611)
__x64_sys_finit_module (./include/linux/file.h:62 ./include/linux/file.h:83 kernel/module/main.c:3634 kernel/module/main.c:3621 kernel/module/main.c:3621)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
RIP: 0033:0x7f6fb0df7e5d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 9f 1b 00 f7 d8 64 89 01 48
RSP: 002b:00007fffdc6a8968 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 000055b535721b70 RCX: 00007f6fb0df7e5d
RDX: 0000000000000000 RSI: 000055b51e44aa2a RDI: 0000000000000004
RBP: 0000000000040000 R08: 0000000000000000 R09: 000055b535721b30
R10: 0000000000000004 R11: 0000000000000246 R12: 000055b51e44aa2a
R13: 000055b535721bf0 R14: 000055b5357220b0 R15: 0000000000000000
</TASK>
Modules linked in: ipv6(+) crc_ccitt
Fixes: fed176bf3143 ("net: Add ops_undo_single for module load/unload.")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202504181511.1c3f23e4-lkp@intel.com
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250418215025.87871-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jakub Kicinski says:
====================
netlink: specs: rtnetlink: adjust specs for C codegen
The first patch brings a schema extension allowing specifying
"header" (as in .h file) properties in attribute sets.
This is used for rare cases where we carry attributes from
another family in a nest - we need to include the extra
headers. If we were to generate kernel code we'd also
need to skip it in the uAPI output.
The remaining 11 patches are pretty boring schema adjustments.
====================
Link: https://patch.msgid.link/20250418021706.1967583-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add properties needed for C codegen to match names with uAPI headers.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-13-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The notify property should point at the object the notifications
carry, usually the get object, not the cmd which triggers
the notification:
notify:
description: Name of the command sharing the reply type with
this notification.
Not treating this as a fix, I think that only C codegen cares.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-12-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The consistency check complains replies to do and dump don't match
because dump has no value. It doesn't have to by the schema... but
fixing this in code gen would be more code than adjusting the spec.
This is rare.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-11-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add properties needed for C codegen to match names with uAPI headers.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-10-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a notification entry for netlink so that we can test ntf handling
in classic netlink and C.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ns-ip6-target is an indexed-array. Codegen for variable size binary
array would be a bit tedious, tell C that we know the size of these
attributes, since they are IPv6 addrs.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The AF nest is indexed by AF ID, so it's a bit strange,
but with minor adjustments C codegen deals with it just fine.
Entirely unclear why the names have been in quotes here.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add properties needed for C codegen to match names with uAPI headers.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
group is listed twice for newlink.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
if-netnsid an alias to target-netnsid:
IFLA_TARGET_NETNSID = IFLA_IF_NETNSID, /* new alias */
We don't have a definition for this attr.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The purpose of the attribute list is to list the attributes
which will be included in a given message to shrink the objects
for families with huge attr spaces. Fixed headers are always
present in their entirety (between netlink header and the attrs)
so there's no point in listing their members. Current C codegen
doesn't expect them and tries to look them up in the attribute space.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
rt-link has a number of disjoint headers, plus it uses attributes
of other families (e.g. DPLL). Allow declaring a attribute set
as "foreign" by specifying which header its definition is coming
from.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250418021706.1967583-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|