Age | Commit message (Collapse) | Author |
|
Pull cifs client fixes from Steve French:
"Seven cifs/smb3 client fixes, all also for stable:
- four DFS fixes
- multichannel reconnect fix
- fix smb1 stats for cancel command
- fix for set file size error path"
* tag '6.3-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: use DFS root session instead of tcon ses
cifs: return DFS root session id in DebugData
cifs: fix use-after-free bug in refresh_cache_worker()
cifs: set DFS root session in cifs_get_smb_ses()
cifs: generate signkey for the channel that's reconnecting
cifs: Fix smb2_set_path_size()
cifs: Move the in_send statistic to __smb_send_rqst()
|
|
Pull kvm fixes from Paolo Bonzini:
"ARM64:
- Address a rather annoying bug w.r.t. guest timer offsetting. The
synchronization of timer offsets between vCPUs was broken, leading
to inconsistent timer reads within the VM.
x86:
- New tests for the slow path of the EVTCHNOP_send Xen hypercall
- Add missing nVMX consistency checks for CR0 and CR4
- Fix bug that broke AMD GATag on 512 vCPU machines
Selftests:
- Skip hugetlb tests if huge pages are not available
- Sync KVM exit reasons"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: selftests: Sync KVM exit reasons in selftests
KVM: selftests: Add macro to generate KVM exit reason strings
KVM: selftests: Print expected and actual exit reason in KVM exit reason assert
KVM: selftests: Make vCPU exit reason test assertion common
KVM: selftests: Add EVTCHNOP_send slow path test to xen_shinfo_test
KVM: selftests: Use enum for test numbers in xen_shinfo_test
KVM: selftests: Add helpers to make Xen-style VMCALL/VMMCALL hypercalls
KVM: selftests: Move the guts of kvm_hypercall() to a separate macro
KVM: SVM: WARN if GATag generation drops VM or vCPU ID information
KVM: SVM: Modify AVIC GATag to support max number of 512 vCPUs
KVM: SVM: Fix a benign off-by-one bug in AVIC physical table mask
selftests: KVM: skip hugetlb tests if huge pages are not available
KVM: VMX: Use tabs instead of spaces for indentation
KVM: VMX: Fix indentation coding style issue
KVM: nVMX: remove unnecessary #ifdef
KVM: nVMX: add missing consistency checks for CR0 and CR4
KVM: arm64: timers: Convert per-vcpu virtual offset to a global value
|
|
Xuan Zhuo says:
====================
virtio_net: fix two bugs related to XDP
This patch set fixes two bugs related to XDP.
These two patch is not associated.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
build_skb_from_xdp_buff() may return NULL, in this case
we need to free the frags of xdp shinfo.
Fixes: fab89bafa95b ("virtio-net: support multi-buffer xdp")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Because headroom is not passed to page_to_skb(), this causes the shinfo
exceeds the range. Then the frags of shinfo are changed by other process.
[ 157.724634] stack segment: 0000 [#1] PREEMPT SMP NOPTI
[ 157.725358] CPU: 3 PID: 679 Comm: xdp_pass_user_f Tainted: G E 6.2.0+ #150
[ 157.726401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/4
[ 157.727820] RIP: 0010:skb_release_data+0x11b/0x180
[ 157.728449] Code: 44 24 02 48 83 c3 01 39 d8 7e be 48 89 d8 48 c1 e0 04 41 80 7d 7e 00 49 8b 6c 04 30 79 0c 48 89 ef e8 89 b
[ 157.730751] RSP: 0018:ffffc90000178b48 EFLAGS: 00010202
[ 157.731383] RAX: 0000000000000010 RBX: 0000000000000001 RCX: 0000000000000000
[ 157.732270] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff888100dd0b00
[ 157.733117] RBP: 5d5d76010f6e2408 R08: ffff888100dd0b2c R09: 0000000000000000
[ 157.734013] R10: ffffffff82effd30 R11: 000000000000a14e R12: ffff88810981ffc0
[ 157.734904] R13: ffff888100dd0b00 R14: 0000000000000002 R15: 0000000000002310
[ 157.735793] FS: 00007f06121d9740(0000) GS:ffff88842fcc0000(0000) knlGS:0000000000000000
[ 157.736794] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 157.737522] CR2: 00007ffd9a56c084 CR3: 0000000104bda001 CR4: 0000000000770ee0
[ 157.738420] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 157.739283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 157.740146] PKRU: 55555554
[ 157.740502] Call Trace:
[ 157.740843] <IRQ>
[ 157.741117] kfree_skb_reason+0x50/0x120
[ 157.741613] __udp4_lib_rcv+0x52b/0x5e0
[ 157.742132] ip_protocol_deliver_rcu+0xaf/0x190
[ 157.742715] ip_local_deliver_finish+0x77/0xa0
[ 157.743280] ip_sublist_rcv_finish+0x80/0x90
[ 157.743834] ip_list_rcv_finish.constprop.0+0x16f/0x190
[ 157.744493] ip_list_rcv+0x126/0x140
[ 157.744952] __netif_receive_skb_list_core+0x29b/0x2c0
[ 157.745602] __netif_receive_skb_list+0xed/0x160
[ 157.746190] ? udp4_gro_receive+0x275/0x350
[ 157.746732] netif_receive_skb_list_internal+0xf2/0x1b0
[ 157.747398] napi_gro_receive+0xd1/0x210
[ 157.747911] virtnet_receive+0x75/0x1c0
[ 157.748422] virtnet_poll+0x48/0x1b0
[ 157.748878] __napi_poll+0x29/0x1b0
[ 157.749330] net_rx_action+0x27a/0x340
[ 157.749812] __do_softirq+0xf3/0x2fb
[ 157.750298] do_softirq+0xa2/0xd0
[ 157.750745] </IRQ>
[ 157.751563] <TASK>
[ 157.752329] __local_bh_enable_ip+0x6d/0x80
[ 157.753178] virtnet_xdp_set+0x482/0x860
[ 157.754159] ? __pfx_virtnet_xdp+0x10/0x10
[ 157.755129] dev_xdp_install+0xa4/0xe0
[ 157.756033] dev_xdp_attach+0x20b/0x5e0
[ 157.756933] do_setlink+0x82e/0xc90
[ 157.757777] ? __nla_validate_parse+0x12b/0x1e0
[ 157.758744] rtnl_setlink+0xd8/0x170
[ 157.759549] ? mod_objcg_state+0xcb/0x320
[ 157.760328] ? security_capable+0x37/0x60
[ 157.761209] ? security_capable+0x37/0x60
[ 157.762072] rtnetlink_rcv_msg+0x145/0x3d0
[ 157.762929] ? ___slab_alloc+0x327/0x610
[ 157.763754] ? __alloc_skb+0x141/0x170
[ 157.764533] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 157.765422] netlink_rcv_skb+0x58/0x110
[ 157.766229] netlink_unicast+0x21f/0x330
[ 157.766951] netlink_sendmsg+0x240/0x4a0
[ 157.767654] sock_sendmsg+0x93/0xa0
[ 157.768434] ? sockfd_lookup_light+0x12/0x70
[ 157.769245] __sys_sendto+0xfe/0x170
[ 157.770079] ? handle_mm_fault+0xe9/0x2d0
[ 157.770859] ? preempt_count_add+0x51/0xa0
[ 157.771645] ? up_read+0x3c/0x80
[ 157.772340] ? do_user_addr_fault+0x1e9/0x710
[ 157.773166] ? kvm_read_and_reset_apf_flags+0x49/0x60
[ 157.774087] __x64_sys_sendto+0x29/0x30
[ 157.774856] do_syscall_64+0x3c/0x90
[ 157.775518] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 157.776382] RIP: 0033:0x7f06122def70
Fixes: 18117a842ab0 ("virtio-net: remove xdp related info from page_to_skb()")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It is preferred to use typed property access functions (i.e.
of_property_read_<type> functions) rather than low-level
of_get_property/of_find_property functions for reading properties.
Convert reading boolean properties to of_property_read_bool().
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are no users of nfcmrvl platform_data struct outside of the
driver and none will be added, so move it into the driver.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It is preferred to use typed property access functions (i.e.
of_property_read_<type> functions) rather than low-level
of_get_property/of_find_property functions for reading properties.
Convert reading boolean properties to of_property_read_bool().
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can
Acked-by: Kalle Valo <kvalo@kernel.org>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Vladimir Oltean says:
====================
Fix MTU reporting for Marvell DSA switches where we can't change it
As explained in patch 2, the driver doesn't know how to change the MTU
on MV88E6165, MV88E6191, MV88E6220, MV88E6250 and MV88E6290, and there
is a regression where it actually reports an MTU value below the
Ethernet standard (1500).
Fixing that shows another issue where DSA is unprepared to be told that
a switch supports an MTU of only 1500, and still errors out. That is
addressed by patch 1.
Testing was not done on "real" hardware, but on a different Marvell DSA
switch, with code modified such that the driver doesn't know how to
change the MTU on that, either.
A key assumption is that these switches don't need any MTU configuration
to pass full MTU-sized, DSA-tagged packets, which seems like a
reasonable assumption to make. My 6390 and 6190 switches, with
.port_set_jumbo_size commented out, certainly don't seem to have any
problem passing MTU-sized traffic, as can be seen in this iperf3 session
captured with tcpdump on the DSA master:
$MAC > $MAC, Marvell DSA mode Forward, dev 2, port 8, untagged, VID 1000,
FPri 0, ethertype IPv4 (0x0800), length 1518:
10.0.0.69.49590 > 10.0.0.1.5201: Flags [.], seq 81088:82536,
ack 1, win 502, options [nop,nop,TS val 2221498829 ecr 3012859850],
length 1448
I don't want to go all the way and say that the adjustment made by
commit b9c587fed61c ("dsa: mv88e6xxx: Include tagger overhead when
setting MTU for DSA and CPU ports") is completely unnecessary, just that
there's an equally good chance that the switches with unknown MTU
configuration procedure "just work".
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are 3 classes of switch families that the driver is aware of, as
far as mv88e6xxx_change_mtu() is concerned:
- MTU configuration is available per port. Here, the
chip->info->ops->port_set_jumbo_size() method will be present.
- MTU configuration is global to the switch. Here, the
chip->info->ops->set_max_frame_size() method will be present.
- We don't know how to change the MTU. Here, none of the above methods
will be present.
Switch families MV88E6165, MV88E6191, MV88E6220, MV88E6250 and MV88E6290
fall in category 3.
The blamed commit has adjusted the MTU for all 3 categories by EDSA_HLEN
(8 bytes), resulting in a new maximum MTU of 1492 being reported by the
driver for these switches.
I don't have the hardware to test, but I do have a MV88E6390 switch on
which I can simulate this by commenting out its .port_set_jumbo_size
definition from mv88e6390_ops. The result is this set of messages at
probe time:
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 1
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 2
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 3
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 4
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 5
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 6
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 7
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 8
It is highly implausible that there exist Ethernet switches which don't
support the standard MTU of 1500 octets, and this is what the DSA
framework says as well - the error comes from dsa_slave_create() ->
dsa_slave_change_mtu(slave_dev, ETH_DATA_LEN).
But the error messages are alarming, and it would be good to suppress
them.
As a consequence of this unlikeliness, we reimplement mv88e6xxx_get_max_mtu()
and mv88e6xxx_change_mtu() on switches from the 3rd category as follows:
the maximum supported MTU is 1500, and any request to set the MTU to a
value larger than that fails in dev_validate_mtu().
Fixes: b9c587fed61c ("dsa: mv88e6xxx: Include tagger overhead when setting MTU for DSA and CPU ports")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, when dsa_slave_change_mtu() is called on a user port where
dev->max_mtu is 1500 (as returned by ds->ops->port_max_mtu()), the code
will stumble upon this check:
if (new_master_mtu > mtu_limit)
return -ERANGE;
because new_master_mtu is adjusted for the tagger overhead but mtu_limit
is not.
But it would be good if the logic went through, for example if the DSA
master really depends on an MTU adjustment to accept DSA-tagged frames.
To make the code pass through the check, we need to adjust mtu_limit for
the overhead as well, if the minimum restriction was caused by the DSA
user port's MTU (dev->max_mtu). A DSA user port MTU and a DSA master MTU
are always offset by the protocol overhead.
Currently no drivers return 1500 .port_max_mtu(), but this is only
temporary and a bug in itself - mv88e6xxx should have done that, but
since commit b9c587fed61c ("dsa: mv88e6xxx: Include tagger overhead when
setting MTU for DSA and CPU ports") it no longer does. This is a
preparation for fixing that.
Fixes: bfcb813203e6 ("net: dsa: configure the MTU for switch ports")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ice_qp_dis() intends to stop a given queue pair that is a target of xsk
pool attach/detach. One of the steps is to disable interrupts on these
queues. It currently is broken in a way that txq irq is turned off
*after* HW flush which in turn takes no effect.
ice_qp_dis():
-> ice_qvec_dis_irq()
--> disable rxq irq
--> flush hw
-> ice_vsi_stop_tx_ring()
-->disable txq irq
Below splat can be triggered by following steps:
- start xdpsock WITHOUT loading xdp prog
- run xdp_rxq_info with XDP_TX action on this interface
- start traffic
- terminate xdpsock
[ 256.312485] BUG: kernel NULL pointer dereference, address: 0000000000000018
[ 256.319560] #PF: supervisor read access in kernel mode
[ 256.324775] #PF: error_code(0x0000) - not-present page
[ 256.329994] PGD 0 P4D 0
[ 256.332574] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 256.337006] CPU: 3 PID: 32 Comm: ksoftirqd/3 Tainted: G OE 6.2.0-rc5+ #51
[ 256.345218] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
[ 256.355807] RIP: 0010:ice_clean_rx_irq_zc+0x9c/0x7d0 [ice]
[ 256.361423] Code: b7 8f 8a 00 00 00 66 39 ca 0f 84 f1 04 00 00 49 8b 47 40 4c 8b 24 d0 41 0f b7 45 04 66 25 ff 3f 66 89 04 24 0f 84 85 02 00 00 <49> 8b 44 24 18 0f b7 14 24 48 05 00 01 00 00 49 89 04 24 49 89 44
[ 256.380463] RSP: 0018:ffffc900088bfd20 EFLAGS: 00010206
[ 256.385765] RAX: 000000000000003c RBX: 0000000000000035 RCX: 000000000000067f
[ 256.393012] RDX: 0000000000000775 RSI: 0000000000000000 RDI: ffff8881deb3ac80
[ 256.400256] RBP: 000000000000003c R08: ffff889847982710 R09: 0000000000010000
[ 256.407500] R10: ffffffff82c060c0 R11: 0000000000000004 R12: 0000000000000000
[ 256.414746] R13: ffff88811165eea0 R14: ffffc9000d255000 R15: ffff888119b37600
[ 256.421990] FS: 0000000000000000(0000) GS:ffff8897e0cc0000(0000) knlGS:0000000000000000
[ 256.430207] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 256.436036] CR2: 0000000000000018 CR3: 0000000005c0a006 CR4: 00000000007706e0
[ 256.443283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 256.450527] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 256.457770] PKRU: 55555554
[ 256.460529] Call Trace:
[ 256.463015] <TASK>
[ 256.465157] ? ice_xmit_zc+0x6e/0x150 [ice]
[ 256.469437] ice_napi_poll+0x46d/0x680 [ice]
[ 256.473815] ? _raw_spin_unlock_irqrestore+0x1b/0x40
[ 256.478863] __napi_poll+0x29/0x160
[ 256.482409] net_rx_action+0x136/0x260
[ 256.486222] __do_softirq+0xe8/0x2e5
[ 256.489853] ? smpboot_thread_fn+0x2c/0x270
[ 256.494108] run_ksoftirqd+0x2a/0x50
[ 256.497747] smpboot_thread_fn+0x1c1/0x270
[ 256.501907] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 256.506594] kthread+0xea/0x120
[ 256.509785] ? __pfx_kthread+0x10/0x10
[ 256.513597] ret_from_fork+0x29/0x50
[ 256.517238] </TASK>
In fact, irqs were not disabled and napi managed to be scheduled and run
while xsk_pool pointer was still valid, but SW ring of xdp_buff pointers
was already freed.
To fix this, call ice_qvec_dis_irq() after ice_vsi_stop_tx_ring(). Also
while at it, remove redundant ice_clean_rx_ring() call - this is handled
in ice_qp_clean_rings().
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
GPY2xx devices need 3 seconds to fully switch out of loopback mode
before it can safely re-enter loopback mode. Implement timeout mechanism
to guarantee 3 seconds waited before re-enter loopback mode.
Signed-off-by: Xu Liang <lxu@maxlinear.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
This adds SOCK_STREAM and SOCK_SEQPACKET tests for invalid buffer case.
It tries to read data to NULL buffer (data already presents in socket's
queue), then uses valid buffer. For SOCK_STREAM second read must return
data, because skbuff is not dropped, but for SOCK_SEQPACKET skbuff will
be dropped by kernel, and 'recv()' will return EAGAIN.
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This returns behaviour of SOCK_STREAM read as before skbuff usage. When
copying to user fails current skbuff won't be dropped, but returned to
sockets's queue. Technically instead of 'skb_dequeue()', 'skb_peek()' is
called and when skbuff becomes empty, it is removed from queue by
'__skb_unlink()'.
Fixes: 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since we now no longer use 'skb->len' to update credit, there is no sense
to update skbuff state, because it is used only once after dequeue to
copy data and then will be released.
Fixes: 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
'skb->len' can vary when we partially read the data, this complicates the
calculation of credit to be updated in 'virtio_transport_inc_rx_pkt()/
virtio_transport_dec_rx_pkt()'.
Also in 'virtio_transport_dec_rx_pkt()' we were miscalculating the
credit since 'skb->len' was redundant.
For these reasons, let's replace the use of skbuff state to calculate new
'rx_bytes'/'fwd_cnt' values with explicit value as input argument. This
makes code more simple, because it is not needed to change skbuff state
before each call to update 'rx_bytes'/'fwd_cnt'.
Fixes: 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before commit bdc1ddad3e5f ("compat_ioctl: block: move
blkdev_compat_ioctl() into ioctl.c"), the config BLOCK_COMPAT was used to
include compat_ioctl.c into the kernel build. With this commit, the code
is moved into ioctl.c and included with the config COMPAT. So, since then,
the config BLOCK_COMPAT has no effect and any further purpose.
Remove this obsolete config BLOCK_COMPAT.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230316111630.4897-1-lukas.bulwahn@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
| BUG: Bad page state in process kworker/u8:0 pfn:5c001
| page:00000000bfda61c8 refcount:0 mapcount:0 mapping:0000000000000000 index:0x20001 pfn:0x5c001
| head:0000000011409842 order:9 entire_mapcount:0 nr_pages_mapped:0 pincount:1
| anon flags: 0x3fffc00000b0004(uptodate|head|mappedtodisk|swapbacked|node=0|zone=0|lastcpupid=0xffff)
| raw: 03fffc0000000000 fffffc0000700001 ffffffff00700903 0000000100000000
| raw: 0000000000000200 0000000000000000 00000000ffffffff 0000000000000000
| head: 03fffc00000b0004 dead000000000100 dead000000000122 ffff00000a809dc1
| head: 0000000000020000 0000000000000000 00000000ffffffff 0000000000000000
| page dumped because: nonzero pincount
| CPU: 3 PID: 9 Comm: kworker/u8:0 Not tainted 6.3.0-rc2-00001-gc6811bf0cd87 #1
| Hardware name: linux,dummy-virt (DT)
| Workqueue: events_unbound io_ring_exit_work
| Call trace:
| dump_backtrace+0x13c/0x208
| show_stack+0x34/0x58
| dump_stack_lvl+0x150/0x1a8
| dump_stack+0x20/0x30
| bad_page+0xec/0x238
| free_tail_pages_check+0x280/0x350
| free_pcp_prepare+0x60c/0x830
| free_unref_page+0x50/0x498
| free_compound_page+0xcc/0x100
| free_transhuge_page+0x1f0/0x2b8
| destroy_large_folio+0x80/0xc8
| __folio_put+0xc4/0xf8
| gup_put_folio+0xd0/0x250
| unpin_user_page+0xcc/0x128
| io_buffer_unmap+0xec/0x2c0
| __io_sqe_buffers_unregister+0xa4/0x1e0
| io_ring_exit_work+0x68c/0x1188
| process_one_work+0x91c/0x1a58
| worker_thread+0x48c/0xe30
| kthread+0x278/0x2f0
| ret_from_fork+0x10/0x20
Mark reports an issue with the recent patches coalescing compound pages
while registering them in io_uring. The reason is that we try to drop
excessive references with folio_put_refs(), but pages were acquired
with pin_user_pages(), which has extra accounting and so should be put
down with matching unpin_user_pages() or at least gup_put_folio().
As a fix unpin_user_pages() all but first page instead, and let's figure
out a better API after.
Fixes: 57bebf807e2abcf8 ("io_uring/rsrc: optimise registered huge pages")
Reported-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/10efd5507d6d1f05ea0f3c601830e08767e189bd.1678980230.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
msg_ring requests transferring files support auto index selection via
IORING_FILE_INDEX_ALLOC, however they don't return the selected index
to the target ring and there is no other good way for the userspace to
know where is the receieved file.
Return the index for allocated slots and 0 otherwise, which is
consistent with other fixed file installing requests.
Cc: stable@vger.kernel.org # v6.0+
Fixes: e6130eba8a848 ("io_uring: add support for passing fixed file descriptors")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://github.com/axboe/liburing/issues/809
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.3
- avoid potential UAF in nvmet_req_complete (Damien Le Moal)
- more quirks (Elmer Miroslav Mosher Golovin, Philipp Geulen)
- fix a memory leak in the nvme-pci probe teardown path (Irvin Cote)
- repair the MAINTAINERS entry (Lukas Bulwahn)
- fix handling single range discard request (Ming Lei)
- show more opcode names in trace events (Minwoo Im)
- fix nvme-tcp timeout reporting (Sagi Grimberg)"
* tag 'nvme-6.3-2022-03-16' of git://git.infradead.org/nvme:
nvmet: avoid potential UAF in nvmet_req_complete()
nvme-trace: show more opcode names
nvme-tcp: add nvme-tcp pdu size build protection
nvme-tcp: fix opcode reporting in the timeout handler
nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM620
nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV3000
nvme-pci: fixing memory leak in probe teardown path
nvme: fix handling single range discard request
MAINTAINERS: repair malformed T: entries in NVM EXPRESS DRIVERS
|
|
Pointer variables of void * type do not require type cast.
Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20230316083954.4223-1-yuzhe@nfschina.com
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
There is a spelling mistake in a pr_warn_ratelimited message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20230314082315.26532-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Louis Peens says:
====================
nfp: flower: add support for multi-zone conntrack
This series add changes to support offload of connection tracking across
multiple zones. Previously the driver only supported offloading of a
single goto_chain, spanning a single zone. This was implemented by
merging a pre_ct rule, post_ct rule and the nft rule. This series
provides updates to let the original post_ct rule act as the new pre_ct
rule for a next set of merges if it contains another goto and
conntrack action. In pseudo-tc rule format this adds support for:
ingress chain 0 proto ip flower
action ct zone 1 pipe action goto 1
ingress chain 1 proto ip flower ct_state +tr+new ct_zone 1
action ct_clear pipe action ct zone 2 pipe action goto 2
ingress chain 1 proto ip flower ct_state +tr+est ct_zone 1
action ct_clear pipe action ct zone 2 pipe action goto 2
ingress chain 2 proto ip flower ct_state +tr+new ct_zone 2
action mirred egress redirect dev ...
ingress chain 2 proto ip flower ct_state +tr+est ct_zone 2
action mirred egress redirect dev ...
This can continue for up to a maximum of 4 zone recirculations.
The first few patches are some smaller preparation patches while the
last one introduces the functionality.
====================
Link: https://lore.kernel.org/r/20230314063610.10544-1-louis.peens@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If goto_chain action present in the post ct flow rule, merge flow rules
in this ct-zone, create a new pre_ct entry as the pre ct flow rule of
next ct-zone, but do not offload merged flow rules to firmware. Repeat
the process in the next ct-zone until no goto_chain action present in
the post ct flow rule in a certain ct-zone, merged all the flow rules.
Offload to firmware finally.
Signed-off-by: Wentao Jia <wentao.jia@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The fixed number of offload flow rule is only supported scenario of one
ct zone, in the scenario of multiple ct zones, dynamic number and more
number of offload flow rules are required. In order to support scenario
of multiple ct zones, parameter num_rules is added for to offload flow
rules
Signed-off-by: Wentao Jia <wentao.jia@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The chain_index has different means in pre ct entry and post ct entry.
In pre ct entry, it means chain index, but in post ct entry, it means
goto chain index, it is confused.
chain_index and goto_chain_index may be present in one flow rule, It
cannot be distinguished by one field chain_index, both chain_index
and goto_chain_index are required in the follow-up patch to support
multiple ct zones
Another field goto_chain_index is added to record the goto chain index.
If no goto action in post ct entry, goto_chain_index is 0.
Signed-off-by: Wentao Jia <wentao.jia@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'ct_clear' action only or no ct action is supported for 'post_ct_flow'.
But in scenario of multiple ct zones, one non 'ct_clear' ct action or
more ct actions, including 'ct_clear action', may be present in one flow
rule. If ct state match key is 'ct_established', the flow rule is still
expected to be classified as 'post_ct_flow'. Check ct status first in
function "is_post_ct_flow" to achieve this.
Signed-off-by: Wentao Jia <wentao.jia@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the scenario of multiple ct zones, ct state key match and ct action
is present in one flow rule, the flow rule is classified to post_ct_flow
in design.
There is no ct state key match for pre ct flow, the judging condition
is added to function "is_pre_ct_flow".
Chain_index is another field for judging which flows are pre ct flow
If chain_index not 0, the flow is not pre ct flow.
Signed-off-by: Wentao Jia <wentao.jia@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
CT action is a special case different from other actions, CT clear action
is not required when get ct action, but this case is not considered.
If CT clear action in the flow rule, skip the CT clear action when get ct
action, return the first ct action that is not a CT clear action
Signed-off-by: Wentao Jia <wentao.jia@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Saeed Mahameed says:
====================
mlx5-updates-2023-03-13
1) Trivial cleanup patches
2) By Sandipan Patra: Implement thermal zone to report NIC temperature
3) Adham Faris, Improves devlink health diagnostics for netdev objects
4) From Maor, Enable TC offload for egress and engress MACVLAN over bond
5) From Gal, add devlink hairpin queues parameters to replace debugfs
as was discussed in [1]:
[1] https://lore.kernel.org/all/20230111194608.7f15b9a1@kernel.org/
====================
Link: https://lore.kernel.org/all/20230314054234.267365-1-saeed@kernel.org/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Support offloading of TC rules that mirror/redirect egress traffic to a
MACVLAN device, which is attached to bond device which master mlx5 devices.
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-16-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Support offloading of TC rules that filter ingress traffic from a MACVLAN
device, which is attached to bond device.
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-15-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In preparation for next patch which will add new check
if device block can be setup, extract all existing checks
to function to make it more readable and maintainable.
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-14-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Print the number of hairpin queues and size as part of the hairpin table
dump.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-13-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We refer to a TC NIC rule that involves forwarding as "hairpin".
Hairpin queues are mlx5 hardware specific implementation for hardware
forwarding of such packets.
Per the discussion in [1], move the hairpin queues control (number and
size) from debugfs to devlink.
Expose two devlink params:
- hairpin_num_queues: control the number of hairpin queues
- hairpin_queue_size: control the size (in packets) of the hairpin queues
[1] https://lore.kernel.org/all/20230111194608.7f15b9a1@kernel.org/
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-12-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Downstream patches require devlink params to access the PTYS register,
move the needed functions from mlx5e to the core layer.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-11-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently RQ health diagnostics doesn't inform the user whether an RQ
is an XSK RQ or not.
Address this, by adding XSK state flag to RQ SW state enum in core/en.h.
XSK will be '1' if current RQ is an XSK RQ, and it will be '0' if it's
not.
In this example below, it can be seen that XSK field value is '1' since
xdpsock program have been attached to channel 0 before issuing the
devlink query command:
$ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx
Output:
=======================================================================
Common config:
RQ:
type: 2 stride size: 4096 size: 16 ts_format: FRC
CQ:
stride size: 64 size: 1024
RQs:
channel ix: 0 rqn: 4236 HW state: 1 WQE counter: 15 posted WQEs: 15 cc: 15
SW State:
enabled: 1 recovering: 0 am: 1 no_csum_complete: 1 csum_full: 0 mini_cqe_hw_stridx: 1 shampo: 0 mini_cqe_enhanced: 0 xsk: 1
CQ:
cqn: 1085 HW status: 0 ci: 0 size: 1024
EQ:
eqn: 7 irqn: 32 vecidx: 0 ci: 5 size: 2048
ICOSQ:
sqn: 4229 HW state: 1 cc: 158 pc: 158 WQE size: 2048
CQ:
cqn: 1080 cc: 1 size: 2048
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-10-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add SQ SW state textual representation to devlink health diagnostics
for tx reporter.
SQ SW state can be retrieved by issuing the devlink command below:
$ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter tx
Output
=======================================================================
Common Config:
SQ:
stride size: 64 size: 1024 ts_format: FRC
CQ:
stride size: 64 size: 1024
SQs:
channel ix: 0 tc: 0 txq ix: 0 sqn: 4170 HW state: 1 stopped: false cc: 0 pc: 0
SW State:
enabled: 1 mpwqe: 1 recovering: 0 ipsec: 0 am: 1 vlan_need_l2_inline: 1 pending_xsk_tx: 0 pending_tls_rx_resync: 0 xdp_multibuf: 0
CQ:
cqn: 1031 HW status: 0 ci: 0 size: 1024
EQ:
eqn: 7 irqn: 32 vecidx: 0 ci: 2 size: 2048
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-9-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
One of the parameters that is retrieved/printed as a response to
devlink health diagnostics for rx reporter is the RQ SW state.
It's printed as a bitmap decimal number. Printing it as bitmap is
problematic and non informative.
In addition User can't count on SW state without accessing the kernel
sources (mlx5e rq state enum in en.h).
This patch prints RQ SW state in a textual representation, as a key:
value pairs, where disabled rq states will appear as '0' and enabled
ones will appear as '1'.
See below the generated output for rx health diagnostics devlink
command:
$ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx
Before:
=======================================================================
Common config:
RQ:
type: 2 stride size: 2048 size: 8 ts_format: FRC
CQ:
stride size: 64 size: 1024
RQs:
channel ix: 0 rqn: 4172 HW state: 1 SW state: 37 WQE counter: 7 posted WQEs: 7 cc: 7
CQ:
cqn: 1033 HW status: 0 ci: 0 size: 1024
EQ:
eqn: 7 irqn: 32 vecidx: 0 ci: 2 size: 2048
ICOSQ:
sqn: 4169 HW state: 1 cc: 74 pc: 74 WQE size: 128
CQ:
cqn: 1030 cc: 1 size: 128
channel ix: 1 ...
.
.
After:
=======================================================================
Common config:
RQ:
type: 2 stride size: 2048 size: 8 ts_format: FRC
CQ:
stride size: 64 size: 1024
RQs:
channel ix: 0 rqn: 4172 HW state: 1 WQE counter: 7 posted WQEs: 7 cc: 7
SW State:
enabled: 1 recovering: 0 am: 1 no_csum_complete: 0 csum_full: 0 mini_cqe_hw_stridx: 1 shampo: 0 mini_cqe_enhanced: 0
CQ:
cqn: 1033 HW status: 0 ci: 0 size: 1024
EQ:
eqn: 7 irqn: 32 vecidx: 0 ci: 2 size: 2048
ICOSQ:
sqn: 4169 HW state: 1 cc: 74 pc: 74 WQE size: 128
CQ:
cqn: 1030 cc: 1 size: 128
channel: ix: 1 ...
.
.
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-8-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Dynamic interrupt moderation RQ and SQ feature represented by
MLX5E_RQ_STATE_AM and MLX5E_SQ_STATE_AM enums respectively, is not
consistent with the feature naming in the driver, and with the formal
feature and library names.
Hence, change MLX5E_RQ_STATE_AM and MLX5E_SQ_STATE_AM enum type names in
core/en.h to MLX5E_RQ_STATE_DIM and MLX5E_SQ_STATE_DIM respectively.
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-7-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Previous check was comparing against the fifo mask. The mask is size of the
fifo (power of two) minus one, so a less than or equal comparator should be
used for checking if the fifo has room for the SKB.
Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-6-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Implement thermal zone support for mlx5 based HW. The NIC
uses temperature sensor provided by ASIC to report current temperature
to thermal core.
Signed-off-by: Sandipan Patra <spatra@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-5-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add comment to mlx5_devlink_params_register() functions so it is clear
that only driver init params should be registered here.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-4-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If driver teardown is called while PCI is turned off, there is a race
between health recovery and teardown. If health recovery already started
it will wait 60 sec trying to see if PCI gets back and it can recover,
but actually there is no need to wait anymore once teardown was called.
Use the MLX5_BREAK_FW_WAIT flag which is set on driver teardown to break
waiting for PCI up.
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-3-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When shutdown or remove callbacks are called the driver sets the flag
MLX5_BREAK_FW_WAIT, to stop waiting for FW as teardown was called. There
is no need to clear the bit as once shutdown or remove were called as
there is no way back, the driver is going down. Furthermore, if not
cleared the flag can be used also in other loops where we may wait while
teardown was already called.
Use test_bit() instead of test_and_clear_bit() as there is no need to
clear the flag.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230314054234.267365-2-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since the blamed commit, phy_ethtool_get_wol() and phy_ethtool_set_wol()
acquire phydev->lock, but the mscc phy driver implementations,
vsc85xx_wol_get() and vsc85xx_wol_set(), acquire the same lock as well,
resulting in a deadlock.
$ ip link set swp3 down
============================================
WARNING: possible recursive locking detected
mscc_felix 0000:00:00.5 swp3: Link is Down
--------------------------------------------
ip/375 is trying to acquire lock:
ffff3d7e82e987a8 (&dev->lock){+.+.}-{4:4}, at: vsc85xx_wol_get+0x2c/0xf4
but task is already holding lock:
ffff3d7e82e987a8 (&dev->lock){+.+.}-{4:4}, at: phy_ethtool_get_wol+0x3c/0x6c
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&dev->lock);
lock(&dev->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by ip/375:
#0: ffffd43b2a955788 (rtnl_mutex){+.+.}-{4:4}, at: rtnetlink_rcv_msg+0x144/0x58c
#1: ffff3d7e82e987a8 (&dev->lock){+.+.}-{4:4}, at: phy_ethtool_get_wol+0x3c/0x6c
Call trace:
__mutex_lock+0x98/0x454
mutex_lock_nested+0x2c/0x38
vsc85xx_wol_get+0x2c/0xf4
phy_ethtool_get_wol+0x50/0x6c
phy_suspend+0x84/0xcc
phy_state_machine+0x1b8/0x27c
phy_stop+0x70/0x154
phylink_stop+0x34/0xc0
dsa_port_disable_rt+0x2c/0xa4
dsa_slave_close+0x38/0xec
__dev_close_many+0xc8/0x16c
__dev_change_flags+0xdc/0x218
dev_change_flags+0x24/0x6c
do_setlink+0x234/0xea4
__rtnl_newlink+0x46c/0x878
rtnl_newlink+0x50/0x7c
rtnetlink_rcv_msg+0x16c/0x58c
Removing the mutex_lock(&phydev->lock) calls from the driver restores
the functionality.
Fixes: 2f987d486610 ("net: phy: Add locks to ethtool functions")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230314153025.2372970-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'temp' was used before commit c0c99d0cd107 ("net: phy: micrel: remove
the use of .ack_interrupt()") refactored the code. Now, we can simplify
it a little.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20230314124928.44948-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 899a3cbbf77a ("net: phy: remove states PHY_STARTING and
PHY_PENDING") missed to update a comment in phy_probe. Remove
superfluous "Description:" prefix while we are here.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/20230314124856.44878-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|