summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-12-19Merge tag 'trace-v6.7-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: "While working on the ring buffer, I found one more bug with the timestamp code, and the fix for this removed the need for the final 64-bit cmpxchg! The ring buffer events hold a "delta" from the previous event. If it is determined that the delta can not be calculated, it falls back to adding an absolute timestamp value. The way to know if the delta can be used is via two stored timestamps in the per-cpu buffer meta data: before_stamp and write_stamp The before_stamp is written by every event before it tries to allocate its space on the ring buffer. The write_stamp is written after it allocates its space and knows that nothing came in after it read the previous before_stamp and write_stamp and the two matched. A previous fix dd9394257078 ("ring-buffer: Do not try to put back write_stamp") removed putting back the write_stamp to match the before_stamp so that the next event could use the delta, but races were found where the two would match, but not be for of the previous event. It was determined to allow the event reservation to not have a valid write_stamp when it is finished, and this fixed a lot of races. The last use of the 64-bit timestamp cmpxchg depended on the write_stamp being valid after an interruption. But this is no longer the case, as if an event is interrupted by a softirq that writes an event, and that event gets interrupted by a hardirq or NMI and that writes an event, then the softirq could finish its reservation without a valid write_stamp. In the slow path of the event reservation, a delta can still be used if the write_stamp is valid. Instead of using a cmpxchg against the write stamp, the before_stamp needs to be read again to validate the write_stamp. The cmpxchg is not needed. This updates the slowpath to validate the write_stamp by comparing it to the before_stamp and removes all rb_time_cmpxchg() as there are no more users of that function. The removal of the 32-bit updates of rb_time_t will be done in the next merge window" * tag 'trace-v6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Fix slowpath of interrupted event
2023-12-19drm/amd/display: dereference variable before checking for zeroJosip Pavic
[Why] Driver incorrectly checks if pointer variable OutBpp is null instead of if the value being pointed to is zero. [How] Dereference OutBpp before checking for a value of zero. Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Josip Pavic <josip.pavic@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-19drm/amd/display: get dprefclk ss info from integration info tableCharlene Liu
[why & how] we have two SSC_En: we get ssc_info from dce_info for MPLL_SSC_EN. we used to call VBIOS cmdtbl's smu_info's SS persentage for DPRECLK SS info, is used for DP AUDIO and VBIOS' smu_info table was from systemIntegrationInfoTable. since dcn35 VBIOS removed smu_info, driver need to use integrationInfotable directly. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Charlene Liu <charlene.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-19drm/amd/display: Add case for dcn35 to support usb4 dmub hpd eventWayne Lin
[Why & how] Refactor dc_is_dmub_outbox_supported() a bit and add case for dcn35 to register dmub outbox notification irq to handle usb4 relevant hpd event. Reviewed-by: Roman Li <roman.li@amd.com> Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-19drm/amd/display: disable FPO and SubVP for older DMUB versions on DCN32xHamza Mahfooz
There have recently been changes that break backwards compatibility, that were introduced into DMUB firmware (for DCN32x) concerning FPO and SubVP. So, since those are just power optimization features, we can just disable them unless the user is using a new enough version of DMUB firmware. Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2870 Fixes: ed6e2782e974 ("drm/amd/display: For cursor P-State allow for SubVP") Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Closes: https://lore.kernel.org/r/CABXGCsNRb0QbF2pKLJMDhVOKxyGD6-E+8p-4QO6FOWa6zp22_A@mail.gmail.com/ Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-19drm/amdkfd: svm range always mapped flag not working on APUPhilip Yang
On gfx943 APU there is no VRAM and page migration, queue CWSR area, svm range with always mapped flag, is not mapped to GPU correctly. This works fine if retry fault on CWSR area can be recovered, but could cause deadlock if there is another retry fault recover waiting for CWSR to finish. Fix this by mapping svm range with always mapped flag to GPU with ACCESS attribute if XNACK ON. There is side effect, because all GPUs have ACCESS attribute by default on new svm range with XNACK on, the CWSR area will be mapped to all GPUs after this change. This side effect will be fixed with Thunk change to set CWSR svm range with ACCESS_IN_PLACE attribute on the GPU that user queue is created. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-19Merge tag 'arc-6.7-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - build error for hugetlb, sparse and smatch fixes - removal of VIPT aliasing cache code * tag 'arc-6.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: add hugetlb definitions ARC: fix smatch warning ARC: fix spare error ARC: mm: retire support for aliasing VIPT D$ ARC: entry: move ARCompact specific bits out of entry.h ARC: entry: SAVE_ABI_CALLEE_REG: ISA/ABI specific helper
2023-12-19drm/amd/display: Revert " drm/amd/display: Use channel_width = 2 for vram ↵Alvin Lee
table 3.0" [Description] Revert commit fec05adc40c2 ("drm/amd/display: Use channel_width = 2 for vram table 3.0") Because the issue is being fixed from VBIOS side. Reviewed-by: Samson Tam <samson.tam@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-19i2c: rk3x: fix potential spinlock recursion on pollJensen Huang
Possible deadlock scenario (on reboot): rk3x_i2c_xfer_common(polling) -> rk3x_i2c_wait_xfer_poll() -> rk3x_i2c_irq(0, i2c); --> spin_lock(&i2c->lock); ... <rk3x i2c interrupt> -> rk3x_i2c_irq(0, i2c); --> spin_lock(&i2c->lock); (deadlock here) Store the IRQ number and disable/enable it around the polling transfer. This patch has been tested on NanoPC-T4. Signed-off-by: Jensen Huang <jensenhuang@friendlyarm.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-12-19i2c: qcom-geni: fix missing clk_disable_unprepare() and geni_se_resources_off()Yang Yingliang
Add missing clk_disable_unprepare() and geni_se_resources_off() in the error path in geni_i2c_probe(). Fixes: 14d02fbadb5d ("i2c: qcom-geni: add desc struct to prepare support for I2C Master Hub variant") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-12-19cifs: do not let cifs_chan_update_iface deallocate channelsShyam Prasad N
cifs_chan_update_iface is meant to check and update the server interface used for a channel when the existing server interface is no longer available. So far, this handler had the code to remove an interface entry even if a new candidate interface is not available. Allowing this leads to several corner cases to handle. This change makes the logic much simpler by not deallocating the current channel interface entry if a new interface is not found to replace it with. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-19cifs: fix a pending undercount of srv_countShyam Prasad N
The following commit reverted the changes to ref count the server struct while scheduling a reconnect work: 823342524868 Revert "cifs: reconnect work should have reference on server struct" However, a following change also introduced scheduling of reconnect work, and assumed ref counting. This change fixes that as well. Fixes umount problems like: [73496.157838] CPU: 5 PID: 1321389 Comm: umount Tainted: G W OE 6.7.0-060700rc6-generic #202312172332 [73496.157841] Hardware name: LENOVO 20MAS08500/20MAS08500, BIOS N2CET67W (1.50 ) 12/15/2022 [73496.157843] RIP: 0010:cifs_put_tcp_session+0x17d/0x190 [cifs] [73496.157906] Code: 5d 31 c0 31 d2 31 f6 31 ff c3 cc cc cc cc e8 4a 6e 14 e6 e9 f6 fe ff ff be 03 00 00 00 48 89 d7 e8 78 26 b3 e5 e9 e4 fe ff ff <0f> 0b e9 b1 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 [73496.157908] RSP: 0018:ffffc90003bcbcb8 EFLAGS: 00010286 [73496.157911] RAX: 00000000ffffffff RBX: ffff8885830fa800 RCX: 0000000000000000 [73496.157913] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [73496.157915] RBP: ffffc90003bcbcc8 R08: 0000000000000000 R09: 0000000000000000 [73496.157917] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [73496.157918] R13: ffff8887d56ba800 R14: 00000000ffffffff R15: ffff8885830fa800 [73496.157920] FS: 00007f1ff0e33800(0000) GS:ffff88887ba80000(0000) knlGS:0000000000000000 [73496.157922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [73496.157924] CR2: 0000115f002e2010 CR3: 00000003d1e24005 CR4: 00000000003706f0 [73496.157926] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [73496.157928] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [73496.157929] Call Trace: [73496.157931] <TASK> [73496.157933] ? show_regs+0x6d/0x80 [73496.157936] ? __warn+0x89/0x160 [73496.157939] ? cifs_put_tcp_session+0x17d/0x190 [cifs] [73496.157976] ? report_bug+0x17e/0x1b0 [73496.157980] ? handle_bug+0x51/0xa0 [73496.157983] ? exc_invalid_op+0x18/0x80 [73496.157985] ? asm_exc_invalid_op+0x1b/0x20 [73496.157989] ? cifs_put_tcp_session+0x17d/0x190 [cifs] [73496.158023] ? cifs_put_tcp_session+0x1e/0x190 [cifs] [73496.158057] __cifs_put_smb_ses+0x2b5/0x540 [cifs] [73496.158090] ? tconInfoFree+0xc2/0x120 [cifs] [73496.158130] cifs_put_tcon.part.0+0x108/0x2b0 [cifs] [73496.158173] cifs_put_tlink+0x49/0x90 [cifs] [73496.158220] cifs_umount+0x56/0xb0 [cifs] [73496.158258] cifs_kill_sb+0x52/0x60 [cifs] [73496.158306] deactivate_locked_super+0x32/0xc0 [73496.158309] deactivate_super+0x46/0x60 [73496.158311] cleanup_mnt+0xc3/0x170 [73496.158314] __cleanup_mnt+0x12/0x20 [73496.158330] task_work_run+0x5e/0xa0 [73496.158333] exit_to_user_mode_loop+0x105/0x130 [73496.158336] exit_to_user_mode_prepare+0xa5/0xb0 [73496.158338] syscall_exit_to_user_mode+0x29/0x60 [73496.158341] do_syscall_64+0x6c/0xf0 [73496.158344] ? syscall_exit_to_user_mode+0x37/0x60 [73496.158346] ? do_syscall_64+0x6c/0xf0 [73496.158349] ? exit_to_user_mode_prepare+0x30/0xb0 [73496.158353] ? syscall_exit_to_user_mode+0x37/0x60 [73496.158355] ? do_syscall_64+0x6c/0xf0 Reported-by: Robert Morris <rtm@csail.mit.edu> Fixes: 705fc522fe9d ("cifs: handle when server starts supporting multichannel") Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-19Revert "nvme-fc: fix race between error recovery and creating association"Keith Busch
The commit was identified to might sleep in invalid context and is blocking regression testing. This reverts commit ee6fdc5055e916b1dd497f11260d4901c4c1e55e. Link: https://lore.kernel.org/linux-nvme/hkhl56n665uvc6t5d6h3wtx7utkcorw4xlwi7d2t2bnonavhe6@xaan6pu43ap6/ Link: https://lists.infradead.org/pipermail/linux-nvme/2023-December/043756.html Reported-by: Daniel Wagner <dwagner@suse.de> Reported-by: Maurizio Lombardi <mlombard@redhat.com> Cc: Michael Liang <mliang@purestorage.com> Tested-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-12-19s390: update defconfigsHeiko Carstens
Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-12-19fs: cifs: Fix atime update checkZizhi Wo
Commit 9b9c5bea0b96 ("cifs: do not return atime less than mtime") indicates that in cifs, if atime is less than mtime, some apps will break. Therefore, it introduce a function to compare this two variables in two places where atime is updated. If atime is less than mtime, update it to mtime. However, the patch was handled incorrectly, resulting in atime and mtime being exactly equal. A previous commit 69738cfdfa70 ("fs: cifs: Fix atime update check vs mtime") fixed one place and forgot to fix another. Fix it. Fixes: 9b9c5bea0b96 ("cifs: do not return atime less than mtime") Cc: stable@vger.kernel.org Signed-off-by: Zizhi Wo <wozizhi@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-19smb: client: fix potential OOB in smb2_dump_detail()Paulo Alcantara
Validate SMB message with ->check_message() before calling ->calc_smb_size(). This fixes CVE-2023-6610. Reported-by: j51569436@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218219 Cc; stable@vger.kernel.org Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-19ASoC: fsl_sai: Fix channel swap issue on i.MX8MPShengjiu Wang
When flag mclk_with_tere and mclk_direction_output enabled, The SAI transmitter or receiver will be enabled in very early stage, that if FSL_SAI_xMR is set by previous case, for example previous case is one channel, current case is two channels, then current case started with wrong xMR in the beginning, then channel swap happen. The patch is to clear xMR in hw_free() to avoid such channel swap issue. Fixes: 3e4a82612998 ("ASoC: fsl_sai: MCLK bind with TX/RX enable bit") Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Link: https://msgid.link/r/1702953057-4499-1-git-send-email-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-12-19ASoC: hdmi-codec: fix missing report for jack initial statusJerome Brunet
This fixes a problem introduced while fixing ELD reporting with no jack set. Most driver using the hdmi-codec will call the 'plugged_cb' callback directly when registered to report the initial state of the HDMI connector. With the commit mentionned, this occurs before jack is ready and the initial report is lost for platforms actually providing a jack for HDMI. Fix this by storing the hdmi connector status regardless of jack being set or not and report the last status when jack gets set. With this, the initial state is reported correctly even if it is disconnected. This was not done initially and is also a fix. Fixes: 15be353d55f9 ("ASoC: hdmi-codec: register hpd callback on component probe") Reported-by: Zhengqiao Xia <xiazhengqiao@huaqin.corp-partner.google.com> Closes: https://lore.kernel.org/alsa-devel/CADYyEwTNyY+fR9SgfDa-g6iiDwkU3MUdPVCYexs2_3wbcM8_vg@mail.gmail.com/ Cc: Hsin-Yi Wang <hsinyi@google.com> Tested-by: Zhengqiao Xia <xiazhengqiao@huaqin.corp-partner.google.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Link: https://msgid.link/r/20231218145655.134929-1-jbrunet@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-12-19Merge branch ↵Paolo Abeni
'check-vlan-filter-feature-in-vlan_vids_add_by_dev-and-vlan_vids_del_by_dev' Liu Jian says: ==================== check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev() v2->v3: Filter using vlan_hw_filter_capable(). Add one basic test. ==================== Link: https://lore.kernel.org/r/20231216075219.2379123-1-liujian56@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19selftests: add vlan hw filter testsLiu Jian
Add one basic vlan hw filter test. Signed-off-by: Liu Jian <liujian56@huawei.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19net: check vlan filter feature in vlan_vids_add_by_dev() and ↵Liu Jian
vlan_vids_del_by_dev() I got the below warning trace: WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ #15 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0 Call Trace: rtnl_dellink rtnetlink_rcv_msg netlink_rcv_skb netlink_unicast netlink_sendmsg __sock_sendmsg ____sys_sendmsg ___sys_sendmsg __sys_sendmsg do_syscall_64 entry_SYSCALL_64_after_hwframe It can be repoduced via: ip netns add ns1 ip netns exec ns1 ip link add bond0 type bond mode 0 ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2 ip netns exec ns1 ip link set bond_slave_1 master bond0 [1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off [2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0 [3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0 [4] ip netns exec ns1 ip link set bond_slave_1 nomaster [5] ip netns exec ns1 ip link del veth2 ip netns del ns1 This is all caused by command [1] turning off the rx-vlan-filter function of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands [2] [3] add the same vid to slave and master respectively, causing command [4] to empty slave->vlan_info. The following command [5] triggers this problem. To fix this problem, we should add VLAN_FILTER feature checks in vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect addition or deletion of vlan_vid information. Fixes: 348a1443cc43 ("vlan: introduce functions to do mass addition/deletion of vids by another device") Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19net: hns3: add new maintainer for the HNS3 ethernet driverJijie Shao
Jijie Shao will be responsible for maintaining the hns3 driver's code in the future, so add Jijie to the hns3 driver's matainer list. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://lore.kernel.org/r/20231216070413.233668-1-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19net: mana: select PAGE_POOLYury Norov
Mana uses PAGE_POOL API. x86_64 defconfig doesn't select it: ld: vmlinux.o: in function `mana_create_page_pool.isra.0': mana_en.c:(.text+0x9ae36f): undefined reference to `page_pool_create' ld: vmlinux.o: in function `mana_get_rxfrag': mana_en.c:(.text+0x9afed1): undefined reference to `page_pool_alloc_pages' make[3]: *** [/home/yury/work/linux/scripts/Makefile.vmlinux:37: vmlinux] Error 1 make[2]: *** [/home/yury/work/linux/Makefile:1154: vmlinux] Error 2 make[1]: *** [/home/yury/work/linux/Makefile:234: __sub-make] Error 2 make[1]: Leaving directory '/home/yury/work/build-linux-x86_64' make: *** [Makefile:234: __sub-make] Error 2 So we need to select it explicitly. Signed-off-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Fixes: ca9c54d2 ("net: mana: Add a driver for Microsoft Azure Network Adapter") Link: https://lore.kernel.org/r/20231215203353.635379-1-yury.norov@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19net: ks8851: Fix TX stall caused by TX buffer overrunRonald Wahl
There is a bug in the ks8851 Ethernet driver that more data is written to the hardware TX buffer than actually available. This is caused by wrong accounting of the free TX buffer space. The driver maintains a tx_space variable that represents the TX buffer space that is deemed to be free. The ks8851_start_xmit_spi() function adds an SKB to a queue if tx_space is large enough and reduces tx_space by the amount of buffer space it will later need in the TX buffer and then schedules a work item. If there is not enough space then the TX queue is stopped. The worker function ks8851_tx_work() dequeues all the SKBs and writes the data into the hardware TX buffer. The last packet will trigger an interrupt after it was send. Here it is assumed that all data fits into the TX buffer. In the interrupt routine (which runs asynchronously because it is a threaded interrupt) tx_space is updated with the current value from the hardware. Also the TX queue is woken up again. Now it could happen that after data was sent to the hardware and before handling the TX interrupt new data is queued in ks8851_start_xmit_spi() when the TX buffer space had still some space left. When the interrupt is actually handled tx_space is updated from the hardware but now we already have new SKBs queued that have not been written to the hardware TX buffer yet. Since tx_space has been overwritten by the value from the hardware the space is not accounted for. Now we have more data queued then buffer space available in the hardware and ks8851_tx_work() will potentially overrun the hardware TX buffer. In many cases it will still work because often the buffer is written out fast enough so that no overrun occurs but for example if the peer throttles us via flow control then an overrun may happen. This can be fixed in different ways. The most simple way would be to set tx_space to 0 before writing data to the hardware TX buffer preventing the queuing of more SKBs until the TX interrupt has been handled. I have chosen a slightly more efficient (and still rather simple) way and track the amount of data that is already queued and not yet written to the hardware. When new SKBs are to be queued the already queued amount of data is honoured when checking free TX buffer space. I tested this with a setup of two linked KS8851 running iperf3 between the two in bidirectional mode. Before the fix I got a stall after some minutes. With the fix I saw now issues anymore after hours. Fixes: 3ba81f3ece3c ("net: Micrel KS8851 SPI network driver") Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Ben Dooks <ben.dooks@codethink.co.uk> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231214181112.76052-1-rwahl@gmx.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19Revert "iio: hid-sensor-als: Add light color temperature support"Srinivas Pandruvada
This reverts commit 5f05285df691b1e82108eead7165feae238c95ef. This commit assumes that every HID descriptor for ALS sensor has presence of usage id ID HID_USAGE_SENSOR_LIGHT_COLOR_TEMPERATURE. When the above usage id is absent, driver probe fails. This breaks ALS sensor functionality on many platforms. Till we have a good solution, revert this commit. Reported-by: Thomas Weißschuh <thomas@t-8ch.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218223 Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: <stable@vger.kernel.org> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231217200703.719876-3-srinivas.pandruvada@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-19Revert "iio: hid-sensor-als: Add light chromaticity support"Srinivas Pandruvada
This reverts commit ee3710f39f9d0ae5137a866138d005fe1ad18132. This commit assumes that every HID descriptor for ALS sensor has presence of usage id ID HID_USAGE_SENSOR_LIGHT_CHROMATICITY_X and HID_USAGE_SENSOR_LIGHT_CHROMATICITY_Y. When the above usage ids are absent, driver probe fails. This breaks ALS sensor functionality on many platforms. Till we have a good solution, revert this commit. Reported-by: Thomas Weißschuh <thomas@t-8ch.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218223 Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: <stable@vger.kernel.org> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231217200703.719876-2-srinivas.pandruvada@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-18ring-buffer: Fix slowpath of interrupted eventSteven Rostedt (Google)
To synchronize the timestamps with the ring buffer reservation, there are two timestamps that are saved in the buffer meta data. 1. before_stamp 2. write_stamp When the two are equal, the write_stamp is considered valid, as in, it may be used to calculate the delta of the next event as the write_stamp is the timestamp of the previous reserved event on the buffer. This is done by the following: /*A*/ w = current position on the ring buffer before = before_stamp after = write_stamp ts = read current timestamp if (before != after) { write_stamp is not valid, force adding an absolute timestamp. } /*B*/ before_stamp = ts /*C*/ write = local_add_return(event length, position on ring buffer) if (w == write - event length) { /* Nothing interrupted between A and C */ /*E*/ write_stamp = ts; delta = ts - after /* * If nothing interrupted again, * before_stamp == write_stamp and write_stamp * can be used to calculate the delta for * events that come in after this one. */ } else { /* * The slow path! * Was interrupted between A and C. */ This is the place that there's a bug. We currently have: after = write_stamp ts = read current timestamp /*F*/ if (write == current position on the ring buffer && after < ts && cmpxchg(write_stamp, after, ts)) { delta = ts - after; } else { delta = 0; } The assumption is that if the current position on the ring buffer hasn't moved between C and F, then it also was not interrupted, and that the last event written has a timestamp that matches the write_stamp. That is the write_stamp is valid. But this may not be the case: If a task context event was interrupted by softirq between B and C. And the softirq wrote an event that got interrupted by a hard irq between C and E. and the hard irq wrote an event (does not need to be interrupted) We have: /*B*/ before_stamp = ts of normal context ---> interrupted by softirq /*B*/ before_stamp = ts of softirq context ---> interrupted by hardirq /*B*/ before_stamp = ts of hard irq context /*E*/ write_stamp = ts of hard irq context /* matches and write_stamp valid */ <---- /*E*/ write_stamp = ts of softirq context /* No longer matches before_stamp, write_stamp is not valid! */ <--- w != write - length, go to slow path // Right now the order of events in the ring buffer is: // // |-- softirq event --|-- hard irq event --|-- normal context event --| // after = write_stamp (this is the ts of softirq) ts = read current timestamp if (write == current position on the ring buffer [true] && after < ts [true] && cmpxchg(write_stamp, after, ts) [true]) { delta = ts - after [Wrong!] The delta is to be between the hard irq event and the normal context event, but the above logic made the delta between the softirq event and the normal context event, where the hard irq event is between the two. This will shift all the remaining event timestamps on the sub-buffer incorrectly. The write_stamp is only valid if it matches the before_stamp. The cmpxchg does nothing to help this. Instead, the following logic can be done to fix this: before = before_stamp ts = read current timestamp before_stamp = ts after = write_stamp if (write == current position on the ring buffer && after == before && after < ts) { delta = ts - after } else { delta = 0; } The above will only use the write_stamp if it still matches before_stamp and was tested to not have changed since C. As a bonus, with this logic we do not need any 64-bit cmpxchg() at all! This means the 32-bit rb_time_t workaround can finally be removed. But that's for a later time. Link: https://lore.kernel.org/linux-trace-kernel/20231218175229.58ec3daf@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20231218230712.3a76b081@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Fixes: dd93942570789 ("ring-buffer: Do not try to put back write_stamp") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-12-18scsi: ufs: core: Let the sq_lock protect sq_tail_slot accessCan Guo
When accessing sq_tail_slot without protection from sq_lock, a race condition can cause multiple SQEs to be copied to duplicate SQE slots. This can lead to multiple stability issues. Fix this by moving the *dest initialization in ufshcd_send_command() back under protection from the sq_lock. Fixes: 3c85f087faec ("scsi: ufs: mcq: Use pointer arithmetic in ufshcd_send_command()") Signed-off-by: Can Guo <quic_cang@quicinc.com> Link: https://lore.kernel.org/r/1702913550-20631-1-git-send-email-quic_cang@quicinc.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-18scsi: ufs: qcom: Return ufs_qcom_clk_scale_*() errors in ↵ChanWoo Lee
ufs_qcom_clk_scale_notify() In commit 031312dbc695 ("scsi: ufs: ufs-qcom: Remove unnecessary goto statements") the error handling was accidentally changed, resulting in the error of ufs_qcom_clk_scale_*() calls not being returned. This is the case I checked: ufs_qcom_clk_scale_notify -> 'ufs_qcom_clk_scale_up_/down_pre_change' error -> return 0; Make sure those errors are properly returned. Fixes: 031312dbc695 ("scsi: ufs: ufs-qcom: Remove unnecessary goto statements") Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com> Link: https://lore.kernel.org/r/20231215003812.29650-1-cw9316.lee@samsung.com Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-18scsi: core: Always send batch on reset or error handling commandAlexander Atanasov
In commit 8930a6c20791 ("scsi: core: add support for request batching") the block layer bd->last flag was mapped to SCMD_LAST and used as an indicator to send the batch for the drivers that implement this feature. However, the error handling code was not updated accordingly. scsi_send_eh_cmnd() is used to send error handling commands and request sense. The problem is that request sense comes as a single command that gets into the batch queue and times out. As a result the device goes offline after several failed resets. This was observed on virtio_scsi during a device resize operation. [ 496.316946] sd 0:0:4:0: [sdd] tag#117 scsi_eh_0: requesting sense [ 506.786356] sd 0:0:4:0: [sdd] tag#117 scsi_send_eh_cmnd timeleft: 0 [ 506.787981] sd 0:0:4:0: [sdd] tag#117 abort To fix this always set SCMD_LAST flag in scsi_send_eh_cmnd() and scsi_reset_ioctl(). Fixes: 8930a6c20791 ("scsi: core: add support for request batching") Cc: <stable@vger.kernel.org> Signed-off-by: Alexander Atanasov <alexander.atanasov@virtuozzo.com> Link: https://lore.kernel.org/r/20231215121008.2881653-1-alexander.atanasov@virtuozzo.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-18scsi: bnx2fc: Fix skb double free in bnx2fc_rcv()Wei Yongjun
skb_share_check() already drops the reference to the skb when returning NULL. Using kfree_skb() in the error handling path leads to an skb double free. Fix this by removing the variable tmp_skb, and return directly when skb_share_check() returns NULL. Fixes: 01a4cc4d0cd6 ("bnx2fc: do not add shared skbs to the fcoe_rx_list") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Link: https://lore.kernel.org/r/20221114110626.526643-1-weiyongjun@huaweicloud.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-18Merge tag 'hid-for-linus-2023121901' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - fix for division by zero in Nintendo driver when generic joycon is attached, reported and fixed by SteamOS folks (Guilherme G. Piccoli) - GCC-7 build fix (which is a good cleanup anyway) for Nintendo driver (Ryan McClelland) * tag 'hid-for-linus-2023121901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: nintendo: Prevent divide-by-zero on code HID: nintendo: fix initializer element is not constant error
2023-12-18SUNRPC: Revert 5f7fc5d69f6e92ec0b38774c387f5cf7812c5806Chuck Lever
Guillaume says: > I believe commit 5f7fc5d69f6e ("SUNRPC: Resupply rq_pages from > node-local memory") in Linux 6.5+ is incorrect. It passes > unconditionally rq_pool->sp_id as the NUMA node. > > While the comment in the svc_pool declaration in sunrpc/svc.h says > that sp_id is also the NUMA node id, it might not be the case if > the svc is created using svc_create_pooled(). svc_created_pooled() > can use the per-cpu pool mode therefore in this case sp_id would > be the cpu id. Fix this by reverting now. At a later point this minor optimization, and the deceptive labeling of the sp_id field, can be revisited. Reported-by: Guillaume Morin <guillaume@morinfr.org> Closes: https://lore.kernel.org/linux-nfs/ZYC9rsno8qYggVt9@bender.morinfr.org/T/#u Fixes: 5f7fc5d69f6e ("SUNRPC: Resupply rq_pages from node-local memory") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-18HID: nintendo: Prevent divide-by-zero on codeGuilherme G. Piccoli
It was reported [0] that adding a generic joycon to the system caused a kernel crash on Steam Deck, with the below panic spew: divide error: 0000 [#1] PREEMPT SMP NOPTI [...] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0119 10/24/2023 RIP: 0010:nintendo_hid_event+0x340/0xcc1 [hid_nintendo] [...] Call Trace: [...] ? exc_divide_error+0x38/0x50 ? nintendo_hid_event+0x340/0xcc1 [hid_nintendo] ? asm_exc_divide_error+0x1a/0x20 ? nintendo_hid_event+0x307/0xcc1 [hid_nintendo] hid_input_report+0x143/0x160 hidp_session_run+0x1ce/0x700 [hidp] Since it's a divide-by-0 error, by tracking the code for potential denominator issues, we've spotted 2 places in which this could happen; so let's guard against the possibility and log in the kernel if the condition happens. This is specially useful since some data that fills some denominators are read from the joycon HW in some cases, increasing the potential for flaws. [0] https://github.com/ValveSoftware/SteamOS/issues/1070 Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Tested-by: Sam Lantinga <slouken@libsdl.org> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2023-12-18Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two medium sized fixes, both in drivers. The UFS one adds parsing of clock info structures, which is required by some host drivers and the aacraid one reverts the IRQ affinity mapping patch which has been causing regressions noted in kernel bugzilla 217599" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Store min and max clk freq from OPP table Revert "scsi: aacraid: Reply queue mapping to CPUs based on IRQ affinity"
2023-12-18Merge tag 'spi-fix-v6.7-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A few bigger things here, the main one being that there were changes to the atmel driver in this cycle which made it possible to kill transfers being used for filesystem I/O which turned out to be very disruptive, the series of patches here undoes that and hardens things up further. There's also a few smaller driver specific changes, the main one being to revert a change that duplicted delays" * tag 'spi-fix-v6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: atmel: Fix clock issue when using devices with different polarities spi: spi-imx: correctly configure burst length when using dma spi: cadence: revert "Add SPI transfer delays" spi: atmel: Prevent spi transfers from being killed spi: atmel: Drop unused defines spi: atmel: Do not cancel a transfer upon any signal
2023-12-18MAINTAINERS: remove stale info for DEVICE-MAPPERMike Snitzer
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-12-18dm audit: fix Kconfig so DM_AUDIT depends on BLK_DEV_DMMike Snitzer
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-12-18dm-integrity: don't modify bio's immutable bio_vec in integrity_metadata()Mikulas Patocka
__bio_for_each_segment assumes that the first struct bio_vec argument doesn't change - it calls "bio_advance_iter_single((bio), &(iter), (bvl).bv_len)" to advance the iterator. Unfortunately, the dm-integrity code changes the bio_vec with "bv.bv_len -= pos". When this code path is taken, the iterator would be out of sync and dm-integrity would report errors. This happens if the machine is out of memory and "kmalloc" fails. Fix this bug by making a copy of "bv" and changing the copy instead. Fixes: 7eada909bfd7 ("dm: add integrity target") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-12-18dm-raid: delay flushing event_work() after reconfig_mutex is releasedYu Kuai
After commit db5e653d7c9f ("md: delay choosing sync action to md_start_sync()"), md_start_sync() will hold 'reconfig_mutex', however, in order to make sure event_work is done, __md_stop() will flush workqueue with reconfig_mutex grabbed, hence if sync_work is still pending, deadlock will be triggered. Fortunately, former pacthes to fix stopping sync_thread already make sure all sync_work is done already, hence such deadlock is not possible anymore. However, in order not to cause confusions for people by this implicit dependency, delay flushing event_work to dm-raid where 'reconfig_mutex' is not held, and add some comments to emphasize that the workqueue can't be flushed with 'reconfig_mutex'. Fixes: db5e653d7c9f ("md: delay choosing sync action to md_start_sync()") Depends-on: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-12-18ALSA: hda/realtek: Add quirks for ASUS Zenbook 2023 ModelsStefan Binding
These models use 2xCS35L41amps with HDA using SPI and I2C. Models use internal and external boost. All models require DSD support to be added inside cs35l41_hda_property.c Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Link: https://lore.kernel.org/r/20231218151221.388745-8-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-12-18ALSA: hda: cs35l41: Support additional ASUS Zenbook 2023 ModelsStefan Binding
Add new model entries into configuration table. Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Link: https://lore.kernel.org/r/20231218151221.388745-7-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-12-18ALSA: hda/realtek: Add quirks for ASUS Zenbook 2022 ModelsStefan Binding
These models use 2xCS35L41amps with HDA using SPI and I2C. Models use internal and external boost. All models require DSD support to be added inside cs35l41_hda_property.c Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Link: https://lore.kernel.org/r/20231218151221.388745-6-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-12-18ALSA: hda: cs35l41: Support additional ASUS Zenbook 2022 ModelsStefan Binding
Add new model entries into configuration table. Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Link: https://lore.kernel.org/r/20231218151221.388745-5-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-12-18ALSA: hda/realtek: Add quirks for ASUS ROG 2023 modelsStefan Binding
These models use 2xCS35L41amps with HDA using SPI and I2C. All models use Internal Boost. Some models also use Realtek Speakers in conjunction with CS35L41. All models require DSD support to be added inside cs35l41_hda_property.c Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Link: https://lore.kernel.org/r/20231218151221.388745-4-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-12-18ALSA: hda: cs35l41: Support additional ASUS ROG 2023 modelsStefan Binding
Add new model entries into configuration table. Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Link: https://lore.kernel.org/r/20231218151221.388745-3-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-12-18ALSA: hda: cs35l41: Add config table to support many laptops without _DSDStefan Binding
This make use of the CS35L41 HDA Property framework, which supports laptops which do not have the _DSD properties in their ACPI. Add configuration table to be able to use a generic function which allows laptops to be supported just by adding an entry into the table. Use configuration table function for existing system 103C89C6. Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Link: https://lore.kernel.org/r/20231218151221.388745-2-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-12-18ice: Fix PF with enabled XDP going no-carrier after resetLarysa Zaremba
Commit 6624e780a577fc596788 ("ice: split ice_vsi_setup into smaller functions") has refactored a bunch of code involved in PFR. In this process, TC queue number adjustment for XDP was lost. Bring it back. Lack of such adjustment causes interface to go into no-carrier after a reset, if XDP program is attached, with the following message: ice 0000:b1:00.0: Failed to set LAN Tx queue context, error: -22 ice 0000:b1:00.0 ens801f0np0: Failed to open VSI 0x0006 on switch 0x0001 ice 0000:b1:00.0: enable VSI failed, err -22, VSI index 0, type ICE_VSI_PF ice 0000:b1:00.0: PF VSI rebuild failed: -22 ice 0000:b1:00.0: Rebuild failed, unload and reload driver Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-18ice: alter feature support check for SRIOV and LAGDave Ertman
Previously, the ice driver had support for using a handler for bonding netdev events to ensure that conflicting features were not allowed to be activated at the same time. While this was still in place, additional support was added to specifically support SRIOV and LAG together. These both utilized the netdev event handler, but the SRIOV and LAG feature was behind a capabilities feature check to make sure the current NVM has support. The exclusion part of the event handler should be removed since there are users who have custom made solutions that depend on the non-exclusion of features. Wrap the creation/registration and cleanup of the event handler and associated structs in the probe flow with a feature check so that the only systems that support the full implementation of LAG features will initialize support. This will leave other systems unhindered with functionality as it existed before any LAG code was added. Fixes: bb52f42acef6 ("ice: Add driver support for firmware changes for LAG") Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-18ice: stop trashing VF VSI aggregator node ID informationJacob Keller
When creating new VSIs, they are assigned into an aggregator node in the scheduler tree. Information about which aggregator node a VSI is assigned into is maintained by the vsi->agg_node structure. In ice_vsi_decfg(), this information is being destroyed, by overwriting the valid flag and the agg_id field to zero. For VF VSIs, this breaks the aggregator node configuration replay, which depends on this information. This results in VFs being inserted into the default aggregator node. The resulting configuration will have unexpected Tx bandwidth sharing behavior. This was broken by commit 6624e780a577 ("ice: split ice_vsi_setup into smaller functions"), which added the block to reset the agg_node data. The vsi->agg_node structure is not managed by the scheduler code, but is instead a wrapper around an aggregator node ID that is tracked at the VSI layer. Its been around for a long time, and its primary purpose was for handling VFs. The SR-IOV VF reset flow does not make use of the standard VSI rebuild/replay logic, and uses vsi->agg_node as part of its handling to rebuild the aggregator node configuration. The logic for aggregator nodes stretches back to early ice driver code from commit b126bd6bcd67 ("ice: create scheduler aggregator node config and move VSIs") The logic in ice_vsi_decfg() which trashes the ice_agg_node data is clearly wrong. It destroys information that is necessary for handling VF reset,. It is also not the correct way to actually remove a VSI from an aggregator node. For that, we need to implement logic in the scheduler code. Further, non-VF VSIs properly replay their aggregator configuration using existing scheduler replay logic. To fix the VF replay logic, remove this broken aggregator node cleanup logic. This is the simplest way to immediately fix this. This ensures that VFs will have proper aggregate configuration after a reset. This is especially important since VFs often perform resets as part of their reconfiguration flows. Without fixing this, VFs will be placed in the default aggregator node and Tx bandwidth will not be shared in the expected and configured manner. Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>