summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-03-19Merge tag 'acpi-6.9-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI updates from Rafael Wysocki: "These update ACPI documentation and kerneldoc comments. Specifics: - Add markup to generate links from footnotes in the ACPI enumeration document (Chris Packham) - Update the handle_eject_request() kerneldoc comment to document the arguments of the function and improve kerneldoc comments for ACPI suspend and hibernation functions (Yang Li)" * tag 'acpi-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: PM: Improve kerneldoc comments for suspend and hibernation functions ACPI: docs: enumeration: Make footnotes links ACPI: Document handle_eject_request() arguments
2024-03-19Merge tag 'thermal-6.9-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more thermal control updates from Rafael Wysocki: "These update thermal drivers for ARM platforms by adding new hardware support (r8a779h0, H616 THS), addressing issues (Mediatek LVTS, Mediatek MT7896, thermal-of) and cleaning up code. Specifics: - Fix memory leak in the error path at probe time in the Mediatek LVTS driver (Christophe Jaillet) - Fix control buffer enablement regression on Meditek MT7896 (Frank Wunderlich) - Drop spaces before TABs in different places: thermal-of, ST drivers and Makefile (Geert Uytterhoeven) - Adjust DT binding for NXP as fsl,tmu-range min/maxItems can vary among several SoC versions (Fabio Estevam) - Add support for the H616 THS controller on Sun8i platforms (Martin Botka) - Don't fail probe due to zone registration failure because there is no trip points defined in the DT (Mark Brown) - Support variable TMU array size for new platforms (Peng Fan) - Adjust the DT binding for thermal-of and make the polling time not required and assume it is zero when not found in the DT (Konrad Dybcio) - Add r8a779h0 support in both the DT and the rcar_gen3 driver (Geert Uytterhoeven)" * tag 'thermal-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal/drivers/rcar_gen3: Add support for R-Car V4M dt-bindings: thermal: rcar-gen3-thermal: Add r8a779h0 support thermal/of: Assume polling-delay(-passive) 0 when absent dt-bindings: thermal-zones: Don't require polling-delay(-passive) thermal/drivers/qoriq: Fix getting tmu range thermal/drivers/sun8i: Don't fail probe due to zone registration failure thermal/drivers/sun8i: Add support for H616 THS controller thermal/drivers/sun8i: Add SRAM register access code thermal/drivers/sun8i: Extend H6 calibration to support 4 sensors thermal/drivers/sun8i: Explain unknown H6 register value dt-bindings: thermal: sun8i: Add H616 THS controller soc: sunxi: sram: export register 0 for THS on H616 dt-bindings: thermal: qoriq-thermal: Adjust fsl,tmu-range min/maxItems thermal: Drop spaces before TABs thermal/drivers/mediatek: Fix control buffer enablement on MT7896 thermal/drivers/mediatek/lvts_thermal: Fix a memory leak in an error handling path
2024-03-19Merge tag 'ata-6.9-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Niklas Cassel: "A single fix for ASMedia HBAs. These HBAs do not indicate that they support SATA Port Multipliers CAP.SPM (Supports Port Multiplier) is not set. Likewise, they do not allow you to probe the devices behind an attached PMP, as defined according to the SATA-IO PMP specification. Instead, they have decided to implement their own version of PMP, and because of this, plugging in a PMP actually works, even if the HBA claims that it does not support PMP. Revert a recent quirk for these HBAs, as that breaks ASMedia's own implementation of PMP. Unfortunately, this will once again give some users of these HBAs significantly increased boot time. However, a longer boot time for some, is the lesser evil compared to some other users not being able to detect their drives at all" * tag 'ata-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ahci: asm1064: asm1166: don't limit reported ports
2024-03-19Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio updates from Michael Tsirkin: - Per vq sizes in vdpa - Info query for block devices support in vdpa - DMA sync callbacks in vduse - Fixes, cleanups * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (35 commits) virtio_net: rename free_old_xmit_skbs to free_old_xmit virtio_net: unify the code for recycling the xmit ptr virtio-net: add cond_resched() to the command waiting loop virtio-net: convert rx mode setting to use workqueue virtio: packed: fix unmap leak for indirect desc table vDPA: report virtio-blk flush info to user space vDPA: report virtio-block read-only info to user space vDPA: report virtio-block write zeroes configuration to user space vDPA: report virtio-block discarding configuration to user space vDPA: report virtio-block topology info to user space vDPA: report virtio-block MQ info to user space vDPA: report virtio-block max segments in a request to user space vDPA: report virtio-block block-size to user space vDPA: report virtio-block max segment size to user space vDPA: report virtio-block capacity to user space virtio: make virtio_bus const vdpa: make vdpa_bus const vDPA/ifcvf: implement vdpa_config_ops.get_vq_num_min vDPA/ifcvf: get_max_vq_size to return max size virtio_vdpa: create vqs with the actual size ...
2024-03-19dm-integrity: fix a memory leak when rechecking the dataMikulas Patocka
Memory for the "checksums" pointer will leak if the data is rechecked after checksum failure (because the associated kfree won't happen due to 'goto skip_io'). Fix this by freeing the checksums memory before recheck, and just use the "checksum_onstack" memory for storing checksum during recheck. Fixes: c88f5e553fe3 ("dm-integrity: recheck the integrity tag after a failure") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-03-19Merge tag 'for-linus-6.9-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - Xen event channel handling fix for a regression with a rare kernel config and some added hardening - better support of running Xen dom0 in PVH mode - a cleanup for the xen grant-dma-iommu driver * tag 'for-linus-6.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/events: increment refcnt only if event channel is refcounted xen/evtchn: avoid WARN() when unbinding an event channel x86/xen: attempt to inflate the memory balloon on PVH xen/grant-dma-iommu: Convert to platform remove callback returning void
2024-03-19net: phy: fix phy_read_poll_timeout argument type in genphy_loopbackNikita Kiryushin
read_poll_timeout inside phy_read_poll_timeout can set val negative in some cases (for example, __mdiobus_read inside phy_read can return -EOPNOTSUPP). Supposedly, commit 4ec732951702 ("net: phylib: fix phy_read*_poll_timeout()") should fix problems with wrong-signed vals, but I do not see how as val is sent to phy_read as is and __val = phy_read (not val) is checked for sign. Change val type for signed to allow better error handling as done in other phy_read_poll_timeout callers. This will not fix any error handling by itself, but allows, for example, to modify cond with appropriate sign check or check resulting val separately. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 014068dcb5b1 ("net: phy: genphy_loopback: add link speed configuration") Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20240315175052.8049-1-kiryushin@ancud.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19Merge branches 'misc' and 'fixes' into for-linusRussell King (Oracle)
2024-03-19ALSA: hda/realtek: Add quirk for HP Spectre x360 14 eu0000Anthony I Gilea
Cirrus amps support for this laptop was added in patch: 33e5e648e631 ("ALSA: hda: cs35l41: Support additional HP Envy Models") This patch adds fixes for wrong pincfgs, wrong DAC selection and mute/micmute LEDs. Signed-off-by: Anthony I Gilea <i@cpp.in> Message-ID: <e2a7aaed-e9d7-4d36-8abf-b71dfd32a0ff@cpp.in> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-03-19net/sched: Add module alias for sch_fq_pieMichal Koutný
The commit 2c15a5aee2f3 ("net/sched: Load modules via their alias") starts loading modules via aliases and not canonical names. The new aliases were added in commit 241a94abcf46 ("net/sched: Add module aliases for cls_,sch_,act_ modules") via a Coccinele script. sch_fq_pie.c is missing module.h header and thus Coccinele did not patch it. Add the include and module alias manually, so that autoloading works for sch_fq_pie too. (Note: commit message in commit 241a94abcf46 ("net/sched: Add module aliases for cls_,sch_,act_ modules") was mangled due to '#' misinterpretation. The predicate haskernel is: | @ haskernel @ | @@ | | #include <linux/module.h> | .) Fixes: 241a94abcf46 ("net/sched: Add module aliases for cls_,sch_,act_ modules") Signed-off-by: Michal Koutný <mkoutny@suse.com> Link: https://lore.kernel.org/r/20240315160210.8379-1-mkoutny@suse.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19can: kvaser_pciefd: Add additional Xilinx interruptsMartin Jocić
Since Xilinx-based adapters now support up to eight CAN channels, the TX interrupt mask array must have eight elements. Signed-off-by: Martin Jocic <martin.jocic@kvaser.com> Link: https://lore.kernel.org/all/2ab3c0585c3baba272ede0487182a423a420134b.camel@kvaser.com Fixes: 9b221ba452aa ("can: kvaser_pciefd: Add support for Kvaser PCIe 8xCAN") [mkl: replace Link by Fixes tag] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-03-19ceph: set correct cap mask for getattr request for readXiubo Li
In case of hitting the file EOF, ceph_read_iter() needs to retrieve the file size from MDS, and Fr caps aren't neccessary. [ idryomov: fold into existing retry_op == READ_INLINE branch ] Reported-by: Frank Hsiao <frankhsiao@qnap.com> Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Frank Hsiao <frankhsiao@qnap.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-03-19ceph: stop copying to iter at EOF on sync readsXiubo Li
If EOF is encountered, ceph_sync_read() return value is adjusted down according to i_size, but the "to" iter is advanced by the actual number of bytes read. Then, when retrying, the remainder of the range may be skipped incorrectly. Ensure that the "to" iter is advanced only until EOF. [ idryomov: changelog ] Fixes: c3d8e0b5de48 ("ceph: return the real size read when it hits EOF") Reported-by: Frank Hsiao <frankhsiao@qnap.com> Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Frank Hsiao <frankhsiao@qnap.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-03-19nouveau/gsp: don't check devinit disable on GSP.Dave Airlie
GSP should be handling this and I can see no evidence in opengpu driver that this register should be touched. Fixed acceleration on 2080 Ti GPUs. Fixes: 15740541e8f0 ("drm/nouveau/devinit/tu102-: prepare for GSP-RM") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240314014521.2695233-1-airlied@gmail.com
2024-03-19ipv4: raw: Fix sending packets from raw sockets via IPsec tunnelsTobias Brunner
Since the referenced commit, the xfrm_inner_extract_output() function uses the protocol field to determine the address family. So not setting it for IPv4 raw sockets meant that such packets couldn't be tunneled via IPsec anymore. IPv6 raw sockets are not affected as they already set the protocol since 9c9c9ad5fae7 ("ipv6: set skb->protocol on tcp, raw and ip6_append_data genereated skbs"). Fixes: f4796398f21b ("xfrm: Remove inner/outer modes from output path") Signed-off-by: Tobias Brunner <tobias@strongswan.org> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://lore.kernel.org/r/c5d9a947-eb19-4164-ac99-468ea814ce20@strongswan.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19hsr: Handle failures in module initFelix Maurer
A failure during registration of the netdev notifier was not handled at all. A failure during netlink initialization did not unregister the netdev notifier. Handle failures of netdev notifier registration and netlink initialization. Both functions should only return negative values on failure and thereby lead to the hsr module not being loaded. Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Felix Maurer <fmaurer@redhat.com> Reviewed-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/3ce097c15e3f7ace98fc7fd9bcbf299f092e63d1.1710504184.git.fmaurer@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19Merge branches 'pm-em', 'pm-powercap' and 'pm-sleep'Rafael J. Wysocki
Merge additional updates related to the Energy Model, power capping and system-wide power management for 6.9-rc1: - Modify the Energy Model code to bail out and complain if the unit of power is not uW to prevent errors due to unit mismatches (Lukasz Luba). - Make the intel_rapl platform driver use a remove callback returning void (Uwe Kleine-König). - Fix typo in the suspend and interrupts document (Saravana Kannan). * pm-em: PM: EM: Force device drivers to provide power in uW * pm-powercap: powercap: intel_rapl: Convert to platform remove callback returning void * pm-sleep: Documentation: power: Fix typo in suspend and interrupts doc
2024-03-19fbmon: prevent division by zero in fb_videomode_from_videomode()Roman Smirnov
The expression htotal * vtotal can have a zero value on overflow. It is necessary to prevent division by zero like in fb_var_to_videomode(). Found by Linux Verification Center (linuxtesting.org) with Svace. Signed-off-by: Roman Smirnov <r.smirnov@omp.ru> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Helge Deller <deller@gmx.de>
2024-03-19Merge branch 'acpi-docs'Rafael J. Wysocki
Merge an ACPI documentation update for 6.9-rc1 which adds markup to generate links from footnotes in the enumeration document. * acpi-docs: ACPI: docs: enumeration: Make footnotes links
2024-03-19usb: usb-acpi: Fix oops due to freeing uninitialized pld pointerMathias Nyman
If reading the ACPI _PLD port location object fails, or the port doesn't have a _PLD ACPI object then the *pld pointer will remain uninitialized and oops when freed. The patch that caused this is currently in next, on its way to v6.9. So no need to add this to stable or current 6.8 kernel. Reported-by: Klara Modin <klarasmodin@gmail.com> Closes: https://lore.kernel.org/linux-usb/7e92369a-3197-4883-9988-3c93452704f5@gmail.com/ Tested-by: Klara Modin <klarasmodin@gmail.com> Fixes: f3ac348e6e04 ("usb: usb-acpi: Set port connect type of not connectable ports correctly") Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20240308113425.1144689-1-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-19exfat: remove duplicate update parent dirYuezhang Mo
For renaming, the directory only needs to be updated once if it is in the same directory. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: do not sync parent dir if just update timestampYuezhang Mo
When sync or dir_sync is enabled, there is no need to sync the parent directory's inode if only for updating its timestamp. 1. If an unexpected power failure occurs, the timestamp of the parent directory is not updated to the storage, which has no impact on the user. 2. The number of writes will be greatly reduced, which can not only improve performance, but also prolong device life. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: remove unused functionsYuezhang Mo
exfat_count_ext_entries() is no longer called, remove it. exfat_update_dir_chksum() is no longer called, remove it and rename exfat_update_dir_chksum_with_entry_set() to it. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: convert exfat_find_empty_entry() to use dentry cacheYuezhang Mo
Before this conversion, each dentry traversed needs to be read from the storage device or page cache. There are at least 16 dentries in a sector. This will result in frequent page cache searches. After this conversion, if all directory entries in a sector are used, the sector only needs to be read once. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: convert exfat_init_ext_entry() to use dentry cacheYuezhang Mo
Before this conversion, in exfat_init_ext_entry(), to init the dentries in a dentry set, the sync times is equals the dentry number if 'dirsync' or 'sync' is enabled. That affects not only performance but also device life. After this conversion, only needs to be synchronized once if 'dirsync' or 'sync' is enabled. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: move free cluster out of exfat_init_ext_entry()Yuezhang Mo
exfat_init_ext_entry() is an init function, it's a bit strange to free cluster in it. And the argument 'inode' will be removed from exfat_init_ext_entry(). So this commit changes to free the cluster in exfat_remove_entries(). Code refinement, no functional changes. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: convert exfat_remove_entries() to use dentry cacheYuezhang Mo
Before this conversion, in exfat_remove_entries(), to mark the dentries in a dentry set as deleted, the sync times is equals the dentry numbers if 'dirsync' or 'sync' is enabled. That affects not only performance but also device life. After this conversion, only needs to be synchronized once if 'dirsync' or 'sync' is enabled. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: convert exfat_add_entry() to use dentry cacheYuezhang Mo
After this conversion, if "dirsync" or "sync" is enabled, the number of synchronized dentries in exfat_add_entry() will change from 2 to 1. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: add exfat_get_empty_dentry_set() helperYuezhang Mo
This helper is used to lookup empty dentry set. If there are no enough empty dentries at the input location, this helper will return the number of dentries that need to be skipped for the next lookup. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: add __exfat_get_dentry_set() helperYuezhang Mo
Since exfat_get_dentry_set() invokes the validate functions of exfat_validate_entry(), it only supports getting a directory entry set of an existing file, doesn't support getting an empty entry set. To remove the limitation, add this helper. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19rds: introduce acquire/release ordering in acquire/release_in_xmit()Yewon Choi
acquire/release_in_xmit() work as bit lock in rds_send_xmit(), so they are expected to ensure acquire/release memory ordering semantics. However, test_and_set_bit/clear_bit() don't imply such semantics, on top of this, following smp_mb__after_atomic() does not guarantee release ordering (memory barrier actually should be placed before clear_bit()). Instead, we use clear_bit_unlock/test_and_set_bit_lock() here. Fixes: 0f4b1c7e89e6 ("rds: fix rds_send_xmit() serialization") Fixes: 1f9ecd7eacfd ("RDS: Pass rds_conn_path to rds_send_xmit()") Signed-off-by: Yewon Choi <woni9911@gmail.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://lore.kernel.org/r/ZfQUxnNTO9AJmzwc@libra05 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19ahci: asm1064: asm1166: don't limit reported portsConrad Kostecki
Previously, patches have been added to limit the reported count of SATA ports for asm1064 and asm1166 SATA controllers, as those controllers do report more ports than physically having. While it is allowed to report more ports than physically having in CAP.NP, it is not allowed to report more ports than physically having in the PI (Ports Implemented) register, which is what these HBAs do. (This is a AHCI spec violation.) Unfortunately, it seems that the PMP implementation in these ASMedia HBAs is also violating the AHCI and SATA-IO PMP specification. What these HBAs do is that they do not report that they support PMP (CAP.SPM (Supports Port Multiplier) is not set). Instead, they have decided to add extra "virtual" ports in the PI register that is used if a port multiplier is connected to any of the physical ports of the HBA. Enumerating the devices behind the PMP as specified in the AHCI and SATA-IO specifications, by using PMP READ and PMP WRITE commands to the physical ports of the HBA is not possible, you have to use the "virtual" ports. This is of course bad, because this gives us no way to detect the device and vendor ID of the PMP actually connected to the HBA, which means that we can not apply the proper PMP quirks for the PMP that is connected to the HBA. Limiting the port map will thus stop these controllers from working with SATA Port Multipliers. This patch reverts both patches for asm1064 and asm1166, so old behavior is restored and SATA PMP will work again, but it will also reintroduce the (minutes long) extra boot time for the ASMedia controllers that do not have a PMP connected (either on the PCIe card itself, or an external PMP). However, a longer boot time for some, is the lesser evil compared to some other users not being able to detect their drives at all. Fixes: 0077a504e1a4 ("ahci: asm1166: correct count of reported ports") Fixes: 9815e3961754 ("ahci: asm1064: correct count of reported ports") Cc: stable@vger.kernel.org Reported-by: Matt <cryptearth@googlemail.com> Signed-off-by: Conrad Kostecki <conikost@gentoo.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> [cassel: rewrote commit message] Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-03-19tools: ynl: add header guards for nlctrlJakub Kicinski
I "extracted" YNL C into a GitHub repo to make it easier to use in other projects: https://github.com/linux-netdev/ynl-c GitHub actions use Ubuntu by default, and the kernel headers there are missing f329a0ebeaba ("genetlink: correct uAPI defines"). Add the direct include workaround for nlctrl. Fixes: 768e044a5fd4 ("doc/netlink/specs: Add spec for nlctrl netlink family") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240315002108.523232-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19Merge branch 'wireguard-fixes-for-6-9-rc1'Paolo Abeni
Jason A. Donenfeld says: ==================== wireguard fixes for 6.9-rc1 This series has four WireGuard fixes: 1) Annotate a data race that KCSAN found by using READ_ONCE/WRITE_ONCE, which has been causing syzkaller noise. 2) Use the generic netdev tstats allocation and stats getters instead of doing this within the driver. 3) Explicitly check a flag variable instead of an empty list in the netlink code, to prevent a UaF situation when paging through GET results during a remove-all SET operation. 4) Set a flag in the RISC-V CI config so the selftests continue to boot. ==================== Link: https://lore.kernel.org/r/20240314224911.6653-1-Jason@zx2c4.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: selftests: set RISCV_ISA_FALLBACK on riscv{32,64}Jason A. Donenfeld
This option is needed to continue booting with QEMU. Recent changes that made this optional meant that it gets unset in the test harness, and so WireGuard CI has been broken. Fix this by simply setting this option. Cc: stable@vger.kernel.org Fixes: 496ea826d1e1 ("RISC-V: provide Kconfig & commandline options to control parsing "riscv,isa"") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: netlink: access device through ctx instead of peerJason A. Donenfeld
The previous commit fixed a bug that led to a NULL peer->device being dereferenced. It's actually easier and faster performance-wise to instead get the device from ctx->wg. This semantically makes more sense too, since ctx->wg->peer_allowedips.seq is compared with ctx->allowedips_seq, basing them both in ctx. This also acts as a defence in depth provision against freed peers. Cc: stable@vger.kernel.org Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: netlink: check for dangling peer via is_dead instead of empty listJason A. Donenfeld
If all peers are removed via wg_peer_remove_all(), rather than setting peer_list to empty, the peer is added to a temporary list with a head on the stack of wg_peer_remove_all(). If a netlink dump is resumed and the cursored peer is one that has been removed via wg_peer_remove_all(), it will iterate from that peer and then attempt to dump freed peers. Fix this by instead checking peer->is_dead, which was explictly created for this purpose. Also move up the device_update_lock lockdep assertion, since reading is_dead relies on that. It can be reproduced by a small script like: echo "Setting config..." ip link add dev wg0 type wireguard wg setconf wg0 /big-config ( while true; do echo "Showing config..." wg showconf wg0 > /dev/null done ) & sleep 4 wg setconf wg0 <(printf "[Peer]\nPublicKey=$(wg genkey)\n") Resulting in: BUG: KASAN: slab-use-after-free in __lock_acquire+0x182a/0x1b20 Read of size 8 at addr ffff88811956ec70 by task wg/59 CPU: 2 PID: 59 Comm: wg Not tainted 6.8.0-rc2-debug+ #5 Call Trace: <TASK> dump_stack_lvl+0x47/0x70 print_address_description.constprop.0+0x2c/0x380 print_report+0xab/0x250 kasan_report+0xba/0xf0 __lock_acquire+0x182a/0x1b20 lock_acquire+0x191/0x4b0 down_read+0x80/0x440 get_peer+0x140/0xcb0 wg_get_device_dump+0x471/0x1130 Cc: stable@vger.kernel.org Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Reported-by: Lillian Berry <lillian@star-ark.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: device: remove generic .ndo_get_stats64Breno Leitao
Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is configured") moved the callback to dev_get_tstats64() to net core, so, unless the driver is doing some custom stats collection, it does not need to set .ndo_get_stats64. Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64 function pointer. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: device: leverage core stats allocatorBreno Leitao
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and convert veth & vrf"), stats allocation could be done on net core instead of in this driver. With this new approach, the driver doesn't have to bother with error handling (allocation failure checking, making sure free happens in the right spot, etc). This is core responsibility now. Remove the allocation in this driver and leverage the network core allocation instead. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19wireguard: receive: annotate data-race around receiving_counter.counterNikita Zhandarovich
Syzkaller with KCSAN identified a data-race issue when accessing keypair->receiving_counter.counter. Use READ_ONCE() and WRITE_ONCE() annotations to mark the data race as intentional. BUG: KCSAN: data-race in wg_packet_decrypt_worker / wg_packet_rx_poll write to 0xffff888107765888 of 8 bytes by interrupt on cpu 0: counter_validate drivers/net/wireguard/receive.c:321 [inline] wg_packet_rx_poll+0x3ac/0xf00 drivers/net/wireguard/receive.c:461 __napi_poll+0x60/0x3b0 net/core/dev.c:6536 napi_poll net/core/dev.c:6605 [inline] net_rx_action+0x32b/0x750 net/core/dev.c:6738 __do_softirq+0xc4/0x279 kernel/softirq.c:553 do_softirq+0x5e/0x90 kernel/softirq.c:454 __local_bh_enable_ip+0x64/0x70 kernel/softirq.c:381 __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline] _raw_spin_unlock_bh+0x36/0x40 kernel/locking/spinlock.c:210 spin_unlock_bh include/linux/spinlock.h:396 [inline] ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline] wg_packet_decrypt_worker+0x6c5/0x700 drivers/net/wireguard/receive.c:499 process_one_work kernel/workqueue.c:2633 [inline] ... read to 0xffff888107765888 of 8 bytes by task 3196 on cpu 1: decrypt_packet drivers/net/wireguard/receive.c:252 [inline] wg_packet_decrypt_worker+0x220/0x700 drivers/net/wireguard/receive.c:501 process_one_work kernel/workqueue.c:2633 [inline] process_scheduled_works+0x5b8/0xa30 kernel/workqueue.c:2706 worker_thread+0x525/0x730 kernel/workqueue.c:2787 ... Fixes: a9e90d9931f3 ("wireguard: noise: separate receive counter from send counter") Reported-by: syzbot+d1de830e4ecdaac83d89@syzkaller.appspotmail.com Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19net: move dev->state into net_device_read_txrx groupEric Dumazet
dev->state can be read in rx and tx fast paths. netif_running() which needs dev->state is called from - enqueue_to_backlog() [RX path] - __dev_direct_xmit() [TX path] Fixes: 43a71cd66b9c ("net-device: reorganize net_device fast path variables") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Coco Li <lixiaoyan@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240314200845.3050179-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-19timers: Fix removed self-IPI on global timer's enqueue in nohz_fullFrederic Weisbecker
While running in nohz_full mode, a task may enqueue a timer while the tick is stopped. However the only places where the timer wheel, alongside the timer migration machinery's decision, may reprogram the next event accordingly with that new timer's expiry are the idle loop or any IRQ tail. However neither the idle task nor an interrupt may run on the CPU if it resumes busy work in userspace for a long while in full dynticks mode. To solve this, the timer enqueue path raises a self-IPI that will re-evaluate the timer wheel on its IRQ tail. This asynchronous solution avoids potential locking inversion. This is supposed to happen both for local and global timers but commit: b2cf7507e186 ("timers: Always queue timers on the local CPU") broke the global timers case with removing the ->is_idle field handling for the global base. As a result, global timers enqueue may go unnoticed in nohz_full. Fix this with restoring the idle tracking of the global timer's base, allowing self-IPIs again on enqueue time. Fixes: b2cf7507e186 ("timers: Always queue timers on the local CPU") Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240318230729.15497-3-frederic@kernel.org
2024-03-19timers/migration: Fix endless timer requeue after idle interruptsFrederic Weisbecker
When a CPU is an idle migrator, but another CPU wakes up before it, becomes an active migrator and handles the queue, the initial idle migrator may end up endlessly reprogramming its clockevent, chasing ghost timers forever such as in the following scenario: [GRP0:0] migrator = 0 active = 0 nextevt = T1 / \ 0 1 active idle (T1) 0) CPU 1 is idle and has a timer queued (T1), CPU 0 is active and is the active migrator. [GRP0:0] migrator = NONE active = NONE nextevt = T1 / \ 0 1 idle idle (T1) wakeup = T1 1) CPU 0 is now idle and is therefore the idle migrator. It has programmed its next timer interrupt to handle T1. [GRP0:0] migrator = 1 active = 1 nextevt = KTIME_MAX / \ 0 1 idle active wakeup = T1 2) CPU 1 has woken up, it is now active and it has just handled its own timer T1. 3) CPU 0 gets a timer interrupt to handle T1 but tmigr_handle_remote() realize it is not the migrator anymore. So it early returns without observing that T1 has been expired already and therefore without updating its ->wakeup value. 4) CPU 0 goes into tmigr_cpu_new_timer() which also early returns because it doesn't queue a timer of its own. So ->wakeup is left unchanged and the next timer is programmed to fire now. 5) goto 3) forever This results in timer interrupt storms in idle and also in nohz_full (as observed in rcutorture's TREE07 scenario). Fix this with forcing a re-evaluation of tmc->wakeup while trying remote timer handling when the CPU isn't the migrator anymmore. The check is inherently racy but in the worst case the CPU just races setting the KTIME_MAX value that a remote expiry also tries to set. Fixes: 7ee988770326 ("timers: Implement the hierarchical pull model") Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240318230729.15497-2-frederic@kernel.org
2024-03-19LoongArch/crypto: Clean up useless assignment operationsYuli Wang
The LoongArch CRC32 hw acceleration is based on arch/mips/crypto/ crc32-mips.c. While the MIPS code supports both MIPS32 and MIPS64, but LoongArch32 lacks the CRC instruction. As a result, the line "len -= sizeof(u32)" is unnecessary. Removing it can make context code style more unified and improve code readability. Cc: stable@vger.kernel.org Reviewed-by: WANG Xuerui <git@xen0n.name> Suggested-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Yuli Wang <wangyuli@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-19LoongArch: Define the __io_aw() hook as mmiowb()Huacai Chen
Commit fb24ea52f78e0d595852e ("drivers: Remove explicit invocations of mmiowb()") remove all mmiowb() in drivers, but it says: "NOTE: mmiowb() has only ever guaranteed ordering in conjunction with spin_unlock(). However, pairing each mmiowb() removal in this patch with the corresponding call to spin_unlock() is not at all trivial, so there is a small chance that this change may regress any drivers incorrectly relying on mmiowb() to order MMIO writes between CPUs using lock-free synchronisation." The mmio in radeon_ring_commit() is protected by a mutex rather than a spinlock, but in the mutex fastpath it behaves similar to spinlock. We can add mmiowb() calls in the radeon driver but the maintainer says he doesn't like such a workaround, and radeon is not the only example of mutex protected mmio. So we should extend the mmiowb tracking system from spinlock to mutex, and maybe other locking primitives. This is not easy and error prone, so we solve it in the architectural code, by simply defining the __io_aw() hook as mmiowb(). And we no longer need to override queued_spin_unlock() so use the generic definition. Without this, we get such an error when run 'glxgears' on weak ordering architectures such as LoongArch: radeon 0000:04:00.0: ring 0 stalled for more than 10324msec radeon 0000:04:00.0: ring 3 stalled for more than 10240msec radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000001f412 last fence id 0x000000000001f414 on ring 3) radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000000f940 last fence id 0x000000000000f941 on ring 0) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35). [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) Link: https://lore.kernel.org/dri-devel/29df7e26-d7a8-4f67-b988-44353c4270ac@amd.com/T/#t Link: https://lore.kernel.org/linux-arch/20240301130532.3953167-1-chenhuacai@loongson.cn/T/#t Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-19LoongArch: Remove superfluous flush_dcache_page() definitionHuacai Chen
LoongArch doesn't have cache aliases, so flush_dcache_page() is a no-op. There is a generic implementation for this case in include/asm-generic/ cacheflush.h. So remove the superfluous flush_dcache_page() definition, which also silences such build warnings: In file included from crypto/scompress.c:12: include/crypto/scatterwalk.h: In function 'scatterwalk_pagedone': include/crypto/scatterwalk.h:76:30: warning: variable 'page' set but not used [-Wunused-but-set-variable] 76 | struct page *page; | ^~~~ crypto/scompress.c: In function 'scomp_acomp_comp_decomp': >> crypto/scompress.c:174:38: warning: unused variable 'dst_page' [-Wunused-variable] 174 | struct page *dst_page = sg_page(req->dst); | Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202403091614.NeUw5zcv-lkp@intel.com/ Suggested-by: Barry Song <baohua@kernel.org> Acked-by: Barry Song <baohua@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-19LoongArch: Move {dmw,tlb}_virt_to_page() definition to page.hMax Kellermann
These two functions are implemented in pgtable.c, and they are needed only by the virt_to_page() macro in page.h. Having the prototypes in pgtable.h causes a circular dependency between page.h and pgtable.h, because the virt_to_page() macro in page.h needs pgtable.h for these two functions, while pgtable.h needs various definitions from page.h (e.g. pte_t and pgt_t). Let's avoid this circular dependency by moving the function prototypes to page.h. Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-19LoongArch: Change __my_cpu_offset definition to avoid mis-optimizationHuacai Chen
From GCC commit 3f13154553f8546a ("df-scan: remove ad-hoc handling of global regs in asms"), global registers will no longer be forced to add to the def-use chain. Then current_thread_info(), current_stack_pointer and __my_cpu_offset may be lifted out of the loop because they are no longer treated as "volatile variables". This optimization is still correct for the current_thread_info() and current_stack_pointer usages because they are associated to a thread. However it is wrong for __my_cpu_offset because it is associated to a CPU rather than a thread: if the thread migrates to a different CPU in the loop, __my_cpu_offset should be changed. Change __my_cpu_offset definition to treat it as a "volatile variable", in order to avoid such a mis-optimization. Cc: stable@vger.kernel.org Reported-by: Xiaotian Wu <wuxiaotian@loongson.cn> Reported-by: Miao Wang <shankerwangmiao@gmail.com> Signed-off-by: Xing Li <lixing@loongson.cn> Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn> Signed-off-by: Rui Wang <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-19LoongArch: Select HAVE_ARCH_USERFAULTFD_MINOR in KconfigHuacai Chen
This allocates the VM flag needed to support the userfaultfd minor fault functionality. See commit 7677f7fd8be7665 ("userfaultfd: add minor fault registration mode") for more information. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-19LoongArch: Select ARCH_HAS_CURRENT_STACK_POINTER in KconfigHuacai Chen
LoongArch has implemented the current_stack_pointer macro, so select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig. This will let it be used in non-arch places (like HARDENED_USERCOPY). Reviewed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>