linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-11-26	Merge tag 'kvm-x86-gmem-6.19' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini
	KVM guest_memfd changes for 6.19: - Add NUMA mempolicy support for guest_memfd, and clean up a variety of rough edges in guest_memfd along the way. - Define a CLASS to automatically handle get+put when grabbing a guest_memfd from a memslot to make it harder to leak references. - Enhance KVM selftests to make it easer to develop and debug selftests like those added for guest_memfd NUMA support, e.g. where test and/or KVM bugs often result in hard-to-debug SIGBUS errors. - Misc cleanups.
2025-11-25	tools: ynl-gen: add regeneration comment	Asbjørn Sloth Tønnesen
	Add a comment on regeneration to the generated files. The comment is placed after the YNL-GEN line[1], as to not interfere with ynl-regen.sh's detection logic. [1] and after the optional YNL-ARG line. Link: https://lore.kernel.org/r/aR5m174O7pklKrMR@zx2c4.com/ Suggested-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251120174429.390574-3-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25	wifi: nl80211: vendor-cmd: intel: fix a blank kernel-doc line warning	Randy Dunlap
	Delete an empty line prevent a kernel-doc warning: Warning: ../include/uapi/linux/nl80211-vnd-intel.h:86 bad line: Fixes: 3d2a2544eae9 ("nl80211: vendor-cmd: add Intel vendor commands for iwlmei usage") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20251125022834.3171742-1-rdunlap@infradead.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-25	fs: Add uoff_t	Matthew Wilcox (Oracle)
	In a recent commit, I inadvertently changed a comparison from being an unsigned comparison (on 64-bit systems) to being a signed comparison (which it had always been on 32-bit systems). This led to a sporadic fstests failure. To make sure this comparison is always unsigned, introduce a new type, uoff_t which is the unsigned version of loff_t. Generally file sizes are restricted to being a signed integer, but in these two places it is convenient to pass -1 to indicate "up to the end of the file". Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://patch.msgid.link/20251123220518.1447261-1-willy@infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-24	btrfs: implement shutdown ioctl	Qu Wenruo
	The shutdown ioctl should follow the XFS one, which use magic number 'X', and ioctl number 125, with a uint32 as flags. For now btrfs don't distinguish DEFAULT and LOGFLUSH flags (just like f2fs), both will freeze the fs first (implies committing the current transaction), setting the SHUTDOWN flag and finally thaw the fs. For NOLOGFLUSH flag, the freeze/thaw part is skipped thus the current transaction is aborted. The new shutdown ioctl is hidden behind experimental features for more testing. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <asj@kernel.org> Tested-by: Anand Jain <asj@kernel.org> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-11-24	staging: gpib: Destage gpib	Dave Penkler
	Move the gpib drivers out of staging and into the "real" part of the kernel. This entails: - Remove the gpib Kconfig menu and Makefile build rule from staging. - Remove gpib/uapi from the header file search path in subdir-ccflags of the gpib Makefile - move the gpib/uapi files to include/uapi/linux - Move the gpib tree out of staging to drivers. - Remove the word "Linux" from the gpib Kconfig file. - Add the gpib Kconfig menu and Makefile build rule to drivers Signed-off-by: Dave Penkler <dpenkler@gmail.com> Link: https://patch.msgid.link/20251117144021.23569-5-dpenkler@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24	perf: Add perf_event_attr::config4	James Clark
	Arm FEAT_SPE_FDS adds the ability to filter on the data source of a packet using another 64-bits of event filtering control. As the existing perf_event_attr::configN fields are all used up for SPE PMU, an additional field is needed. Add a new 'config4' field. Reviewed-by: Leo Yan <leo.yan@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2025-11-22	Merge tag 'input-for-v6.18-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - INPUT_PROP_HAPTIC_TOUCHPAD definition added early in 6.18 cycle has been renamed to INPUT_PROP_PRESSUREPAD to better reflect the kind of devices it is supposed to be set for - a new ID for a touchscreen found in Ayaneo Flip DS in Goodix driver - Goodix driver no longer tries to set reset pin as "input" as it causes issues when there is no pull up resistor installed on the board - fixes for cros_ec_keyb, imx_sc_key, and pegasus-notetaker drivers to deal with potential out-of-bounds access and memory corruption issues * tag 'input-for-v6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: rename INPUT_PROP_HAPTIC_TOUCHPAD to INPUT_PROP_PRESSUREPAD Input: cros_ec_keyb - fix an invalid memory access Input: imx_sc_key - fix memory corruption on unload Input: pegasus-notetaker - fix potential out-of-bounds access Input: goodix - remove setting of RST pin to input Input: goodix - add support for ACPI ID GDIX1003
2025-11-21	uapi: cdc.h: cleanly provide for more interfaces and countries	Oliver Neukum
	The spec requires at least one interface respectively country. It allows multiple ones. This needs to be clearly said in the UAPI. This is subject to sanity checking in cdc_parse_cdc_header(), thus we can trust the length. Signed-off-by: Oliver Neukum <oneukum@suse.com> Link: https://patch.msgid.link/20251111134641.4118827-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-21	KVM: s390: Add capability that forwards operation exceptions	Janosch Frank
	Setting KVM_CAP_S390_USER_OPEREXEC will forward all operation exceptions to user space. This also includes the 0x0000 instructions managed by KVM_CAP_S390_USER_INSTR0. It's helpful if user space wants to emulate instructions which do not (yet) have an opcode. While we're at it refine the documentation for KVM_CAP_S390_USER_INSTR0. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2025-11-20	vfio/pci: Add dma-buf export support for MMIO regions	Leon Romanovsky
	Add support for exporting PCI device MMIO regions through dma-buf, enabling safe sharing of non-struct page memory with controlled lifetime management. This allows RDMA and other subsystems to import dma-buf FDs and build them into memory regions for PCI P2P operations. The implementation provides a revocable attachment mechanism using dma-buf move operations. MMIO regions are normally pinned as BARs don't change physical addresses, but access is revoked when the VFIO device is closed or a PCI reset is issued. This ensures kernel self-defense against potentially hostile userspace. Currently VFIO can take MMIO regions from the device's BAR and map them into a PFNMAP VMA with special PTEs. This mapping type ensures the memory cannot be used with things like pin_user_pages(), hmm, and so on. In practice only the user process CPU and KVM can safely make use of these VMA. When VFIO shuts down these VMAs are cleaned by unmap_mapping_range() to prevent any UAF of the MMIO beyond driver unbind. However, VFIO type 1 has an insecure behavior where it uses follow_pfnmap_*() to fish a MMIO PFN out of a VMA and program it back into the IOMMU. This has a long history of enabling P2P DMA inside VMs, but has serious lifetime problems by allowing a UAF of the MMIO after the VFIO driver has been unbound. Introduce DMABUF as a new safe way to export a FD based handle for the MMIO regions. This can be consumed by existing DMABUF importers like RDMA or DRM without opening an UAF. A following series will add an importer to iommufd to obsolete the type 1 code and allow safe UAF-free MMIO P2P in VM cases. DMABUF has a built in synchronous invalidation mechanism called move_notify. VFIO keeps track of all drivers importing its MMIO and can invoke a synchronous invalidation callback to tell the importing drivers to DMA unmap and forget about the MMIO pfns. This process is being called revoke. This synchronous invalidation fully prevents any lifecycle problems. VFIO will do this before unbinding its driver ensuring there is no UAF of the MMIO beyond the driver lifecycle. Further, VFIO has additional behavior to block access to the MMIO during things like Function Level Reset. This is because some poor platforms may experience a MCE type crash when touching MMIO of a PCI device that is undergoing a reset. Today this is done by using unmap_mapping_range() on the VMAs. Extend that into the DMABUF world and temporarily revoke the MMIO from the DMABUF importers during FLR as well. This will more robustly prevent an errant P2P from possibly upsetting the platform. A DMABUF FD is a preferred handle for MMIO compared to using something like a pgmap because: - VFIO is supported, including its P2P feature, on archs that don't support pgmap - PCI devices have all sorts of BAR sizes, including ones smaller than a section so a pgmap cannot always be created - It is undesirable to waste a lot of memory for struct pages, especially for a case like a GPU with ~100GB of BAR size - We want a synchronous revoke semantic to support FLR with light hardware requirements Use the P2P subsystem to help generate the DMA mapping. This is a significant upgrade over the abuse of dma_map_resource() that has historically been used by DMABUF exporters. Experience with an OOT version of this patch shows that real systems do need this. This approach deals with all the P2P scenarios: - Non-zero PCI bus_offset - ACS flags routing traffic to the IOMMU - ACS flags that bypass the IOMMU - though vfio noiommu is required to hit this. There will be further work to formalize the revoke semantic in DMABUF. For now this acts like a move_notify dynamic exporter where importer fault handling will get a failure when they attempt to map. This means that only fully restartable fault capable importers can import the VFIO DMABUFs. A future revoke semantic should open this up to more HW as the HW only needs to invalidate, not handle restartable faults. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Ankit Agrawal <ankita@nvidia.com> Link: https://lore.kernel.org/r/20251120-dmabuf-vfio-v9-10-d7f71607f371@nvidia.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-20	devlink: support default values for param-get and param-set	Daniel Zahka
	Support querying and resetting to default param values. Introduce two new devlink netlink attrs: DEVLINK_ATTR_PARAM_VALUE_DEFAULT and DEVLINK_ATTR_PARAM_RESET_DEFAULT. The former is used to contain an optional parameter value inside of the param_value nested attribute. The latter is used in param-set requests from userspace to indicate that the driver should reset the param to its default value. To implement this, two new functions are added to the devlink driver api: devlink_param::get_default() and devlink_param::reset_default(). These callbacks allow drivers to implement default param actions for runtime and permanent cmodes. For driverinit params, the core latches the last value set by a driver via devl_param_driverinit_value_set(), and uses that as the default value for a param. Because default parameter values are optional, it would be impossible to discern whether or not a param of type bool has default value of false or not provided if the default value is encoded using a netlink flag type. For this reason, when a DEVLINK_PARAM_TYPE_BOOL has an associated default value, the default value is encoded using a u8 type. Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20251119025038.651131-4-daniel.zahka@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	net: ethtool: Add support for 1600Gbps speed	Yael Chemla
	Add support for 1600Gbps link modes based on 200Gbps per lane [1]. This includes the adopted IEEE 802.3dj copper and optical PMDs that use 200G/lane signaling [2]. Add the following PMD types: - KR8 (backplane) - CR8 (copper cable) - DR8 (SMF 500m) - DR8-2 (SMF 2km) These modes are defined in the 802.3dj specifications. References: [1] https://www.ieee802.org/3/dj/public/23_03/opsasnick_3dj_01a_2303.pdf [2] https://www.ieee802.org/3/dj/projdoc/objectives_P802d3dj_240314.pdf Signed-off-by: Yael Chemla <ychemla@nvidia.com> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/1763585297-1243980-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-21	Merge tag 'v6.18-rc6' into drm-next	Dave Airlie
	Linux 6.18-rc6 Backmerge in order to merge msm next Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-11-20	Merge tag 'platform-drivers-x86-v6.18-4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: "This one has lots of new HW entries which adds to the size in diffstat but the individual changes are simple. Fixes - acer-wmi: Ignore backlight event - alienware-wmi-wmax: Fix quirk match table order & drop redundant entries - amd/pmc: - Add Xbox Ally to spurious 8042 quirk list - Quirk list Lenovo Legion Go 2 NVMe resume - msi-wmi-platform: - Correct GUID to uppercase - GUID is uncleverly copy-pasted from an example so add a DMI whitelist - intel/speed_select_if: PCIBIOS_* return code conversion - intel-uncore-freq & ISST: Fix kernel doc warnings New HW support - alienware-wmi-wmax: - Alienware 16 Aurora support - Alienware M support - Alienware X support - Dell G support - amd/pmc: - ROG Xbox Ally (non-X) support - huaway-wmi: HONOR MagicBoox X16/X14 PrintScreen & YOYO keys - hp-wmi: - Omen 16-wf1xxx fan support - Omen MAX 16-ah0xx fan + thermal profile support - Victus 16-r0 and 16-s0 fan + thermal profile support - intel/hid: Intel Nova Lake support - intel-uncore-freq: - Intel Panther Lake support - Intel Wildcat Lake support - Intel Nova Lake support" * tag 'platform-drivers-x86-v6.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (21 commits) platform/x86: intel-uncore-freq: fix all header kernel-doc warnings platform/x86: acer-wmi: Ignore backlight event platform/x86/intel/speed_select_if: Convert PCIBIOS_* return codes to errnos platform/x86/intel/hid: Add Nova Lake support platform/x86: alienware-wmi-wmax: Add AWCC support to Alienware 16 Aurora platform/x86: hp-wmi: Add Omen MAX 16-ah0xx fan support and thermal profile platform/x86: msi-wmi-platform: Fix typo in WMI GUID platform/x86: msi-wmi-platform: Only load on MSI devices platform/x86/amd: pmc: Add Lenovo Legion Go 2 to pmc quirk list platform/x86/amd/pmc: Add spurious_8042 to Xbox Ally platform/x86/amd/pmc: Add support for Van Gogh SoC platform/x86: alienware-wmi-wmax: Add support for the whole "G" family platform/x86: alienware-wmi-wmax: Add support for the whole "X" family platform/x86: alienware-wmi-wmax: Add support for the whole "M" family platform/x86: alienware-wmi-wmax: Drop redundant DMI entries platform/x86: alienware-wmi-wmax: Fix "Alienware m16 R1 AMD" quirk order platform/x86: ISST: isst_if.h: fix all kernel-doc warnings platform/x86: intel-uncore-freq: Add additional client processors platform/x86: hp-wmi: Add Omen 16-wf1xxx fan support platform/x86: huawei-wmi: add keys for HONOR models ...
2025-11-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.18-rc7). No conflicts, adjacent changes: tools/testing/selftests/net/af_unix/Makefile e1bb28bf13f4 ("selftest: af_unix: Add test for SO_PEEK_OFF.") 45a1cd8346ca ("selftests: af_unix: Add tests for ECONNRESET and EOF semantics") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19	Merge tag 'soc-fixes-6.18-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "These are mainly devicetree fixes for the arm platforms from Rockchips NXP, ASpeed and Broadcom, addressing issues with accidental overclocking, pinctrl, network and dtc warnings. There are additional fixes for regressions with the i.MX reset and memory controller drivers as well as the Tegra memory controller driver. Minor updates to the MAINTAINERS file, tee documentation and defconfigs bring those up to date with recent changes elsewhere" * tag 'soc-fixes-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (29 commits) MAINTAINERS: sync omap devicetree maintainers with omap platform MAINTAINERS: Update Krzysztof Kozlowski's email arm64: dts: rockchip: fix PCIe 3.3V regulator voltage on orangepi-5 arm64: dts: rockchip: disable HS400 on RK3588 Tiger arm64: dts: rockchip: drop reset from rk3576 i2c9 node tee: <uapi/linux/tee.h: fix all kernel-doc issues arm64: dts: rockchip: Fix USB power enable pin for BTT CB2 and Pi2 arm64: dts: broadcom: bcm2712: rpi-5: Add ethernet0 alias arm64: dts: broadcom: Assign clock rates in eth node for RPi5 reset: imx8mp-audiomix: Fix bad mask values ARM: dts: BCM53573: Fix address of Luxul XAP-1440's Ethernet PHY arm64: defconfig: Fix V3D deferred probe timeout arm64: dts: rockchip: Fix vccio4-supply on rk3566-pinetab2 arm64: dts: rockchip: include rk3399-base instead of rk3399 in rk3399-op1 arm64: dts: imx8mp-kontron: Fix USB OTG role switching arm64: dts: imx95: Fix MSI mapping for PCIe endpoint nodes arm64: dts: imx8-ss-img: Avoid gpio0_mipi_csi GPIOs being deferred arm: imx_v6_v7_defconfig: enable ext4 directly memory: tegra210: Fix incorrect client ids arm64: dts: rockchip: Fix indentation on rk3399 haikou demo dtso ...
2025-11-17	Input: rename INPUT_PROP_HAPTIC_TOUCHPAD to INPUT_PROP_PRESSUREPAD	Peter Hutterer
	And expand it to encompass all pressure pads. Definition: "pressure pad" as used here as includes all touchpads that use physical pressure to convert to click, without physical hinges. Also called haptic touchpads in general parlance, Synaptics calls them ForcePads. Most (all?) pressure pads are currently advertised as INPUT_PROP_BUTTONPAD. The suggestion to identify them as pressure pads by defining the resolution on ABS_MT_PRESSURE has been in the docs since commit 20ccc8dd38a3 ("Documentation: input: define ABS_PRESSURE/ABS_MT_PRESSURE resolution as grams") but few devices provide this information. In userspace it's thus impossible to determine whether a device is a true pressure pad (pressure equals pressure) or a normal clickpad with (pressure equals finger size). Commit 7075ae4ac9db ("Input: add INPUT_PROP_HAPTIC_TOUCHPAD") introduces INPUT_PROP_HAPTIC_TOUCHPAD but restricted it to those touchpads that have support for userspace-controlled effects. Let's expand and rename that definition to include all pressure pad touchpads since those that do support FF effects can be identified by the presence of the FF_HAPTIC bit. This means: - clickpad: INPUT_PROP_BUTTONPAD - pressurepad: INPUT_PROP_BUTTONPAD + INPUT_PROP_PRESSUREPAD - pressurepad with configurable haptics: INPUT_PROP_BUTTONPAD + INPUT_PROP_PRESSUREPAD + FF_HAPTIC Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Benjamin Tissoires <bentiss@kernel.org> Link: https://patch.msgid.link/20251106114534.GA405512@tassie Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-11-17	Merge tag 'vfs-6.18-rc7.fixes' of ↵	Linus Torvalds
	gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix unitialized variable in statmount_string() - Fix hostfs mounting when passing host root during boot - Fix dynamic lookup to fail on cell lookup failure - Fix missing file type when reading bfs inodes from disk - Enforce checking of sb_min_blocksize() calls and update all callers accordingly - Restore write access before closing files opened by open_exec() in binfmt_misc - Always freeze efivarfs during suspend/hibernate cycles - Fix statmount()'s and listmount()'s grab_requested_mnt_ns() helper to actually allow mount namespace file descriptor in addition to mount namespace ids - Fix tmpfs remount when noswap is specified - Switch Landlock to iput_not_last() to remove false-positives from might_sleep() annotations in iput() - Remove dead node_to_mnt_ns() code - Ensure that per-queue kobjects are successfully created * tag 'vfs-6.18-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: landlock: fix splats from iput() after it started calling might_sleep() fs: add iput_not_last() shmem: fix tmpfs reconfiguration (remount) when noswap is set fs/namespace: correctly handle errors returned by grab_requested_mnt_ns power: always freeze efivarfs binfmt_misc: restore write access before closing files opened by open_exec() block: add __must_check attribute to sb_min_blocksize() virtio-fs: fix incorrect check for fsvq->kobj xfs: check the return value of sb_min_blocksize() in xfs_fs_fill_super isofs: check the return value of sb_min_blocksize() in isofs_fill_super exfat: check return value of sb_min_blocksize in exfat_read_boot_sector vfat: fix missing sb_min_blocksize() return value checks mnt: Remove dead code which might prevent from building bfs: Reconstruct file type when loading from disk afs: Fix dynamic lookup to fail on cell lookup failure hostfs: Fix only passing host root in boot stage with new mount fs: Fix uninitialized 'offp' in statmount_string()
2025-11-16	ASoC: Intel: avs: Honor NHLT override when setting up a path	Cezary Rojewski
	In case topology provides NHLT configuration, use it instead of relying on the table in ACPI tree. Only gateway-related modules e.g.: Copier care about the process. For those the order of fetching for hardware configuration becomes: 1) check if NHLT override is set, 2) check if NHLT descriptor override is set, 3) use NHLT from ACPI directly Such approach ensures no conflicts exist between 1) and 2) and that 1) always takes precedence. Co-developed-by: Amadeusz Sławiński <amade@asmblr.net> Signed-off-by: Amadeusz Sławiński <amade@asmblr.net> Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com> Link: https://patch.msgid.link/20251115180627.3589520-3-cezary.rojewski@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-11-16	ASoC: Intel: avs: Allow the topology to carry NHLT data	Cezary Rojewski
	Typically the hardware configuration for I2S and DMIC devices resides in the Non-HDAudio Link Table (NHLT) that is part of the ACPI tree. As the NHLTs existing in the field are not always perfect, workaround mechanisms are provided to patch them. Currently the avs-driver is utilizing the ->blob_fmt override (see topology.h and struct avs_tplg_modcfg_ext) when there is a valid entry within a NHLT to configure the hardware for specific format but its descriptor (header) is invalid. A separate case is when there is no correct hardware configuration at all within the NHLT available in the system. Patching the header won't help and forcing ad-hoc BIOS updates for dated system is not feasible. Allowing the topology to carry the data is the solution of choice as replacing a userspace file that is part of /lib/firmware/intel/ is less invasive than BIOS update and solves the problem. Co-developed-by: Amadeusz Sławiński <amade@asmblr.net> Signed-off-by: Amadeusz Sławiński <amade@asmblr.net> Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com> Link: https://patch.msgid.link/20251115180627.3589520-2-cezary.rojewski@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-11-15	mshv: Extend create partition ioctl to support cpu features	Muminul Islam
	The existing mshv create partition ioctl does not provide a way to specify which cpu features are enabled in the guest. Instead, it attempts to enable all features and those that are not supported are silently disabled by the hypervisor. This was done to reduce unnecessary complexity and is sufficient for many cases. However, new scenarios require fine-grained control over these features. Define a new mshv_create_partition_v2 structure which supports passing the disabled processor and xsave feature bits through to the create partition hypercall directly. Introduce a new flag MSHV_PT_BIT_CPU_AND_XSAVE_FEATURES which enables the new structure. If unset, the original mshv_create_partition struct is used, with the old behavior of enabling all features. Co-developed-by: Jinank Jain <jinankjain@microsoft.com> Signed-off-by: Jinank Jain <jinankjain@microsoft.com> Signed-off-by: Muminul Islam <muislam@microsoft.com> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-11-15	mshv: Allow mappings that overlap in uaddr	Magnus Kulke
	Currently the MSHV driver rejects mappings that would overlap in userspace. Some VMMs require the same memory to be mapped to different parts of the guest's address space, and so working around this restriction is difficult. The hypervisor itself doesn't prohibit mappings that overlap in uaddr, (really in SPA; system physical addresses), so supporting this in the driver doesn't require any extra work: only the checks need to be removed. Since no userspace code until now has been able to overlap regions in userspace, relaxing this constraint can't break any existing code. Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-11-14	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.18-rc5+	Alexei Starovoitov
	Cross-merge BPF and other fixes after downstream PR. Minor conflict in kernel/bpf/helpers.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-14	Merge tag 'tee-fix-for-v6.18' of ↵	Arnd Bergmann
	git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee into arm/fixes TEE kernel-doc fixes for v6.18 * tag 'tee-fix-for-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee: tee: <uapi/linux/tee.h: fix all kernel-doc issues Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-11-14	Merge tag 'io_uring-6.18-20251113' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring fixes from Jens Axboe: - Use the actual segments in a request when for bvec based buffers - Fix an odd case where the iovec might get leaked for a read/write request, if it was newly allocated, overflowed the alloc cache, and hit an early error - Minor tweak to the query API added in this release, returning the number of available entries * tag 'io_uring-6.18-20251113' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs io_uring/query: return number of available queries io_uring/rw: ensure allocated iovec gets cleared for early failure
2025-11-14	media: uapi: Add parameters structs to mali-c55-config.h	Daniel Scally
	Add structures describing the ISP parameters to mali-c55-config.h Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Acked-by: Nayden Kanchev <nayden.kanchev@arm.com> Co-developed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-11-14	media: mali-c55: Add image formats for Mali-C55 parameters buffer	Jacopo Mondi
	Add a new V4L2 meta format code for the Mali-C55 parameters. Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Nayden Kanchev <nayden.kanchev@arm.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-11-14	media: uapi: Add 3a stats buffer for mali-c55	Daniel Scally
	Describe the format of the 3A statistics buffers in the userspace API header for the mali-c55 ISP. Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Nayden Kanchev <nayden.kanchev@arm.com> Co-developed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-11-14	media: Add MALI_C55_3A_STATS meta format	Daniel Scally
	Add a new meta format for the Mali-C55 ISP's 3A Statistics along with a new descriptor entry. Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Nayden Kanchev <nayden.kanchev@arm.com> Co-developed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-11-14	media: uapi: Add controls for Mali-C55 ISP	Daniel Scally
	Add definitions and documentation for the custom control that will be needed by the Mali-C55 ISP driver. This will be a read only bitmask of the driver's capabilities, informing userspace of which blocks are fitted and which are absent. Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-11-14	media: uapi: Add 20-bit bayer formats	Daniel Scally
	The Mali-C55 requires input data be in 20-bit format, MSB aligned. Add some new media bus format macros to represent that input format. Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Co-developed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-11-14	media: uapi: Add MEDIA_BUS_FMT_RGB202020_1X60 format code	Daniel Scally
	The Mali-C55 ISP by ARM requires 20-bits per colour channel input on the bus. Add a new media bus format code to represent it. Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Nayden Kanchev <nayden.kanchev@arm.com> Co-developed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-11-14	media: uapi: Convert Amlogic C3 to V4L2 extensible params	Jacopo Mondi
	With the introduction of common types for extensible parameters format, convert the c3-isp-config.h header to use the new types. Factor-out the documentation that is now part of the common header and only keep the driver-specific on in place. The conversion to use common types doesn't impact userspace as the new types are either identical to the ones already existing in the C3 ISP uAPI or are 1-to-1 type convertible. Reviewed-by: Daniel Scally <dan.scally@ideasonboard.com> Reviewed-by: Keke Li <keke.li@amlogic.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-11-14	media: uapi: Convert RkISP1 to V4L2 extensible params	Jacopo Mondi
	With the introduction of common types for extensible parameters format, convert the rkisp1-config.h header to use the new types. Factor out the documentation that is now part of the common header and only keep the driver-specific on in place. The conversion to use common types doesn't impact userspace as the new types are either identical to the ones already existing in the RkISP1 uAPI or are 1-to-1 type convertible. Reviewed-by: Daniel Scally <dan.scally@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Michael Riesch <michael.riesch@collabora.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-11-14	media: uapi: Introduce V4L2 generic ISP types	Jacopo Mondi
	Introduce v4l2-isp.h in the Linux kernel uAPI. The header includes types for generic ISP configuration parameters and will be extended in the future with support for generic ISP statistics formats. Generic ISP parameters support is provided by introducing two new types that represent an extensible and versioned buffer of ISP configuration parameters. The v4l2_params_buffer represents the container for the ISP configuration data block. The generic type is defined with a 0-sized data member that the ISP driver implementations shall properly size according to their capabilities. The v4l2_params_block_header structure represents the header to be prepend to each ISP configuration block. Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com> Reviewed-by: Daniel Scally <dan.scally@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Michael Riesch <michael.riesch@collabora.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-11-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.18-rc6). No conflicts, adjacent changes in: drivers/net/phy/micrel.c 96a9178a29a6 ("net: phy: micrel: lan8814 fix reset of the QSGMII interface") 61b7ade9ba8c ("net: phy: micrel: Add support for non PTP SKUs for lan8814") and a trivial one in tools/testing/selftests/drivers/net/Makefile. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-13	io_uring/zcrx: share an ifq between rings	David Wei
	Add a way to share an ifq from a src ring that is real (i.e. bound to a HW RX queue) with other rings. This is done by passing a new flag IORING_ZCRX_IFQ_REG_IMPORT in the registration struct io_uring_zcrx_ifq_reg, alongside the fd of an exported zcrx ifq. Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13	io_uring/zcrx: export zcrx via a file	Pavel Begunkov
	Add an option to wrap a zcrx instance into a file and expose it to the user space. Currently, users can't do anything meaningful with the file, but it'll be used in a next patch to import it into another io_uring instance. It's implemented as a new op called ZCRX_CTRL_EXPORT for the IORING_REGISTER_ZCRX_CTRL registration opcode. Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13	io_uring/zcrx: add sync refill queue flushing	Pavel Begunkov
	Add an zcrx interface via IORING_REGISTER_ZCRX_CTRL that forces the kernel to flush / consume entries from the refill queue. Just as with the IORING_REGISTER_ZCRX_REFILL attempt, the motivation is to address cases where the refill queue becomes full, and the user can't return buffers and needs to stash them. It's still a slow path, and the user should size refill queue appropriately, but it should be helpful for handling temporary traffic spikes and other unpredictable conditions. The interface is simpler comparing to ZCRX_REFILL as it doesn't need temporary refill entry arrays and gives natural batching, whereas ZCRX_REFILL requires even more user logic to be somewhat efficient. Also, add a structure for the operation. It's not currently used but can serve for future improvements like limiting the number of buffers to process, etc. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13	io_uring/zcrx: introduce IORING_REGISTER_ZCRX_CTRL	Pavel Begunkov
	It'll be annoying and take enough of boilerplate code to implement new zcrx features as separate io_uring register opcode. Introduce IORING_REGISTER_ZCRX_CTRL that will multiplex such calls to zcrx. Note, there are no real users of the opcode in this patch. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13	io_uring/query: introduce rings info query	Pavel Begunkov
	Same problem as with zcrx in the previous patch, the user needs to know SQ/CQ header sizes to allocated memory before setup to use it for user provided rings, i.e. IORING_SETUP_NO_MMAP, however that information is only returned after registration, hence the user is guessing kernel implementation details. Return the header size and alignment, which is split with the same motivation, to allow the user to know the real structure size without alignment in case there will be more flexible placement schemes in the future. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13	io_uring/query: introduce zcrx query	Pavel Begunkov
	Add a new query type IO_URING_QUERY_ZCRX returning the user some basic information about the interface, which includes allowed flags for areas and registration and supported IORING_REGISTER_ZCRX_CTRL subcodes. There is also a chicken-egg problem with user provided refill queue memory, where offsets and size information is returned after registration, but to properly allocate memory you need to know it beforehand, which is why the userspace currently has to guess the RQ headers size and severely overestimates it. Return the size information. It's split into "size" and "alignment" fields because for default placement modes the user is interested in the aligned size, however if it gets support for more flexible placement, it'll need to only know the actual header size. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13	Merge branch 'io_uring-6.18' into for-6.19/io_uring	Jens Axboe
	Merge 6.18-rc io_uring fixes, as certain coming changes depend on some of these. * io_uring-6.18: io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs io_uring/query: return number of available queries io_uring/rw: ensure allocated iovec gets cleared for early failure io_uring: fix regbuf vector size truncation io_uring: fix types for region size calulation io_uring/zcrx: remove sync refill uapi io_uring: fix buffer auto-commit for multishot uring_cmd io_uring: correct __must_hold annotation in io_install_fixed_file io_uring zcrx: add MAINTAINERS entry io_uring: Fix code indentation error io_uring/sqpoll: be smarter on when to update the stime usage io_uring/sqpoll: switch away from getrusage() for CPU accounting io_uring: fix incorrect unlikely() usage in io_waitid_prep() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-12	fs/namespace: correctly handle errors returned by grab_requested_mnt_ns	Andrei Vagin
	grab_requested_mnt_ns was changed to return error codes on failure, but its callers were not updated to check for error pointers, still checking only for a NULL return value. This commit updates the callers to use IS_ERR() or IS_ERR_OR_NULL() and PTR_ERR() to correctly check for and propagate errors. This also makes sure that the logic actually works and mount namespace file descriptors can be used to refere to mounts. Christian Brauner <brauner@kernel.org> says: Rework the patch to be more ergonomic and in line with our overall error handling patterns. Fixes: 7b9d14af8777 ("fs: allow mount namespace fd") Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Andrei Vagin <avagin@google.com> Link: https://patch.msgid.link/20251111062815.2546189-1-avagin@google.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-12	KVM: arm64: VM exit to userspace to handle SEA	Jiaqi Yan
	When APEI fails to handle a stage-2 synchronous external abort (SEA), today KVM injects an asynchronous SError to the VCPU then resumes it, which usually results in unpleasant guest kernel panic. One major situation of guest SEA is when vCPU consumes recoverable uncorrected memory error (UER). Although SError and guest kernel panic effectively stops the propagation of corrupted memory, guest may re-use the corrupted memory if auto-rebooted; in worse case, guest boot may run into poisoned memory. So there is room to recover from an UER in a more graceful manner. Alternatively KVM can redirect the synchronous SEA event to VMM to - Reduce blast radius if possible. VMM can inject a SEA to VCPU via KVM's existing KVM_SET_VCPU_EVENTS API. If the memory poison consumption or fault is not from guest kernel, blast radius can be limited to the triggering thread in guest userspace, so VM can keep running. - Allow VMM to protect from future memory poison consumption by unmapping the page from stage-2, or to interrupt guest of the poisoned page so guest kernel can unmap it from stage-1 page table. - Allow VMM to track SEA events that VM customers care about, to restart VM when certain number of distinct poison events have happened, to provide observability to customers in log management UI. Introduce an userspace-visible feature to enable VMM handle SEA: - KVM_CAP_ARM_SEA_TO_USER. As the alternative fallback behavior when host APEI fails to claim a SEA, userspace can opt in this new capability to let KVM exit to userspace during SEA if it is not owned by host. - KVM_EXIT_ARM_SEA. A new exit reason is introduced for this. KVM fills kvm_run.arm_sea with as much as possible information about the SEA, enabling VMM to emulate SEA to guest by itself. - Sanitized ESR_EL2. The general rule is to keep only the bits useful for userspace and relevant to guest memory. - Flags indicating if faulting guest physical address is valid. - Faulting guest physical and virtual addresses if valid. Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> Co-developed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://msgid.link/20251013185903.1372553-2-jiaqiyan@google.com Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-12	vfs: expose delegation support to userland	Jeff Layton
	Now that support for recallable directory delegations is available, expose this functionality to userland with new F_SETDELEG and F_GETDELEG commands for fcntl(). Note that this also allows userland to request a FL_DELEG type lease on files too. Userland applications that do will get signalled when there are metadata changes in addition to just data changes (which is a limitation of FL_LEASE leases). These commands accept a new "struct delegation" argument that contains a flags field for future expansion. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20251111-dir-deleg-ro-v6-17-52f3feebb2f2@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11	devlink: Introduce switchdev_inactive eswitch mode	Saeed Mahameed
	Adds DEVLINK_ESWITCH_MODE_SWITCHDEV_INACTIVE attribute to UAPI and documentation. Before having traffic flow through an eswitch, a user may want to have the ability to block traffic towards the FDB until FDB is fully programmed and the user is ready to send traffic to it. For example: when two eswitches are present for vports in a multi-PF setup, one eswitch may take over the traffic from the other when the user chooses. Before this take over, a user may want to first program the inactive eswitch and then once ready redirect traffic to this new eswitch. switchdev modes transition semantics: legacy->switchdev_inactive: Create switchdev mode normally, traffic not allowed to flow yet. switchdev_inactive->switchdev: Enable traffic to flow. switchdev->switchdev_inactive: Block traffic on the FDB, FDB and representros state and content is preserved. When eswitch is configured to this mode, traffic is ignored/dropped on this eswitch FDB, while current configuration is kept, e.g FDB rules and netdev representros are kept available, FDB programming is allowed. Example: # start inactive switchdev devlink dev eswitch set pci/0000:08:00.1 mode switchdev_inactive # setup TC rules, representors etc .. # activate devlink dev eswitch set pci/0000:08:00.1 mode switchdev Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20251108070404.1551708-2-saeed@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11	Merge branch 'kbuild-6.19.fms.extension'	Christian Brauner
	Bring in the shared branch with the kbuild tree to enable '-fms-extensions' for 6.19. Further namespace cleanup work requires this extension. Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11	md: allow configuring logical block size	Li Nan
	Previously, raid array used the maximum logical block size (LBS) of all member disks. Adding a larger LBS disk at runtime could unexpectedly increase RAID's LBS, risking corruption of existing partitions. This can be reproduced by: ``` # LBS of sd[de] is 512 bytes, sdf is 4096 bytes. mdadm -CRq /dev/md0 -l1 -n3 /dev/sd[de] missing --assume-clean # LBS is 512 cat /sys/block/md0/queue/logical_block_size # create partition md0p1 parted -s /dev/md0 mklabel gpt mkpart primary 1MiB 100% lsblk \| grep md0p1 # LBS becomes 4096 after adding sdf mdadm --add -q /dev/md0 /dev/sdf cat /sys/block/md0/queue/logical_block_size # partition lost partprobe /dev/md0 lsblk \| grep md0p1 ``` Simply restricting larger-LBS disks is inflexible. In some scenarios, only disks with 512 bytes LBS are available currently, but later, disks with 4KB LBS may be added to the array. Making LBS configurable is the best way to solve this scenario. After this patch, the raid will: - store LBS in disk metadata - add a read-write sysfs 'mdX/logical_block_size' Future mdadm should support setting LBS via metadata field during RAID creation and the new sysfs. Though the kernel allows runtime LBS changes, users should avoid modifying it after creating partitions or filesystems to prevent compatibility issues. Only 1.x metadata supports configurable LBS. 0.90 metadata inits all fields to default values at auto-detect. Supporting 0.90 would require more extensive changes and no such use case has been observed. Note that many RAID paths rely on PAGE_SIZE alignment, including for metadata I/O. A larger LBS than PAGE_SIZE will result in metadata read/write failures. So this config should be prevented. Link: https://lore.kernel.org/linux-raid/20251103125757.1405796-6-linan666@huaweicloud.com Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>