git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
11 days	iommu/vt-d: Assign devtlb cache tag on ATS enablement	Lu Baolu
	Commit <4f1492efb495> ("iommu/vt-d: Revert ATS timing change to fix boot failure") placed the enabling of ATS in the probe_finalize callback. This occurs after the default domain attachment, which is when the ATS cache tag is assigned. Consequently, the device TLB cache tag is missed when the domain is attached, leading to the device TLB not being invalidated in the iommu_unmap paths. Fix this by assigning the CACHE_TAG_DEVTLB cache tag when ATS is enabled. Fixes: 4f1492efb495 ("iommu/vt-d: Revert ATS timing change to fix boot failure") Cc: stable@vger.kernel.org Suggested-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20250625050135.3129955-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20250628100351.3198955-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27	iommu/rockchip: prevent iommus dead loop when two masters share one IOMMU	Simon Xue
	When two masters share an IOMMU, calling ops->of_xlate during the second master's driver init may overwrite iommu->domain set by the first. This causes the check if (iommu->domain == domain) in rk_iommu_attach_device() to fail, resulting in the same iommu->node being added twice to &rk_domain->iommus, which can lead to an infinite loop in subsequent &rk_domain->iommus operations. Cc: <stable@vger.kernel.org> Fixes: 25c2325575cc ("iommu/rockchip: Add missing set_platform_dma_ops callback") Signed-off-by: Simon Xue <xxm@rock-chips.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20250623020018.584802-1-xxm@rock-chips.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-13	iommu/tegra: Fix incorrect size calculation	Jason Gunthorpe
	This driver uses a mixture of ways to get the size of a PTE, tegra_smmu_set_pde() did it as sizeof(pd) which became wrong when pd switched to a struct tegra_pd. Switch pd back to a u32 in tegra_smmu_set_pde() so the sizeof(*pd) returns 4. Fixes: 50568f87d1e2 ("iommu/terga: Do not use struct page as the handle for as->pd memory") Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Closes: https://lore.kernel.org/all/62e7f7fe-6200-4e4f-ad42-d58ad272baa6@tecnico.ulisboa.pt/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Link: https://lore.kernel.org/r/0-v1-da7b8b3d57eb+ce-iommu_terga_sizeof_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-08	treewide, timers: Rename from_timer() to timer_container_of()	Ingo Molnar
	Move this API to the canonical timer_*() namespace. [ tglx: Redone against pre rc1 ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
2025-06-04	Merge tag 'pci-v6.16-changes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Print the actual delay time in pci_bridge_wait_for_secondary_bus() instead of assuming it was 1000ms (Wilfred Mallawa) - Revert 'iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices', which broke resume from system sleep on AMD platforms and has been fixed by other commits (Lukas Wunner) Resource management: - Remove mtip32xx use of pcim_iounmap_regions(), which is deprecated and unnecessary (Philipp Stanner) - Remove pcim_iounmap_regions() and pcim_request_region_exclusive() and related flags since all uses have been removed (Philipp Stanner) - Rework devres 'request' functions so they are no longer 'hybrid', i.e., their behavior no longer depends on whether pcim_enable_device or pci_enable_device() was used, and remove related code (Philipp Stanner) - Warn (not BUG()) about failure to assign optional resources (Ilpo Järvinen) Error handling: - Log the DPC Error Source ID only when it's actually valid (when ERR_FATAL or ERR_NONFATAL was received from a downstream device) and decode into bus/device/function (Bjorn Helgaas) - Determine AER log level once and save it so all related messages use the same level (Karolina Stolarek) - Use KERN_WARNING, not KERN_ERR, when logging PCIe Correctable Errors (Karolina Stolarek) - Ratelimit PCIe Correctable and Non-Fatal error logging, with sysfs controls on interval and burst count, to avoid flooding logs and RCU stall warnings (Jon Pan-Doh) Power management: - Increment PM usage counter when probing reset methods so we don't try to read config space of a powered-off device (Alex Williamson) - Set all devices to D0 during enumeration to ensure ACPI opregion is connected via _REG (Mario Limonciello) Power control: - Rename pwrctrl Kconfig symbols from 'PWRCTL' to 'PWRCTRL' to match the filename paths. Retain old deprecated symbols for compatibility, except for the pwrctrl slot driver (PCI_PWRCTRL_SLOT) (Johan Hovold) - When unregistering pwrctrl, cancel outstanding rescan work before cleaning up data structures to avoid use-after-free issues (Brian Norris) Bandwidth control: - Simplify link bandwidth controller by replacing the count of Link Bandwidth Management Status (LBMS) events with a PCI_LINK_LBMS_SEEN flag (Ilpo Järvinen) - Update the Link Speed after retraining, since the Link Speed may have changed (Ilpo Järvinen) PCIe native device hotplug: - Ignore Presence Detect Changed caused by DPC. pciehp already ignores Link Down/Up events caused by DPC, but on slots using in-band presence detect, DPC causes a spurious Presence Detect Changed event (Lukas Wunner) - Ignore Link Down/Up caused by Secondary Bus Reset. On hotplug ports using in-band presence detect, the reset causes a Presence Detect Changed event, which mistakenly caused teardown and re-enumeration of the device. Drivers may need to annotate code that resets their device (Lukas Wunner) Virtualization: - Add an ACS quirk for Loongson Root Ports that don't advertise ACS but don't allow peer-to-peer transactions between Root Ports; the quirk allows each Root Port to be in a separate IOMMU group (Huacai Chen) Endpoint framework: - For fixed-size BARs, retain both the actual size and the possibly larger size allocated to accommodate iATU alignment requirements (Jerome Brunet) - Simplify ctrl/SPAD space allocation and avoid allocating more space than needed (Jerome Brunet) - Correct MSI-X PBA offset calculations for DesignWare and Cadence endpoint controllers (Niklas Cassel) - Align the return value (number of interrupts) encoding for pci_epc_get_msi()/pci_epc_ops::get_msi() and pci_epc_get_msix()/pci_epc_ops::get_msix() (Niklas Cassel) - Align the nr_irqs parameter encoding for pci_epc_set_msi()/pci_epc_ops::set_msi() and pci_epc_set_msix()/pci_epc_ops::set_msix() (Niklas Cassel) Common host controller library: - Convert pci-host-common to a library so platforms that don't need native host controller drivers don't need to include these helper functions (Manivannan Sadhasivam) Apple PCIe controller driver: - Extract ECAM bridge creation helper from pci_host_common_probe() to separate driver-specific things like MSI from PCI things (Marc Zyngier) - Dynamically allocate RID-to_SID bitmap to prepare for SoCs with varying capabilities (Marc Zyngier) - Skip ports disabled in DT when setting up ports (Janne Grunau) - Add t6020 compatible string (Alyssa Rosenzweig) - Add T602x PCIe support (Hector Martin) - Directly set/clear INTx mask bits because T602x dropped the accessors that could do this without locking (Marc Zyngier) - Move port PHY registers to their own reg items to accommodate T602x, which moves them around; retain default offsets for existing DTs that lack phy%d entries with the reg offsets (Hector Martin) - Stop polling for core refclk, which doesn't work on T602x and the bootloader has already done anyway (Hector Martin) - Use gpiod_set_value_cansleep() when asserting PERST# in probe because we're allowed to sleep there (Hector Martin) Cadence PCIe controller driver: - Drop a runtime PM 'put' to resolve a runtime atomic count underflow (Hans Zhang) - Make the cadence core buildable as a module (Kishon Vijay Abraham I) - Add cdns_pcie_host_disable() and cdns_pcie_ep_disable() for use by loadable drivers when they are removed (Siddharth Vadapalli) Freescale i.MX6 PCIe controller driver: - Apply link training workaround only on IMX6Q, IMX6SX, IMX6SP (Richard Zhu) - Remove redundant dw_pcie_wait_for_link() from imx_pcie_start_link(); since the DWC core does this, imx6 only needs it when retraining for a faster link speed (Richard Zhu) - Toggle i.MX95 core reset to align with PHY powerup (Richard Zhu) - Set SYS_AUX_PWR_DET to work around i.MX95 ERR051624 erratum: in some cases, the controller can't exit 'L23 Ready' through Beacon or PERST# deassertion (Richard Zhu) - Clear GEN3_ZRXDC_NONCOMPL to work around i.MX95 ERR051586 erratum: controller can't meet 2.5 GT/s ZRX-DC timing when operating at 8 GT/s, causing timeouts in L1 (Richard Zhu) - Wait for i.MX95 PLL lock before enabling controller (Richard Zhu) - Save/restore i.MX95 LUT for suspend/resume (Richard Zhu) Mobiveil PCIe controller driver: - Return bool (not int) for link-up check in mobiveil_pab_ops.link_up() and layerscape-gen4, mobiveil (Hans Zhang) NVIDIA Tegra194 PCIe controller driver: - Create debugfs directory for 'aspm_state_cnt' only when CONFIG_PCIEASPM is enabled, since there are no other entries (Hans Zhang) Qualcomm PCIe controller driver: - Add OF support for parsing DT 'eq-presets-<N>gts' property for lane equalization presets (Krishna Chaitanya Chundru) - Read Maximum Link Width from the Link Capabilities register if DT lacks 'num-lanes' property (Krishna Chaitanya Chundru) - Add Physical Layer 64 GT/s Capability ID and register offsets for 8, 32, and 64 GT/s lane equalization registers (Krishna Chaitanya Chundru) - Add generic dwc support for configuring lane equalization presets (Krishna Chaitanya Chundru) - Add DT and driver support for PCIe on IPQ5018 SoC (Nitheesh Sekar) Renesas R-Car PCIe controller driver: - Describe endpoint BAR 4 as being fixed size (Jerome Brunet) - Document how to obtain R-Car V4H (r8a779g0) controller firmware (Yoshihiro Shimoda) Rockchip PCIe controller driver: - Reorder rockchip_pci_core_rsts because reset_control_bulk_deassert() deasserts in reverse order, to fix a link training regression (Jensen Huang) - Mark RK3399 as being capable of raising INTx interrupts (Niklas Cassel) Rockchip DesignWare PCIe controller driver: - Check only PCIE_LINKUP, not LTSSM status, to determine whether the link is up (Shawn Lin) - Increase N_FTS (used in L0s->L0 transitions) and enable ASPM L0s for Root Complex and Endpoint modes (Shawn Lin) - Hide the broken ATS Capability in rockchip_pcie_ep_init() instead of rockchip_pcie_ep_pre_init() so it stays hidden after PERST# resets non-sticky registers (Shawn Lin) - Call phy_power_off() before phy_exit() in rockchip_pcie_phy_deinit() (Diederik de Haas) Synopsys DesignWare PCIe controller driver: - Set PORT_LOGIC_LINK_WIDTH to one lane to make initial link training more robust; this will not affect the intended link width if all lanes are functional (Wenbin Yao) - Return bool (not int) for link-up check in dw_pcie_ops.link_up() and armada8k, dra7xx, dw-rockchip, exynos, histb, keembay, keystone, kirin, meson, qcom, qcom-ep, rcar_gen4, spear13xx, tegra194, uniphier, visconti (Hans Zhang) - Add debugfs support for exposing DWC device-specific PTM context (Manivannan Sadhasivam) TI J721E PCIe driver: - Make j721e buildable as a loadable and removable module (Siddharth Vadapalli) - Fix j721e host/endpoint dependencies that result in link failures in some configs (Arnd Bergmann) Device tree bindings: - Add qcom DT binding for 'global' interrupt (PCIe controller and link-specific events) for ipq8074, ipq8074-gen3, ipq6018, sa8775p, sc7280, sc8180x sdm845, sm8150, sm8250, sm8350 (Manivannan Sadhasivam) - Add qcom DT binding for 8 MSI SPI interrupts for msm8998, ipq8074, ipq8074-gen3, ipq6018 (Manivannan Sadhasivam) - Add dw rockchip DT binding for rk3576 and rk3562 (Kever Yang) - Correct indentation and style of examples in brcm,stb-pcie, cdns,cdns-pcie-ep, intel,keembay-pcie-ep, intel,keembay-pcie, microchip,pcie-host, rcar-pci-ep, rcar-pci-host, xilinx-versal-cpm (Krzysztof Kozlowski) - Convert Marvell EBU (dove, kirkwood, armada-370, armada-xp) and armada8k from text to schema DT bindings (Rob Herring) - Remove obsolete .txt DT bindings for content that has been moved to schemas (Rob Herring) - Add qcom DT binding for MHI registers in IPQ5332, IPQ6018, IPQ8074 and IPQ9574 (Varadarajan Narayanan) - Convert v3,v360epc-pci from text to DT schema binding (Rob Herring) - Change microchip,pcie-host DT binding to be 'dma-noncoherent' since PolarFire may be configured that way (Conor Dooley) Miscellaneous: - Drop 'pci' suffix from intel_mid_pci.c filename to match similar files (Andy Shevchenko) - All platforms with PCI have an MMU, so add PCI Kconfig dependency on MMU to simplify build testing and avoid inadvertent build regressions (Arnd Bergmann) - Update Krzysztof Wilczyński's email address in MAINTAINERS (Krzysztof Wilczyński) - Update Manivannan Sadhasivam's email address in MAINTAINERS (Manivannan Sadhasivam)" * tag 'pci-v6.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (147 commits) MAINTAINERS: Update Manivannan Sadhasivam email address PCI: j721e: Fix host/endpoint dependencies PCI: j721e: Add support to build as a loadable module PCI: cadence-ep: Introduce cdns_pcie_ep_disable() helper for cleanup PCI: cadence-host: Introduce cdns_pcie_host_disable() helper for cleanup PCI: cadence: Add support to build pcie-cadence library as a kernel module MAINTAINERS: Update Krzysztof Wilczyński email address PCI: Remove unnecessary linesplit in __pci_setup_bridge() PCI: WARN (not BUG()) when we fail to assign optional resources PCI: Remove unused pci_printk() PCI: qcom: Replace PERST# sleep time with proper macro PCI: dw-rockchip: Replace PERST# sleep time with proper macro PCI: host-common: Convert to library for host controller drivers PCI/ERR: Remove misleading TODO regarding kernel panic PCI: cadence: Remove duplicate message code definitions PCI: endpoint: Align pci_epc_set_msix(), pci_epc_ops::set_msix() nr_irqs encoding PCI: endpoint: Align pci_epc_set_msi(), pci_epc_ops::set_msi() nr_irqs encoding PCI: endpoint: Align pci_epc_get_msix(), pci_epc_ops::get_msix() return value encoding PCI: endpoint: Align pci_epc_get_msi(), pci_epc_ops::get_msi() return value encoding PCI: cadence-ep: Correct PBA offset in .set_msix() callback ...
2025-05-31	Revert "iommu: make inclusion of arm/arm-smmu-v3 directory conditional"	Linus Torvalds
	This reverts commit e436576b0231542f6f233279f0972989232575a8. That commit is very broken, and seems to have missed the fact that CONFIG_ARM_SMMU_V3 is not just a yes-or-no thing, but also can be modular. So it caused build errors on arm64 allmodconfig setups: ERROR: modpost: "arm_smmu_make_cdtable_ste" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ERROR: modpost: "arm_smmu_make_s2_domain_ste" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ERROR: modpost: "arm_smmu_make_s1_cd" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ... (and six more symbols just the same). Link: https://lore.kernel.org/all/CAHk-=wh4qRwm7AQ8sBmQj7qECzgAhj4r73RtCDfmHo5SdcN0Jw@mail.gmail.com/ Cc: Joerg Roedel <joro@8bytes.org> Cc: Rolf Eike Beer <eb@emlix.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-05-30	Merge tag 'iommu-updates-v6.16' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: "Core: - Introduction of iommu-pages infrastructure to consolitate page-table allocation code among hardware drivers. This is ground-work for more generalization in the future - Remove IOMMU_DEV_FEAT_SVA and IOMMU_DEV_FEAT_IOPF feature flags - Convert virtio-iommu to domain_alloc_paging() - KConfig cleanups - Some small fixes for possible overflows and race conditions Intel VT-d driver: - Restore WO permissions on second-level paging entries - Use ida to manage domain id - Miscellaneous cleanups AMD-Vi: - Make sure notifiers finish running before module unload - Add support for HTRangeIgnore feature - Allow matching ACPI HID devices without matching UIDs ARM-SMMU: - SMMUv2: - Recognise the compatible string for SAR2130P MDSS in the Qualcomm driver, as this device requires an identity domain - Fix Adreno stall handling so that GPU debugging is more robust and doesn't e.g. result in deadlock - SMMUv3: - Fix ->attach_dev() error reporting for unrecognised domains - IO-pgtable: - Allow clients (notably, drivers that process requests from userspace) to silence warnings when mapping an already-mapped IOVA S390: - Add support for additional table regions Mediatek: - Add support for MT6893 MM IOMMU And some smaller fixes and improvements in various other drivers" * tag 'iommu-updates-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (75 commits) iommu/vt-d: Restore context entry setup order for aliased devices iommu/mediatek: Fix compatible typo for mediatek,mt6893-iommu-mm iommu/arm-smmu-qcom: Make set_stall work when the device is on iommu/arm-smmu: Move handing of RESUME to the context fault handler iommu/arm-smmu-qcom: Enable threaded IRQ for Adreno SMMUv2/MMU500 iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() iommu: Clear the freelist after iommu_put_pages_list() iommu/vt-d: Change dmar_ats_supported() to return boolean iommu/vt-d: Eliminate pci_physfn() in dmar_find_matched_satc_unit() iommu/vt-d: Replace spin_lock with mutex to protect domain ida iommu/vt-d: Use ida to manage domain id iommu/vt-d: Restore WO permissions on second-level paging entries iommu/amd: Allow matching ACPI HID devices without matching UIDs iommu: make inclusion of arm/arm-smmu-v3 directory conditional iommu: make inclusion of riscv directory conditional iommu: make inclusion of amd directory conditional iommu: make inclusion of intel directory conditional iommu: remove duplicate selection of DMAR_TABLE iommu/fsl_pamu: remove trailing space after \n iommu/arm-smmu-qcom: Add SAR2130P MDSS compatible ...
2025-05-27	Merge tag 'dma-mapping-6.16-2025-05-26' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping updates from Marek Szyprowski: "New two step DMA mapping API, which is is a first step to a long path to provide alternatives to scatterlist and to remove hacks, abuses and design mistakes related to scatterlists. This new approach optimizes some calls to DMA-IOMMU layer and cache maintenance by batching them, reduces memory usage as it is no need to store mapped DMA addresses to unmap them, and reduces some function call overhead. It is a combination effort of many people, lead and developed by Christoph Hellwig and Leon Romanovsky" * tag 'dma-mapping-6.16-2025-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: docs: core-api: document the IOVA-based API dma-mapping: add a dma_need_unmap helper dma-mapping: Implement link/unlink ranges API iommu/dma: Factor out a iommu_dma_map_swiotlb helper dma-mapping: Provide an interface to allow allocate IOVA iommu: add kernel-doc for iommu_unmap_fast iommu: generalize the batched sync after map interface dma-mapping: move the PCI P2PDMA mapping helpers to pci-p2pdma.h PCI/P2PDMA: Refactor the p2pdma mapping helpers
2025-05-23	Merge branches 'fixes', 'apple/dart', 'arm/smmu/updates', ↵	Joerg Roedel
	'arm/smmu/bindings', 'fsl/pamu', 'mediatek', 'renesas/ipmmu', 's390', 'intel/vt-d', 'amd/amd-vi' and 'core' into next
2025-05-23	iommu/vt-d: Restore context entry setup order for aliased devices	Lu Baolu
	Commit 2031c469f816 ("iommu/vt-d: Add support for static identity domain") changed the context entry setup during domain attachment from a set-and-check policy to a clear-and-reset approach. This inadvertently introduced a regression affecting PCI aliased devices behind PCIe-to-PCI bridges. Specifically, keyboard and touchpad stopped working on several Apple Macbooks with below messages: kernel: platform pxa2xx-spi.3: Adding to iommu group 20 kernel: input: Apple SPI Keyboard as /devices/pci0000:00/0000:00:1e.3/pxa2xx-spi.3/spi_master/spi2/spi-APP000D:00/input/input0 kernel: DMAR: DRHD: handling fault status reg 3 kernel: DMAR: [DMA Read NO_PASID] Request device [00:1e.3] fault addr 0xffffa000 [fault reason 0x06] PTE Read access is not set kernel: DMAR: DRHD: handling fault status reg 3 kernel: DMAR: [DMA Read NO_PASID] Request device [00:1e.3] fault addr 0xffffa000 [fault reason 0x06] PTE Read access is not set kernel: applespi spi-APP000D:00: Error writing to device: 01 0e 00 00 kernel: DMAR: DRHD: handling fault status reg 3 kernel: DMAR: [DMA Read NO_PASID] Request device [00:1e.3] fault addr 0xffffa000 [fault reason 0x06] PTE Read access is not set kernel: DMAR: DRHD: handling fault status reg 3 kernel: applespi spi-APP000D:00: Error writing to device: 01 0e 00 00 Fix this by restoring the previous context setup order. Fixes: 2031c469f816 ("iommu/vt-d: Add support for static identity domain") Closes: https://lore.kernel.org/all/4dada48a-c5dd-4c30-9c85-5b03b0aa01f0@bfh.ch/ Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20250514060523.2862195-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20250520075849.755012-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-22	iommu/mediatek: Fix compatible typo for mediatek,mt6893-iommu-mm	AngeloGioacchino Del Regno
	Fix the "mediatek.mt6893-iommu-mm" compatible string typo, as the dot was actually meant to be a comma: "mediatek,mt6893-iommu-mm". Fixes: f6a1e89ab6e3 ("iommu/mediatek: Add support for Dimensity 1200 MT6893 MM IOMMU") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20250521151548.185910-1-angelogioacchino.delregno@collabora.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-22	iommu: Skip PASID validation for devices without PASID capability	Tushar Dave
	Generally PASID support requires ACS settings that usually create single device groups, but there are some niche cases where we can get multi-device groups and still have working PASID support. The primary issue is that PCI switches are not required to treat PASID tagged TLPs specially so appropriate ACS settings are required to route all TLPs to the host bridge if PASID is going to work properly. pci_enable_pasid() does check that each device that will use PASID has the proper ACS settings to achieve this routing. However, no-PASID devices can be combined with PASID capable devices within the same topology using non-uniform ACS settings. In this case the no-PASID devices may not have strict route to host ACS flags and end up being grouped with the PASID devices. This configuration fails to allow use of the PASID within the iommu core code which wrongly checks if the no-PASID device supports PASID. Fix this by ignoring no-PASID devices during the PASID validation. They will never issue a PASID TLP anyhow so they can be ignored. Fixes: c404f55c26fc ("iommu: Validate the PASID in iommu_attach_device_pasid()") Cc: stable@vger.kernel.org Signed-off-by: Tushar Dave <tdave@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20250520011937.3230557-1-tdave@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-21	iommu/arm-smmu-qcom: Make set_stall work when the device is on	Connor Abbott
	Up until now we have only called the set_stall callback during initialization when the device is off. But we will soon start calling it to temporarily disable stall-on-fault when the device is on, so handle that by checking if the device is on and writing SCTLR. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Link: https://lore.kernel.org/r/20250520-msm-gpu-fault-fixes-next-v8-3-fce6ee218787@gmail.com [will: Fix "mixed declarations and code" warning from sparse] Signed-off-by: Will Deacon <will@kernel.org>
2025-05-21	iommu/arm-smmu: Move handing of RESUME to the context fault handler	Connor Abbott
	The upper layer fault handler is now expected to handle everything required to retry the transaction or dump state related to it, since we enable threaded IRQs. This means that we can take charge of writing RESUME, making sure that we always write it after writing FSR as recommended by the specification. The iommu handler should write -EAGAIN if a transaction needs to be retried. This avoids tricky cross-tree changes in drm/msm, since it never wants to retry the transaction and it already returns 0 from its fault handler. Therefore it will continue to correctly terminate the transaction without any changes required. devcoredumps from drm/msm will temporarily be broken until it is fixed to collect devcoredumps inside its fault handler, but fixing that first would actually be worse because MMU-500 ignores writes to RESUME unless all fields of FSR (except SS of course) are clear and raises an interrupt when only SS is asserted. Right now, things happen to work most of the time if we collect a devcoredump, because RESUME is written asynchronously in the fault worker after the fault handler clears FSR and finishes, although there will be some spurious faults, but if this is changed before this commit fixes the FSR/RESUME write order then SS will never be cleared, the interrupt will never be cleared, and the whole system will hang every time a fault happens. It will therefore help bisectability if this commit goes first. I've changed the TBU path to also accept -EAGAIN and do the same thing, while keeping the old -EBUSY behavior. Although the old path was broken because you'd get a storm of interrupts due to returning IRQ_NONE that would eventually result in the interrupt being disabled, and I think it was dead code anyway, so it should eventually be deleted. Note that drm/msm never uses TBU so this is untested. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Link: https://lore.kernel.org/r/20250520-msm-gpu-fault-fixes-next-v8-2-fce6ee218787@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-21	iommu/arm-smmu-qcom: Enable threaded IRQ for Adreno SMMUv2/MMU500	Connor Abbott
	The recommended flow for stall-on-fault in SMMUv2 is the following: 1. Resolve the fault. 2. Write to FSR to clear the fault bits. 3. Write RESUME to retry or fail the transaction. MMU500 is designed with this sequence in mind. For example, experimentally we have seen on MMU500 that writing RESUME does not clear FSR.SS unless the original fault is cleared in FSR, so 2 must come before 3. FSR.SS is allowed to signal a fault (and does on MMU500) so that if we try to do 2 -> 1 -> 3 (while exiting from the fault handler after 2) we can get duplicate faults without hacks to disable interrupts. However, resolving the fault typically requires lengthy operations that can stall, like bringing in pages from disk. The only current user, drm/msm, dumps GPU state before failing the transaction which indeed can stall. Therefore, from now on we will require implementations that want to use stall-on-fault to also enable threaded IRQs. Do that with the Adreno MMU implementations. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Link: https://lore.kernel.org/r/20250520-msm-gpu-fault-fixes-next-v8-1-fce6ee218787@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-20	iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()	Rob Clark
	In situations where mapping/unmapping sequence can be controlled by userspace, attempting to map over a region that has not yet been unmapped is an error. But not something that should spam dmesg. Now that there is a quirk, we can also drop the selftest_running flag, and use the quirk instead for selftests. Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20250519175348.11924-6-robdclark@gmail.com [will: Rename quirk to IO_PGTABLE_QUIRK_NO_WARN per Robin's suggestion] Signed-off-by: Will Deacon <will@kernel.org>
2025-05-16	iommu: Clear the freelist after iommu_put_pages_list()	Jason Gunthorpe
	The commit below reworked how iommu_put_pages_list() worked to not do list_del() on every entry. This was done expecting all the callers to already re-init the list so doing a per-item deletion is not efficient. It was missed that fq_ring_free_locked() re-uses its list after calling iommu_put_pages_list() and so the leftover list reaches free'd struct pages and will crash or WARN/BUG/etc. Reinit the list to empty in fq_ring_free_locked() after calling iommu_put_pages_list(). Audit to see if any other callers of iommu_put_pages_list() need the list to be empty: - iommu_dma_free_fq_single() and iommu_dma_free_fq_percpu() immediately frees the memory - iommu_v1_map_pages(), v1_free_pgtable(), domain_exit(), riscv_iommu_map_pages() uses a stack variable which goes out of scope - intel_iommu_tlb_sync() uses a gather in a iotlb_sync() callback, the caller re-inits the gather Fixes: 13f43d7cf3e0 ("iommu/pages: Formalize the freelist API") Reported-by: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com> Closes: https://lore.kernel.org/r/SJ1PR11MB61292CE72D7BE06B8810021CB997A@SJ1PR11MB6129.namprd11.prod.outlook.com Tested-by: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/0-v1-7d4dfa6140f7+11f04-iommu_freelist_init_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu/vt-d: Change dmar_ats_supported() to return boolean	Wei Wang
	According to "Function return values and names" in coding-style.rst, the dmar_ats_supported() function should return a boolean instead of an integer. Also, rename "ret" to "supported" to be more straightforward. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20250509140021.4029303-3-wei.w.wang@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu/vt-d: Eliminate pci_physfn() in dmar_find_matched_satc_unit()	Wei Wang
	The function dmar_find_matched_satc_unit() contains a duplicate call to pci_physfn(). This call is unnecessary as pci_physfn() has already been invoked by the caller. Removing the redundant call simplifies the code and improves efficiency a bit. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Link: https://lore.kernel.org/r/20250509140021.4029303-2-wei.w.wang@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu/vt-d: Replace spin_lock with mutex to protect domain ida	Lu Baolu
	The domain ID allocator is currently protected by a spin_lock. However, ida_alloc_range can potentially block if it needs to allocate memory to grow its internal structures. Replace the spin_lock with a mutex which allows sleep on block. Thus, the memory allocation flags can be updated from GFP_ATOMIC to GFP_KERNEL to allow blocking memory allocations if necessary. Introduce a new mutex, did_lock, specifically for protecting the domain ida. The existing spinlock will remain for protecting other intel_iommu fields. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20250430021135.2370244-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu/vt-d: Use ida to manage domain id	Lu Baolu
	Switch the intel iommu driver to use the ida mechanism for managing domain IDs, replacing the previous fixed-size bitmap. The previous approach allocated a bitmap large enough to cover the maximum number of domain IDs supported by the hardware, regardless of the actual number of domains in use. This led to unnecessary memory consumption, especially on systems supporting a large number of iommu units but only utilizing a small number of domain IDs. The ida allocator dynamically manages the allocation and freeing of integer IDs, only consuming memory for the IDs that are currently in use. This significantly optimizes memory usage compared to the fixed-size bitmap. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20250430021135.2370244-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu/vt-d: Restore WO permissions on second-level paging entries	Jason Gunthorpe
	VT-D HW can do WO permissions on the second-stage but not the first-stage page table formats. The commit eea53c581688 ("iommu/vt-d: Remove WO permissions on second-level paging entries") wanted to make this uniform for VT-D by disabling the support for WO permissions in the second-stage. This isn't consistent with how other drivers are working. Instead if the underlying HW can support WO, it should. For instance AMD already supports WO on its second stage (v1) format and not its first (v2). If WO support needs to be discoverable it should be done through an iommu_domain capability flag. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/0-v1-c26553717e90+65f-iommu_vtd_ss_wo_jgg@nvidia.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu/amd: Allow matching ACPI HID devices without matching UIDs	Mario Limonciello
	A BIOS upgrade has changed the IVRS DTE UID for a device that no longer matches the UID in the SSDT. In this case there is only one ACPI device on the system with that _HID but the _UID mismatch. IVRS: ``` Subtable Type : F0 [Device Entry: ACPI HID Named Device] Device ID : 0060 Data Setting (decoded below) : 40 INITPass : 0 EIntPass : 0 NMIPass : 0 Reserved : 0 System MGMT : 0 LINT0 Pass : 1 LINT1 Pass : 0 ACPI HID : "MSFT0201" ACPI CID : 0000000000000000 UID Format : 02 UID Length : 09 UID : "\_SB.MHSP" ``` SSDT: ``` Device (MHSP) { Name (_ADR, Zero) // _ADR: Address Name (_HID, "MSFT0201") // _HID: Hardware ID Name (_UID, One) // _UID: Unique ID ``` To handle this case; while enumerating ACPI devices in get_acpihid_device_id() count the number of matching ACPI devices with a matching _HID. If there is exactly one _HID match then accept it even if the UID doesn't match. Other operating systems allow this, but the current IVRS spec leaves some ambiguity whether to allow or disallow it. This should be clarified in future revisions of the spec. Output 'Firmware Bug' for this case to encourage it to be solved in the BIOS. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20250512173129.1274275-1-superm1@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu: make inclusion of arm/arm-smmu-v3 directory conditional	Rolf Eike Beer
	Nothing in there is active if CONFIG_ARM_SMMU_V3 is not enabled, so the whole directory can depend on that switch as well. Fixes: e86d1aa8b60f ("iommu/arm-smmu: Move Arm SMMU drivers into their own subdirectory") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/2434059.NG923GbCHz@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu: make inclusion of riscv directory conditional	Rolf Eike Beer
	Nothing in there is active if CONFIG_RISCV_IOMMU is not enabled, so the whole directory can depend on that switch as well. Fixes: 5c0ebbd3c6c6 ("iommu/riscv: Add RISC-V IOMMU platform device driver") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/2235451.Icojqenx9y@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu: make inclusion of amd directory conditional	Rolf Eike Beer
	Nothing in there is active if CONFIG_AMD_IOMMU is not enabled, so the whole directory can depend on that switch as well. Fixes: cbe94c6e1a7d ("iommu/amd: Move Kconfig and Makefile bits down into amd directory") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1894970.atdPhlSkOF@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu: make inclusion of intel directory conditional	Rolf Eike Beer
	Nothing in there is active if CONFIG_INTEL_IOMMU is not enabled, so the whole directory can depend on that switch as well. Fixes: ab65ba57e3ac ("iommu/vt-d: Move Kconfig and Makefile bits down into intel directory") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/3818749.MHq7AAxBmi@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu: remove duplicate selection of DMAR_TABLE	Rolf Eike Beer
	This is already done in intel/Kconfig. Fixes: 70bad345e622 ("iommu: Fix compilation without CONFIG_IOMMU_INTEL") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/2232605.Mh6RI2rZIc@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16	iommu/fsl_pamu: remove trailing space after \n	Colin Ian King
	There is an extraenous space after \n in a pr_debug message. Remove it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20250430151853.923614-1-colin.i.king@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-15	Revert "iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices"	Lukas Wunner
	Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()") changed IRQ handling on PCI driver probing. It inadvertently broke resume from system sleep on AMD platforms: https://lore.kernel.org/r/20150926164651.GA3640@pd.tnic/ This was fixed by two independent commits: * 8affb487d4a4 ("x86/PCI: Don't alloc pcibios-irq when MSI is enabled") * cbbc00be2ce3 ("iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices") The breaking change and one of these two fixes were subsequently reverted: * fe25d078874f ("Revert "x86/PCI: Don't alloc pcibios-irq when MSI is enabled"") * 6c777e8799a9 ("Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()"") This rendered the second fix unnecessary, so revert it as well. It used the match_driver flag in struct pci_dev, which is internal to the PCI core and not supposed to be touched by arbitrary drivers. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://patch.msgid.link/9a3ddff5cc49512044f963ba0904347bd404094d.1745572340.git.lukas@wunner.de
2025-05-06	iommu/arm-smmu-qcom: Add SAR2130P MDSS compatible	Dmitry Baryshkov
	Add the SAR2130P compatible to clients compatible list, the device require identity domain. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250418-sar2130p-display-v5-9-442c905cb3a4@oss.qualcomm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-06	iommu/arm-smmu-v3: Fix incorrect return in arm_smmu_attach_dev	Qinxin Xia
	After commit 48e7b8e284e5 ("iommu/arm-smmu-v3: Remove arm_smmu_domain_finalise() during attach"), an error code is not returned on the attach path when the smmu does not match with the domain. This causes problems with VFIO because vfio_iommu_type1_attach_group() relies on this check to determine domain compatability. Re-instate the -EINVAL return value when the SMMU doesn't match on the device attach path. Fixes: 48e7b8e284e5 ("iommu/arm-smmu-v3: Remove arm_smmu_domain_finalise() during attach") Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com> Link: https://lore.kernel.org/r/20250422112951.2027969-1-xiaqinxin@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-06	dma-mapping: Implement link/unlink ranges API	Leon Romanovsky
	Introduce new DMA APIs to perform DMA linkage of buffers in layers higher than DMA. In proposed API, the callers will perform the following steps. In map path: if (dma_can_use_iova(...)) dma_iova_alloc() for (page in range) dma_iova_link_next(...) dma_iova_sync(...) else /* Fallback to legacy map pages */ for (all pages) dma_map_page(...) In unmap path: if (dma_can_use_iova(...)) dma_iova_destroy() else for (all pages) dma_unmap_page(...) Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2025-05-06	iommu/dma: Factor out a iommu_dma_map_swiotlb helper	Christoph Hellwig
	Split the iommu logic from iommu_dma_map_page into a separate helper. This not only keeps the code neatly separated, but will also allow for reuse in another caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2025-05-06	dma-mapping: Provide an interface to allow allocate IOVA	Leon Romanovsky
	The existing .map_pages() callback provides both allocating of IOVA and linking DMA pages. That combination works great for most of the callers who use it in control paths, but is less effective in fast paths where there may be multiple calls to map_page(). These advanced callers already manage their data in some sort of database and can perform IOVA allocation in advance, leaving range linkage operation to be in fast path. Provide an interface to allocate/deallocate IOVA and next patch link/unlink DMA ranges to that specific IOVA. In the new API a DMA mapping transaction is identified by a struct dma_iova_state, which holds some recomputed information for the transaction which does not change for each page being mapped, so add a check if IOVA can be used for the specific transaction. The API is exported from dma-iommu as it is the only implementation supported, the namespace is clearly different from iommu_* functions which are not allowed to be used. This code layout allows us to save function call per API call used in datapath as well as a lot of boilerplate code. Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2025-05-06	iommu: add kernel-doc for iommu_unmap_fast	Leon Romanovsky
	Add kernel-doc section for iommu_unmap_fast to document existing limitation of underlying functions which can't split individual ranges. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2025-05-06	iommu: generalize the batched sync after map interface	Christoph Hellwig
	For the upcoming IOVA-based DMA API we want to batch the ops->iotlb_sync_map() call after mapping multiple IOVAs from dma-iommu without having a scatterlist. Improve the API. Add a wrapper for the map_sync as iommu_sync_map() so that callers don't need to poke into the methods directly. Formalize __iommu_map() into iommu_map_nosync() which requires the caller to call iommu_sync_map() after all maps are completed. Refactor the existing sanity checks from all the different layers into iommu_map_nosync(). Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Will Deacon <will@kernel.org> Tested-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2025-05-06	dma-mapping: move the PCI P2PDMA mapping helpers to pci-p2pdma.h	Christoph Hellwig
	To support the upcoming non-scatterlist mapping helpers, we need to go back to have them called outside of the DMA API. Thus move them out of dma-map-ops.h, which is only for DMA API implementations to pci-p2pdma.h, which is for driver use. Note that the core helper is still not exported as the mapping is expected to be done only by very highlevel subsystem code at least for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2025-05-06	PCI/P2PDMA: Refactor the p2pdma mapping helpers	Christoph Hellwig
	The current scheme with a single helper to determine the P2P status and map a scatterlist segment force users to always use the map_sg helper to DMA map, which we're trying to get away from because they are very cache inefficient. Refactor the code so that there is a single helper that checks the P2P state for a page, including the result that it is not a P2P page to simplify the callers, and a second one to perform the address translation for a bus mapped P2P transfer that does not depend on the scatterlist structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2025-05-02	Merge tag 'iommu-fixes-v6.15-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu fixes from Joerg Roedel: "ARM-SMMU fixes: - Fix broken detection of the S2FWB feature - Ensure page-size bitmap is initialised for SVA domains - Fix handling of SMMU client devices with duplicate Stream IDs - Don't fail SMMU probe if Stream IDs are aliased across clients Intel VT-d fixes: - Add quirk for IGFX device - Revert an ATS change to fix a boot failure AMD IOMMU: - Fix potential buffer overflow Core: - Fix for iommu_copy_struct_from_user()" * tag 'iommu-fixes-v6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu/vt-d: Apply quirk_iommu_igfx for 8086:0044 (QM57/QS57) iommu/vt-d: Revert ATS timing change to fix boot failure iommu: Fix two issues in iommu_copy_struct_from_user() iommu/amd: Fix potential buffer overflow in parse_ivrs_acpihid iommu/arm-smmu-v3: Fail aliasing StreamIDs more gracefully iommu/arm-smmu-v3: Fix iommu_device_probe bug due to duplicated stream ids iommu/arm-smmu-v3: Fix pgsize_bit for sva domains iommu/arm-smmu-v3: Add missing S2FWB feature detection
2025-05-02	iommu/amd: Add support for HTRangeIgnore feature	Sairaj Kodilkar
	AMD IOMMU reserves the address range 0xfd00000000-0xffffffffff for the hypertransport protocol (HT) and has special meaning. Hence devices cannot use this address range for the DMA. However on some AMD platforms this HT range is shifted to the very top of the address space and new feature bit `HTRangeIgnore` is introduced. When this feature bit is on, IOMMU treats the GPA access to the legacy HT range as regular GPA access. Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20250317055020.25214-1-sarunkod@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-28	iommu/amd: Ensure GA log notifier callbacks finish running before module unload	Sean Christopherson
	Synchronize RCU when unregistering KVM's GA log notifier to ensure all in-flight interrupt handlers complete before KVM-the module is unloaded. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20250315031048.2374109-1-seanjc@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-28	iommu: Protect against overflow in iommu_pgsize()	Jason Gunthorpe
	On a 32 bit system calling: iommu_map(0, 0x40000000) When using the AMD V1 page table type with a domain->pgsize of 0xfffff000 causes iommu_pgsize() to miscalculate a result of: size=0x40000000 count=2 count should be 1. This completely corrupts the mapping process. This is because the final test to adjust the pagesize malfunctions when the addition overflows. Use check_add_overflow() to prevent this. Fixes: b1d99dc5f983 ("iommu: Hook up '->unmap_pages' driver callback") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/0-v1-3ad28fc2e3a3+163327-iommu_overflow_pgsize_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-28	iommu: Handle yet another race around registration	Robin Murphy
	Next up on our list of race windows to close is another one during iommu_device_register() - it's now OK again for multiple instances to run their bus_iommu_probe() in parallel, but an iommu_probe_device() can still also race against a running bus_iommu_probe(). As Johan has managed to prove, this has now become a lot more visible on DT platforms wth driver_async_probe where a client driver is attempting to probe in parallel with its IOMMU driver - although commit b46064a18810 ("iommu: Handle race with default domain setup") resolves this from the client driver's point of view, this isn't before of_iommu_configure() has had the chance to attempt to "replay" a probe that the bus walk hasn't even tried yet, and so still cause the out-of-order group allocation behaviour that we're trying to clean up (and now warning about). The most reliable thing to do here is to explicitly keep track of the "iommu_device_register() is still running" state, so we can then special-case the ops lookup for the replay path (based on dev->iommu again) to let that think it's still waiting for the IOMMU driver to appear at all. This still leaves the longstanding theoretical case of iommu_bus_notifier() being triggered during bus_iommu_probe(), but it's not so simple to defer a notifier, and nobody's ever reported that being a visible issue, so let's quietly kick that can down the road for now... Reported-by: Johan Hovold <johan@kernel.org> Fixes: bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/88d54c1b48fed8279aa47d30f3d75173685bb26a.1745516488.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-28	iommu: Allow attaching static domains in iommu_attach_device_pasid()	Lu Baolu
	The idxd driver attaches the default domain to a PASID of the device to perform kernel DMA using that PASID. The domain is attached to the device's PASID through iommu_attach_device_pasid(), which checks if the domain->owner matches the iommu_ops retrieved from the device. If they do not match, it returns a failure. if (ops != domain->owner \|\| pasid == IOMMU_NO_PASID) return -EINVAL; The static identity domain implemented by the intel iommu driver doesn't specify the domain owner. Therefore, kernel DMA with PASID doesn't work for the idxd driver if the device translation mode is set to passthrough. Generally the owner field of static domains are not set because they are already part of iommu ops. Add a helper domain_iommu_ops_compatible() that checks if a domain is compatible with the device's iommu ops. This helper explicitly allows the static blocked and identity domains associated with the device's iommu_ops to be considered compatible. Fixes: 2031c469f816 ("iommu/vt-d: Add support for static identity domain") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220031 Cc: stable@vger.kernel.org Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/linux-iommu/20250422191554.GC1213339@ziepe.ca/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20250424034123.2311362-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-28	iommu/io-pgtable-arm: dynamically allocate selftest device struct	Arnd Bergmann
	In general a 'struct device' is way too large to be put on the kernel stack. Apparently something just caused it to grow a slightly larger, which pushed the arm_lpae_do_selftests() function over the warning limit in some configurations: drivers/iommu/io-pgtable-arm.c:1423:19: error: stack frame size (1032) exceeds limit (1024) in 'arm_lpae_do_selftests' [-Werror,-Wframe-larger-than] 1423 \| static int __init arm_lpae_do_selftests(void) \| ^ Change the function to use a dynamically allocated faux_device instead of the on-stack device structure. Fixes: ca25ec247aad ("iommu/io-pgtable-arm: Remove iommu_dev==NULL special case") Link: https://lore.kernel.org/all/ab75a444-22a1-47f5-b3c0-253660395b5a@arm.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20250423164826.2931382-1-arnd@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-28	iommu: ipmmu-vmsa: avoid Wformat-security warning	Arnd Bergmann
	iommu_device_sysfs_add() requires a constant format string, otherwise a W=1 build produces a warning: drivers/iommu/ipmmu-vmsa.c:1093:62: error: format string is not a string literal (potentially insecure) [-Werror,-Wformat-security] 1093 \| ret = iommu_device_sysfs_add(&mmu->iommu, &pdev->dev, NULL, dev_name(&pdev->dev)); \| ^~~~~~~~~~~~~~~~~~~~ drivers/iommu/ipmmu-vmsa.c:1093:62: note: treat the string as an argument to avoid this 1093 \| ret = iommu_device_sysfs_add(&mmu->iommu, &pdev->dev, NULL, dev_name(&pdev->dev)); \| ^ \| "%s", This was an old bug but I saw it now because the code was changed as part of commit d9d3cede4167 ("iommu/ipmmu-vmsa: Register in a sensible order"). Fixes: 7af9a5fdb9e0 ("iommu/ipmmu-vmsa: Use iommu_device_sysfs_add()/remove()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20250423164006.2661372-1-arnd@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-28	iommu: Hide ops.domain_alloc behind CONFIG_FSL_PAMU	Jason Gunthorpe
	fsl_pamu is the last user of domain_alloc(), and it is using it to create something weird that doesn't really fit into the iommu subsystem architecture. It is a not a paging domain since it doesn't have any map/unmap ops. It may be some special kind of identity domain. For now just leave it as is. Wrap it's definition in CONFIG_FSL_PAMU to discourage any new drivers from attempting to use it. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/5-v4-ff5fb6b03bd1+288-iommu_virtio_domains_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-28	iommu: Do not call domain_alloc() in iommu_sva_domain_alloc()	Jason Gunthorpe
	No driver implements SVA under domain_alloc() anymore, this is dead code. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/4-v4-ff5fb6b03bd1+288-iommu_virtio_domains_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-28	iommu/virtio: Move to domain_alloc_paging()	Jason Gunthorpe
	virtio has the complication that it sometimes wants to return a paging domain for IDENTITY which makes this conversion a little different than other drivers. Add a viommu_domain_alloc_paging() that combines viommu_domain_alloc() and viommu_domain_finalise() to always return a fully initialized and finalized paging domain. Use viommu_domain_alloc_identity() to implement the special non-bypass IDENTITY flow by calling viommu_domain_alloc_paging() then viommu_domain_map_identity(). Remove support for deferred finalize and the vdomain->mutex. Remove core support for domain_alloc() IDENTITY as virtio was the last driver using it. Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/3-v4-ff5fb6b03bd1+288-iommu_virtio_domains_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>