summaryrefslogtreecommitdiff
path: root/drivers/iommu/intel/irq_remapping.c
AgeCommit message (Collapse)Author
2025-04-11iommu/vt-d: Wire up irq_ack() to irq_move_irq() for posted MSIsSean Christopherson
Set the posted MSI irq_chip's irq_ack() hook to irq_move_irq() instead of a dummy/empty callback so that posted MSIs process pending changes to the IRQ's SMP affinity. Failure to honor a pending set-affinity results in userspace being unable to change the effective affinity of the IRQ, as IRQD_SETAFFINITY_PENDING is never cleared and so irq_set_affinity_locked() always defers moving the IRQ. The issue is most easily reproducible by setting /proc/irq/xx/smp_affinity multiple times in quick succession, as only the first update is likely to be handled in process context. Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs") Cc: Robert Lippert <rlippert@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Reported-by: Wentao Yang <wentaoyang@google.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20250321194249.1217961-1-seanjc@google.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-03-20iommu/vt-d: Don't clobber posted vCPU IRTE when host IRQ affinity changesSean Christopherson
Don't overwrite an IRTE that is posting IRQs to a vCPU with a posted MSI entry if the host IRQ affinity happens to change. If/when the IRTE is reverted back to "host mode", it will be reconfigured as a posted MSI or remapped entry as appropriate. Drop the "mode" field, which doesn't differentiate between posted MSIs and posted vCPUs, in favor of a dedicated posted_vcpu flag. Note! The two posted_{msi,vcpu} flags are intentionally not mutually exclusive; an IRTE can transition between posted MSI and posted vCPU. Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs") Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20250315025135.2365846-3-seanjc@google.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-03-20iommu/vt-d: Put IRTE back into posted MSI mode if vCPU posting is disabledSean Christopherson
Add a helper to take care of reconfiguring an IRTE to deliver IRQs to the host, i.e. not to a vCPU, and use the helper when an IRTE's vCPU affinity is nullified, i.e. when KVM puts an IRTE back into "host" mode. Because posted MSIs use an ephemeral IRTE, using modify_irte() puts the IRTE into full remapped mode, i.e. unintentionally disables posted MSIs on the IRQ. Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs") Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20250315025135.2365846-2-seanjc@google.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-01-24Merge tag 'iommu-updates-v6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: "Core changes: - PASID support for the blocked_domain ARM-SMMU Updates: - SMMUv2: - Implement per-client prefetcher configuration on Qualcomm SoCs - Support for the Adreno SMMU on Qualcomm's SDM670 SOC - SMMUv3: - Pretty-printing of event records - Drop the ->domain_alloc_paging implementation in favour of domain_alloc_paging_flags(flags==0) - IO-PGTable: - Generalisation of the page-table walker to enable external walkers (e.g. for debugging unexpected page-faults from the GPU) - Minor fix for handling concatenated PGDs at stage-2 with 16KiB pages - Misc: - Clean-up device probing and replace the crufty probe-deferral hack with a more robust implementation of arm_smmu_get_by_fwnode() - Device-tree binding updates for a bunch of Qualcomm platforms Intel VT-d Updates: - Remove domain_alloc_paging() - Remove capability audit code - Draining PRQ in sva unbind path when FPD bit set - Link cache tags of same iommu unit together AMD-Vi Updates: - Use CMPXCHG128 to update DTE - Cleanups of the domain_alloc_paging() path RiscV IOMMU: - Platform MSI support - Shutdown support Rockchip IOMMU: - Add DT bindings for Rockchip RK3576 More smaller fixes and cleanups" * tag 'iommu-updates-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (66 commits) iommu: Use str_enable_disable-like helpers iommu/amd: Fully decode all combinations of alloc_paging_flags iommu/amd: Move the nid to pdom_setup_pgtable() iommu/amd: Change amd_iommu_pgtable to use enum protection_domain_mode iommu/amd: Remove type argument from do_iommu_domain_alloc() and related iommu/amd: Remove dev == NULL checks iommu/amd: Remove domain_alloc() iommu/amd: Remove unused amd_iommu_domain_update() iommu/riscv: Fixup compile warning iommu/arm-smmu-v3: Add missing #include of linux/string_choices.h iommu/arm-smmu-v3: Use str_read_write helper w/ logs iommu/io-pgtable-arm: Add way to debug pgtable walk iommu/io-pgtable-arm: Re-use the pgtable walk for iova_to_phys iommu/io-pgtable-arm: Make pgtable walker more generic iommu/arm-smmu: Add ACTLR data and support for qcom_smmu_500 iommu/arm-smmu: Introduce ACTLR custom prefetcher settings iommu/arm-smmu: Add support for PRR bit setup iommu/arm-smmu: Refactor qcom_smmu structure to include single pointer iommu/arm-smmu: Re-enable context caching in smmu reset operation iommu/vt-d: Link cache tags of same iommu unit together ...
2025-01-15x86/apic: Convert to IRQCHIP_MOVE_DEFERREDThomas Gleixner
Instead of marking individual interrupts as safe to be migrated in arbitrary contexts, mark the interrupt chips, which require the interrupt to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED flag. This makes more sense because this is a per interrupt chip property and not restricted to individual interrupts. That flips the logic from the historical opt-out to a opt-in model. This is simpler to handle for other architectures, which default to unrestricted affinity setting. It also allows to cleanup the redundant core logic significantly. All interrupt chips, which belong to a top-level domain sitting directly on top of the x86 vector domain are marked accordingly, unless the related setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Acked-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/all/20241210103335.563277044@linutronix.de
2025-01-07iommu/vt-d: Remove iommu cap auditLu Baolu
The capability audit code was introduced by commit <ad3d19029979> "iommu/vt-d: Audit IOMMU Capabilities and add helper functions", aiming to verify the consistency of capabilities across all IOMMUs for supported features. Nowadays, all the kAPIs of the iommu subsystem have evolved to be device oriented, in preparation for supporting heterogeneous IOMMU architectures. There is no longer a need to require capability consistence among IOMMUs for any feature. Remove the iommu cap audit code to make the driver align with the design in the iommu core. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20241216071828.22962-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-11-05iommu/vt-d: Use PCI_DEVID() macroJinjie Ruan
The macro PCI_DEVID() can be used instead of compose it manually. Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240829021011.4135618-1-ruanjinjie@huawei.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-08-07iommu/vt-d: Cleanup apic_printk()Thomas Gleixner
Use the new apic_pr_verbose() helper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Tested-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/all/20240802155440.843266805@linutronix.de
2024-07-03iommu/vt-d: Downgrade warning for pre-enabled IRLu Baolu
Emitting a warning is overkill in intel_setup_irq_remapping() since the interrupt remapping is pre-enabled. For example, there's no guarantee that kexec will explicitly disable interrupt remapping before booting a new kernel. As a result, users are seeing warning messages like below when they kexec boot a kernel, though there is nothing wrong: DMAR-IR: IRQ remapping was enabled on dmar18 but we are not in kdump mode DMAR-IR: IRQ remapping was enabled on dmar17 but we are not in kdump mode DMAR-IR: IRQ remapping was enabled on dmar16 but we are not in kdump mode ... ... Downgrade the severity of this message to avoid user confusion. CC: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/linux-iommu/5517f76a-94ad-452c-bae6-34ecc0ec4831@molgen.mpg.de/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240625043912.258036-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20240702130839.108139-5-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-21Merge tag 'pci-v6.10-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Skip E820 checks for MCFG ECAM regions for new (2016+) machines, since there's no requirement to describe them in E820 and some platforms require ECAM to work (Bjorn Helgaas) - Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific (Damien Le Moal) - Remove last user and pci_enable_device_io() (Heiner Kallweit) - Wait for Link Training==0 to avoid possible race (Ilpo Järvinen) - Skip waiting for devices that have been disconnected while suspended (Ilpo Järvinen) - Clear Secondary Status errors after enumeration since Master Aborts and Unsupported Request errors are an expected part of enumeration (Vidya Sagar) MSI: - Remove unused IMS (Interrupt Message Store) support (Bjorn Helgaas) Error handling: - Mask Genesys GL975x SD host controller Replay Timer Timeout correctable errors caused by a hardware defect; the errors cause interrupts that prevent system suspend (Kai-Heng Feng) - Fix EDR-related _DSM support, which previously evaluated revision 5 but assumed revision 6 behavior (Kuppuswamy Sathyanarayanan) ASPM: - Simplify link state definitions and mask calculation (Ilpo Järvinen) Power management: - Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports, where BIOS apparently doesn't know how to put them back in D0 (Mario Limonciello) CXL: - Support resetting CXL devices; special handling required because CXL Ports mask Secondary Bus Reset by default (Dave Jiang) DOE: - Support DOE Discovery Version 2 (Alexey Kardashevskiy) Endpoint framework: - Set endpoint BAR to be 64-bit if the driver says that's all the device supports, in addition to doing so if the size is >2GB (Niklas Cassel) - Simplify endpoint BAR allocation and setting interfaces (Niklas Cassel) Cadence PCIe controller driver: - Drop DT binding redundant msi-parent and pci-bus.yaml (Krzysztof Kozlowski) Cadence PCIe endpoint driver: - Configure endpoint BARs to be 64-bit based on the BAR type, not the BAR value (Niklas Cassel) Freescale Layerscape PCIe controller driver: - Convert DT binding to YAML (Frank Li) MediaTek MT7621 PCIe controller driver: - Add DT binding missing 'reg' property for child Root Ports (Krzysztof Kozlowski) - Fix theoretical string truncation in PHY name (Sergio Paracuellos) NVIDIA Tegra194 PCIe controller driver: - Return success for endpoint probe instead of falling through to the failure path (Vidya Sagar) Renesas R-Car PCIe controller driver: - Add DT binding missing IOMMU properties (Geert Uytterhoeven) - Add DT binding R-Car V4H compatible for host and endpoint mode (Yoshihiro Shimoda) Rockchip PCIe controller driver: - Configure endpoint BARs to be 64-bit based on the BAR type, not the BAR value (Niklas Cassel) - Add DT binding missing maxItems to ep-gpios (Krzysztof Kozlowski) - Set the Subsystem Vendor ID, which was previously zero because it was masked incorrectly (Rick Wertenbroek) Synopsys DesignWare PCIe controller driver: - Restructure DBI register access to accommodate devices where this requires Refclk to be active (Manivannan Sadhasivam) - Remove the deinit() callback, which was only need by the pcie-rcar-gen4, and do it directly in that driver (Manivannan Sadhasivam) - Add dw_pcie_ep_cleanup() so drivers that support PERST# can clean up things like eDMA (Manivannan Sadhasivam) - Rename dw_pcie_ep_exit() to dw_pcie_ep_deinit() to make it parallel to dw_pcie_ep_init() (Manivannan Sadhasivam) - Rename dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers() to reflect the actual functionality (Manivannan Sadhasivam) - Call dw_pcie_ep_init_registers() directly from all the glue drivers, not just those that require active Refclk from the host (Manivannan Sadhasivam) - Remove the "core_init_notifier" flag, which was an obscure way for glue drivers to indicate that they depend on Refclk from the host (Manivannan Sadhasivam) TI J721E PCIe driver: - Add DT binding J784S4 SoC Device ID (Siddharth Vadapalli) - Add DT binding J722S SoC support (Siddharth Vadapalli) TI Keystone PCIe controller driver: - Add DT binding missing num-viewport, phys and phy-name properties (Jan Kiszka) Miscellaneous: - Constify and annotate with __ro_after_init (Heiner Kallweit) - Convert DT bindings to YAML (Krzysztof Kozlowski) - Check for kcalloc() failure in of_pci_prop_intr_map() (Duoming Zhou)" * tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (97 commits) PCI: Do not wait for disconnected devices when resuming x86/pci: Skip early E820 check for ECAM region PCI: Remove unused pci_enable_device_io() ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io() PCI: Update pci_find_capability() stub return types PCI: Remove PCI_IRQ_LEGACY scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY dt-bindings: PCI: rockchip,rk3399-pcie: Add missing maxItems to ep-gpios Revert "genirq/msi: Provide constants for PCI/IMS support" Revert "x86/apic/msi: Enable PCI/IMS" Revert "iommu/vt-d: Enable PCI/IMS" Revert "iommu/amd: Enable PCI/IMS" Revert "PCI/MSI: Provide IMS (Interrupt Message Store) support" ...
2024-05-18Merge tag 'iommu-updates-v6.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: "Core: - IOMMU memory usage observability - This will make the memory used for IO page tables explicitly visible. - Simplify arch_setup_dma_ops() Intel VT-d: - Consolidate domain cache invalidation - Remove private data from page fault message - Allocate DMAR fault interrupts locally - Cleanup and refactoring ARM-SMMUv2: - Support for fault debugging hardware on Qualcomm implementations - Re-land support for the ->domain_alloc_paging() callback ARM-SMMUv3: - Improve handling of MSI allocation failure - Drop support for the "disable_bypass" cmdline option - Major rework of the CD creation code, following on directly from the STE rework merged last time around. - Add unit tests for the new STE/CD manipulation logic AMD-Vi: - Final part of SVA changes with generic IO page fault handling Renesas IPMMU: - Add support for R8A779H0 hardware ... and a couple smaller fixes and updates across the sub-tree" * tag 'iommu-updates-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (80 commits) iommu/arm-smmu-v3: Make the kunit into a module arm64: Properly clean up iommu-dma remnants iommu/amd: Enable Guest Translation after reading IOMMU feature register iommu/vt-d: Decouple igfx_off from graphic identity mapping iommu/amd: Fix compilation error iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd() iommu/arm-smmu-v3: Move the CD generation for SVA into a function iommu/arm-smmu-v3: Allocate the CD table entry in advance iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr() iommu/arm-smmu-v3: Consolidate clearing a CD table entry iommu/arm-smmu-v3: Move the CD generation for S1 domains into a function iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry() iommu/arm-smmu-v3: Add an ops indirection to the STE code iommu/arm-smmu-qcom: Don't build debug features as a kernel module iommu/amd: Add SVA domain support iommu: Add ops->domain_alloc_sva() iommu/amd: Initial SVA support for AMD IOMMU iommu/amd: Add support for enable/disable IOPF iommu/amd: Add IO page fault notifier handler ...
2024-05-15Revert "iommu/vt-d: Enable PCI/IMS"Bjorn Helgaas
This reverts commit 810531a1af5393f010d6508b1cb48e6650fc5e8f. IMS (Interrupt Message Store) support appeared in v6.2, but there are no users yet. Remove it for now. We can add it back when a user comes along. Link: https://lore.kernel.org/r/20240410221307.2162676-6-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2024-04-30iommu/vt-d: Enable posted mode for device MSIsJacob Pan
With posted MSI feature enabled on the CPU side, iommu interrupt remapping table entries (IRTEs) for device MSI/x can be allocated, activated, and programed in posted mode. This means that IRTEs are linked with their respective PIDs of the target CPU. Handlers for the posted MSI notification vector will de-multiplex device MSI handlers. CPU notifications are coalesced if interrupts arrive at a high frequency. Posted interrupts are only used for device MSI and not for legacy devices (IO/APIC, HPET). Introduce a new irq_chip for posted MSIs, which has a dummy irq_ack() callback as EOI is performed in the notification handler once. When posted MSI is enabled, MSI domain/chip hierarchy will look like this example: domain: IR-PCI-MSIX-0000:50:00.0-12 hwirq: 0x29 chip: IR-PCI-MSIX-0000:50:00.0 flags: 0x430 IRQCHIP_SKIP_SET_WAKE IRQCHIP_ONESHOT_SAFE parent: domain: INTEL-IR-10-13 hwirq: 0x2d0000 chip: INTEL-IR-POST flags: 0x0 parent: domain: VECTOR hwirq: 0x77 chip: APIC Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240423174114.526704-13-jacob.jun.pan@linux.intel.com
2024-04-15iommu/vt-d: add wrapper functions for page allocationsPasha Tatashin
In order to improve observability and accountability of IOMMU layer, we must account the number of pages that are allocated by functions that are calling directly into buddy allocator. This is achieved by first wrapping the allocation related functions into a separate inline functions in new file: drivers/iommu/iommu-pages.h Convert all page allocation calls under iommu/intel to use these new functions. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-2-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-21x86/apic: Drop apic::delivery_modeAndrew Cooper
This field is set to APIC_DELIVERY_MODE_FIXED in all cases, and is read exactly once. Fold the constant in uv_program_mmr() and drop the field. Searching for the origin of the stale HyperV comment reveals commit a31e58e129f7 ("x86/apic: Switch all APICs to Fixed delivery mode") which notes: As a consequence of this change, the apic::irq_delivery_mode field is now pointless, but this needs to be cleaned up in a separate patch. 6 years is long enough for this technical debt to have survived. [ bp: Fold in https://lore.kernel.org/r/20231121123034.1442059-1-andrew.cooper3@citrix.com ] Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lore.kernel.org/r/20231102-x86-apic-v1-1-bf049a2a0ed6@citrix.com
2023-08-06x86/vector: Rename send_cleanup_vector() to vector_schedule_cleanup()Thomas Gleixner
Rename send_cleanup_vector() to vector_schedule_cleanup() to prepare for replacing the vector cleanup IPI with a timer callback. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Xin Li <xin3.li@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lore.kernel.org/r/20230621171248.6805-2-xin3.li@intel.com
2023-06-05x86,intel_iommu: Replace cmpxchg_double()Peter Zijlstra
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.855976804@infradead.org
2023-04-14Merge branches 'iommu/fixes', 'arm/allwinner', 'arm/exynos', 'arm/mediatek', ↵Joerg Roedel
'arm/omap', 'arm/renesas', 'arm/rockchip', 'arm/smmu', 'ppc/pamu', 'unisoc', 'x86/vt-d', 'x86/amd', 'core' and 'platform-remove_new' into next
2023-04-13iommu/vt-d: Do not use GFP_ATOMIC when not neededChristophe JAILLET
There is no need to use GFP_ATOMIC here. GFP_KERNEL is already used for some other memory allocations just a few lines above. Commit e3a981d61d15 ("iommu/vt-d: Convert allocations to GFP_KERNEL") has changed the other memory allocation flags. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/e2a8a1019ffc8a86b4b4ed93def3623f60581274.1675542576.git.christophe.jaillet@wanadoo.fr Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-31iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()Lu Baolu
The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932 ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It is used to protect DMAR related global data from DMAR hotplug operations. Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as the DMAR global data structures are not touched there. Remove it to avoid below lockdep warning. ====================================================== WARNING: possible circular locking dependency detected 6.3.0-rc2 #468 Not tainted ------------------------------------------------------ swapper/0/1 is trying to acquire lock: ff1db4cb40178698 (&domain->mutex){+.+.}-{3:3}, at: __irq_domain_alloc_irqs+0x3b/0xa0 but task is already holding lock: ffffffffa0c1cdf0 (dmar_global_lock){++++}-{3:3}, at: intel_iommu_init+0x58e/0x880 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (dmar_global_lock){++++}-{3:3}: lock_acquire+0xd6/0x320 down_read+0x42/0x180 intel_irq_remapping_alloc+0xad/0x750 mp_irqdomain_alloc+0xb8/0x2b0 irq_domain_alloc_irqs_locked+0x12f/0x2d0 __irq_domain_alloc_irqs+0x56/0xa0 alloc_isa_irq_from_domain.isra.7+0xa0/0xe0 mp_map_pin_to_irq+0x1dc/0x330 setup_IO_APIC+0x128/0x210 apic_intr_mode_init+0x67/0x110 x86_late_time_init+0x24/0x40 start_kernel+0x41e/0x7e0 secondary_startup_64_no_verify+0xe0/0xeb -> #0 (&domain->mutex){+.+.}-{3:3}: check_prevs_add+0x160/0xef0 __lock_acquire+0x147d/0x1950 lock_acquire+0xd6/0x320 __mutex_lock+0x9c/0xfc0 __irq_domain_alloc_irqs+0x3b/0xa0 dmar_alloc_hwirq+0x9e/0x120 iommu_pmu_register+0x11d/0x200 intel_iommu_init+0x5de/0x880 pci_iommu_init+0x12/0x40 do_one_initcall+0x65/0x350 kernel_init_freeable+0x3ca/0x610 kernel_init+0x1a/0x140 ret_from_fork+0x29/0x50 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(dmar_global_lock); lock(&domain->mutex); lock(dmar_global_lock); lock(&domain->mutex); *** DEADLOCK *** Fixes: 9dbb8e3452ab ("irqdomain: Switch to per-domain locking") Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Tested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230314051836.23817-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20230329134721.469447-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-11iommu/x86: Replace IOMMU_CAP_INTR_REMAP with IRQ_DOMAIN_FLAG_ISOLATED_MSIJason Gunthorpe
On x86 platforms when the HW can support interrupt remapping the iommu driver creates an irq_domain for the IR hardware and creates a child MSI irq_domain. When the global irq_remapping_enabled is set, the IR MSI domain is assigned to the PCI devices (by intel_irq_remap_add_device(), or amd_iommu_set_pci_msi_domain()) making those devices have the isolated MSI property. Due to how interrupt domains work, setting IRQ_DOMAIN_FLAG_ISOLATED_MSI on the parent IR domain will cause all struct devices attached to it to return true from msi_device_has_isolated_msi(). This replaces the IOMMU_CAP_INTR_REMAP flag as all places using IOMMU_CAP_INTR_REMAP also call msi_device_has_isolated_msi() Set the flag and delete the cap. Link: https://lore.kernel.org/r/7-v3-3313bb5dd3a3+10f11-secure_msi_jgg@nvidia.com Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-17Merge tag 'x86_mm_for_6.2_v2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Dave Hansen: "New Feature: - Randomize the per-cpu entry areas Cleanups: - Have CR3_ADDR_MASK use PHYSICAL_PAGE_MASK instead of open coding it - Move to "native" set_memory_rox() helper - Clean up pmd_get_atomic() and i386-PAE - Remove some unused page table size macros" * tag 'x86_mm_for_6.2_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits) x86/mm: Ensure forced page table splitting x86/kasan: Populate shadow for shared chunk of the CPU entry area x86/kasan: Add helpers to align shadow addresses up and down x86/kasan: Rename local CPU_ENTRY_AREA variables to shorten names x86/mm: Populate KASAN shadow for entire per-CPU range of CPU entry area x86/mm: Recompute physical address for every page of per-CPU CEA mapping x86/mm: Rename __change_page_attr_set_clr(.checkalias) x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias() x86/mm: Untangle __change_page_attr_set_clr(.checkalias) x86/mm: Add a few comments x86/mm: Fix CR3_ADDR_MASK x86/mm: Remove P*D_PAGE_MASK and P*D_PAGE_SIZE macros mm: Convert __HAVE_ARCH_P..P_GET to the new style mm: Remove pointless barrier() after pmdp_get_lockless() x86/mm/pae: Get rid of set_64bit() x86_64: Remove pointless set_64bit() usage x86/mm/pae: Be consistent with pXXp_get_and_clear() x86/mm/pae: Use WRITE_ONCE() x86/mm/pae: Don't (ab)use atomic64 mm/gup: Fix the lockless PMD access ...
2022-12-15x86_64: Remove pointless set_64bit() usagePeter Zijlstra
The use of set_64bit() in X86_64 only code is pretty pointless, seeing how it's a direct assignment. Remove all this nonsense. [nathanchance: unbreak irte] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114425.168036718%40infradead.org
2022-12-05iommu/vt-d: Enable PCI/IMSThomas Gleixner
PCI/IMS works like PCI/MSI-X in the remapping. Just add the feature flag, but only when on real hardware. Virtualized IOMMUs need additional support, e.g. for PASID. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232327.081482253@linutronix.de
2022-12-05iommu/vt-d: Switch to MSI parent domainsThomas Gleixner
Remove the global PCI/MSI irqdomain implementation and provide the required MSI parent ops so the PCI/MSI code can detect the new parent and setup per device domains. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.151226317@linutronix.de
2022-12-05x86/apic/vector: Provide MSI parent domainThomas Gleixner
Enable MSI parent domain support in the x86 vector domain and fixup the checks in the iommu implementations to check whether device::msi::domain is the default MSI parent domain. That keeps the existing logic to protect e.g. devices behind VMD working. The interrupt remap PCI/MSI code still works because the underlying vector domain still provides the same functionality. None of the other x86 PCI/MSI, e.g. XEN and HyperV, implementations are affected either. They still work the same way both at the low level and the PCI/MSI implementations they provide. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.034672592@linutronix.de
2022-11-17x86/apic: Remove X86_IRQ_ALLOC_CONTIGUOUS_VECTORSThomas Gleixner
Now that the PCI/MSI core code does early checking for multi-MSI support X86_IRQ_ALLOC_CONTIGUOUS_VECTORS is not required anymore. Remove the flag and rely on MSI_FLAG_MULTI_PCI_MSI. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20221111122015.865042356@linutronix.de
2022-11-17iommu/vt-d: Remove bogus check for multi MSI-XThomas Gleixner
PCI/Multi-MSI is MSI specific and not supported for MSI-X. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Link: https://lore.kernel.org/r/20221111122013.713848846@linutronix.de
2022-09-26iommu/vt-d: Avoid unnecessary global IRTE cache invalidationLu Baolu
Some VT-d hardware implementations invalidate all interrupt remapping hardware translation caches as part of SIRTP flow. The VT-d spec adds a ESIRTPS (Enhanced Set Interrupt Remap Table Pointer Support, section 11.4.2 in VT-d spec) capability bit to indicate this. The spec also states in 11.4.4 that hardware also performs global invalidation on all interrupt remapping caches as part of Interrupt Remapping Disable operation if ESIRTPS capability bit is set. This checks the ESIRTPS capability bit and skip software global cache invalidation if it's set. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20220921065741.3572495-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-15iommu/vt-d: Move include/linux/intel-iommu.h under iommuLu Baolu
This header file is private to the Intel IOMMU driver. Move it to the driver folder. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lore.kernel.org/r/20220514014322.2927339-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-01-31iommu/vt-d: Fix potential memory leak in intel_setup_irq_remapping()Guoqing Jiang
After commit e3beca48a45b ("irqdomain/treewide: Keep firmware node unconditionally allocated"). For tear down scenario, fn is only freed after fail to allocate ir_domain, though it also should be freed in case dmar_enable_qi returns error. Besides free fn, irq_domain and ir_msi_domain need to be removed as well if intel_setup_irq_remapping fails to enable queued invalidation. Improve the rewinding path by add out_free_ir_domain and out_free_fwnode lables per Baolu's suggestion. Fixes: e3beca48a45b ("irqdomain/treewide: Keep firmware node unconditionally allocated") Suggested-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Link: https://lore.kernel.org/r/20220119063640.16864-1-guoqing.jiang@linux.dev Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220128031002.2219155-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-05-05Merge tag 'pci-v5.13-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Release OF node when pci_scan_device() fails (Dmitry Baryshkov) - Add pci_disable_parity() (Bjorn Helgaas) - Disable Mellanox Tavor parity reporting (Heiner Kallweit) - Disable N2100 r8169 parity reporting (Heiner Kallweit) - Fix RCiEP device to RCEC association (Qiuxu Zhuo) - Convert sysfs "config", "rom", "reset", "label", "index", "acpi_index" to static attributes to help fix races in device enumeration (Krzysztof Wilczyński) - Convert sysfs "vpd" to static attribute (Heiner Kallweit, Krzysztof Wilczyński) - Use sysfs_emit() in "show" functions (Krzysztof Wilczyński) - Remove unused alloc_pci_root_info() return value (Krzysztof Wilczyński) PCI device hotplug: - Fix acpiphp reference count leak (Feilong Lin) Power management: - Fix acpi_pci_set_power_state() debug message (Rafael J. Wysocki) - Fix runtime PM imbalance (Dinghao Liu) Virtualization: - Increase delay after FLR to work around Intel DC P4510 NVMe erratum (Raphael Norwitz) MSI: - Convert rcar, tegra, xilinx to MSI domains (Marc Zyngier) - For rcar, xilinx, use controller address as MSI doorbell (Marc Zyngier) - Remove unused hv msi_controller struct (Marc Zyngier) - Remove unused PCI core msi_controller support (Marc Zyngier) - Remove struct msi_controller altogether (Marc Zyngier) - Remove unused default_teardown_msi_irqs() (Marc Zyngier) - Let host bridges declare their reliance on MSI domains (Marc Zyngier) - Make pci_host_common_probe() declare its reliance on MSI domains (Marc Zyngier) - Advertise mediatek lack of built-in MSI handling (Thomas Gleixner) - Document ways of ending up with NO_MSI (Marc Zyngier) - Refactor HT advertising of NO_MSI flag (Marc Zyngier) VPD: - Remove obsolete Broadcom NIC VPD length-limiting quirk (Heiner Kallweit) - Remove sysfs VPD size checking dead code (Heiner Kallweit) - Convert VPF sysfs file to static attribute (Heiner Kallweit) - Remove unnecessary pci_set_vpd_size() (Heiner Kallweit) - Tone down "missing VPD" message (Heiner Kallweit) Endpoint framework: - Fix NULL pointer dereference when epc_features not implemented (Shradha Todi) - Add missing destroy_workqueue() in endpoint test (Yang Yingliang) Amazon Annapurna Labs PCIe controller driver: - Fix compile testing without CONFIG_PCI_ECAM (Arnd Bergmann) - Fix "no symbols" warnings when compile testing with CONFIG_TRIM_UNUSED_KSYMS (Arnd Bergmann) APM X-Gene PCIe controller driver: - Fix cfg resource mapping regression (Dejin Zheng) Broadcom iProc PCIe controller driver: - Return zero for success of iproc_msi_irq_domain_alloc() (Pali Rohár) Broadcom STB PCIe controller driver: - Add reset_control_rearm() stub for !CONFIG_RESET_CONTROLLER (Jim Quinlan) - Fix use of BCM7216 reset controller (Jim Quinlan) - Use reset/rearm for Broadcom STB pulse reset instead of deassert/assert (Jim Quinlan) - Fix brcm_pcie_probe() error return for unsupported revision (Wei Yongjun) Cavium ThunderX PCIe controller driver: - Fix compile testing (Arnd Bergmann) - Fix "no symbols" warnings when compile testing with CONFIG_TRIM_UNUSED_KSYMS (Arnd Bergmann) Freescale Layerscape PCIe controller driver: - Fix ls_pcie_ep_probe() syntax error (comma for semicolon) (Krzysztof Wilczyński) - Remove layerscape-gen4 dependencies on OF and ARM64, add dependency on ARCH_LAYERSCAPE (Geert Uytterhoeven) HiSilicon HIP PCIe controller driver: - Remove obsolete HiSilicon PCIe DT description (Dongdong Liu) Intel Gateway PCIe controller driver: - Remove unused pcie_app_rd() (Jiapeng Chong) Intel VMD host bridge driver: - Program IRTE with Requester ID of VMD endpoint, not child device (Jon Derrick) - Disable VMD MSI-X remapping when possible so children can use more MSI-X vectors (Jon Derrick) MediaTek PCIe controller driver: - Configure FC and FTS for functions other than 0 (Ryder Lee) - Add YAML schema for MediaTek (Jianjun Wang) - Export pci_pio_to_address() for module use (Jianjun Wang) - Add MediaTek MT8192 PCIe controller driver (Jianjun Wang) - Add MediaTek MT8192 INTx support (Jianjun Wang) - Add MediaTek MT8192 MSI support (Jianjun Wang) - Add MediaTek MT8192 system power management support (Jianjun Wang) - Add missing MODULE_DEVICE_TABLE (Qiheng Lin) Microchip PolarFlare PCIe controller driver: - Make several symbols static (Wei Yongjun) NVIDIA Tegra PCIe controller driver: - Add MCFG quirks for Tegra194 ECAM errata (Vidya Sagar) - Make several symbols const (Rikard Falkeborn) - Fix Kconfig host/endpoint typo (Wesley Sheng) SiFive FU740 PCIe controller driver: - Add pcie_aux clock to prci driver (Greentime Hu) - Use reset-simple in prci driver for PCIe (Greentime Hu) - Add SiFive FU740 PCIe host controller driver and DT binding (Paul Walmsley, Greentime Hu) Synopsys DesignWare PCIe controller driver: - Move MSI Receiver init to dw_pcie_host_init() so it is re-initialized along with the RC in resume (Jisheng Zhang) - Move iATU detection earlier to fix regression (Hou Zhiqiang) TI J721E PCIe driver: - Add DT binding and TI j721e support for refclk to PCIe connector (Kishon Vijay Abraham I) - Add host mode and endpoint mode DT bindings for TI AM64 SoC (Kishon Vijay Abraham I) TI Keystone PCIe controller driver: - Use generic config accessors for TI AM65x (K3) to fix regression (Kishon Vijay Abraham I) Xilinx NWL PCIe controller driver: - Add support for coherent PCIe DMA traffic using CCI (Bharat Kumar Gogada) - Add optional "dma-coherent" DT property (Bharat Kumar Gogada) Miscellaneous: - Fix kernel-doc warnings (Krzysztof Wilczyński) - Remove unused MicroGate SyncLink device IDs (Jiri Slaby) - Remove redundant dev_err() for devm_ioremap_resource() failure (Chen Hui) - Remove redundant initialization (Colin Ian King) - Drop redundant dev_err() for platform_get_irq() errors (Krzysztof Wilczyński)" * tag 'pci-v5.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (98 commits) riscv: dts: Add PCIe support for the SiFive FU740-C000 SoC PCI: fu740: Add SiFive FU740 PCIe host controller driver dt-bindings: PCI: Add SiFive FU740 PCIe host controller MAINTAINERS: Add maintainers for SiFive FU740 PCIe driver clk: sifive: Use reset-simple in prci driver for PCIe driver clk: sifive: Add pcie_aux clock in prci driver for PCIe driver PCI: brcmstb: Use reset/rearm instead of deassert/assert ata: ahci_brcm: Fix use of BCM7216 reset controller reset: add missing empty function reset_control_rearm() PCI: Allow VPD access for QLogic ISP2722 PCI/VPD: Add helper pci_get_func0_dev() PCI/VPD: Remove pci_vpd_find_tag() SRDT handling PCI/VPD: Remove pci_vpd_find_tag() 'offset' argument PCI/VPD: Change pci_vpd_init() return type to void PCI/VPD: Make missing VPD message less alarming PCI/VPD: Remove pci_set_vpd_size() x86/PCI: Remove unused alloc_pci_root_info() return value MAINTAINERS: Add Jianjun Wang as MediaTek PCI co-maintainer PCI: mediatek-gen3: Add system PM support PCI: mediatek-gen3: Add MSI support ...
2021-04-15iommu/vt-d: Fix an error handling path in 'intel_prepare_irq_remapping()'Christophe JAILLET
If 'intel_cap_audit()' fails, we should return directly, as already done in the surrounding error handling path. Fixes: ad3d19029979 ("iommu/vt-d: Audit IOMMU Capabilities and add helper functions") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/98d531caabe66012b4fffc7813fd4b9470afd517.1618124777.git.christophe.jaillet@wanadoo.fr Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-22iommu/vt-d: Use Real PCI DMA device for IRTEJon Derrick
VMD retransmits child device MSI-X with the VMD endpoint's requester-id. In order to support direct interrupt remapping of VMD child devices, ensure that the IRTE is programmed with the VMD endpoint's requester-id using pci_real_dma_dev(). Link: https://lore.kernel.org/r/20210210161315.316097-2-jonathan.derrick@intel.com Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Joerg Roedel <jroedel@suse.de>
2021-02-04iommu/vt-d: Audit IOMMU Capabilities and add helper functionsKyung Min Park
Audit IOMMU Capability/Extended Capability and check if the IOMMUs have the consistent value for features. Report out or scale to the lowest supported when IOMMU features have incompatibility among IOMMUs. Report out features when below features are mismatched: - First Level 5 Level Paging Support (FL5LP) - First Level 1 GByte Page Support (FL1GP) - Read Draining (DRD) - Write Draining (DWD) - Page Selective Invalidation (PSI) - Zero Length Read (ZLR) - Caching Mode (CM) - Protected High/Low-Memory Region (PHMR/PLMR) - Required Write-Buffer Flushing (RWBF) - Advanced Fault Logging (AFL) - RID-PASID Support (RPS) - Scalable Mode Page Walk Coherency (SMPWC) - First Level Translation Support (FLTS) - Second Level Translation Support (SLTS) - No Write Flag Support (NWFS) - Second Level Accessed/Dirty Support (SLADS) - Virtual Command Support (VCS) - Scalable Mode Translation Support (SMTS) - Device TLB Invalidation Throttle (DIT) - Page Drain Support (PDS) - Process Address Space ID Support (PASID) - Extended Accessed Flag Support (EAFS) - Supervisor Request Support (SRS) - Execute Request Support (ERS) - Page Request Support (PRS) - Nested Translation Support (NEST) - Snoop Control (SC) - Pass Through (PT) - Device TLB Support (DT) - Queued Invalidation (QI) - Page walk Coherency (C) Set capability to the lowest supported when below features are mismatched: - Maximum Address Mask Value (MAMV) - Number of Fault Recording Registers (NFR) - Second Level Large Page Support (SLLPS) - Fault Recording Offset (FRO) - Maximum Guest Address Width (MGAW) - Supported Adjusted Guest Address Width (SAGAW) - Number of Domains supported (NDOMS) - Pasid Size Supported (PSS) - Maximum Handle Mask Value (MHMV) - IOTLB Register Offset (IRO) Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com Link: https://lore.kernel.org/r/20210204014401.2846425-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-05iommu/intel: Fix memleak in intel_irq_remapping_allocDinghao Liu
When irq_domain_get_irq_data() or irqd_cfg() fails at i == 0, data allocated by kzalloc() has not been freed before returning, which leads to memleak. Fixes: b106ee63abcc ("irq_remapping/vt-d: Enhance Intel IR driver to support hierarchical irqdomains") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210105051837.32118-1-dinghao.liu@zju.edu.cn Signed-off-by: Will Deacon <will@kernel.org>
2020-10-28iommu/vt-d: Simplify intel_irq_remapping_select()David Woodhouse
Now that the old get_irq_domain() method has gone, consolidate on just the map_XXX_to_iommu() functions. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-31-dwmw2@infradead.org
2020-10-28x86: Kill all traces of irq_remapping_get_irq_domain()David Woodhouse
All users are converted to use the fwspec based parent domain lookup. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-30-dwmw2@infradead.org
2020-10-28iommu/vt-d: Implement select() method on remapping irqdomainDavid Woodhouse
Preparatory for removing irq_remapping_get_irq_domain() Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-26-dwmw2@infradead.org
2020-10-28x86/ioapic: Generate RTE directly from parent irqchip's MSI messageDavid Woodhouse
The I/O-APIC generates an MSI cycle with address/data bits taken from its Redirection Table Entry in some combination which used to make sense, but now is just a bunch of bits which get passed through in some seemingly arbitrary order. Instead of making IRQ remapping drivers directly frob the I/OA-PIC RTE, let them just do their job and generate an MSI message. The bit swizzling to turn that MSI message into the I/O-APIC's RTE is the same in all cases, since it's a function of the I/O-APIC hardware. The IRQ remappers have no real need to get involved with that. The only slight caveat is that the I/OAPIC is interpreting some of those fields too, and it does want the 'vector' field to be unique to make EOI work. The AMD IOMMU happens to put its IRTE index in the bits that the I/O-APIC thinks are the vector field, and accommodates this requirement by reserving the first 32 indices for the I/O-APIC. The Intel IOMMU doesn't actually use the bits that the I/O-APIC thinks are the vector field, so it fills in the 'pin' value there instead. [ tglx: Replaced the unreadably macro maze with the cleaned up RTE/msi_msg bitfields and added commentry to explain the mapping magic ] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-22-dwmw2@infradead.org
2020-10-28x86/ioapic: Cleanup IO/APIC route entry structsThomas Gleixner
Having two seperate structs for the I/O-APIC RTE entries (non-remapped and DMAR remapped) requires type casts and makes it hard to map. Combine them in IO_APIC_routing_entry by defining a union of two 64bit bitfields. Use naming which reflects which bits are shared and which bits are actually different for the operating modes. [dwmw2: Fix it up and finish the job, pulling the 32-bit w1,w2 words for register access into the same union and eliminating a few more places where bits were accessed through masks and shifts.] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-21-dwmw2@infradead.org
2020-10-28x86/io_apic: Cleanup trigger/polarity helpersThomas Gleixner
'trigger' and 'polarity' are used throughout the I/O-APIC code for handling the trigger type (edge/level) and the active low/high configuration. While there are defines for initializing these variables and struct members, they are not used consequently and the meaning of 'trigger' and 'polarity' is opaque and confusing at best. Rename them to 'is_level' and 'active_low' and make them boolean in various structs so it's entirely clear what the meaning is. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-20-dwmw2@infradead.org
2020-10-28iommu/intel: Use msi_msg shadow structsThomas Gleixner
Use the bitfields in the x86 shadow struct to compose the MSI message. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-14-dwmw2@infradead.org
2020-10-28x86/apic: Cleanup destination modeThomas Gleixner
apic::irq_dest_mode is actually a boolean, but defined as u32 and named in a way which does not explain what it means. Make it a boolean and rename it to 'dest_mode_logical' Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-9-dwmw2@infradead.org
2020-10-28x86/apic: Cleanup delivery mode definesThomas Gleixner
The enum ioapic_irq_destination_types and the enumerated constants starting with 'dest_' are gross misnomers because they describe the delivery mode. Rename then enum and the constants so they actually make sense. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-6-dwmw2@infradead.org
2020-09-16iommu/vt-d: Remove domain search for PCI/MSI[X]Thomas Gleixner
Now that the domain can be retrieved through device::msi_domain the domain search for PCI_MSI[X] is not longer required. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112334.305699301@linutronix.de
2020-09-16iommm/vt-d: Store irq domain in struct deviceThomas Gleixner
As a first step to make X86 utilize the direct MSI irq domain operations store the irq domain pointer in the device struct when a device is probed. This is done from dmar_pci_bus_add_dev() because it has to work even when DMA remapping is disabled. It only overrides the irqdomain of devices which are handled by a regular PCI/MSI irq domain which protects PCI devices behind special busses like VMD which have their own irq domain. No functional change. It just avoids the redirection through arch_*_msi_irqs() and allows the PCI/MSI core to directly invoke the irq domain alloc/free functions instead of having to look up the irq domain for every single MSI interupt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112333.714566121@linutronix.de
2020-09-16x86/msi: Consolidate MSI allocationThomas Gleixner
Convert the interrupt remap drivers to retrieve the pci device from the msi descriptor and use info::hwirq. This is the first step to prepare x86 for using the generic MSI domain ops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112332.466405395@linutronix.de
2020-09-16x86_ioapic_Consolidate_IOAPIC_allocationThomas Gleixner
Move the IOAPIC specific fields into their own struct and reuse the common devid. Get rid of the #ifdeffery as it does not matter at all whether the alloc info is a couple of bytes longer or not. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112332.054367732@linutronix.de
2020-09-16x86/msi: Consolidate HPET allocationThomas Gleixner
None of the magic HPET fields are required in any way. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.943993771@linutronix.de