summaryrefslogtreecommitdiff
path: root/drivers/iommu/amd
AgeCommit message (Collapse)Author
2024-12-18iommu/amd: Modify clear_dte_entry() to avoid in-place updateSuravee Suthikulpanit
By reusing the make_clear_dte() and update_dte256(). Also, there is no need to set TV bit for non-SNP system when clearing DTE for blocked domain, and no longer need to apply erratum 63 in clear_dte() since it is already stored in struct ivhd_dte_flags and apply in set_dte_entry(). Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-8-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18iommu/amd: Introduce helper function get_dte256()Suravee Suthikulpanit
And use it in clone_alias() along with update_dte256(). Also use get_dte256() in dump_dte_entry(). Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-7-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18iommu/amd: Modify set_dte_entry() to use 256-bit DTE helpersSuravee Suthikulpanit
Also, the set_dte_entry() is used to program several DTE fields (e.g. stage1 table, stage2 table, domain id, and etc.), which is difficult to keep track with current implementation. Therefore, separate logic for clearing DTE (i.e. make_clear_dte) and another function for setting up the GCR3 Table Root Pointer, GIOV, GV, GLX, and GuestPagingMode into another function set_dte_gcr3_table(). Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-6-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18iommu/amd: Introduce helper function to update 256-bit DTESuravee Suthikulpanit
The current implementation does not follow 128-bit write requirement to update DTE as specified in the AMD I/O Virtualization Techonology (IOMMU) Specification. Therefore, modify the struct dev_table_entry to contain union of u128 data array, and introduce a helper functions update_dte256() to update DTE using two 128-bit cmpxchg operations to update 256-bit DTE with the modified structure, and take into account the DTE[V, GV] bits when programming the DTE to ensure proper order of DTE programming and flushing. In addition, introduce a per-DTE spin_lock struct dev_data.dte_lock to provide synchronization when updating the DTE to prevent cmpxchg128 failure. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Uros Bizjak <ubizjak@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-5-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18iommu/amd: Introduce struct ivhd_dte_flags to store persistent DTE flagsSuravee Suthikulpanit
During early initialization, the driver parses IVRS IVHD block to get list of downstream devices along with their DTE flags (i.e INITPass, EIntPass, NMIPass, SysMgt, Lint0Pass, Lint1Pass). This information is currently store in the device DTE, and needs to be preserved when clearing and configuring each DTE, which makes it difficult to manage. Introduce struct ivhd_dte_flags to store IVHD DTE settings for a device or range of devices, which are stored in the amd_ivhd_dev_flags_list during initial IVHD parsing. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-4-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18iommu/amd: Disable AMD IOMMU if CMPXCHG16B feature is not supportedSuravee Suthikulpanit
According to the AMD IOMMU spec, IOMMU hardware reads the entire DTE in a single 256-bit transaction. It is recommended to update DTE using 128-bit operation followed by an INVALIDATE_DEVTAB_ENTYRY command when the IV=1b or V=1b before the change. According to the AMD BIOS and Kernel Developer's Guide (BDKG) dated back to family 10h Processor [1], which is the first introduction of AMD IOMMU, AMD processor always has CPUID Fn0000_0001_ECX[CMPXCHG16B]=1. Therefore, it is safe to assume cmpxchg128 is available with all AMD processor w/ IOMMU. In addition, the CMPXCHG16B feature has already been checked separately before enabling the GA, XT, and GAM modes. Consolidate the detection logic, and fail the IOMMU initialization if the feature is not supported. [1] https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/31116.pdf Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-3-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18iommu/amd: Misc ACPI IVRS debug info clean upSuravee Suthikulpanit
* Remove redundant AMD-Vi prefix. * Print IVHD device entry settings field using hex value. * Print root device of IVHD ACPI device entry using hex value. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-2-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-10iommu/amd: Add lockdep asserts for domain->dev_listJason Gunthorpe
Add an assertion to all the iteration points that don't obviously have the lock held already. These all take the locker higher in their call chains. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/2-v1-3b9edcf8067d+3975-amd_dev_list_locking_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-10iommu/amd: Put list_add/del(dev_data) back under the domain->lockJason Gunthorpe
The list domain->dev_list is protected by the domain->lock spinlock. Any iteration, addition or removal must be under the lock. Move the list_del() up into the critical section. pdom_is_sva_capable(), and destroy_gcr3_table() do not interact with the list element. Wrap the list_add() in a lock, it would make more sense if this was under the same critical section as adjusting the refcounts earlier, but that requires more complications. Fixes: d6b47dec3684 ("iommu/amd: Reduce domain lock scope in attach device path") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/1-v1-3b9edcf8067d+3975-amd_dev_list_locking_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-11-22iommu: Rename ops->domain_alloc_user() to domain_alloc_paging_flags()Jason Gunthorpe
Now that the main domain allocating path is calling this function it doesn't make sense to leave it named _user. Change the name to alloc_paging_flags() to mirror the new iommu_paging_domain_alloc_flags() function. A driver should implement only one of ops->domain_alloc_paging() or ops->domain_alloc_paging_flags(). The former is a simpler interface with less boiler plate that the majority of drivers use. The latter is for drivers with a greater feature set (PASID, multiple page table support, advanced iommufd support, nesting, etc). Additional patches will be needed to achieve this. Link: https://patch.msgid.link/r/2-v1-c252ebdeb57b+329-iommu_paging_flags_jgg@nvidia.com Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-15Merge branches 'intel/vt-d', 'amd/amd-vi' and 'iommufd/arm-smmuv3-nested' ↵Joerg Roedel
into next
2024-11-08iommu: Make set_dev_pasid op support domain replacementYi Liu
The iommu core is going to support domain replacement for pasid, it needs to make the set_dev_pasid op support replacing domain and keep the old domain config in the failure case. AMD iommu driver does not support domain replacement for pasid yet, so it would fail the set_dev_pasid op to keep the old config if the input @old is non-NULL. Till now, all the set_dev_pasid callbacks can handle the old parameter and can keep the old config when failed, so update the kdoc of set_dev_pasid op. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20241107122234.7424-14-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-11-08iommu: Pass old domain to set_dev_pasid opYi Liu
To support domain replacement for pasid, the underlying iommu driver needs to know the old domain hence be able to clean up the existing attachment. It would be much convenient for iommu layer to pass down the old domain. Otherwise, iommu drivers would need to track domain for pasids by themselves, this would duplicate code among the iommu drivers. Or iommu drivers would rely group->pasid_array to get domain, which may not always the correct one. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20241107122234.7424-2-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Improve amd_iommu_release_device()Vasant Hegde
Previous patch added ops->release_domain support. Core will attach devices to release_domain->attach_dev() before calling this function. Devices are already detached their current domain and attached to blocked domain. This is mostly dummy function now. Just throw warning if device is still attached to domain. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-13-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Add ops->release_domainVasant Hegde
In release path, remove device from existing domain and attach it to blocked domain. So that all DMAs from device is blocked. Note that soon blocked_domain will support other ops like set_dev_pasid() but release_domain supports only attach_dev ops. Hence added separate 'release_domain' variable. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-12-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Reorder attach device codeVasant Hegde
Ideally in attach device path, it should take dev_data lock before making changes to device data including IOPF enablement. So far dev_data was using spinlock and it was hitting lock order issue when it tries to enable IOPF. Hence Commit 526606b0a199 ("iommu/amd: Fix Invalid wait context issue") moved IOPF enablement outside dev_data->lock. Previous patch converted dev_data lock to mutex. Now its safe to call amd_iommu_iopf_add_device() with dev_data->mutex. Hence move back PCI device capability enablement (ATS, PRI, PASID) and IOPF enablement code inside the lock. Also in attach_device(), update 'dev_data->domain' at the end so that error handling becomes simple. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-11-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Convert dev_data lock from spinlock to mutexVasant Hegde
Currently in attach device path it takes dev_data->spinlock. But as per design attach device path can sleep. Also if device is PRI capable then it adds device to IOMMU fault handler queue which takes mutex. Hence currently PRI enablement is done outside dev_data lock. Covert dev_data lock from spinlock to mutex so that it follows the design and also PRI enablement can be done properly. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-10-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Rearrange attach device codeVasant Hegde
attach_device() is just holding lock and calling do_attach(). There is not need to have another function. Just move do_attach() code to attach_device(). Similarly move do_detach() code to detach_device(). Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-9-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Reduce domain lock scope in attach device pathVasant Hegde
Currently attach device path takes protection domain lock followed by dev_data lock. Most of the operations in this function is specific to device data except pdom_attach_iommu() where it updates protection domain structure. Hence reduce the scope of protection domain lock. Note that this changes the locking order. Now it takes device lock before taking doamin lock (group->mutex -> dev_data->lock -> pdom->lock). dev_data->lock is used only in device attachment path. So changing order is fine. It will not create any issue. Finally move numa node assignment to pdom_attach_iommu(). Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-8-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Do not detach devices in domain free pathVasant Hegde
All devices attached to a protection domain must be freed before calling domain free. Hence do not try to free devices in domain free path. Continue to throw warning if pdom->dev_list is not empty so that any potential issues can be fixed. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-7-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Remove unused amd_iommus variableVasant Hegde
protection_domain structure is updated to use xarray to track the IOMMUs attached to the domain. Now domain flush code is not using amd_iommus. Hence remove this unused variable. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-6-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: xarray to track protection_domain->iommu listVasant Hegde
Use xarray to track IOMMU attached to protection domain instead of static array of MAX_IOMMUS. Also add lockdep assertion. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-5-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Remove protection_domain.dev_cnt variableVasant Hegde
protection_domain->dev_list tracks list of attached devices to domain. We can use list_* functions on dev_list to get device count. Hence remove 'dev_cnt' variable. No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-4-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Use ida interface to manage protection domain IDVasant Hegde
Replace custom domain ID allocator with IDA interface. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-3-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd/pgtbl_v2: Take protection domain lock before invalidating TLBVasant Hegde
Commit c7fc12354be0 ("iommu/amd/pgtbl_v2: Invalidate updated page ranges only") missed to take domain lock before calling amd_iommu_domain_flush_pages(). Fix this by taking protection domain lock before calling TLB invalidation function. Fixes: c7fc12354be0 ("iommu/amd/pgtbl_v2: Invalidate updated page ranges only") Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-2-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30Merge branch 'core' into amd/amd-viJoerg Roedel
2024-10-29iommu/amd: Implement global identity domainVasant Hegde
Implement global identity domain. All device groups in identity domain will share this domain. In attach device path, based on device capability it will allocate per device domain ID and GCR3 table. So that it can support SVA. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20241028093810.5901-11-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29iommu/amd: Enhance amd_iommu_domain_alloc_user()Vasant Hegde
Previous patch enhanced core layer to check device PASID capability and pass right flags to ops->domain_alloc_user(). Enhance amd_iommu_domain_alloc_user() to allocate domain with appropriate page table based on flags parameter. - If flags is empty then allocate domain with default page table type. This will eventually replace ops->domain_alloc(). For UNMANAGED domain, core will call this interface with flags=0. So AMD driver will continue to allocate V1 page table. - If IOMMU_HWPT_ALLOC_PASID flags is passed then allocate domain with v2 page table. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241028093810.5901-10-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29iommu/amd: Pass page table type as param to pdom_setup_pgtable()Vasant Hegde
Current code forces v1 page table for UNMANAGED domain and global page table type (amd_iommu_pgtable) for rest of paging domain. Following patch series adds support for domain_alloc_paging() ops. Also enhances domain_alloc_user() to allocate page table based on 'flags. Hence pass page table type as parameter to pdomain_setup_pgtable(). So that caller can decide right page table type. Also update dma_max_address() to take pgtable as parameter. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jacob Pan <jacob.pan@linux.microsoft.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241028093810.5901-9-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29iommu/amd: Separate page table setup from domain allocationVasant Hegde
Currently protection_domain_alloc() allocates domain and also sets up page table. Page table setup is required for PAGING domain only. Domain type like SVA doesn't need page table. Hence move page table setup code to separate function. Also SVA domain allocation path does not call pdom_setup_pgtable(). Hence remove IOMMU_DOMAIN_SVA type check. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jacob Pan <jacob.pan@linux.microsoft.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241028093810.5901-8-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29iommu/amd: Move V2 page table support check to early_amd_iommu_init()Vasant Hegde
amd_iommu_pgtable validation has to be done before calling iommu_snp_enable(). It can be done immediately after reading IOMMU features. Hence move this check to early_amd_iommu_init(). Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241028093810.5901-7-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29iommu/amd: Add helper function to check GIOSUP/GTSUPVasant Hegde
amd_iommu_gt_ppr_supported() only checks for GTSUP. To support PASID with V2 page table we need GIOSUP as well. Hence add new helper function to check GIOSUP/GTSUP. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241028093810.5901-6-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29iommu/amd: Fix corruption when mapping large pages from 0Jason Gunthorpe
If a page is mapped starting at 0 that is equal to or larger than can fit in the current mode (number of table levels) it results in corrupting the mapping as the following logic assumes the mode is correct for the page size being requested. There are two issues here, the check if the address fits within the table uses the start address, it should use the last address to ensure that last byte of the mapping fits within the current table mode. The second is if the mapping is exactly the size of the full page table it has to add another level to instead hold a single IOPTE for the large size. Since both corner cases require a 0 IOVA to be hit and doesn't start until a page size of 2^48 it is unlikely to ever hit in a real system. Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/0-v1-27ab08d646a1+29-amd_0map_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29iommu/amd: Do not try copy old DTE resume pathVasant Hegde
In suspend/resume path, no need to copy old DTE (early_enable_iommus()). Just need to reload IOMMU hardware. This is the side effect of commit 3ac3e5ee5ed5 ("iommu/amd: Copy old trans table from old kernel") which changed early_enable_iommus() but missed to fix enable_iommus(). Resume path continue to work as 'amd_iommu_pre_enabled' is set to false and copy_device_table() will fail. It will just re-loaded IOMMU. Hence I think we don't need to backport this to stable tree. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20241016084958.99727-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-15iommu/amd: Use atomic64_inc_return() in iommu.cUros Bizjak
Use atomic64_inc_return(&ref) instead of atomic64_add_return(1, &ref) to use optimized implementation and ease register pressure around the primitive for targets that implement optimized variant. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241007084356.47799-1-ubizjak@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-13Merge branches 'fixes', 'arm/smmu', 'intel/vt-d', 'amd/amd-vi' and 'core' ↵Joerg Roedel
into next
2024-09-12iommu/amd: Test for PAGING domains before freeing a domainJason Gunthorpe
This domain free function can be called for IDENTITY and SVA domains too, and they don't have page tables. For now protect against this by checking the type. Eventually the different types should have their own free functions. Fixes: 485534bfccb2 ("iommu/amd: Remove conditions from domain free paths") Reported-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/0-v1-ad9884ee5f5b+da-amd_iopgtbl_fix_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-12iommu/amd: Fix argument order in amd_iommu_dev_flush_pasid_all()Eliav Bar-ilan
An incorrect argument order calling amd_iommu_dev_flush_pasid_pages() causes improper flushing of the IOMMU, leaving the old value of GCR3 from a previous process attached to the same PASID. The function has the signature: void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data, ioasid_t pasid, u64 address, size_t size) Correct the argument order. Cc: stable@vger.kernel.org Fixes: 474bf01ed9f0 ("iommu/amd: Add support for device based TLB invalidation") Signed-off-by: Eliav Bar-ilan <eliavb@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/0-v1-fc6bc37d8208+250b-amd_pasid_flush_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-10iommu/amd: Add kernel parameters to limit V1 page-sizesJoerg Roedel
Add two new kernel command line parameters to limit the page-sizes used for v1 page-tables: nohugepages - Limits page-sizes to 4KiB v2_pgsizes_only - Limits page-sizes to 4Kib/2Mib/1GiB; The same as the sizes used with v2 page-tables This is needed for multiple scenarios. When assigning devices to SEV-SNP guests the IOMMU page-sizes need to match the sizes in the RMP table, otherwise the device will not be able to access all shared memory. Also, some ATS devices do not work properly with arbitrary IO page-sizes as supported by AMD-Vi, so limiting the sizes used by the driver is a suitable workaround. All-in-all, these parameters are only workarounds until the IOMMU core and related APIs gather the ability to negotiate the page-sizes in a better way. Signed-off-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240905072240.253313-1-joro@8bytes.org
2024-09-04iommu/amd: Do not set the D bit on AMD v2 table entriesJason Gunthorpe
The manual says that bit 6 is IGN for all Page-Table Base Address pointers, don't set it. Fixes: aaac38f61487 ("iommu/amd: Initial support for AMD IOMMU v2 page table") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/14-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Correct the reported page sizes from the V1 tableJason Gunthorpe
The HW only has 52 bits of physical address support, the supported page sizes should not have bits set beyond this. Further the spec says that the 6th level does not support any "default page size for translation entries" meaning leafs in the 6th level are not allowed too. Rework the definition to use GENMASK to build the range of supported pages from the top of physical to 4k. Nothing ever uses such large pages, so this is a cosmetic/documentation improvement only. Reported-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/13-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove the confusing dummy iommu_flush_ops tlb opsJason Gunthorpe
The iommu driver is supposed to provide these ops to its io_pgtable implementation so that it can hook the invalidations and do the right thing. They are called by wrapper functions like io_pgtable_tlb_add_page() etc, which the AMD code never calls. Instead it directly calls the AMD IOMMU invalidation functions by casting to the struct protection_domain. Remove it all. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/12-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Fix typo of , instead of ;Jason Gunthorpe
Generates the same code, but is not the expected C style. Fixes: aaac38f61487 ("iommu/amd: Initial support for AMD IOMMU v2 page table") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/11-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove conditions from domain free pathsJason Gunthorpe
Don't use tlb as some flag to indicate if protection_domain_alloc() completed. Have protection_domain_alloc() unwind itself in the normal kernel style and require protection_domain_free() only be called on successful results of protection_domain_alloc(). Also, the amd_iommu_domain_free() op is never called by the core code with a NULL argument, so remove all the NULL tests as well. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/10-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Narrow the use of struct protection_domain to invalidationJason Gunthorpe
The AMD io_pgtable stuff doesn't implement the tlb ops callbacks, instead it invokes the invalidation ops directly on the struct protection_domain. Narrow the use of struct protection_domain to only those few code paths. Make everything else properly use struct amd_io_pgtable through the call chains, which is the correct modular type for an io-pgtable module. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/9-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Store the nid in io_pgtable_cfg instead of the domainJason Gunthorpe
We already have memory in the union here that is being wasted in AMD's case, use it to store the nid. Putting the nid here further isolates the io_pgtable code from the struct protection_domain. Fixup protection_domain_alloc so that the NID from the device is provided, at this point dev is never NULL for AMD so this will now allocate the first table pointer on the correct NUMA node. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/8-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove amd_io_pgtable::pgtbl_cfgJason Gunthorpe
This struct is already in iop.cfg, we don't need two. AMD is using this API sort of wrong, the cfg is supposed to be passed in and then the allocation function will allocate ops memory and copy the passed config into the new memory. Keep it kind of wrong and pass in the cfg memory that is already part of the pagetable struct. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Rename struct amd_io_pgtable iopt to pgtblJason Gunthorpe
There is struct protection_domain iopt and struct amd_io_pgtable iopt. Next patches are going to want to write domain.iopt.iopt.xx which is quite unnatural to read. Give one of them a different name, amd_io_pgtable has fewer references so call it pgtbl, to match pgtbl_cfg, instead. Suggested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove the amd_iommu_domain_set_pt_root() and relatedJason Gunthorpe
Looks like many refactorings here have left this confused. There is only one storage of the root/mode, it is in the iop struct. increase_address_space() calls amd_iommu_domain_set_pgtable() with values that it already stored in iop a few lines above. amd_iommu_domain_clr_pt_root() is zero'ing memory we are about to free. It used to protect against a double free of root, but that is gone now. Remove amd_iommu_domain_set_pgtable(), amd_iommu_domain_set_pt_root(), amd_iommu_domain_clr_pt_root() as they are all pointless. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove amd_iommu_domain_update() from page table freeingJason Gunthorpe
It is a serious bug if the domain is still mapped to any DTEs when it is freed as we immediately start freeing page table memory, so any remaining HW touch will UAF. If it is not mapped then dev_list is empty and amd_iommu_domain_update() does nothing. Remove it and add a WARN_ON() to catch this class of bug. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>