summaryrefslogtreecommitdiff
path: root/tools/testing/selftests/kvm
AgeCommit message (Collapse)Author
2024-08-07KVM: selftests: arm64: Correct feature test for S1PIE in get-reg-listMark Brown
The ID register for S1PIE is ID_AA64MMFR3_EL1.S1PIE which is bits 11:8 but get-reg-list uses a shift of 4, checking SCTLRX instead. Use a shift of 8 instead. Fixes: 5f0419a0083b ("KVM: selftests: get-reg-list: add Permission Indirection registers") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20240731-kvm-arm64-fix-s1pie-test-v1-1-a9253f3b7db4@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-29KVM: riscv: selftests: Fix compile errorYong-Xuan Wang
Fix compile error introduced by commit d27c34a73514 ("KVM: riscv: selftests: Add some Zc* extensions to get-reg-list test"). These 4 lines should be end with ";". Fixes: d27c34a73514 ("KVM: riscv: selftests: Add some Zc* extensions to get-reg-list test") Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Clément Léger <cleger@rivosinc.com> Link: https://lore.kernel.org/r/20240726084931.28924-5-yongxuan.wang@sifive.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-07-21Merge tag 'mm-stable-2024-07-21-14-50' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - In the series "mm: Avoid possible overflows in dirty throttling" Jan Kara addresses a couple of issues in the writeback throttling code. These fixes are also targetted at -stable kernels. - Ryusuke Konishi's series "nilfs2: fix potential issues related to reserved inodes" does that. This should actually be in the mm-nonmm-stable tree, along with the many other nilfs2 patches. My bad. - More folio conversions from Kefeng Wang in the series "mm: convert to folio_alloc_mpol()" - Kemeng Shi has sent some cleanups to the writeback code in the series "Add helper functions to remove repeated code and improve readability of cgroup writeback" - Kairui Song has made the swap code a little smaller and a little faster in the series "mm/swap: clean up and optimize swap cache index". - In the series "mm/memory: cleanly support zeropage in vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David Hildenbrand has reworked the rather sketchy handling of the use of the zeropage in MAP_SHARED mappings. I don't see any runtime effects here - more a cleanup/understandability/maintainablity thing. - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of higher addresses, for aarch64. The (poorly named) series is "Restructure va_high_addr_switch". - The core TLB handling code gets some cleanups and possible slight optimizations in Bang Li's series "Add update_mmu_tlb_range() to simplify code". - Jane Chu has improved the handling of our fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the series "Enhance soft hwpoison handling and injection". - Jeff Johnson has sent a billion patches everywhere to add MODULE_DESCRIPTION() to everything. Some landed in this pull. - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has simplified migration's use of hardware-offload memory copying. - Yosry Ahmed performs more folio API conversions in his series "mm: zswap: trivial folio conversions". - In the series "large folios swap-in: handle refault cases first", Chuanhua Han inches us forward in the handling of large pages in the swap code. This is a cleanup and optimization, working toward the end objective of full support of large folio swapin/out. - In the series "mm,swap: cleanup VMA based swap readahead window calculation", Huang Ying has contributed some cleanups and a possible fixlet to his VMA based swap readahead code. - In the series "add mTHP support for anonymous shmem" Baolin Wang has taught anonymous shmem mappings to use multisize THP. By default this is a no-op - users must opt in vis sysfs controls. Dramatic improvements in pagefault latency are realized. - David Hildenbrand has some cleanups to our remaining use of page_mapcount() in the series "fs/proc: move page_mapcount() to fs/proc/internal.h". - David also has some highmem accounting cleanups in the series "mm/highmem: don't track highmem pages manually". - Build-time fixes and cleanups from John Hubbard in the series "cleanups, fixes, and progress towards avoiding "make headers"". - Cleanups and consolidation of the core pagemap handling from Barry Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers and utilize them". - Lance Yang's series "Reclaim lazyfree THP without splitting" has reduced the latency of the reclaim of pmd-mapped THPs under fairly common circumstances. A 10x speedup is seen in a microbenchmark. It does this by punting to aother CPU but I guess that's a win unless all CPUs are pegged. - hugetlb_cgroup cleanups from Xiu Jianfeng in the series "mm/hugetlb_cgroup: rework on cftypes". - Miaohe Lin's series "Some cleanups for memory-failure" does just that thing. - Someone other than SeongJae has developed a DAMON feature in Honggyu Kim's series "DAMON based tiered memory management for CXL memory". This adds DAMON features which may be used to help determine the efficiency of our placement of CXL/PCIe attached DRAM. - DAMON user API centralization and simplificatio work in SeongJae Park's series "mm/damon: introduce DAMON parameters online commit function". - In the series "mm: page_type, zsmalloc and page_mapcount_reset()" David Hildenbrand does some maintenance work on zsmalloc - partially modernizing its use of pageframe fields. - Kefeng Wang provides more folio conversions in the series "mm: remove page_maybe_dma_pinned() and page_mkclean()". - More cleanup from David Hildenbrand, this time in the series "mm/memory_hotplug: use PageOffline() instead of PageReserved() for !ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline() pages" and permits the removal of some virtio-mem hacks. - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap()" is a cleanup to the anon folio handling in preparation for mTHP (multisize THP) swapin. - Kefeng Wang's series "mm: improve clear and copy user folio" implements more folio conversions, this time in the area of large folio userspace copying. - The series "Docs/mm/damon/maintaier-profile: document a mailing tool and community meetup series" tells people how to get better involved with other DAMON developers. From SeongJae Park. - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does that. - David Hildenbrand sends along more cleanups, this time against the migration code. The series is "mm/migrate: move NUMA hinting fault folio isolation + checks under PTL". - Jan Kara has found quite a lot of strangenesses and minor errors in the readahead code. He addresses this in the series "mm: Fix various readahead quirks". - SeongJae Park's series "selftests/damon: test DAMOS tried regions and {min,max}_nr_regions" adds features and addresses errors in DAMON's self testing code. - Gavin Shan has found a userspace-triggerable WARN in the pagecache code. The series "mm/filemap: Limit page cache size to that supported by xarray" addresses this. The series is marked cc:stable. - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations and cleanup" cleans up and slightly optimizes KSM. - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of code motion. The series (which also makes the memcg-v1 code Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put under config option" and "mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1" - Dan Schatzberg's series "Add swappiness argument to memory.reclaim" adds an additional feature to this cgroup-v2 control file. - The series "Userspace controls soft-offline pages" from Jiaqi Yan permits userspace to stop the kernel's automatic treatment of excessive correctable memory errors. In order to permit userspace to monitor and handle this situation. - Kefeng Wang's series "mm: migrate: support poison recover from migrate folio" teaches the kernel to appropriately handle migration from poisoned source folios rather than simply panicing. - SeongJae Park's series "Docs/damon: minor fixups and improvements" does those things. - In the series "mm/zsmalloc: change back to per-size_class lock" Chengming Zhou improves zsmalloc's scalability and memory utilization. - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare refcount increments. So these paes can first be moved aside if they reside in the movable zone or a CMA block. - Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps for much faster reading of vma information. The series is "query VMAs from /proc/<pid>/maps". - In the series "mm: introduce per-order mTHP split counters" Lance Yang improves the kernel's presentation of developer information related to multisize THP splitting. - Michael Ellerman has developed the series "Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)". This permits userspace to use all available huge page sizes. - In the series "revert unconditional slab and page allocator fault injection calls" Vlastimil Babka removes a performance-affecting and not very useful feature from slab fault injection. * tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits) mm/mglru: fix ineffective protection calculation mm/zswap: fix a white space issue mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio mm/hugetlb: fix possible recursive locking detected warning mm/gup: clear the LRU flag of a page before adding to LRU batch mm/numa_balancing: teach mpol_to_str about the balancing mode mm: memcg1: convert charge move flags to unsigned long long alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting lib: reuse page_ext_data() to obtain codetag_ref lib: add missing newline character in the warning message mm/mglru: fix overshooting shrinker memory mm/mglru: fix div-by-zero in vmpressure_calc_level() mm/kmemleak: replace strncpy() with strscpy() mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB mm: ignore data-race in __swap_writepage hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr mm: shmem: rename mTHP shmem counters mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async() mm/migrate: putback split folios when numa hint migration fails ...
2024-07-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...
2024-07-20Merge tag 'riscv-for-linus-6.11-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for various new ISA extensions: * The Zve32[xf] and Zve64[xfd] sub-extensios of the vector extension * Zimop and Zcmop for may-be-operations * The Zca, Zcf, Zcd and Zcb sub-extensions of the C extension * Zawrs - riscv,cpu-intc is now dtschema - A handful of performance improvements and cleanups to text patching - Support for memory hot{,un}plug - The highest user-allocatable virtual address is now visible in hwprobe * tag 'riscv-for-linus-6.11-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (58 commits) riscv: lib: relax assembly constraints in hweight riscv: set trap vector earlier KVM: riscv: selftests: Add Zawrs extension to get-reg-list test KVM: riscv: Support guest wrs.nto riscv: hwprobe: export Zawrs ISA extension riscv: Add Zawrs support for spinlocks dt-bindings: riscv: Add Zawrs ISA extension description riscv: Provide a definition for 'pause' riscv: hwprobe: export highest virtual userspace address riscv: Improve sbi_ecall() code generation by reordering arguments riscv: Add tracepoints for SBI calls and returns riscv: Optimize crc32 with Zbc extension riscv: Enable DAX VMEMMAP optimization riscv: mm: Add support for ZONE_DEVICE virtio-mem: Enable virtio-mem for RISC-V riscv: Enable memory hotplugging for RISC-V riscv: mm: Take memory hotplug read-lock during kernel page table dump riscv: mm: Add memory hotplugging support riscv: mm: Add pfn_to_kaddr() implementation riscv: mm: Refactor create_linear_mapping_range() for memory hot add ...
2024-07-16Merge tag 'kvm-x86-selftests-6.11' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM selftests for 6.11 - Remove dead code in the memslot modification stress test. - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs. - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs. - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration.
2024-07-16Merge tag 'kvm-x86-misc-6.11' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM x86 misc changes for 6.11 - Add a global struct to consolidate tracking of host values, e.g. EFER, and move "shadow_phys_bits" into the structure as "maxphyaddr". - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX. - Print the name of the APICv/AVIC inhibits in the relevant tracepoint. - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor. - Misc cleanups
2024-07-16Merge tag 'kvm-x86-generic-6.11' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM generic changes for 6.11 - Enable halt poll shrinking by default, as Intel found it to be a clear win. - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86. - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out(). - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs. - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout. - A few minor cleanups
2024-07-16Merge tag 'kvmarm-6.11' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 changes for 6.11 - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates
2024-07-14Merge branch kvm-arm64/ctr-el0 into kvmarm/nextOliver Upton
* kvm-arm64/ctr-el0: : Support for user changes to CTR_EL0, courtesy of Sebastian Ott : : Allow userspace to change the guest-visible value of CTR_EL0 for a VM, : so long as the requested value represents a subset of features supported : by hardware. In other words, prevent the VMM from over-promising the : capabilities of hardware. : : Make this happen by fitting CTR_EL0 into the existing infrastructure for : feature ID registers. KVM: selftests: Assert that MPIDR_EL1 is unchanged across vCPU reset KVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 masking KVM: selftests: arm64: Test writes to CTR_EL0 KVM: arm64: rename functions for invariant sys regs KVM: arm64: show writable masks for feature registers KVM: arm64: Treat CTR_EL0 as a VM feature ID register KVM: arm64: unify code to prepare traps KVM: arm64: nv: Use accessors for modifying ID registers KVM: arm64: Add helper for writing ID regs KVM: arm64: Use read-only helper for reading VM ID registers KVM: arm64: Make idregs debugfs iterator search sysreg table directly KVM: arm64: Get sys_reg encoding from descriptor in idregs_debug_show() Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-12Merge patch series "riscv: Apply Zawrs when available"Palmer Dabbelt
Andrew Jones <ajones@ventanamicro.com> says: Zawrs provides two instructions (wrs.nto and wrs.sto), where both are meant to allow the hart to enter a low-power state while waiting on a store to a memory location. The instructions also both wait an implementation-defined "short" duration (unless the implementation terminates the stall for another reason). The difference is that while wrs.sto will terminate when the duration elapses, wrs.nto, depending on configuration, will either just keep waiting or an ILL exception will be raised. Linux will use wrs.nto, so if platforms have an implementation which falls in the "just keep waiting" category (which is not expected), then it should _not_ advertise Zawrs in the hardware description. Like wfi (and with the same {m,h}status bits to configure it), when wrs.nto is configured to raise exceptions it's expected that the higher privilege level will see the instruction was a wait instruction, do something, and then resume execution following the instruction. For example, KVM does configure exceptions for wfi (hstatus.VTW=1) and therefore also for wrs.nto. KVM does this for wfi since it's better to allow other tasks to be scheduled while a VCPU waits for an interrupt. For waits such as those where wrs.nto/sto would be used, which are typically locks, it is also a good idea for KVM to be involved, as it can attempt to schedule the lock holding VCPU. This series starts with Christoph's addition of the riscv smp_cond_load_relaxed function which applies wrs.sto when available. That patch has been reworked to use wrs.nto and to use the same approach as Arm for the wait loop, since we can't have arbitrary C code between the load-reserved and the wrs. Then, hwprobe support is added (since the instructions are also usable from usermode), and finally KVM is taught about wrs.nto, allowing guests to see and use the Zawrs extension. We still don't have test results from hardware, and it's not possible to prove that using Zawrs is a win when testing on QEMU, not even when oversubscribing VCPUs to guests. However, it is possible to use KVM selftests to force a scenario where we can prove Zawrs does its job and does it well. [4] is a test which does this and, on my machine, without Zawrs it takes 16 seconds to complete and with Zawrs it takes 0.25 seconds. This series is also available here [1]. In order to use QEMU for testing a build with [2] is needed. In order to enable guests to use Zawrs with KVM using kvmtool, the branch at [3] may be used. [1] https://github.com/jones-drew/linux/commits/riscv/zawrs-v3/ [2] https://lore.kernel.org/all/20240312152901.512001-2-ajones@ventanamicro.com/ [3] https://github.com/jones-drew/kvmtool/commits/riscv/zawrs/ [4] https://github.com/jones-drew/linux/commit/cb2beccebcece10881db842ed69bdd5715cfab5d Link: https://lore.kernel.org/r/20240426100820.14762-8-ajones@ventanamicro.com * b4-shazam-merge: KVM: riscv: selftests: Add Zawrs extension to get-reg-list test KVM: riscv: Support guest wrs.nto riscv: hwprobe: export Zawrs ISA extension riscv: Add Zawrs support for spinlocks dt-bindings: riscv: Add Zawrs ISA extension description riscv: Provide a definition for 'pause' Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12KVM: riscv: selftests: Add Zawrs extension to get-reg-list testAndrew Jones
KVM RISC-V allows the Zawrs extension for the Guest/VM, so add it to the get-reg-list test. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240426100820.14762-14-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12Merge tag 'loongarch-kvm-6.11' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.11 1. Add ParaVirt steal time support. 2. Add some VM migration enhancement. 3. Add perf kvm-stat support for loongarch.
2024-07-12Merge tag 'kvm-riscv-6.11-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini
KVM/riscv changes for 6.11 - Redirect AMO load/store access fault traps to guest - Perf kvm stat support for RISC-V - Use guest files for IMSIC virtualization, when available ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA extensions is coming through the RISC-V tree.
2024-07-12KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORYIsaku Yamahata
Add a test case to exercise KVM_PRE_FAULT_MEMORY and run the guest to access the pre-populated area. It tests KVM_PRE_FAULT_MEMORY ioctl for KVM_X86_DEFAULT_VM and KVM_X86_SW_PROTECTED_VM. Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Message-ID: <32427791ef42e5efaafb05d2ac37fa4372715f47.1712785629.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-10selftests: centralize -D_GNU_SOURCE= to CFLAGS in lib.mkEdward Liaw
Centralize the _GNU_SOURCE definition to CFLAGS in lib.mk. Remove redundant defines from Makefiles that import lib.mk. Convert any usage of "#define _GNU_SOURCE 1" to "#define _GNU_SOURCE". This uses the form "-D_GNU_SOURCE=", which is equivalent to "#define _GNU_SOURCE". Otherwise using "-D_GNU_SOURCE" is equivalent to "-D_GNU_SOURCE=1" and "#define _GNU_SOURCE 1", which is less commonly seen in source code and would require many changes in selftests to avoid redefinition warnings. Link: https://lkml.kernel.org/r/20240625223454.1586259-2-edliaw@google.com Signed-off-by: Edward Liaw <edliaw@google.com> Suggested-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: André Almeida <andrealmeid@igalia.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kees Cook <kees@kernel.org> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-28KVM: selftests: Add test for configure of x86 APIC bus frequencyIsaku Yamahata
Test if KVM emulates the APIC bus clock at the expected frequency when userspace configures the frequency via KVM_CAP_X86_APIC_BUS_CYCLES_NS. Set APIC timer's initial count to the maximum value and busy wait for 100 msec (largely arbitrary) using the TSC. Read the APIC timer's "current count" to calculate the actual APIC bus clock frequency based on TSC frequency. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Co-developed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/2fccf35715b5ba8aec5e5708d86ad7015b8d74e6.1718214999.git.reinette.chatre@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-28KVM: selftests: Add guest udelay() utility for x86Reinette Chatre
Add udelay() for x86 tests to allow busy waiting in the guest for a specific duration, and to match ARM and RISC-V's udelay() in the hopes of eventually making udelay() available on all architectures. Get the guest's TSC frequency using KVM_GET_TSC_KHZ and expose it to all VMs via a new global, guest_tsc_khz. Assert that KVM_GET_TSC_KHZ returns a valid frequency, instead of simply skipping tests, which would require detecting which tests actually need/want udelay(). KVM hasn't returned an error for KVM_GET_TSC_KHZ since commit cc578287e322 ("KVM: Infrastructure for software and hardware based TSC rate scaling"), which predates KVM selftests by 6+ years (KVM_GET_TSC_KHZ itself predates KVM selftest by 7+ years). Note, if the GUEST_ASSERT() in udelay() somehow fires and the test doesn't check for guest asserts, then the test will fail with a very cryptic message. But fixing that, e.g. by automatically handling guest asserts, is a much larger task, and practically speaking the odds of a test afoul of this wart are infinitesimally small. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/5aa86285d1c1d7fe1960e3fe490f4b22273977e6.1718214999.git.reinette.chatre@intel.com Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-28KVM: selftests: Increase robustness of LLC cache misses in PMU counters testMaxim Levitsky
Currently the PMU counters test does a single CLFLUSH{,OPT} on the loop's code, but due to speculative execution this might not cause LLC misses within the measured section. Instead of doing a single flush before the loop, do a cache flush on each iteration of the loop to confuse the prediction and ensure that at least one cache miss occurs within the measured section. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> [sean: keep MFENCE, massage changelog] Link: https://lore.kernel.org/r/20240628005558.3835480-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-28KVM: selftests: Rework macros in PMU counters test to prep for multi-insn loopSean Christopherson
Tweak the macros in the PMU counters test to prepare for moving the CLFLUSH+MFENCE instructions into the loop body, to fix an issue where a single CLFUSH doesn't guarantee an LLC miss. Link: https://lore.kernel.org/r/20240628005558.3835480-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-27KVM: selftests: Print the seed for the guest pRNG iff it has changedSean Christopherson
Print the guest's random seed during VM creation if and only if the seed has changed since the seed was last printed. The vast majority of tests, if not all tests at this point, set the seed during test initialization and never change the seed, i.e. printing it every time a VM is created is useless noise. Snapshot and print the seed during early selftest init to play nice with tests that use the kselftests harness, at the cost of printing an unused seed for tests that change the seed during test-specific initialization, e.g. dirty_log_perf_test. The kselftests harness runs each testcase in a separate process that is forked from the original process before creating each testcase's VM, i.e. waiting until first VM creation will result in the seed being printed by each testcase despite it never changing. And long term, the hope/goal is that setting the seed will be handled by the core framework, i.e. that the dirty_log_perf_test wart will naturally go away. Reported-by: Yi Lai <yi1.lai@intel.com> Reported-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240627021756.144815-2-dapeng1.mi@linux.intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-26KVM: riscv: selftests: Add Zcmop extension to get-reg-list testClément Léger
The KVM RISC-V allows Zcmop extension for Guest/VM so add this extension to get-reg-list test. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Acked-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240619113529.676940-17-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26KVM: riscv: selftests: Add some Zc* extensions to get-reg-list testClément Léger
The KVM RISC-V allows Zca, Zcf, Zcd and Zcb extensions for Guest/VM so add these extensions to get-reg-list test. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Acked-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240619113529.676940-12-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26KVM: riscv: selftests: Add Zimop extension to get-reg-list testClément Léger
The KVM RISC-V allows Zimop extension for Guest/VM so add this extension to get-reg-list test. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Acked-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240619113529.676940-6-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-22KVM: selftests: Assert that MPIDR_EL1 is unchanged across vCPU resetOliver Upton
commit 606af8293cd8 ("KVM: selftests: arm64: Test vCPU-scoped feature ID registers") intended to test that MPIDR_EL1 is unchanged across vCPU reset but failed at actually doing so. Add the missing assertion. Link: https://lore.kernel.org/r/20240621225045.2472090-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - Fix dangling references to a redistributor region if the vgic was prematurely destroyed. - Properly mark FFA buffers as released, ensuring that both parties can make forward progress. x86: - Allow getting/setting MSRs for SEV-ES guests, if they're using the pre-6.9 KVM_SEV_ES_INIT API. - Always sync pending posted interrupts to the IRR prior to IOAPIC route updates, so that EOIs are intercepted properly if the old routing table requested that. Generic: - Avoid __fls(0) - Fix reference leak on hwpoisoned page - Fix a race in kvm_vcpu_on_spin() by ensuring loads and stores are atomic. - Fix bug in __kvm_handle_hva_range() where KVM calls a function pointer that was intended to be a marker only (nothing bad happens but kind of a mine and also technically undefined behavior) - Do not bother accounting allocations that are small and freed before getting back to userspace. Selftests: - Fix compilation for RISC-V. - Fix a "shift too big" goof in the KVM_SEV_INIT2 selftest. - Compute the max mappable gfn for KVM selftests on x86 using GuestMaxPhyAddr from KVM's supported CPUID (if it's available)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SEV-ES: Fix svm_get_msr()/svm_set_msr() for KVM_SEV_ES_INIT guests KVM: Discard zero mask with function kvm_dirty_ring_reset virt: guest_memfd: fix reference leak on hwpoisoned page kvm: do not account temporary allocations to kmem MAINTAINERS: Drop Wanpeng Li as a Reviewer for KVM Paravirt support KVM: x86: Always sync PIR to IRR prior to scanning I/O APIC routes KVM: Stop processing *all* memslots when "null" mmu_notifier handler is found KVM: arm64: FFA: Release hyp rx buffer KVM: selftests: Fix RISC-V compilation KVM: arm64: Disassociate vcpus from redistributor region on teardown KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin() KVM: selftests: x86: Prioritize getting max_gfn from GuestPhysBits KVM: selftests: Fix shift of 32 bit unsigned int more than 32 bits
2024-06-21Merge tag 'kvm-riscv-fixes-6.10-2' of https://github.com/kvm-riscv/linux ↵Paolo Bonzini
into HEAD KVM/riscv fixes for 6.10, take #2 - Fix compilation for KVM selftests
2024-06-20KVM: selftests: arm64: Test writes to CTR_EL0Sebastian Ott
Test that CTR_EL0 is modifiable from userspace, that changes are visible to guests, and that they are preserved across a vCPU reset. Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240619174036.483943-11-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-18KVM: selftests: Test vCPU boot IDs above 2^32 and MAX_VCPU_IDMathias Krause
The KVM_SET_BOOT_CPU_ID ioctl missed to reject invalid vCPU IDs. Verify this no longer works and gets rejected with an appropriate error code. Signed-off-by: Mathias Krause <minipli@grsecurity.net> Link: https://lore.kernel.org/r/20240614202859.3597745-6-minipli@grsecurity.net [sean: add test for MAX_VCPU_ID+1, always do negative test] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-18KVM: selftests: Test max vCPU IDs corner casesMathias Krause
The KVM_CREATE_VCPU ioctl ABI had an implicit integer truncation bug, allowing 2^32 aliases for a vCPU ID by setting the upper 32 bits of a 64 bit ioctl() argument. It also allowed excluding a once set boot CPU ID. Verify this no longer works and gets rejected with an error. Signed-off-by: Mathias Krause <minipli@grsecurity.net> Link: https://lore.kernel.org/r/20240614202859.3597745-5-minipli@grsecurity.net [sean: tweak assert message+comment for 63:32!=0 testcase] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-10KVM: selftests: Treat AMD Family 17h+ as supporting branch insns retiredManali Shukla
When detecting AMD PMU support for encoding "branch instructions retired" as event 0xc2,0, simply check for Family 17h+ as all Zen CPUs support said encoding, and AMD will maintain the encoding for backwards compatibility on future CPUs. Note, the kernel proper also interprets Family 17h+ as Zen (see the sole caller of init_amd_zen_common()). Suggested-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Manali Shukla <manali.shukla@amd.com> Link: https://lore.kernel.org/r/20240605050835.30491-1-manali.shukla@amd.com Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-06KVM: selftests: Fix RISC-V compilationAndrew Jones
Due to commit 2b7deea3ec7c ("Revert "kvm: selftests: move base kvm_util.h declarations to kvm_util_base.h"") kvm selftests now requires explicitly including ucall_common.h when needed. The commit added the directives everywhere they were needed at the time, but, by merge time, new places had been merged for RISC-V. Add those now to fix RISC-V's compilation. Fixes: dee7ea42a1eb ("Merge tag 'kvm-x86-selftests_utils-6.10' of https://github.com/kvm-x86/linux into HEAD") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240603122045.323064-2-ajones@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-06-05KVM: s390x: selftests: Add shared zeropage testDavid Hildenbrand
Let's test that we can have shared zeropages in our process as long as storage keys are not getting used, that shared zeropages are properly unshared (replaced by anonymous pages) once storage keys are enabled, and that no new shared zeropages are populated after storage keys were enabled. We require the new pagemap interface to detect the shared zeropage. On an old kernel (zeropages always disabled): # ./s390x/shared_zeropage_test TAP version 13 1..3 not ok 1 Shared zeropages should be enabled ok 2 Shared zeropage should be gone ok 3 Shared zeropages should be disabled # Totals: pass:2 fail:1 xfail:0 xpass:0 skip:0 error:0 On a fixed kernel: # ./s390x/shared_zeropage_test TAP version 13 1..3 ok 1 Shared zeropages should be enabled ok 2 Shared zeropage should be gone ok 3 Shared zeropages should be disabled # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 Testing of UFFDIO_ZEROPAGE can be added later. [ agordeev: Fixed checkpatch complaint, added ucall_common.h include ] Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Thomas Huth <thuth@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Tested-by: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/r/20240412084329.30315-1-david@redhat.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-06-05KVM: selftests: x86: Prioritize getting max_gfn from GuestPhysBitsTao Su
Use the max mappable GPA via GuestPhysBits advertised by KVM to calculate max_gfn. Currently some selftests (e.g. access_tracking_perf_test, dirty_log_test...) add RAM regions close to max_gfn, so guest may access GPA beyond its mappable range and cause infinite loop. Adjust max_gfn in vm_compute_max_gfn() since x86 selftests already overrides vm_compute_max_gfn() specifically to deal with goofy edge cases. Reported-by: Yi Lai <yi1.lai@intel.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Tested-by: Yi Lai <yi1.lai@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20240513014003.104593-1-tao1.su@linux.intel.com [sean: tweak name, add comment and sanity check] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-05KVM: selftests: Fix shift of 32 bit unsigned int more than 32 bitsColin Ian King
Currrentl a 32 bit 1u value is being shifted more than 32 bits causing overflow and incorrect checking of bits 32-63. Fix this by using the BIT_ULL macro for shifting bits. Detected by cppcheck: sev_init2_tests.c:108:34: error: Shifting 32-bit value by 63 bits is undefined behaviour [shiftTooManyBits] Fixes: dfc083a181ba ("selftests: kvm: add tests for KVM_SEV_INIT2") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20240523154102.2236133-1-colin.i.king@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: selftests: remove unused struct 'memslot_antagonist_args'Dr. David Alan Gilbert
'memslot_antagonist_args' is unused since the original commit f73a3446252e ("KVM: selftests: Add memslot modification stress test"). Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20240602235529.228204-1-linux@treblig.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-05-15selftests/kvm: remove dead filePaolo Bonzini
This file was supposed to be removed in commit 2b7deea3ec7c ("Revert "kvm: selftests: move base kvm_util.h declarations to kvm_util_base.h""), but it survived. Remove it now. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-12Merge tag 'kvm-x86-selftests_utils-6.10' of https://github.com/kvm-x86/linux ↵Paolo Bonzini
into HEAD KVM selftests treewide updates for 6.10: - Define _GNU_SOURCE for all selftests to fix a warning that was introduced by a change to kselftest_harness.h late in the 6.9 cycle, and because forcing every test to #define _GNU_SOURCE is painful. - Provide a global psuedo-RNG instance for all tests, so that library code can generate random, but determinstic numbers. - Use the global pRNG to randomly force emulation of select writes from guest code on x86, e.g. to help validate KVM's emulation of locked accesses. - Rename kvm_util_base.h back to kvm_util.h, as the weird layer of indirection was added purely to avoid manually #including ucall_common.h in a handful of locations. - Allocate and initialize x86's GDT, IDT, TSS, segments, and default exception handlers at VM creation, instead of forcing tests to manually trigger the related setup.
2024-05-12Merge tag 'kvm-x86-selftests-6.10' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM selftests cleanups and fixes for 6.10: - Enhance the demand paging test to allow for better reporting and stressing of UFFD performance. - Convert the steal time test to generate TAP-friendly output. - Fix a flaky false positive in the xen_shinfo_test due to comparing elapsed time across two different clock domains. - Skip the MONITOR/MWAIT test if the host doesn't actually support MWAIT. - Avoid unnecessary use of "sudo" in the NX hugepage test to play nice with running in a minimal userspace environment. - Allow skipping the RSEQ test's sanity check that the vCPU was able to complete a reasonable number of KVM_RUNs, as the assert can fail on a completely valid setup. If the test is run on a large-ish system that is otherwise idle, and the test isn't affined to a low-ish number of CPUs, the vCPU task can be repeatedly migrated to CPUs that are in deep sleep states, which results in the vCPU having very little net runtime before the next migration due to high wakeup latencies.
2024-05-12Merge tag 'kvmarm-6.10-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 6.10 - Move a lot of state that was previously stored on a per vcpu basis into a per-CPU area, because it is only pertinent to the host while the vcpu is loaded. This results in better state tracking, and a smaller vcpu structure. - Add full handling of the ERET/ERETAA/ERETAB instructions in nested virtualisation. The last two instructions also require emulating part of the pointer authentication extension. As a result, the trap handling of pointer authentication has been greattly simplified. - Turn the global (and not very scalable) LPI translation cache into a per-ITS, scalable cache, making non directly injected LPIs much cheaper to make visible to the vcpu. - A batch of pKVM patches, mostly fixes and cleanups, as the upstreaming process seems to be resuming. Fingers crossed! - Allocate PPIs and SGIs outside of the vcpu structure, allowing for smaller EL2 mapping and some flexibility in implementing more or less than 32 private IRQs. - Purge stale mpidr_data if a vcpu is created after the MPIDR map has been created. - Preserve vcpu-specific ID registers across a vcpu reset. - Various minor cleanups and improvements.
2024-05-10Merge tag 'loongarch-kvm-6.10' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.10 1. Add ParaVirt IPI support. 2. Add software breakpoint support. 3. Add mmio trace events support.
2024-05-09Merge branch kvm-arm64/mpidr-reset into kvmarm-master/nextMarc Zyngier
* kvm-arm64/mpidr-reset: : . : Fixes for CLIDR_EL1 and MPIDR_EL1 being accidentally mutable across : a vcpu reset, courtesy of Oliver. From the cover letter: : : "For VM-wide feature ID registers we ensure they get initialized once for : the lifetime of a VM. On the other hand, vCPU-local feature ID registers : get re-initialized on every vCPU reset, potentially clobbering the : values userspace set up. : : MPIDR_EL1 and CLIDR_EL1 are the only registers in this space that we : allow userspace to modify for now. Clobbering the value of MPIDR_EL1 has : some disastrous side effects as the compressed index used by the : MPIDR-to-vCPU lookup table assumes MPIDR_EL1 is immutable after KVM_RUN. : : Series + reproducer test case to address the problem of KVM wiping out : userspace changes to these registers. Note that there are still some : differences between VM and vCPU scoped feature ID registers from the : perspective of userspace. We do not allow the value of VM-scope : registers to change after KVM_RUN, but vCPU registers remain mutable." : . KVM: selftests: arm64: Test vCPU-scoped feature ID registers KVM: selftests: arm64: Test that feature ID regs survive a reset KVM: selftests: arm64: Store expected register value in set_id_regs KVM: selftests: arm64: Rename helper in set_id_regs to imply VM scope KVM: arm64: Only reset vCPU-scoped feature ID regs once KVM: arm64: Reset VM feature ID regs from kvm_reset_sys_regs() KVM: arm64: Rename is_id_reg() to imply VM scope Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-09KVM: selftests: arm64: Test vCPU-scoped feature ID registersOliver Upton
Test that CLIDR_EL1 and MPIDR_EL1 are modifiable from userspace and that the values are preserved across a vCPU reset like the other feature ID registers. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-8-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-09KVM: selftests: arm64: Test that feature ID regs survive a resetOliver Upton
One of the expectations with feature ID registers is that their values survive a vCPU reset. Start testing that. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-7-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-09KVM: selftests: arm64: Store expected register value in set_id_regsOliver Upton
Rather than comparing against what is returned by the ioctl, store expected values for the feature ID registers in a table and compare with that instead. This will prove useful for subsequent tests involving vCPU reset. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-6-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-09KVM: selftests: arm64: Rename helper in set_id_regs to imply VM scopeOliver Upton
Prepare for a later change that'll cram in per-vCPU feature ID test cases by renaming the current test case. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-5-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-07Merge tag 'kvm-riscv-6.10-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini
KVM/riscv changes for 6.10 - Support guest breakpoints using ebreak - Introduce per-VCPU mp_state_lock and reset_cntx_lock - Virtualize SBI PMU snapshot and counter overflow interrupts - New selftests for SBI PMU and Guest ebreak
2024-05-02KVM: selftests: Require KVM_CAP_USER_MEMORY2 for tests that create memslotsSean Christopherson
Explicitly require KVM_CAP_USER_MEMORY2 for selftests that create memslots, i.e. skip selftests that need memslots instead of letting them fail on KVM_SET_USER_MEMORY_REGION2. While it's ok to take a dependency on new kernel features, selftests should skip gracefully instead of failing hard when run on older kernels. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/69ae0694-8ca3-402c-b864-99b500b24f5d@moroto.mountain Suggested-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20240430162133.337541-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-05-02KVM: selftests: Allow skipping the KVM_RUN sanity check in rseq_testZide Chen
The rseq test's migration worker delays 1-10 us, assuming that one KVM_RUN iteration only takes a few microseconds. But if the CPU low power wakeup latency is large enough, for example, hundreds or even thousands of microseconds for deep C-state exit latencies on x86 server CPUs, it may happen that the target CPU is unable to wakeup and run the vCPU before the migration worker starts to migrate the vCPU thread to the _next_ CPU. If the system workload is light, most CPUs could be at a certain low power state, which may result in less successful migrations and fail the migration/KVM_RUN ratio sanity check. But this is not supposed to be deemed a test failure. Add a command line option to skip the sanity check, along with a comment and a verbose assert message to try to help the user resolve the potential source of failures without having to resort to disabling the check. Co-developed-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com> Signed-off-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Link: https://lore.kernel.org/r/20240502213936.27619-1-zide.chen@intel.com [sean: massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-04-30Merge tag 'kvmarm-fixes-6.9-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.9, part #2 - Fix + test for a NULL dereference resulting from unsanitised user input in the vgic-v2 device attribute accessors