summaryrefslogtreecommitdiff
path: root/arch/arm64/kvm/hyp/vhe/switch.c
AgeCommit message (Collapse)Author
2025-05-23Merge branch kvm-arm64/misc-6.16 into kvmarm-master/nextMarc Zyngier
* kvm-arm64/misc-6.16: : . : Misc changes and improvements for 6.16: : : - Add a new selftest for the SVE host state being corrupted by a guest : : - Keep HCR_EL2.xMO set at all times for systems running with the kernel at EL2, : ensuring that the window for interrupts is slightly bigger, and avoiding : a pretty bad erratum on the AmpereOne HW : : - Replace a couple of open-coded on/off strings with str_on_off() : : - Get rid of the pKVM memblock sorting, which now appears to be superflous : : - Drop superflous clearing of ICH_LR_EOI in the LR when nesting : : - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers from : a pretty bad case of TLB corruption unless accesses to HCR_EL2 are : heavily synchronised : : - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS tables : in a human-friendly fashion : . KVM: arm64: Fix documentation for vgic_its_iter_next() KVM: arm64: vgic-its: Add debugfs interface to expose ITS tables arm64: errata: Work around AmpereOne's erratum AC04_CPU_23 KVM: arm64: nv: Remove clearing of ICH_LR<n>.EOI if ICH_LR<n>.HW == 1 KVM: arm64: Drop sort_memblock_regions() KVM: arm64: selftests: Add test for SVE host corruption KVM: arm64: Force HCR_EL2.xMO to 1 at all times in VHE mode KVM: arm64: Replace ternary flags with str_on_off() helper Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-19arm64: errata: Work around AmpereOne's erratum AC04_CPU_23D Scott Phillips
On AmpereOne AC04, updates to HCR_EL2 can rarely corrupt simultaneous translations for data addresses initiated by load/store instructions. Only instruction initiated translations are vulnerable, not translations from prefetches for example. A DSB before the store to HCR_EL2 is sufficient to prevent older instructions from hitting the window for corruption, and an ISB after is sufficient to prevent younger instructions from hitting the window for corruption. Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20250513184514.2678288-1-scott@os.amperecomputing.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-19KVM: arm64: nv: Add S1 TLB invalidation primitive for VNCR_EL2Marc Zyngier
A TLBI by VA for S1 must take effect on our pseudo-TLB for VNCR and potentially knock the fixmap mapping. Even worse, that TLBI must be able to work cross-vcpu. For that, we track on a per-VM basis if any VNCR is mapped, using an atomic counter. Whenever a TLBI S1E2 occurs and that this counter is non-zero, we take the long road all the way back to the core code. There, we iterate over all vcpus and check whether this particular invalidation has any damaging effect. If it does, we nuke the pseudo TLB and the corresponding fixmap. Yes, this is costly. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250514103501.2225951-14-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-19KVM: arm64: nv: Program host's VNCR_EL2 to the fixmap addressMarc Zyngier
Since we now have a way to map the guest's VNCR_EL2 on the host, we can point the host's VNCR_EL2 to it and go full circle! Note that we unconditionally assign the fixmap to VNCR_EL2, irrespective of the guest's version being mapped or not. We want to take a fault on first access, so the fixmap either contains something guranteed to be either invalid or a guest mapping. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250514103501.2225951-13-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-19KVM: arm64: nv: Don't adjust PSTATE.M when L2 is nestingMarc Zyngier
We currently check for HCR_EL2.NV being set to decide whether we need to repaint PSTATE.M to say EL2 instead of EL1 on exit. However, this isn't correct when L2 is itself a hypervisor, and that L1 as set its own HCR_EL2.NV. That's because we "flatten" the state and inherit parts of the guest's own setup. In that case, we shouldn't adjust PSTATE.M, as this is really EL1 for both us and the guest. Instead of trying to try and work out how we ended-up with HCR_EL2.NV being set by introspecting both the host and guest states, use a per-CPU flag to remember the context (HYP or not), and use that information to decide whether PSTATE needs tweaking. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250514103501.2225951-7-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-03-11KVM: arm64: Compute synthetic sysreg ESR for Apple PMUv3 trapsOliver Upton
Apple M* CPUs provide an IMPDEF trap for PMUv3 sysregs, where ESR_EL2.EC is a reserved value (0x3F) and a sysreg-like ISS is reported in AFSR1_EL2. Compute a synthetic ESR for these PMUv3 traps, giving the illusion of something architectural to the rest of KVM. Tested-by: Janne Grunau <j@jannau.net> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250305202641.428114-10-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-02-13KVM: arm64: Eagerly switch ZCR_EL{1,2}Mark Rutland
In non-protected KVM modes, while the guest FPSIMD/SVE/SME state is live on the CPU, the host's active SVE VL may differ from the guest's maximum SVE VL: * For VHE hosts, when a VM uses NV, ZCR_EL2 contains a value constrained by the guest hypervisor, which may be less than or equal to that guest's maximum VL. Note: in this case the value of ZCR_EL1 is immaterial due to E2H. * For nVHE/hVHE hosts, ZCR_EL1 contains a value written by the guest, which may be less than or greater than the guest's maximum VL. Note: in this case hyp code traps host SVE usage and lazily restores ZCR_EL2 to the host's maximum VL, which may be greater than the guest's maximum VL. This can be the case between exiting a guest and kvm_arch_vcpu_put_fp(). If a softirq is taken during this period and the softirq handler tries to use kernel-mode NEON, then the kernel will fail to save the guest's FPSIMD/SVE state, and will pend a SIGKILL for the current thread. This happens because kvm_arch_vcpu_ctxsync_fp() binds the guest's live FPSIMD/SVE state with the guest's maximum SVE VL, and fpsimd_save_user_state() verifies that the live SVE VL is as expected before attempting to save the register state: | if (WARN_ON(sve_get_vl() != vl)) { | force_signal_inject(SIGKILL, SI_KERNEL, 0, 0); | return; | } Fix this and make this a bit easier to reason about by always eagerly switching ZCR_EL{1,2} at hyp during guest<->host transitions. With this happening, there's no need to trap host SVE usage, and the nVHE/nVHE __deactivate_cptr_traps() logic can be simplified to enable host access to all present FPSIMD/SVE/SME features. In protected nVHE/hVHE modes, the host's state is always saved/restored by hyp, and the guest's state is saved prior to exit to the host, so from the host's PoV the guest never has live FPSIMD/SVE/SME state, and the host's ZCR_EL1 is never clobbered by hyp. Fixes: 8c8010d69c132273 ("KVM: arm64: Save/restore SVE state for nVHE") Fixes: 2e3cf82063a00ea0 ("KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250210195226.1215254-9-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-02-13KVM: arm64: Refactor exit handlersMark Rutland
The hyp exit handling logic is largely shared between VHE and nVHE/hVHE, with common logic in arch/arm64/kvm/hyp/include/hyp/switch.h. The code in the header depends on function definitions provided by arch/arm64/kvm/hyp/vhe/switch.c and arch/arm64/kvm/hyp/nvhe/switch.c when they include the header. This is an unusual header dependency, and prevents the use of arch/arm64/kvm/hyp/include/hyp/switch.h in other files as this would result in compiler warnings regarding missing definitions, e.g. | In file included from arch/arm64/kvm/hyp/nvhe/hyp-main.c:8: | ./arch/arm64/kvm/hyp/include/hyp/switch.h:733:31: warning: 'kvm_get_exit_handler_array' used but never defined | 733 | static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm_vcpu *vcpu); | | ^~~~~~~~~~~~~~~~~~~~~~~~~~ | ./arch/arm64/kvm/hyp/include/hyp/switch.h:735:13: warning: 'early_exit_filter' used but never defined | 735 | static void early_exit_filter(struct kvm_vcpu *vcpu, u64 *exit_code); | | ^~~~~~~~~~~~~~~~~ Refactor the logic such that the header doesn't depend on anything from the C files. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250210195226.1215254-7-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-02-13KVM: arm64: Refactor CPTR trap deactivationMark Rutland
For historical reasons, the VHE and nVHE/hVHE implementations of __activate_cptr_traps() pair with a common implementation of __kvm_reset_cptr_el2(), which ideally would be named __deactivate_cptr_traps(). Rename __kvm_reset_cptr_el2() to __deactivate_cptr_traps(), and split it into separate VHE and nVHE/hVHE variants so that each can be paired with its corresponding implementation of __activate_cptr_traps(). At the same time, fold kvm_write_cptr_el2() into its callers. This makes it clear in-context whether a write is made to the CPACR_EL1 encoding or the CPTR_EL2 encoding, and removes the possibility of confusion as to whether kvm_write_cptr_el2() reformats the sysreg fields as cpacr_clear_set() does. In the nVHE/hVHE implementation of __activate_cptr_traps(), placing the sysreg writes within the if-else blocks requires that the call to __activate_traps_fpsimd32() is moved earlier, but as this was always called before writing to CPTR_EL2/CPACR_EL1, this should not result in a functional change. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250210195226.1215254-6-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-02-13KVM: arm64: Remove host FPSIMD saving for non-protected KVMMark Rutland
Now that the host eagerly saves its own FPSIMD/SVE/SME state, non-protected KVM never needs to save the host FPSIMD/SVE/SME state, and the code to do this is never used. Protected KVM still needs to save/restore the host FPSIMD/SVE state to avoid leaking guest state to the host (and to avoid revealing to the host whether the guest used FPSIMD/SVE/SME), and that code needs to be retained. Remove the unused code and data structures. To avoid the need for a stub copy of kvm_hyp_save_fpsimd_host() in the VHE hyp code, the nVHE/hVHE version is moved into the shared switch header, where it is only invoked when KVM is in protected mode. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250210195226.1215254-3-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-01-17Merge branch kvm-arm64/nv-timers into kvmarm-master/nextMarc Zyngier
* kvm-arm64/nv-timers: : . : Nested Virt support for the EL2 timers. From the initial cover letter: : : "Here's another batch of NV-related patches, this time bringing in most : of the timer support for EL2 as well as nested guests. : : The code is pretty convoluted for a bunch of reasons: : : - FEAT_NV2 breaks the timer semantics by redirecting HW controls to : memory, meaning that a guest could setup a timer and never see it : firing until the next exit : : - We go try hard to reflect the timer state in memory, but that's not : great. : : - With FEAT_ECV, we can finally correctly emulate the virtual timer, : but this emulation is pretty costly : : - As a way to make things suck less, we handle timer reads as early as : possible, and only defer writes to the normal trap handling : : - Finally, some implementations are badly broken, and require some : hand-holding, irrespective of NV support. So we try and reuse the NV : infrastructure to make them usable. This could be further optimised, : but I'm running out of patience for this sort of HW. : : [...]" : . KVM: arm64: nv: Fix doc header layout for timers KVM: arm64: nv: Document EL2 timer API KVM: arm64: Work around x1e's CNTVOFF_EL2 bogosity KVM: arm64: nv: Sanitise CNTHCTL_EL2 KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits KVM: arm64: nv: Add trap routing for CNTHCTL_EL2.EL1{NVPCT,NVVCT,TVT,TVCT} KVM: arm64: Handle counter access early in non-HYP context KVM: arm64: nv: Accelerate EL0 counter accesses from hypervisor context KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV in use KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state KVM: arm64: nv: Sync nested timer state with FEAT_NV2 KVM: arm64: nv: Add handling of EL2-specific timer registers Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-01-02KVM: arm64: nv: Accelerate EL0 counter accesses from hypervisor contextMarc Zyngier
Similarly to handling the physical timer accesses early when FEAT_ECV causes a trap, we try to handle the physical counter without returning to the general sysreg handling. More surprisingly, we introduce something similar for the virtual counter. Although this isn't necessary yet, it will prove useful on systems that have a broken CNTVOFF_EL2 implementation. Yes, they exist. Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241217142321.763801-7-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-01-02KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV in useMarc Zyngier
Although FEAT_ECV allows us to correctly emulate the timers, it also reduces performances pretty badly. Mitigate this by emulating the CTL/CVAL register reads in the inner run loop, without returning to the general kernel. Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241217142321.763801-6-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-12-19arm64/sysreg: Get rid of CPACR_ELx SysregFieldsMarc Zyngier
There is no such thing as CPACR_ELx in the architecture. What we have is CPACR_EL1, for which CPTR_EL12 is an accessor. Rename CPACR_ELx_* to CPACR_EL1_*, and fix the bit of code using these names. Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241219173351.1123087-5-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-08-27KVM: arm64: Add save/restore support for FPMRMarc Zyngier
Just like the rest of the FP/SIMD state, FPMR needs to be context switched. The only interesting thing here is that we need to treat the pKVM part a bit differently, as the host FP state is never written back to the vcpu thread, but instead stored locally and eagerly restored. Reviewed-by: Mark Brown <broonie@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240820131802.3547589-5-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-07-14Merge branch kvm-arm64/nv-sve into kvmarm/nextOliver Upton
* kvm-arm64/nv-sve: : CPTR_EL2, FPSIMD/SVE support for nested : : This series brings support for honoring the guest hypervisor's CPTR_EL2 : trap configuration when running a nested guest, along with support for : FPSIMD/SVE usage at L1 and L2. KVM: arm64: Allow the use of SVE+NV KVM: arm64: nv: Add additional trap setup for CPTR_EL2 KVM: arm64: nv: Add trap description for CPTR_EL2 KVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2 KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap KVM: arm64: nv: Handle CPACR_EL1 traps KVM: arm64: Spin off helper for programming CPTR traps KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context KVM: arm64: nv: Load guest hyp's ZCR into EL1 state KVM: arm64: nv: Handle ZCR_EL2 traps KVM: arm64: nv: Forward SVE traps to guest hypervisor KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-14Merge branch kvm-arm64/el2-kcfi into kvmarm/nextOliver Upton
* kvm-arm64/el2-kcfi: : kCFI support in the EL2 hypervisor, courtesy of Pierre-Clément Tosi : : Enable the usage fo CONFIG_CFI_CLANG (kCFI) for hardening indirect : branches in the EL2 hypervisor. Unlike kernel support for the feature, : CFI failures at EL2 are always fatal. KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2 KVM: arm64: Introduce print_nvhe_hyp_panic helper arm64: Introduce esr_brk_comment, esr_is_cfi_brk KVM: arm64: VHE: Mark __hyp_call_panic __noreturn KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32 KVM: arm64: nVHE: Simplify invalid_host_el2_vect KVM: arm64: Fix __pkvm_init_switch_pgd call ABI KVM: arm64: Fix clobbered ELR in sync abort/SError Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20KVM: arm64: nv: Add additional trap setup for CPTR_EL2Marc Zyngier
We need to teach KVM a couple of new tricks. CPTR_EL2 and its VHE accessor CPACR_EL1 need to be handled specially: - CPACR_EL1 is trapped on VHE so that we can track the TCPAC and TTA bits - CPTR_EL2.{TCPAC,E0POE} are propagated from L1 to L2 Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-15-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2Oliver Upton
Start folding the guest hypervisor's FP/SVE traps into the value programmed in hardware. Note that as of writing this is dead code, since KVM does a full put() / load() for every nested exception boundary which saves + flushes the FP/SVE state. However, this will become useful when we can keep the guest's FP/SVE state alive across a nested exception boundary and the host no longer needs to conservatively program traps. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-12-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20KVM: arm64: nv: Load guest FP state for ZCR_EL2 trapOliver Upton
Round out the ZCR_EL2 gymnastics by loading SVE state in the fast path when the guest hypervisor tries to access SVE state. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-11-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20KVM: arm64: nv: Handle CPACR_EL1 trapsMarc Zyngier
Handle CPACR_EL1 accesses when running a VHE guest. In order to limit the cost of the emulation, implement it ass a shallow exit. In the other cases: - this is a nVHE L1 which will write to memory, and we don't trap - this is a L2 guest: * the L1 has CPTR_EL2.TCPAC==0, and the L2 has direct register access * the L1 has CPTR_EL2.TCPAC==1, and the L2 will trap, but the handling is defered to the general handling for forwarding Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-10-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20KVM: arm64: Spin off helper for programming CPTR trapsOliver Upton
A subsequent change to KVM will add preliminary support for merging a guest hypervisor's CPTR traps with that of KVM. Prepare by spinning off a new helper for managing CPTR traps. Avoid reading CPACR_EL1 for the baseline trap config, and start off with the most restrictive set of traps that is subsequently relaxed. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-9-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20KVM: arm64: VHE: Mark __hyp_call_panic __noreturnPierre-Clément Tosi
Given that the sole purpose of __hyp_call_panic() is to call panic(), a __noreturn function, give it the __noreturn attribute, removing the need for its caller to use unreachable(). Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-6-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-19KVM: arm64: nv: Handle EL2 Stage-1 TLB invalidationMarc Zyngier
Due to the way FEAT_NV2 suppresses traps when accessing EL2 system registers, we can't track when the guest changes its HCR_EL2.TGE setting. This means we always trap EL1 TLBIs, even if they don't affect any L2 guest. Given that invalidating the EL2 TLBs doesn't require any messing with the shadow stage-2 page-tables, we can simply emulate the instructions early and return directly to the guest. This is conditioned on the instruction being an EL1 one and the guest's HCR_EL2.{E2H,TGE} being {1,1} (indicating that the instruction targets the EL2 S1 TLBs), or the instruction being one of the EL2 ones (which are not ambiguous). EL1 TLBIs issued with HCR_EL2.{E2H,TGE}={1,0} are not handled here, and cause a full exit so that they can be handled in the context of a VMID. Co-developed-by: Jintack Lim <jintack.lim@linaro.org> Co-developed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240614144552.2773592-7-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-04KVM: arm64: Refactor CPACR trap bit setting/clearing to use ELx formatFuad Tabba
When setting/clearing CPACR bits for EL0 and EL1, use the ELx format of the bits, which covers both. This makes the code clearer, and reduces the chances of accidentally missing a bit. No functional change intended. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-9-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04KVM: arm64: Specialize handling of host fpsimd state on trapFuad Tabba
In subsequent patches, n/vhe will diverge on saving the host fpsimd/sve state when taking a guest fpsimd/sve trap. Add a specialized helper to handle it. No functional change intended. Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-5-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-03Merge branch kvm-arm64/pkvm-6.10 into kvmarm-master/nextMarc Zyngier
* kvm-arm64/pkvm-6.10: (25 commits) : . : At last, a bunch of pKVM patches, courtesy of Fuad Tabba. : From the cover letter: : : "This series is a bit of a bombay-mix of patches we've been : carrying. There's no one overarching theme, but they do improve : the code by fixing existing bugs in pKVM, refactoring code to : make it more readable and easier to re-use for pKVM, or adding : functionality to the existing pKVM code upstream." : . KVM: arm64: Force injection of a data abort on NISV MMIO exit KVM: arm64: Restrict supported capabilities for protected VMs KVM: arm64: Refactor setting the return value in kvm_vm_ioctl_enable_cap() KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst KVM: arm64: Rename firmware pseudo-register documentation file KVM: arm64: Reformat/beautify PTP hypercall documentation KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit KVM: arm64: Introduce and use predicates that check for protected VMs KVM: arm64: Add is_pkvm_initialized() helper KVM: arm64: Simplify vgic-v3 hypercalls KVM: arm64: Move setting the page as dirty out of the critical section KVM: arm64: Change kvm_handle_mmio_return() return polarity KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() KVM: arm64: Prevent kmemleak from accessing .hyp.data KVM: arm64: Do not map the host fpsimd state to hyp in pKVM KVM: arm64: Rename __tlb_switch_to_{guest,host}() in VHE KVM: arm64: Support TLB invalidation in guest context KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE KVM: arm64: Check for PTE validity when checking for executable/cacheable KVM: arm64: Avoid BUG-ing from the host abort path ... Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-03Merge branch kvm-arm64/nv-eret-pauth into kvmarm-master/nextMarc Zyngier
* kvm-arm64/nv-eret-pauth: : . : Add NV support for the ERETAA/ERETAB instructions. From the cover letter: : : "Although the current upstream NV support has *some* support for : correctly emulating ERET, that support is only partial as it doesn't : support the ERETAA and ERETAB variants. : : Supporting these instructions was cast aside for a long time as it : involves implementing some form of PAuth emulation, something I wasn't : overly keen on. But I have reached a point where enough of the : infrastructure is there that it actually makes sense. So here it is!" : . KVM: arm64: nv: Work around lack of pauth support in old toolchains KVM: arm64: Drop trapping of PAuth instructions/keys KVM: arm64: nv: Advertise support for PAuth KVM: arm64: nv: Handle ERETA[AB] instructions KVM: arm64: nv: Add emulation for ERETAx instructions KVM: arm64: nv: Add kvm_has_pauth() helper KVM: arm64: nv: Reinject PAC exceptions caused by HCR_EL2.API==0 KVM: arm64: nv: Handle HCR_EL2.{API,APK} independently KVM: arm64: nv: Honor HFGITR_EL2.ERET being set KVM: arm64: nv: Fast-track 'InHost' exception returns KVM: arm64: nv: Add trap forwarding for ERET and SMC KVM: arm64: nv: Configure HCR_EL2 for FEAT_NV2 KVM: arm64: nv: Drop VCPU_HYP_CONTEXT flag KVM: arm64: Constraint PAuth support to consistent implementations KVM: arm64: Add helpers for ESR_ELx_ERET_ISS_ERET* KVM: arm64: Harden __ctxt_sys_reg() against out-of-range values Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-03KVM: arm64: Move management of __hyp_running_vcpu to load/put on VHEMarc Zyngier
The per-CPU host context structure contains a __hyp_running_vcpu that serves as a replacement for kvm_get_current_vcpu() in contexts where we cannot make direct use of it (such as in the nVHE hypervisor). Since there is a lot of common code between nVHE and VHE, the latter also populates this field even if kvm_get_running_vcpu() always works. We currently pretty inconsistent when populating __hyp_running_vcpu to point to the currently running vcpu: - on {n,h}VHE, we set __hyp_running_vcpu on entry to __kvm_vcpu_run and clear it on exit. - on VHE, we set __hyp_running_vcpu on entry to __kvm_vcpu_run_vhe and never clear it, effectively leaving a dangling pointer... VHE is obviously the odd one here. Although we could make it behave just like nVHE, this wouldn't match the behaviour of KVM with VHE, where the load phase is where most of the context-switch gets done. So move all the __hyp_running_vcpu management to the VHE-specific load/put phases, giving us a bit more sanity and matching the behaviour of kvm_get_running_vcpu(). Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502154030.3011995-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Refactor checks for FP state ownershipFuad Tabba
To avoid direct comparison against the fp_owner enum, add a new function that performs the check, host_owns_fp_regs(), to complement the existing guest_owns_fp_regs(). To check for fpsimd state ownership, use the helpers instead of directly using the enums. No functional change intended. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-4-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Move guest_owns_fp_regs() to increase its scopeFuad Tabba
guest_owns_fp_regs() will be used to check fpsimd state ownership across kvm/arm64. Therefore, move it to kvm_host.h to widen its scope. Moreover, the host state is not per-vcpu anymore, the vcpu parameter isn't used, so remove it as well. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-3-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: Drop trapping of PAuth instructions/keysMarc Zyngier
We currently insist on disabling PAuth on vcpu_load(), and get to enable it on first guest use of an instruction or a key (ignoring the NV case for now). It isn't clear at all what this is trying to achieve: guests tend to use PAuth when available, and nothing forces you to expose it to the guest if you don't want to. This also isn't totally free: we take a full GPR save/restore between host and guest, only to write ten 64bit registers. The "value proposition" escapes me. So let's forget this stuff and enable PAuth eagerly if exposed to the guest. This results in much simpler code. Performance wise, that's not bad either (tested on M2 Pro running a fully automated Debian installer as the workload): - On a non-NV guest, I can see reduction of 0.24% in the number of cycles (measured with perf over 10 consecutive runs) - On a NV guest (L2), I see a 2% reduction in wall-clock time (measured with 'time', as M2 doesn't have a PMUv3 and NV doesn't support it either) So overall, a much reduced complexity and a (small) performance improvement. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-16-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Handle ERETA[AB] instructionsMarc Zyngier
Now that we have some emulation in place for ERETA[AB], we can plug it into the exception handling machinery. As for a bare ERET, an "easy" ERETAx instruction is processed as a fixup, while something that requires a translation regime transition or an exception delivery is left to the slow path. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-14-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Honor HFGITR_EL2.ERET being setMarc Zyngier
If the L1 hypervisor decides to trap ERETs while running L2, make sure we don't try to emulate it, just like we wouldn't if it had its NV bit set. The exception will be reinjected from the core handler. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-9-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Fast-track 'InHost' exception returnsMarc Zyngier
A significant part of the FEAT_NV extension is to trap ERET instructions so that the hypervisor gets a chance to switch from a vEL2 L1 guest to an EL1 L2 guest. But this also has the unfortunate consequence of trapping ERET in unsuspecting circumstances, such as staying at vEL2 (interrupt handling while being in the guest hypervisor), or returning to host userspace in the case of a VHE guest. Although we already make some effort to handle these ERET quicker by not doing the put/load dance, it is still way too far down the line for it to be efficient enough. For these cases, it would ideal to ERET directly, no question asked. Of course, we can't do that. But the next best thing is to do it as early as possible, in fixup_guest_exit(), much as we would handle FPSIMD exceptions. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-8-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Configure HCR_EL2 for FEAT_NV2Marc Zyngier
Add the HCR_EL2 configuration for FEAT_NV2, adding the required bits for running a guest hypervisor, and overall merging the allowed bits provided by the guest. This heavily replies on unavaliable features being sanitised when the HCR_EL2 shadow register is accessed, and only a couple of bits must be explicitly disabled. Non-NV guests are completely unaffected by any of this. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-6-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Drop VCPU_HYP_CONTEXT flagMarc Zyngier
It has become obvious that HCR_EL2.NV serves the exact same use as VCPU_HYP_CONTEXT, only in an architectural way. So just drop the flag for good. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-5-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-12KVM: arm64: Exclude FP ownership from kvm_vcpu_archMarc Zyngier
In retrospect, it is fairly obvious that the FP state ownership is only meaningful for a given CPU, and that locating this information in the vcpu was just a mistake. Move the ownership tracking into the host data structure, and rename it from fp_state to fp_owner, which is a better description (name suggested by Mark Brown). Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-12KVM: arm64: Add accessor for per-CPU stateMarc Zyngier
In order to facilitate the introduction of new per-CPU state, add a new host_data_ptr() helped that hides some of the per-CPU verbosity, and make it easier to move that state around in the future. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-10-31Merge tag 'kvmarm-6.7' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.7 - Generalized infrastructure for 'writable' ID registers, effectively allowing userspace to opt-out of certain vCPU features for its guest - Optimization for vSGI injection, opportunistically compressing MPIDR to vCPU mapping into a table - Improvements to KVM's PMU emulation, allowing userspace to select the number of PMCs available to a VM - Guest support for memory operation instructions (FEAT_MOPS) - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing bugs and getting rid of useless code - Changes to the way the SMCCC filter is constructed, avoiding wasted memory allocations when not in use - Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing the overhead of errata mitigations - Miscellaneous kernel and selftest fixes
2023-10-30Merge branch kvm-arm64/mops into kvmarm/nextOliver Upton
* kvm-arm64/mops: : KVM support for MOPS, courtesy of Kristina Martsenko : : MOPS adds new instructions for accelerating memcpy(), memset(), and : memmove() operations in hardware. This series brings virtualization : support for KVM guests, and allows VMs to run on asymmetrict systems : that may have different MOPS implementations. KVM: arm64: Expose MOPS instructions to guests KVM: arm64: Add handler for MOPS exceptions Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-20KVM: arm64: Load the stage-2 MMU context in kvm_vcpu_load_vhe()Oliver Upton
To date the VHE code has aggressively reloaded the stage-2 MMU context on every guest entry, despite the fact that this isn't necessary. This was probably done for consistency with the nVHE code, which needs to switch in/out the stage-2 MMU context as both the host and guest run at EL1. Hoist __load_stage2() into kvm_vcpu_load_vhe(), thus avoiding a reload on every guest entry/exit. This is likely to be beneficial to systems with one of the speculative AT errata, as there is now one fewer context synchronization event on the guest entry path. Additionally, it is possible that implementations have hitched correctness mitigations on writes to VTTBR_EL2, which are now elided on guest re-entry. Note that __tlb_switch_to_guest() is deliberately left untouched as it can be called outside the context of a running vCPU. Link: https://lore.kernel.org/r/20231018233212.2888027-6-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-20KVM: arm64: Rename helpers for VHE vCPU load/putOliver Upton
The names for the helpers we expose to the 'generic' KVM code are a bit imprecise; we switch the EL0 + EL1 sysreg context and setup trap controls that do not need to change for every guest entry/exit. Rename + shuffle things around a bit in preparation for loading the stage-2 MMU context on vcpu_load(). Link: https://lore.kernel.org/r/20231018233212.2888027-5-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-12KVM: arm64: timers: Correctly handle TGE flip with CNTPOFF_EL2Marc Zyngier
Contrary to common belief, HCR_EL2.TGE has a direct and immediate effect on the way the EL0 physical counter is offset. Flipping TGE from 1 to 0 while at EL2 immediately changes the way the counter compared to the CVAL limit. This means that we cannot directly save/restore the guest's view of CVAL, but that we instead must treat it as if CNTPOFF didn't exist. Only in the world switch, once we figure out that we do have CNTPOFF, can we must the offset back and forth depending on the polarity of TGE. Fixes: 2b4825a86940 ("KVM: arm64: timers: Use CNTPOFF_EL2 to offset the physical timer") Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Tested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-10-09KVM: arm64: Add handler for MOPS exceptionsKristina Martsenko
An Armv8.8 FEAT_MOPS main or epilogue instruction will take an exception if executed on a CPU with a different MOPS implementation option (A or B) than the CPU where the preceding prologue instruction ran. In this case the OS exception handler is expected to reset the registers and restart execution from the prologue instruction. A KVM guest may use the instructions at EL1 at times when the guest is not able to handle the exception, expecting that the instructions will only run on one CPU (e.g. when running UEFI boot services in the guest). As KVM may reschedule the guest between different types of CPUs at any time (on an asymmetric system), it needs to also handle the resulting exception itself in case the guest is not able to. A similar situation will also occur in the future when live migrating a guest from one type of CPU to another. Add handling for the MOPS exception to KVM. The handling can be shared with the EL0 exception handler, as the logic and register layouts are the same. The exception can be handled right after exiting a guest, which avoids the cost of returning to the host exit handler. Similarly to the EL0 exception handler, in case the main or epilogue instruction is being single stepped, it makes sense to finish the step before executing the prologue instruction, so advance the single step state machine. Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230922112508.1774352-2-kristina.martsenko@arm.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-07-01Merge tag 'kvmarm-6.5' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.5 - Eager page splitting optimization for dirty logging, optionally allowing for a VM to avoid the cost of block splitting in the stage-2 fault path. - Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with services that live in the Secure world. pKVM intervenes on FF-A calls to guarantee the host doesn't misuse memory donated to the hyp or a pKVM guest. - Support for running the split hypervisor with VHE enabled, known as 'hVHE' mode. This is extremely useful for testing the split hypervisor on VHE-only systems, and paves the way for new use cases that depend on having two TTBRs available at EL2. - Generalized framework for configurable ID registers from userspace. KVM/arm64 currently prevents arbitrary CPU feature set configuration from userspace, but the intent is to relax this limitation and allow userspace to select a feature set consistent with the CPU. - Enable the use of Branch Target Identification (FEAT_BTI) in the hypervisor. - Use a separate set of pointer authentication keys for the hypervisor when running in protected mode, as the host is untrusted at runtime. - Ensure timer IRQs are consistently released in the init failure paths. - Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps (FEAT_EVT), as it is a register commonly read from userspace. - Erratum workaround for the upcoming AmpereOne part, which has broken hardware A/D state management. As a consequence of the hVHE series reworking the arm64 software features framework, the for-next/module-alloc branch from the arm64 tree comes along for the ride.
2023-06-12KVM: arm64: Rework CPTR_EL2 programming for HVHE configurationMarc Zyngier
Just like we repainted the early arm64 code, we need to update the CPTR_EL2 accesses that are taking place in the nVHE code when hVHE is used, making them look as if they were CPACR_EL1 accesses. Just like the VHE code. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230609162200.2024064-14-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-04KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loadedReiji Watanabe
Currently, with VHE, KVM sets ER, CR, SW and EN bits of PMUSERENR_EL0 to 1 on vcpu_load(), and saves and restores the register value for the host on vcpu_load() and vcpu_put(). If the value of those bits are cleared on a pCPU with a vCPU loaded (armv8pmu_start() would do that when PMU counters are programmed for the guest), PMU access from the guest EL0 might be trapped to the guest EL1 directly regardless of the current PMUSERENR_EL0 value of the vCPU. Fix this by not letting armv8pmu_start() overwrite PMUSERENR_EL0 on the pCPU where PMUSERENR_EL0 for the guest is loaded, and instead updating the saved shadow register value for the host so that the value can be restored on vcpu_put() later. While vcpu_{put,load}() are manipulating PMUSERENR_EL0, disable IRQs to prevent a race condition between these processes and IPIs that attempt to update PMUSERENR_EL0 for the host EL0. Suggested-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Marc Zyngier <maz@kernel.org> Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230603025035.3781797-3-reijiw@google.com
2023-05-30KVM: arm64: Populate fault info for watchpointAkihiko Odaki
When handling ESR_ELx_EC_WATCHPT_LOW, far_el2 member of struct kvm_vcpu_fault_info will be copied to far member of struct kvm_debug_exit_arch and exposed to the userspace. The userspace will see stale values from older faults if the fault info does not get populated. Fixes: 8fb2046180a0 ("KVM: arm64: Move early handlers to per-EC handlers") Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230530024651.10014-1-akihiko.odaki@daynix.com Cc: stable@vger.kernel.org
2023-04-14KVM: arm64: vhe: Drop extra isb() on guest exitMarc Zyngier
__kvm_vcpu_run_vhe() end on VHE with an isb(). However, this function is only reachable via kvm_call_hyp_ret(), which already contains an isb() in order to mimick the behaviour of nVHE and provide a context synchronisation event. We thus have two isb()s back to back, which is one too many. Drop the first one and solely rely on the one in the helper. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev>