Age | Commit message (Collapse) | Author |
|
Just like we have kvm_has_s1pie(), add its S1POE counterpart,
making the code slightly more readable.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241023145345.1613824-31-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
When the guest does not support S1PIE we should not allow any access
to the system registers it adds in order to ensure that we do not create
spurious issues with guest migration. Add a visibility operation for these
registers.
Fixes: 86f9de9db178 ("KVM: arm64: Save/restore PIE registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240822-kvm-arm64-hide-pie-regs-v2-3-376624fa829c@kernel.org
[maz: simplify by using __el2_visibility(), kvm_has_s1pie() throughout]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241023145345.1613824-26-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
When the guest does not support FEAT_TCR2 we should not allow any access
to it in order to ensure that we do not create spurious issues with guest
migration. Add a visibility operation for it.
Fixes: fbff56068232 ("KVM: arm64: Save/restore TCR2_EL1")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240822-kvm-arm64-hide-pie-regs-v2-2-376624fa829c@kernel.org
[maz: simplify by using __el2_visibility(), kvm_has_tcr2() throughout]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241023145345.1613824-25-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Like their EL1 equivalent, the EL2-specific FEAT_S1PIE registers
are context-switched. This is made conditional on both FEAT_TCRX
and FEAT_S1PIE being adversised.
Note that this change only makes sense if read together with the
issue D22677 contained in 102105_K.a_04_en.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241023145345.1613824-16-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Like its EL1 equivalent, TCR2_EL2 gets context-switched.
This is made conditional on FEAT_TCRX being adversised.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241023145345.1613824-14-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Whenever we need to restore the guest's system registers to the CPU, we
now need to take care of the EL2 system registers as well. Most of them
are accessed via traps only, but some have an immediate effect and also
a guest running in VHE mode would expect them to be accessible via their
EL1 encoding, which we do not trap.
For vEL2 we write the virtual EL2 registers with an identical format directly
into their EL1 counterpart, and translate the few registers that have a
different format for the same effect on the execution when running a
non-VHE guest guest hypervisor.
Based on an initial patch from Andre Przywara, rewritten many times
since.
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241023145345.1613824-8-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Don't mark pages/folios as accessed in the primary MMU when making a SPTE
young in KVM's secondary MMU, as doing so relies on
kvm_pfn_to_refcounted_page(), and generally speaking is unnecessary and
wasteful. KVM participates in page aging via mmu_notifiers, so there's no
need to push "accessed" updates to the primary MMU.
Dropping use of kvm_set_pfn_accessed() also paves the way for removing
kvm_pfn_to_refcounted_page() and all its users.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20241010182427.1434605-84-seanjc@google.com>
|
|
Pass through the SYSTEM_OFF2 function for hibernation, just like SYSTEM_OFF.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Link: https://lore.kernel.org/r/20241019172459.2241939-6-dwmw2@infradead.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Now that 'kvm_vgic_global_state' is no longer needed for ICC_CTLR_EL1
emulation on machines with a broken SEIS implementation, drop the
pKVM hypervisor mapping of the page.
Note that kvm_vgic_global_state is still mapped in non-protected
hypervisor configurations (i.e. {n,h}VHE) through the rodata section
mapping.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241022144016.27350-3-will@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
ICC_CTLR_EL1 accesses from a guest are trapped and emulated on systems
with broken SEIS support and without FEAT_GICv3_TDIR. On such systems,
we mask SEIS support in 'kvm_vgic_global_state.ich_vtr_el2' and so the
value of ICC_CTLR_EL1.SEIS visible to the guest is always zero.
Simplify the ICC_CTLR_EL1 read emulation to return 0 for the SEIS field,
rather than reading an always-zero value from the global state.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241022144016.27350-2-will@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Our idmap is becoming too big, to the point where it doesn't fit in
a 4kB page anymore.
There are some low-hanging fruits though, such as the el2_init_state
horror that is expanded 3 times in the kernel. Let's at least limit
ourselves to two copies, which makes the kernel link again.
At some point, we'll have to have a better way of doing this.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241009204903.GA3353168@thelio-3990X
|
|
When pKVM saves and restores the host floating point state on a SVE system,
it programs the vector length in ZCR_EL2.LEN to be whatever the maximum VL
for the PE is. But it uses a buffer allocated with kvm_host_sve_max_vl, the
maximum VL shared by all PEs in the system. This means that if we run on a
system where the maximum VLs are not consistent, we will overflow the buffer
on PEs which support larger VLs.
Since the host will not currently attempt to make use of non-shared VLs, fix
this by explicitly setting the EL2 VL to be the maximum shared VL when we
save and restore. This will enforce the limit on host VL usage. Should we
wish to support asymmetric VLs, this code will need to be updated along with
the required changes for the host:
https://lore.kernel.org/r/20240730-kvm-arm64-fix-pkvm-sve-vl-v6-0-cae8a2e0bd66@kernel.org
Fixes: b5b9955617bc ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM")
Signed-off-by: Mark Brown <broonie@kernel.org>
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240912-kvm-arm64-limit-guest-vl-v2-1-dd2c29cb2ac9@kernel.org
[maz: added punctuation to the commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
On an error, hyp_vcpu will be accessed while this memory has already
been relinquished to the host and unmapped from the hypervisor. Protect
the CPTR assignment with an early return.
Fixes: b5b9955617bc ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM")
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/r/20240919110500.2345927-1-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Pull kvm updates from Paolo Bonzini:
"These are the non-x86 changes (mostly ARM, as is usually the case).
The generic and x86 changes will come later"
ARM:
- New Stage-2 page table dumper, reusing the main ptdump
infrastructure
- FP8 support
- Nested virtualization now supports the address translation
(FEAT_ATS1A) family of instructions
- Add selftest checks for a bunch of timer emulation corner cases
- Fix multiple cases where KVM/arm64 doesn't correctly handle the
guest trying to use a GICv3 that wasn't advertised
- Remove REG_HIDDEN_USER from the sysreg infrastructure, making
things little simpler
- Prevent MTE tags being restored by userspace if we are actively
logging writes, as that's a recipe for disaster
- Correct the refcount on a page that is not considered for MTE tag
copying (such as a device)
- When walking a page table to split block mappings, synchronize only
at the end the walk rather than on every store
- Fix boundary check when transfering memory using FFA
- Fix pKVM TLB invalidation, only affecting currently out of tree
code but worth addressing for peace of mind
LoongArch:
- Revert qspinlock to test-and-set simple lock on VM.
- Add Loongson Binary Translation extension support.
- Add PMU support for guest.
- Enable paravirt feature control from VMM.
- Implement function kvm_para_has_feature().
RISC-V:
- Fix sbiret init before forwarding to userspace
- Don't zero-out PMU snapshot area before freeing data
- Allow legacy PMU access from guest
- Fix to allow hpmcounter31 from the guest"
* tag 'for-linus-non-x86' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (64 commits)
LoongArch: KVM: Implement function kvm_para_has_feature()
LoongArch: KVM: Enable paravirt feature control from VMM
LoongArch: KVM: Add PMU support for guest
KVM: arm64: Get rid of REG_HIDDEN_USER visibility qualifier
KVM: arm64: Simplify visibility handling of AArch32 SPSR_*
KVM: arm64: Simplify handling of CNTKCTL_EL12
LoongArch: KVM: Add vm migration support for LBT registers
LoongArch: KVM: Add Binary Translation extension support
LoongArch: KVM: Add VM feature detection function
LoongArch: Revert qspinlock to test-and-set simple lock on VM
KVM: arm64: Register ptdump with debugfs on guest creation
arm64: ptdump: Don't override the level when operating on the stage-2 tables
arm64: ptdump: Use the ptdump description from a local context
arm64: ptdump: Expose the attribute parsing functionality
KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer
KVM: arm64: Move pagetable definitions to common header
KVM: arm64: nv: Add support for FEAT_ATS1A
KVM: arm64: nv: Plumb handling of AT S1* traps from EL2
KVM: arm64: nv: Make AT+PAN instructions aware of FEAT_PAN3
KVM: arm64: nv: Sanitise SCTLR_EL1.EPAN according to VM configuration
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"The highlights are support for Arm's "Permission Overlay Extension"
using memory protection keys, support for running as a protected guest
on Android as well as perf support for a bunch of new interconnect
PMUs.
Summary:
ACPI:
- Enable PMCG erratum workaround for HiSilicon HIP10 and 11
platforms.
- Ensure arm64-specific IORT header is covered by MAINTAINERS.
CPU Errata:
- Enable workaround for hardware access/dirty issue on Ampere-1A
cores.
Memory management:
- Define PHYSMEM_END to fix a crash in the amdgpu driver.
- Avoid tripping over invalid kernel mappings on the kexec() path.
- Userspace support for the Permission Overlay Extension (POE) using
protection keys.
Perf and PMUs:
- Add support for the "fixed instruction counter" extension in the
CPU PMU architecture.
- Extend and fix the event encodings for Apple's M1 CPU PMU.
- Allow LSM hooks to decide on SPE permissions for physical
profiling.
- Add support for the CMN S3 and NI-700 PMUs.
Confidential Computing:
- Add support for booting an arm64 kernel as a protected guest under
Android's "Protected KVM" (pKVM) hypervisor.
Selftests:
- Fix vector length issues in the SVE/SME sigreturn tests
- Fix build warning in the ptrace tests.
Timers:
- Add support for PR_{G,S}ET_TSC so that 'rr' can deal with
non-determinism arising from the architected counter.
Miscellaneous:
- Rework our IPI-based CPU stopping code to try NMIs if regular IPIs
don't succeed.
- Minor fixes and cleanups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits)
perf: arm-ni: Fix an NULL vs IS_ERR() bug
arm64: hibernate: Fix warning for cast from restricted gfp_t
arm64: esr: Define ESR_ELx_EC_* constants as UL
arm64: pkeys: remove redundant WARN
perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled
MAINTAINERS: List Arm interconnect PMUs as supported
perf: Add driver for Arm NI-700 interconnect PMU
dt-bindings/perf: Add Arm NI-700 PMU
perf/arm-cmn: Improve format attr printing
perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check
arm64/mm: use lm_alias() with addresses passed to memblock_free()
mm: arm64: document why pte is not advanced in contpte_ptep_set_access_flags()
arm64: Expose the end of the linear map in PHYSMEM_END
arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec()
arm64/mm: Delete __init region from memblock.reserved
perf/arm-cmn: Support CMN S3
dt-bindings: perf: arm-cmn: Add CMN S3
perf/arm-cmn: Refactor DTC PMU register access
perf/arm-cmn: Make cycle counts less surprising
perf/arm-cmn: Improve build-time assertion
...
|
|
* kvm-arm64/s2-ptdump:
: .
: Stage-2 page table dumper, reusing the main ptdump infrastructure,
: courtesy of Sebastian Ene. From the cover letter:
:
: "This series extends the ptdump support to allow dumping the guest
: stage-2 pagetables. When CONFIG_PTDUMP_STAGE2_DEBUGFS is enabled, ptdump
: registers the new following files under debugfs:
: - /sys/debug/kvm/<guest_id>/stage2_page_tables
: - /sys/debug/kvm/<guest_id>/stage2_levels
: - /sys/debug/kvm/<guest_id>/ipa_range
:
: This allows userspace tools (eg. cat) to dump the stage-2 pagetables by
: reading the 'stage2_page_tables' file.
: [...]"
: .
KVM: arm64: Register ptdump with debugfs on guest creation
arm64: ptdump: Don't override the level when operating on the stage-2 tables
arm64: ptdump: Use the ptdump description from a local context
arm64: ptdump: Expose the attribute parsing functionality
KVM: arm64: Move pagetable definitions to common header
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
* kvm-arm64/nv-at-pan:
: .
: Add NV support for the AT family of instructions, which mostly results
: in adding a page table walker that deals with most of the complexity
: of the architecture.
:
: From the cover letter:
:
: "Another task that a hypervisor supporting NV on arm64 has to deal with
: is to emulate the AT instruction, because we multiplex all the S1
: translations on a single set of registers, and the guest S2 is never
: truly resident on the CPU.
:
: So given that we lie about page tables, we also have to lie about
: translation instructions, hence the emulation. Things are made
: complicated by the fact that guest S1 page tables can be swapped out,
: and that our shadow S2 is likely to be incomplete. So while using AT
: to emulate AT is tempting (and useful), it is not going to always
: work, and we thus need a fallback in the shape of a SW S1 walker."
: .
KVM: arm64: nv: Add support for FEAT_ATS1A
KVM: arm64: nv: Plumb handling of AT S1* traps from EL2
KVM: arm64: nv: Make AT+PAN instructions aware of FEAT_PAN3
KVM: arm64: nv: Sanitise SCTLR_EL1.EPAN according to VM configuration
KVM: arm64: nv: Add SW walker for AT S1 emulation
KVM: arm64: nv: Make ps_to_output_size() generally available
KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}
KVM: arm64: nv: Add basic emulation of AT S1E2{R,W}
KVM: arm64: nv: Add basic emulation of AT S1E1{R,W}P
KVM: arm64: nv: Add basic emulation of AT S1E{0,1}{R,W}
KVM: arm64: nv: Honor absence of FEAT_PAN2
KVM: arm64: nv: Turn upper_attr for S2 walk into the full descriptor
KVM: arm64: nv: Enforce S2 alignment when contiguous bit is set
arm64: Add ESR_ELx_FSC_ADDRSZ_L() helper
arm64: Add system register encoding for PSTATE.PAN
arm64: Add PAR_EL1 field description
arm64: Add missing APTable and TCR_ELx.HPD masks
KVM: arm64: Make kvm_at() take an OP_AT_*
Signed-off-by: Marc Zyngier <maz@kernel.org>
# Conflicts:
# arch/arm64/kvm/nested.c
|
|
* kvm-arm64/vgic-sre-traps:
: .
: Fix the multiple of cases where KVM/arm64 doesn't correctly
: handle the guest trying to use a GICv3 that isn't advertised.
:
: From the cover letter:
:
: "It recently appeared that, when running on a GICv3-equipped platform
: (which is what non-ancient arm64 HW has), *not* configuring a GICv3
: for the guest could result in less than desirable outcomes.
:
: We have multiple issues to fix:
:
: - for registers that *always* trap (the SGI registers) or that *may*
: trap (the SRE register), we need to check whether a GICv3 has been
: instantiated before acting upon the trap.
:
: - for registers that only conditionally trap, we must actively trap
: them even in the absence of a GICv3 being instantiated, and handle
: those traps accordingly.
:
: - finally, ID registers must reflect the absence of a GICv3, so that
: we are consistent.
:
: This series goes through all these requirements. The main complexity
: here is to apply a GICv3 configuration on the host in the absence of a
: GICv3 in the guest. This is pretty hackish, but I don't have a much
: better solution so far.
:
: As part of making wider use of of the trap bits, we fully define the
: trap routing as per the architecture, something that we eventually
: need for NV anyway."
: .
KVM: arm64: selftests: Cope with lack of GICv3 in set_id_regs
KVM: arm64: Add selftest checking how the absence of GICv3 is handled
KVM: arm64: Unify UNDEF injection helpers
KVM: arm64: Make most GICv3 accesses UNDEF if they trap
KVM: arm64: Honor guest requested traps in GICv3 emulation
KVM: arm64: Add trap routing information for ICH_HCR_EL2
KVM: arm64: Add ICH_HCR_EL2 to the vcpu state
KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guest
KVM: arm64: Add helper for last ditch idreg adjustments
KVM: arm64: Force GICv3 trap activation when no irqchip is configured on VHE
KVM: arm64: Force SRE traps when SRE access is not enabled
KVM: arm64: Move GICv3 trap configuration to kvm_calculate_traps()
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
* kvm-arm64/fpmr:
: .
: Add FP8 support to the KVM/arm64 floating point handling.
:
: This includes new ID registers (ID_AA64PFR2_EL1 ID_AA64FPFR0_EL1)
: being made visible to guests, as well as a new confrol register
: (FPMR) which gets context-switched.
: .
KVM: arm64: Expose ID_AA64PFR2_EL1 to userspace and guests
KVM: arm64: Enable FP8 support when available and configured
KVM: arm64: Expose ID_AA64FPFR0_EL1 as a writable ID reg
KVM: arm64: Honor trap routing for FPMR
KVM: arm64: Add save/restore support for FPMR
KVM: arm64: Move FPMR into the sysreg array
KVM: arm64: Add predicate for FPMR support in a VM
KVM: arm64: Move SVCR into the sysreg array
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
* kvm-arm64/mmu-misc-6.12:
: .
: Various minor MMU improvements and bug-fixes:
:
: - Prevent MTE tags being restored by userspace if we are actively
: logging writes, as that's a recipe for disaster
:
: - Correct the refcount on a page that is not considered for MTE
: tag copying (such as a device)
:
: - When walking a page table to split blocks, keep the DSB at the end
: the walk, as there is no need to perform it on every store.
:
: - Fix boundary check when transfering memory using FFA
: .
KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer
KVM: arm64: Disallow copying MTE to guest memory while KVM is dirty logging
KVM: arm64: Release pfn, i.e. put page, if copying MTE tags hits ZONE_DEVICE
KVM: arm64: Move data barrier to end of split walk
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When we share memory through FF-A and the description of the buffers
exceeds the size of the mapped buffer, the fragmentation API is used.
The fragmentation API allows specifying chunks of descriptors in subsequent
FF-A fragment calls and no upper limit has been established for this.
The entire memory region transferred is identified by a handle which can be
used to reclaim the transferred memory.
To be able to reclaim the memory, the description of the buffers has to fit
in the ffa_desc_buf.
Add a bounds check on the FF-A sharing path to prevent the memory reclaim
from failing.
Also do_ffa_mem_xfer() does not need __always_inline, except for the
BUILD_BUG_ON() aspect, which gets moved to a macro.
[maz: fixed the BUILD_BUG_ON() breakage with LLVM, thanks to Wei-Lin Chang
for the timely report]
Fixes: 634d90cf0ac65 ("KVM: arm64: Handle FFA_MEM_LEND calls from the host")
Cc: stable@vger.kernel.org
Reviewed-by: Sebastian Ene <sebastianene@google.com>
Signed-off-by: Snehal Koukuntla <snehalreddy@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240909180154.3267939-1-snehalreddy@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In preparation for using the stage-2 definitions in ptdump, move some of
these macros in the common header.
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Link: https://lore.kernel.org/r/20240909124721.1672199-2-sebastianene@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
FEAT_ATS1E1A introduces a new instruction: `at s1e1a`.
This is an address translation, without permission checks.
POE allows read permissions to be removed from S1 by the guest. This means
that an `at` instruction could fail, and not get the IPA.
Switch to using `at s1e1a` so that KVM can get the IPA regardless of S1
permissions.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240822151113.1479789-10-joey.gouly@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Define the new system registers that POE introduces and context switch them.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240822151113.1479789-8-joey.gouly@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
To allow using newer instructions that current assemblers don't know about,
replace the `at` instruction with the underlying SYS instruction.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We don't expect to trap any GICv3 register for host handling,
apart from ICC_SRE_EL1 and the SGI registers. If they trap,
that's because the guest is playing with us despite being
told it doesn't have a GICv3.
If it does, UNDEF is what it will get.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240827152517.3909653-10-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
On platforms that require emulation of the CPU interface, we still
need to honor the traps requested by the guest (ICH_HCR_EL2 as well
as the FGTs for ICC_IGRPEN{0,1}_EL1.
Check for these bits early and lail out if any trap applies.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240827152517.3909653-9-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We so far only write the ICH_HCR_EL2 config in two situations:
- when we need to emulate the GICv3 CPU interface due to HW bugs
- when we do direct injection, as the virtual CPU interface needs
to be enabled
This is all good. But it also means that we don't do anything special
when we emulate a GICv2, or that there is no GIC at all.
What happens in this case when the guest uses the GICv3 system
registers? The *guest* gets a trap for a sysreg access (EC=0x18)
while we'd really like it to get an UNDEF.
Fixing this is a bit involved:
- we need to set all the required trap bits (TC, TALL0, TALL1, TDIR)
- for these traps to take effect, we need to (counter-intuitively)
set ICC_SRE_EL1.SRE to 1 so that the above traps take priority.
Note that doesn't fully work when GICv2 emulation is enabled, as
we cannot set ICC_SRE_EL1.SRE to 1 (it breaks Group0 delivery as
IRQ).
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240827152517.3909653-3-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
* kvm-arm64/tlbi-fixes-6.12:
: .
: A couple of TLB invalidation fixes, only affecting pKVM
: out of tree, courtesy of Will Deacon.
: .
KVM: arm64: Ensure TLBI uses correct VMID after changing context
KVM: arm64: Invalidate EL1&0 TLB entries for all VMIDs in nvhe hyp init
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Just like the rest of the FP/SIMD state, FPMR needs to be context
switched.
The only interesting thing here is that we need to treat the pKVM
part a bit differently, as the host FP state is never written back
to the vcpu thread, but instead stored locally and eagerly restored.
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240820131802.3547589-5-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When the target context passed to enter_vmid_context() matches the
current running context, the function returns early without manipulating
the registers of the stage-2 MMU. This can result in a stale VMID due to
the lack of an ISB instruction in exit_vmid_context() after writing the
VTTBR when ARM64_WORKAROUND_SPECULATIVE_AT is not enabled.
For example, with pKVM enabled:
// Initially running in host context
enter_vmid_context(guest);
-> __load_stage2(guest); isb // Writes VTCR & VTTBR
exit_vmid_context(guest);
-> __load_stage2(host); // Restores VTCR & VTTBR
enter_vmid_context(host);
-> Returns early as we're already in host context
tlbi vmalls12e1is // !!! Can use the stale VMID as we
// haven't performed context
// synchronisation since restoring
// VTTBR.VMID
Add an unconditional ISB instruction to exit_vmid_context() after
restoring the VTTBR. This already existed for the
ARM64_WORKAROUND_SPECULATIVE_AT path, so we can simply hoist that onto
the common path.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Fuad Tabba <tabba@google.com>
Fixes: 58f3b0fc3b87 ("KVM: arm64: Support TLB invalidation in guest context")
Signed-off-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240814123429.20457-3-will@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When initialising the nVHE hypervisor, we invalidate potentially stale
TLB entries for the EL1&0 regime using a 'vmalls12e1' invalidation.
However, this invalidation operation applies only to the active VMID
and therefore we could proceed with stale TLB entries for other VMIDs.
Replace the operation with an 'alle1' which applies to all entries for
the EL1&0 regime, regardless of the VMID.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Fixes: 1025c8c0c6ac ("KVM: arm64: Wrap the host with a stage 2")
Signed-off-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240814123429.20457-2-will@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
This DSB guarantees page table updates have been made visible to the
hardware table walker. Moving the DSB from stage2_split_walker() to
after the walk is finished in kvm_pgtable_stage2_split() results in a
roughly 70% reduction in Clear Dirty Log Time in
dirty_log_perf_test (modified to use eager page splitting) when using
huge pages. This gain holds steady through a range of vcpus
used (tested 1-64) and memory used (tested 1-64GB).
This is safe to do because nothing else is using the page tables while
they are still being mapped and this is how other page table walkers
already function. None of them have a data barrier in the walker
itself because relative ordering of table PTEs to table contents comes
from the release semantics of stage2_make_pte().
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240808174243.2836363-1-coltonlewis@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Tidy up some of the PAuth trapping code to clear up some comments
and avoid clang/checkpatch warnings. Also, don't bother setting
PAuth HCR_EL2 bits in pKVM, since it's handled by the hypervisor.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240722163311.1493879-1-tabba@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Add -Wno-override-init to the build flags for sys_regs.c,
handle_exit.c, and switch.c to fix warnings like the following:
arch/arm64/kvm/hyp/vhe/switch.c:271:43: warning: initialized field overwritten [-Woverride-init]
271 | [ESR_ELx_EC_CP15_32] = kvm_hyp_handle_cp15_32,
|
Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240723101204.7356-2-sebott@redhat.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Pull kvm updates from Paolo Bonzini:
"ARM:
- Initial infrastructure for shadow stage-2 MMUs, as part of nested
virtualization enablement
- Support for userspace changes to the guest CTR_EL0 value, enabling
(in part) migration of VMs between heterogenous hardware
- Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1
of the protocol
- FPSIMD/SVE support for nested, including merged trap configuration
and exception routing
- New command-line parameter to control the WFx trap behavior under
KVM
- Introduce kCFI hardening in the EL2 hypervisor
- Fixes + cleanups for handling presence/absence of FEAT_TCRX
- Miscellaneous fixes + documentation updates
LoongArch:
- Add paravirt steal time support
- Add support for KVM_DIRTY_LOG_INITIALLY_SET
- Add perf kvm-stat support for loongarch
RISC-V:
- Redirect AMO load/store access fault traps to guest
- perf kvm stat support
- Use guest files for IMSIC virtualization, when available
s390:
- Assortment of tiny fixes which are not time critical
x86:
- Fixes for Xen emulation
- Add a global struct to consolidate tracking of host values, e.g.
EFER
- Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the
effective APIC bus frequency, because TDX
- Print the name of the APICv/AVIC inhibits in the relevant
tracepoint
- Clean up KVM's handling of vendor specific emulation to
consistently act on "compatible with Intel/AMD", versus checking
for a specific vendor
- Drop MTRR virtualization, and instead always honor guest PAT on
CPUs that support self-snoop
- Update to the newfangled Intel CPU FMS infrastructure
- Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as
it reads '0' and writes from userspace are ignored
- Misc cleanups
x86 - MMU:
- Small cleanups, renames and refactoring extracted from the upcoming
Intel TDX support
- Don't allocate kvm_mmu_page.shadowed_translation for shadow pages
that can't hold leafs SPTEs
- Unconditionally drop mmu_lock when allocating TDP MMU page tables
for eager page splitting, to avoid stalling vCPUs when splitting
huge pages
- Bug the VM instead of simply warning if KVM tries to split a SPTE
that is non-present or not-huge. KVM is guaranteed to end up in a
broken state because the callers fully expect a valid SPTE, it's
all but dangerous to let more MMU changes happen afterwards
x86 - AMD:
- Make per-CPU save_area allocations NUMA-aware
- Force sev_es_host_save_area() to be inlined to avoid calling into
an instrumentable function from noinstr code
- Base support for running SEV-SNP guests. API-wise, this includes a
new KVM_X86_SNP_VM type, encrypting/measure the initial image into
guest memory, and finalizing it before launching it. Internally,
there are some gmem/mmu hooks needed to prepare gmem-allocated
pages before mapping them into guest private memory ranges
This includes basic support for attestation guest requests, enough
to say that KVM supports the GHCB 2.0 specification
There is no support yet for loading into the firmware those signing
keys to be used for attestation requests, and therefore no need yet
for the host to provide certificate data for those keys.
To support fetching certificate data from userspace, a new KVM exit
type will be needed to handle fetching the certificate from
userspace.
An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS
exit type to handle this was introduced in v1 of this patchset, but
is still being discussed by community, so for now this patchset
only implements a stub version of SNP Extended Guest Requests that
does not provide certificate data
x86 - Intel:
- Remove an unnecessary EPT TLB flush when enabling hardware
- Fix a series of bugs that cause KVM to fail to detect nested
pending posted interrupts as valid wake eents for a vCPU executing
HLT in L2 (with HLT-exiting disable by L1)
- KVM: x86: Suppress MMIO that is triggered during task switch
emulation
Explicitly suppress userspace emulated MMIO exits that are
triggered when emulating a task switch as KVM doesn't support
userspace MMIO during complex (multi-step) emulation
Silently ignoring the exit request can result in the
WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace
for some other reason prior to purging mmio_needed
See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write
exits if emulator detects exception") for more details on KVM's
limitations with respect to emulated MMIO during complex emulator
flows
Generic:
- Rename the AS_UNMOVABLE flag that was introduced for KVM to
AS_INACCESSIBLE, because the special casing needed by these pages
is not due to just unmovability (and in fact they are only
unmovable because the CPU cannot access them)
- New ioctl to populate the KVM page tables in advance, which is
useful to mitigate KVM page faults during guest boot or after live
migration. The code will also be used by TDX, but (probably) not
through the ioctl
- Enable halt poll shrinking by default, as Intel found it to be a
clear win
- Setup empty IRQ routing when creating a VM to avoid having to
synchronize SRCU when creating a split IRQCHIP on x86
- Rework the sched_in/out() paths to replace kvm_arch_sched_in() with
a flag that arch code can use for hooking both sched_in() and
sched_out()
- Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
truncating a bogus value from userspace, e.g. to help userspace
detect bugs
- Mark a vCPU as preempted if and only if it's scheduled out while in
the KVM_RUN loop, e.g. to avoid marking it preempted and thus
writing guest memory when retrieving guest state during live
migration blackout
Selftests:
- Remove dead code in the memslot modification stress test
- Treat "branch instructions retired" as supported on all AMD Family
17h+ CPUs
- Print the guest pseudo-RNG seed only when it changes, to avoid
spamming the log for tests that create lots of VMs
- Make the PMU counters test less flaky when counting LLC cache
misses by doing CLFLUSH{OPT} in every loop iteration"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
crypto: ccp: Add the SNP_VLEK_LOAD command
KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops
KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops
KVM: x86: Replace static_call_cond() with static_call()
KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
x86/sev: Move sev_guest.h into common SEV header
KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
KVM: x86: Suppress MMIO that is triggered during task switch emulation
KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro
KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE
KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level
KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault"
KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler
KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
KVM: Document KVM_PRE_FAULT_MEMORY ioctl
mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE
perf kvm: Add kvm-stat for loongarch64
LoongArch: KVM: Add PV steal time support in guest side
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 changes for 6.11
- Initial infrastructure for shadow stage-2 MMUs, as part of nested
virtualization enablement
- Support for userspace changes to the guest CTR_EL0 value, enabling
(in part) migration of VMs between heterogenous hardware
- Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of
the protocol
- FPSIMD/SVE support for nested, including merged trap configuration
and exception routing
- New command-line parameter to control the WFx trap behavior under KVM
- Introduce kCFI hardening in the EL2 hypervisor
- Fixes + cleanups for handling presence/absence of FEAT_TCRX
- Miscellaneous fixes + documentation updates
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"The biggest part is the virtual CPU hotplug that touches ACPI,
irqchip. We also have some GICv3 optimisation for pseudo-NMIs that has
been queued via the arm64 tree. Otherwise the usual perf updates,
kselftest, various small cleanups.
Core:
- Virtual CPU hotplug support for arm64 ACPI systems
- cpufeature infrastructure cleanups and making the FEAT_ECBHB ID
bits visible to guests
- CPU errata: expand the speculative SSBS workaround to more CPUs
- GICv3, use compile-time PMR values: optimise the way regular IRQs
are masked/unmasked when GICv3 pseudo-NMIs are used, removing the
need for a static key in fast paths by using a priority value
chosen dynamically at boot time
ACPI:
- 'acpi=nospcr' option to disable SPCR as default console for arm64
- Move some ACPI code (cpuidle, FFH) to drivers/acpi/arm64/
Perf updates:
- Rework of the IMX PMU driver to enable support for I.MX95
- Enable support for tertiary match groups in the CMN PMU driver
- Initial refactoring of the CPU PMU code to prepare for the fixed
instruction counter introduced by Arm v9.4
- Add missing PMU driver MODULE_DESCRIPTION() strings
- Hook up DT compatibles for recent CPU PMUs
Kselftest updates:
- Kernel mode NEON fp-stress
- Cleanups, spelling mistakes
Miscellaneous:
- arm64 Documentation update with a minor clarification on TBI
- Fix missing IPI statistics
- Implement raw_smp_processor_id() using thread_info rather than a
per-CPU variable (better code generation)
- Make MTE checking of in-kernel asynchronous tag faults conditional
on KASAN being enabled
- Minor cleanups, typos"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (69 commits)
selftests: arm64: tags: remove the result script
selftests: arm64: tags_test: conform test to TAP output
perf: add missing MODULE_DESCRIPTION() macros
arm64: smp: Fix missing IPI statistics
irqchip/gic-v3: Fix 'broken_rdists' unused warning when !SMP and !ACPI
ACPI: Add acpi=nospcr to disable ACPI SPCR as default console on ARM64
Documentation: arm64: Update memory.rst for TBI
arm64/cpufeature: Replace custom macros with fields from ID_AA64PFR0_EL1
KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1
perf: arm_pmuv3: Include asm/arm_pmuv3.h from linux/perf/arm_pmuv3.h
perf: arm_v6/7_pmu: Drop non-DT probe support
perf/arm: Move 32-bit PMU drivers to drivers/perf/
perf: arm_pmuv3: Drop unnecessary IS_ENABLED(CONFIG_ARM64) check
perf: arm_pmuv3: Avoid assigning fixed cycle counter with threshold
arm64: Kconfig: Fix dependencies to enable ACPI_HOTPLUG_CPU
perf: imx_perf: add support for i.MX95 platform
perf: imx_perf: fix counter start and config sequence
perf: imx_perf: refactor driver for imx93
perf: imx_perf: let the driver manage the counter usage rather the user
perf: imx_perf: add macro definitions for parsing config attr
...
|
|
* kvm-arm64/nv-tcr2:
: Fixes to the handling of TCR_EL1, courtesy of Marc Zyngier
:
: Series addresses a couple gaps that are present in KVM (from cover
: letter):
:
: - VM configuration: HCRX_EL2.TCR2En is forced to 1, and we blindly
: save/restore stuff.
:
: - trap bit description and routing: none, obviously, since we make a
: point in not trapping.
KVM: arm64: Honor trap routing for TCR2_EL1
KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX
KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features
KVM: arm64: Get rid of HCRX_GUEST_FLAGS
KVM: arm64: Correctly honor the presence of FEAT_TCRX
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/nv-sve:
: CPTR_EL2, FPSIMD/SVE support for nested
:
: This series brings support for honoring the guest hypervisor's CPTR_EL2
: trap configuration when running a nested guest, along with support for
: FPSIMD/SVE usage at L1 and L2.
KVM: arm64: Allow the use of SVE+NV
KVM: arm64: nv: Add additional trap setup for CPTR_EL2
KVM: arm64: nv: Add trap description for CPTR_EL2
KVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper
KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2
KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap
KVM: arm64: nv: Handle CPACR_EL1 traps
KVM: arm64: Spin off helper for programming CPTR traps
KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state
KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest
KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context
KVM: arm64: nv: Load guest hyp's ZCR into EL1 state
KVM: arm64: nv: Handle ZCR_EL2 traps
KVM: arm64: nv: Forward SVE traps to guest hypervisor
KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/el2-kcfi:
: kCFI support in the EL2 hypervisor, courtesy of Pierre-Clément Tosi
:
: Enable the usage fo CONFIG_CFI_CLANG (kCFI) for hardening indirect
: branches in the EL2 hypervisor. Unlike kernel support for the feature,
: CFI failures at EL2 are always fatal.
KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2
KVM: arm64: Introduce print_nvhe_hyp_panic helper
arm64: Introduce esr_brk_comment, esr_is_cfi_brk
KVM: arm64: VHE: Mark __hyp_call_panic __noreturn
KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32
KVM: arm64: nVHE: Simplify invalid_host_el2_vect
KVM: arm64: Fix __pkvm_init_switch_pgd call ABI
KVM: arm64: Fix clobbered ELR in sync abort/SError
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/shadow-mmu:
: Shadow stage-2 MMU support for NV, courtesy of Marc Zyngier
:
: Initial implementation of shadow stage-2 page tables to support a guest
: hypervisor. In the author's words:
:
: So here's the 10000m (approximately 30000ft for those of you stuck
: with the wrong units) view of what this is doing:
:
: - for each {VMID,VTTBR,VTCR} tuple the guest uses, we use a
: separate shadow s2_mmu context. This context has its own "real"
: VMID and a set of page tables that are the combination of the
: guest's S2 and the host S2, built dynamically one fault at a time.
:
: - these shadow S2 contexts are ephemeral, and behave exactly as
: TLBs. For all intent and purposes, they *are* TLBs, and we discard
: them pretty often.
:
: - TLB invalidation takes three possible paths:
:
: * either this is an EL2 S1 invalidation, and we directly emulate
: it as early as possible
:
: * or this is an EL1 S1 invalidation, and we need to apply it to
: the shadow S2s (plural!) that match the VMID set by the L1 guest
:
: * or finally, this is affecting S2, and we need to teardown the
: corresponding part of the shadow S2s, which invalidates the TLBs
KVM: arm64: nv: Truely enable nXS TLBI operations
KVM: arm64: nv: Add handling of NXS-flavoured TLBI operations
KVM: arm64: nv: Add handling of range-based TLBI operations
KVM: arm64: nv: Add handling of outer-shareable TLBI operations
KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information
KVM: arm64: nv: Tag shadow S2 entries with guest's leaf S2 level
KVM: arm64: nv: Handle FEAT_TTL hinted TLB operations
KVM: arm64: nv: Handle TLBI IPAS2E1{,IS} operations
KVM: arm64: nv: Handle TLBI ALLE1{,IS} operations
KVM: arm64: nv: Handle TLBI VMALLS12E1{,IS} operations
KVM: arm64: nv: Handle TLB invalidation targeting L2 stage-1
KVM: arm64: nv: Handle EL2 Stage-1 TLB invalidation
KVM: arm64: nv: Add Stage-1 EL2 invalidation primitives
KVM: arm64: nv: Unmap/flush shadow stage 2 page tables
KVM: arm64: nv: Handle shadow stage 2 page faults
KVM: arm64: nv: Implement nested Stage-2 page table walk logic
KVM: arm64: nv: Support multiple nested Stage-2 mmu structures
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
This replaces custom macros usage (i.e ID_AA64PFR0_EL1_ELx_64BIT_ONLY and
ID_AA64PFR0_EL1_ELx_32BIT_64BIT) and instead directly uses register fields
from ID_AA64PFR0_EL1 sysreg definition.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: kvmarm@lists.linux.dev
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240613102710.3295108-2-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
As per the architecture, if FEAT_S1PIE is implemented, then FEAT_TCRX
must be implemented as well.
Take advantage of this to avoid checking for S1PIE when TCRX isn't
implemented.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240625130042.259175-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
As for other registers, save/restore of TCR2_EL1 should be gated
on the feature being actually present.
In the case of a nVHE hypervisor, it is perfectly fine to leave
the host value in the register, as HCRX_EL2.TCREn==0 imposes that
TCR2_EL1 is treated as 0.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240625130042.259175-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
We need to teach KVM a couple of new tricks. CPTR_EL2 and its
VHE accessor CPACR_EL1 need to be handled specially:
- CPACR_EL1 is trapped on VHE so that we can track the TCPAC
and TTA bits
- CPTR_EL2.{TCPAC,E0POE} are propagated from L1 to L2
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-15-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Start folding the guest hypervisor's FP/SVE traps into the value
programmed in hardware. Note that as of writing this is dead code, since
KVM does a full put() / load() for every nested exception boundary which
saves + flushes the FP/SVE state.
However, this will become useful when we can keep the guest's FP/SVE
state alive across a nested exception boundary and the host no longer
needs to conservatively program traps.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-12-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Round out the ZCR_EL2 gymnastics by loading SVE state in the fast path
when the guest hypervisor tries to access SVE state.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-11-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Handle CPACR_EL1 accesses when running a VHE guest. In order to
limit the cost of the emulation, implement it ass a shallow exit.
In the other cases:
- this is a nVHE L1 which will write to memory, and we don't trap
- this is a L2 guest:
* the L1 has CPTR_EL2.TCPAC==0, and the L2 has direct register
access
* the L1 has CPTR_EL2.TCPAC==1, and the L2 will trap, but the
handling is defered to the general handling for forwarding
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-10-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
A subsequent change to KVM will add preliminary support for merging a
guest hypervisor's CPTR traps with that of KVM. Prepare by spinning off
a new helper for managing CPTR traps.
Avoid reading CPACR_EL1 for the baseline trap config, and start off with
the most restrictive set of traps that is subsequently relaxed.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-9-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|