Age | Commit message (Collapse) | Author |
|
Exit to userspace for TDG.VP.VMCALL<GetTdVmCallInfo> via KVM_EXIT_TDX,
to allow userspace to provide information about the support of
TDVMCALLs when r12 is 1 for the TDVMCALLs beyond the GHCI base API.
GHCI spec defines the GHCI base TDVMCALLs: <GetTdVmCallInfo>, <MapGPA>,
<ReportFatalError>, <Instruction.CPUID>, <#VE.RequestMMIO>,
<Instruction.HLT>, <Instruction.IO>, <Instruction.RDMSR> and
<Instruction.WRMSR>. They must be supported by VMM to support TDX guests.
For GetTdVmCallInfo
- When leaf (r12) to enumerate TDVMCALL functionality is set to 0,
successful execution indicates all GHCI base TDVMCALLs listed above are
supported.
Update the KVM TDX document with the set of the GHCI base APIs.
- When leaf (r12) to enumerate TDVMCALL functionality is set to 1, it
indicates the TDX guest is querying the supported TDVMCALLs beyond
the GHCI base TDVMCALLs.
Exit to userspace to let userspace set the TDVMCALL sub-function bit(s)
accordingly to the leaf outputs. KVM could set the TDVMCALL bit(s)
supported by itself when the TDVMCALLs don't need support from userspace
after returning from userspace and before entering guest. Currently, no
such TDVMCALLs implemented, KVM just sets the values returned from
userspace.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
[Adjust userspace API. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Handle TDVMCALL for GetQuote to generate a TD-Quote.
GetQuote is a doorbell-like interface used by TDX guests to request VMM
to generate a TD-Quote signed by a service hosting TD-Quoting Enclave
operating on the host. A TDX guest passes a TD Report (TDREPORT_STRUCT) in
a shared-memory area as parameter. Host VMM can access it and queue the
operation for a service hosting TD-Quoting enclave. When completed, the
Quote is returned via the same shared-memory area.
KVM only checks the GPA from the TDX guest has the shared-bit set and drops
the shared-bit before exiting to userspace to avoid bleeding the shared-bit
into KVM's exit ABI. KVM forwards the request to userspace VMM (e.g. QEMU)
and userspace VMM queues the operation asynchronously. KVM sets the return
code according to the 'ret' field set by userspace to notify the TDX guest
whether the request has been queued successfully or not. When the request
has been queued successfully, the TDX guest can poll the status field in
the shared-memory area to check whether the Quote generation is completed
or not. When completed, the generated Quote is returned via the same
buffer.
Add KVM_EXIT_TDX as a new exit reason to userspace. Userspace is
required to handle the KVM exit reason as the initial support for TDX,
by reentering KVM to ensure that the TDVMCALL is complete. While at it,
add a note that KVM_EXIT_HYPERCALL also requires reentry with KVM_RUN.
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Tested-by: Mikko Ylinen <mikko.ylinen@linux.intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
[Adjust userspace API. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull UML updates from Johannes Berg:
"The only really new thing is the long-standing seccomp work
(originally from 2021!). Wven if it still isn't enabled by default due
to security concerns it can still be used e.g. for tests.
- remove obsolete network transports
- remove PCI IO port support
- start adding seccomp-based process handling instead of ptrace"
* tag 'uml-for-linux-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (29 commits)
um: remove "extern" from implementation of sigchld_handler
um: fix unused variable warning
um: fix SECCOMP 32bit xstate register restore
um: pass FD for memory operations when needed
um: Add SECCOMP support detection and initialization
um: Implement kernel side of SECCOMP based process handling
um: Track userspace children dying in SECCOMP mode
um: Add helper functions to get/set state for SECCOMP
um: Add stub side of SECCOMP/futex based process handling
um: Move faultinfo extraction into userspace routine
um: vector: Use mac_pton() for MAC address parsing
um: vector: Clean up and modernize log messages
um: chan_kern: use raw spinlock for irqs_to_free_lock
MAINTAINERS: remove obsolete file entry in TUN/TAP DRIVER
um: Fix tgkill compile error on old host OSes
um: stop using PCI port I/O
um: Remove legacy network transport infrastructure
um: vector: Eliminate the dependency on uml_net
um: Remove obsolete legacy network transports
um/asm: Replace "REP; NOP" with PAUSE mnemonic
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- Support for Virtual Trust Level (VTL) on arm64 (Roman Kisel)
- Fixes for Hyper-V UIO driver (Long Li)
- Fixes for Hyper-V PCI driver (Michael Kelley)
- Select CONFIG_SYSFB for Hyper-V guests (Michael Kelley)
- Documentation updates for Hyper-V VMBus (Michael Kelley)
- Enhance logging for hv_kvp_daemon (Shradha Gupta)
* tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (23 commits)
Drivers: hv: Always select CONFIG_SYSFB for Hyper-V guests
Drivers: hv: vmbus: Add comments about races with "channels" sysfs dir
Documentation: hyperv: Update VMBus doc with new features and info
PCI: hv: Remove unnecessary flex array in struct pci_packet
Drivers: hv: Remove hv_alloc/free_* helpers
Drivers: hv: Use kzalloc for panic page allocation
uio_hv_generic: Align ring size to system page
uio_hv_generic: Use correct size for interrupt and monitor pages
Drivers: hv: Allocate interrupt and monitor pages aligned to system page boundary
arch/x86: Provide the CPU number in the wakeup AP callback
x86/hyperv: Fix APIC ID and VP index confusion in hv_snp_boot_ap()
PCI: hv: Get vPCI MSI IRQ domain from DeviceTree
ACPI: irq: Introduce acpi_get_gsi_dispatcher()
Drivers: hv: vmbus: Introduce hv_get_vmbus_root_device()
Drivers: hv: vmbus: Get the IRQ number from DeviceTree
dt-bindings: microsoft,vmbus: Add interrupt and DMA coherence properties
arm64, x86: hyperv: Report the VTL the system boots in
arm64: hyperv: Initialize the Virtual Trust Level field
Drivers: hv: Provide arch-neutral implementation of get_vtl()
Drivers: hv: Enable VTL mode for arm64
...
|
|
KVM SVM changes for 6.16:
- Wait for target vCPU to acknowledge KVM_REQ_UPDATE_PROTECTED_GUEST_STATE to
fix a race between AP destroy and VMRUN.
- Decrypt and dump the VMSA in dump_vmcb() if debugging enabled for the VM.
- Add support for ALLOWED_SEV_FEATURES.
- Add #VMGEXIT to the set of handlers special cased for CONFIG_RETPOLINE=y.
- Treat DEBUGCTL[5:2] as reserved to pave the way for virtualizing features
that utilize those bits.
- Don't account temporary allocations in sev_send_update_data().
- Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM, via Bus Lock Threshold.
|
|
KVM/riscv changes for 6.16
- Add vector registers to get-reg-list selftest
- VCPU reset related improvements
- Remove scounteren initialization from VCPU reset
- Support VCPU reset from userspace using set_mpstate() ioctl
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.16
* New features:
- Add large stage-2 mapping support for non-protected pKVM guests,
clawing back some performance.
- Add UBSAN support to the standalone EL2 object used in nVHE/hVHE and
protected modes.
- Enable nested virtualisation support on systems that support it
(yes, it has been a long time coming), though it is disabled by
default.
* Improvements, fixes and cleanups:
- Large rework of the way KVM tracks architecture features and links
them with the effects of control bits. This ensures correctness of
emulation (the data is automatically extracted from the published
JSON files), and helps dealing with the evolution of the
architecture.
- Significant changes to the way pKVM tracks ownership of pages,
avoiding page table walks by storing the state in the hypervisor's
vmemmap. This in turn enables the THP support described above.
- New selftest checking the pKVM ownership transition rules
- Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
even if the host didn't have it.
- Fixes for the address translation emulation, which happened to be
rather buggy in some specific contexts.
- Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
from the number of counters exposed to a guest and addressing a
number of issues in the process.
- Add a new selftest for the SVE host state being corrupted by a
guest.
- Keep HCR_EL2.xMO set at all times for systems running with the
kernel at EL2, ensuring that the window for interrupts is slightly
bigger, and avoiding a pretty bad erratum on the AmpereOne HW.
- Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
from a pretty bad case of TLB corruption unless accesses to HCR_EL2
are heavily synchronised.
- Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
tables in a human-friendly fashion.
- and the usual random cleanups.
|
|
Replace it with just the URL.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Starting in the 6.15 kernel, VMBus interrupts are automatically
assigned away from a CPU that is being taken offline. Add documentation
describing this case.
Also add details of Hyper-V behavior when the primary channel of
a VMBus device is closed as the result of unbinding the device's
driver. This behavior has not changed, but it was not previously
documented.
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/20250520044435.7734-1-mhklinux@outlook.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250520044435.7734-1-mhklinux@outlook.com>
|
|
* kvm-arm64/nv-nv:
: .
: Flick the switch on the NV support by adding the missing piece
: in the form of the VNCR page management. From the cover letter:
:
: "This is probably the most interesting bit of the whole NV adventure.
: So far, everything else has been a walk in the park, but this one is
: where the real fun takes place.
:
: With FEAT_NV2, most of the NV support revolves around tricking a guest
: into accessing memory while it tries to access system registers. The
: hypervisor's job is to handle the context switch of the actual
: registers with the state in memory as needed."
: .
KVM: arm64: nv: Release faulted-in VNCR page from mmu_lock critical section
KVM: arm64: nv: Handle TLBI S1E2 for VNCR invalidation with mmu_lock held
KVM: arm64: nv: Hold mmu_lock when invalidating VNCR SW-TLB before translating
KVM: arm64: Document NV caps and vcpu flags
KVM: arm64: Allow userspace to request KVM_ARM_VCPU_EL2*
KVM: arm64: nv: Remove dead code from ERET handling
KVM: arm64: nv: Plumb TLBI S1E2 into system instruction dispatch
KVM: arm64: nv: Add S1 TLB invalidation primitive for VNCR_EL2
KVM: arm64: nv: Program host's VNCR_EL2 to the fixmap address
KVM: arm64: nv: Handle VNCR_EL2 invalidation from MMU notifiers
KVM: arm64: nv: Handle mapping of VNCR_EL2 at EL2
KVM: arm64: nv: Handle VNCR_EL2-triggered faults
KVM: arm64: nv: Add userspace and guest handling of VNCR_EL2
KVM: arm64: nv: Add pseudo-TLB backing VNCR_EL2
KVM: arm64: nv: Don't adjust PSTATE.M when L2 is nesting
KVM: arm64: nv: Move TLBI range decoding to a helper
KVM: arm64: nv: Snapshot S1 ASID tagging information during walk
KVM: arm64: nv: Extract translation helper from the AT code
KVM: arm64: nv: Allocate VNCR page when required
arm64: sysreg: Add layout for VNCR_EL2
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Add a toggleable VM capability to reset the VCPU from userspace by
setting MP_STATE_INIT_RECEIVED through IOCTL.
Reset through a mp_state to avoid adding a new IOCTL.
Do not reset on a transition from STOPPED to RUNNABLE, because it's
better to avoid side effects that would complicate userspace adoption.
The MP_STATE_INIT_RECEIVED is not a permanent mp_state -- IOCTL resets
the VCPU while preserving the original mp_state -- because we wouldn't
gain much from having a new state it in the rest of KVM, but it's a very
non-standard use of the IOCTL.
Signed-off-by: Radim Krčmář <rkrcmar@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250515143723.2450630-5-rkrcmar@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs with Bus Lock
Threshold, which is close enough to VMX's Bus Lock Detection VM-Exit to
allow reusing KVM_CAP_X86_BUS_LOCK_EXIT.
The biggest difference between the two features is that Threshold is
fault-like, whereas Detection is trap-like. To allow the guest to make
forward progress, Threshold provides a per-VMCB counter which is
decremented every time a bus lock occurs, and a VM-Exit is triggered if
and only if the counter is '0'.
To provide Detection-like semantics, initialize the counter to '0', i.e.
exit on every bus lock, and when re-executing the guilty instruction, set
the counter to '1' to effectively step past the instruction.
Note, in the unlikely scenario that re-executing the instruction doesn't
trigger a bus lock, e.g. because the guest has changed memory types or
patched the guilty instruction, the bus lock counter will be left at '1',
i.e. the guest will be able to do a bus lock on a different instruction.
In a perfect world, KVM would ensure the counter is '0' if the guest has
made forward progress, e.g. if RIP has changed. But trying to close that
hole would incur non-trivial complexity, for marginal benefit; the intent
of KVM_CAP_X86_BUS_LOCK_EXIT is to allow userspace rate-limit bus locks,
not to allow for precise detection of problematic guest code. And, it's
simply not feasible to fully close the hole, e.g. if an interrupt arrives
before the original instruction can re-execute, the guest could step past
a different bus lock.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Manali Shukla <manali.shukla@amd.com>
Link: https://lore.kernel.org/r/20250502050346.14274-5-manali.shukla@amd.com
[sean: fix typo in comment]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Describe the two new vcpu flags that control NV, together with
the capabilities that advertise them.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20250514103501.2225951-18-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
All legacy network transports have been removed. Vector transports
provide the same capabilities with significantly higher network
throughput. There is no reason to keep the legacy network transport
infrastructure anymore. Remove it to reduce the maintenance burden.
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://patch.msgid.link/20250503051710.3286595-4-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
These legacy network transports were marked as obsolete in commit
40814b98a570 ("um: Mark non-vector net transports as obsolete").
More than five years have passed since then. Remove these network
transports to reduce the maintenance burden.
Suggested-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://patch.msgid.link/20250503051710.3286595-2-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
* kvm-arm64/nv-pmu-fixes:
: .
: Fixes for NV PMU emulation. From the cover letter:
:
: "Joey reports that some of his PMU tests do not behave quite as
: expected:
:
: - MDCR_EL2.HPMN is set to 0 out of reset
:
: - PMCR_EL0.P should reset all the counters when written from EL2
:
: Oliver points out that setting PMCR_EL0.N from userspace by writing to
: the register is silly with NV, and that we need a new PMU attribute
: instead.
:
: On top of that, I figured out that we had a number of little gotchas:
:
: - It is possible for a guest to write an HPMN value that is out of
: bound, and it seems valuable to limit it
:
: - PMCR_EL0.N should be the maximum number of counters when read from
: EL2, and MDCR_EL2.HPMN when read from EL0/EL1
:
: - Prevent userspace from updating PMCR_EL0.N when EL2 is available"
: .
KVM: arm64: Let kvm_vcpu_read_pmcr() return an EL-dependent value for PMCR_EL0.N
KVM: arm64: Handle out-of-bound write to MDCR_EL2.HPMN
KVM: arm64: Don't let userspace write to PMCR_EL0.N when the vcpu has EL2
KVM: arm64: Allow userspace to limit the number of PMU counters for EL2 VMs
KVM: arm64: Contextualise the handling of PMCR_EL0.P writes
KVM: arm64: Fix MDCR_EL2.HPMN reset value
KVM: arm64: Repaint pmcr_n into nr_pmu_counters
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
As long as we had purely EL1 VMs, we could easily update the number
of guest-visible counters by letting userspace write to PMCR_EL0.N.
With VMs started at EL2, PMCR_EL1.N only reflects MDCR_EL2.HPMN,
and we don't have a good way to limit it.
For this purpose, introduce a new PMUv3 attribute that allows
limiting the maximum number of counters. This requires the explicit
selection of a PMU.
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
This large commit contains the initial support for TDX in KVM. All x86
parts enable the host-side hypercalls that KVM uses to talk to the TDX
module, a software component that runs in a special CPU mode called SEAM
(Secure Arbitration Mode).
The series is in turn split into multiple sub-series, each with a separate
merge commit:
- Initialization: basic setup for using the TDX module from KVM, plus
ioctls to create TDX VMs and vCPUs.
- MMU: in TDX, private and shared halves of the address space are mapped by
different EPT roots, and the private half is managed by the TDX module.
Using the support that was added to the generic MMU code in 6.14,
add support for TDX's secure page tables to the Intel side of KVM.
Generic KVM code takes care of maintaining a mirror of the secure page
tables so that they can be queried efficiently, and ensuring that changes
are applied to both the mirror and the secure EPT.
- vCPU enter/exit: implement the callbacks that handle the entry of a TDX
vCPU (via the SEAMCALL TDH.VP.ENTER) and the corresponding save/restore
of host state.
- Userspace exits: introduce support for guest TDVMCALLs that KVM forwards to
userspace. These correspond to the usual KVM_EXIT_* "heavyweight vmexits"
but are triggered through a different mechanism, similar to VMGEXIT for
SEV-ES and SEV-SNP.
- Interrupt handling: support for virtual interrupt injection as well as
handling VM-Exits that are caused by vectored events. Exclusive to
TDX are machine-check SMIs, which the kernel already knows how to
handle through the kernel machine check handler (commit 7911f145de5f,
"x86/mce: Implement recovery for errors in TDX/SEAM non-root mode")
- Loose ends: handling of the remaining exits from the TDX module, including
EPT violation/misconfig and several TDVMCALL leaves that are handled in
the kernel (CPUID, HLT, RDMSR/WRMSR, GetTdVmCallInfo); plus returning
an error or ignoring operations that are not supported by TDX guests
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Trap and emulate virtualization is not available anymore for MIPS.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Categorize the capabilities correctly. Section 6 is for enabled vCPU
capabilities; section 7 is for enabled VM capabilities; section 8 is
for informational ones.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Ensure that they have a ":" in front of the defined item.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
It is redundant, and sometimes wrong.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The capability is incorrectly called KVM_CAP_PPC_MULTITCE in the documentation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
TSC_DEADLINE is now advertised unconditionally by KVM_GET_SUPPORTED_CPUID,
since commit 9be4ec35d668 ("KVM: x86: Advertise TSC_DEADLINE_TIMER in
KVM_GET_SUPPORTED_CPUID", 2024-12-18). Adjust the documentation to
reflect the new behavior.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250331150550.510320-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.15
- Nested virtualization support for VGICv3, giving the nested
hypervisor control of the VGIC hardware when running an L2 VM
- Removal of 'late' nested virtualization feature register masking,
making the supported feature set directly visible to userspace
- Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers
- Paravirtual interface for discovering the set of CPU implementations
where a VM may run, addressing a longstanding issue of guest CPU
errata awareness in big-little systems and cross-implementation VM
migration
- Userspace control of the registers responsible for identifying a
particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
allowing VMs to be migrated cross-implementation
- pKVM updates, including support for tracking stage-2 page table
allocations in the protected hypervisor in the 'SecPageTable' stat
- Fixes to vPMU, ensuring that userspace updates to the vPMU after
KVM_RUN are reflected into the backing perf events
|
|
* kvm-arm64/writable-midr:
: Writable implementation ID registers, courtesy of Sebastian Ott
:
: Introduce a new capability that allows userspace to set the
: ID registers that identify a CPU implementation: MIDR_EL1, REVIDR_EL1,
: and AIDR_EL1. Also plug a hole in KVM's trap configuration where
: SMIDR_EL1 was readable at EL1, despite the fact that KVM does not
: support SME.
KVM: arm64: Fix documentation for KVM_CAP_ARM_WRITABLE_IMP_ID_REGS
KVM: arm64: Copy MIDR_EL1 into hyp VM when it is writable
KVM: arm64: Copy guest CTR_EL0 into hyp VM
KVM: selftests: arm64: Test writes to MIDR,REVIDR,AIDR
KVM: arm64: Allow userspace to change the implementation ID registers
KVM: arm64: Load VPIDR_EL2 with the VM's MIDR_EL1 value
KVM: arm64: Maintain per-VM copy of implementation ID regs
KVM: arm64: Set HCR_EL2.TID1 unconditionally
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/pv-cpuid:
: Paravirtualized implementation ID, courtesy of Shameer Kolothum
:
: Big-little has historically been a pain in the ass to virtualize. The
: implementation ID (MIDR, REVIDR, AIDR) of a vCPU can change at the whim
: of vCPU scheduling. This can be particularly annoying when the guest
: needs to know the underlying implementation to mitigate errata.
:
: "Hyperscalers" face a similar scheduling problem, where VMs may freely
: migrate between hosts in a pool of heterogenous hardware. And yes, our
: server-class friends are equally riddled with errata too.
:
: In absence of an architected solution to this wart on the ecosystem,
: introduce support for paravirtualizing the implementation exposed
: to a VM, allowing the VMM to describe the pool of implementations that a
: VM may be exposed to due to scheduling/migration.
:
: Userspace is expected to intercept and handle these hypercalls using the
: SMCCC filter UAPI, should it choose to do so.
smccc: kvm_guest: Fix kernel builds for 32 bit arm
KVM: selftests: Add test for KVM_REG_ARM_VENDOR_HYP_BMAP_2
smccc/kvm_guest: Enable errata based on implementation CPUs
arm64: Make _midr_in_range_list() an exported function
KVM: arm64: Introduce KVM_REG_ARM_VENDOR_HYP_BMAP_2
KVM: arm64: Specify hypercall ABI for retrieving target implementations
arm64: Modify _midr_range() functions to read MIDR/REVIDR internally
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/nv-vgic:
: NV VGICv3 support, courtesy of Marc Zyngier
:
: Support for emulating the GIC hypervisor controls and managing shadow
: VGICv3 state for the L1 hypervisor. As part of it, bring in support for
: taking IRQs to the L1 and UAPI to manage the VGIC maintenance interrupt.
KVM: arm64: nv: Fail KVM init if asking for NV without GICv3
KVM: arm64: nv: Allow userland to set VGIC maintenance IRQ
KVM: arm64: nv: Fold GICv3 host trapping requirements into guest setup
KVM: arm64: nv: Propagate used_lrs between L1 and L0 contexts
KVM: arm64: nv: Request vPE doorbell upon nested ERET to L2
KVM: arm64: nv: Respect virtual HCR_EL2.TWx setting
KVM: arm64: nv: Add Maintenance Interrupt emulation
KVM: arm64: nv: Handle L2->L1 transition on interrupt injection
KVM: arm64: nv: Nested GICv3 emulation
KVM: arm64: nv: Sanitise ICH_HCR_EL2 accesses
KVM: arm64: nv: Plumb handling of GICv3 EL2 accesses
KVM: arm64: nv: Add ICH_*_EL2 registers to vpcu_sysreg
KVM: arm64: nv: Load timer before the GIC
arm64: sysreg: Add layout for ICH_MISR_EL2
arm64: sysreg: Add layout for ICH_VTR_EL2
arm64: sysreg: Add layout for ICH_HCR_EL2
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
KVM Xen changes for 6.15
- Don't write to the Xen hypercall page on MSR writes that are initiated by
the host (userspace or KVM) to fix a class of bugs where KVM can write to
guest memory at unexpected times, e.g. during vCPU creation if userspace has
set the Xen hypercall MSR index to collide with an MSR that KVM emulates.
- Restrict the Xen hypercall MSR indx to the unofficial synthetic range to
reduce the set of possible collisions with MSRs that are emulated by KVM
(collisions can still happen as KVM emulates Hyper-V MSRs, which also reside
in the synthetic range).
- Clean up and optimize KVM's handling of Xen MSR writes and xen_hvm_config.
- Update Xen TSC leaves during CPUID emulation instead of modifying the CPUID
entries when updating PV clocks, as there is no guarantee PV clocks will be
updated between TSC frequency changes and CPUID emulation, and guest reads
of Xen TSC should be rare, i.e. are not a hot path.
|
|
Add documentation to Intel Trusted Domain Extensions (TDX) support.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Message-ID: <20250227012021.1778144-21-binbin.wu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Introduce an Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT to have
KVM ignore guest PAT when this quirk is enabled.
On AMD platforms, KVM always honors guest PAT. On Intel however there are
two issues. First, KVM *cannot* honor guest PAT if CPU feature self-snoop
is not supported. Second, UC access on certain Intel platforms can be very
slow[1] and honoring guest PAT on those platforms may break some old
guests that accidentally specify video RAM as UC. Those old guests may
never expect the slowness since KVM always forces WB previously. See [2].
So, introduce a quirk that KVM can enable by default on all Intel platforms
to avoid breaking old unmodifiable guests. Newer userspace can disable this
quirk if it wishes KVM to honor guest PAT; disabling the quirk will fail
if self-snoop is not supported, i.e. if KVM cannot obey the wish.
The quirk is a no-op on AMD and also if any assigned devices have
non-coherent DMA. This is not an issue, as KVM_X86_QUIRK_CD_NW_CLEARED is
another example of a quirk that is sometimes automatically disabled.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://lore.kernel.org/all/Ztl9NWCOupNfVaCA@yzhao56-desk.sh.intel.com # [1]
Link: https://lore.kernel.org/all/87jzfutmfc.fsf@redhat.com # [2]
Message-ID: <20250224070946.31482-1-yan.y.zhao@intel.com>
[Use supported_quirks/inapplicable_quirks to support both AMD and
no-self-snoop cases, as well as to remove the shadow_memtype_mask check
from kvm_mmu_may_ignore_guest_pat(). - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Convert TDG.VP.VMCALL<ReportFatalError> to KVM_EXIT_SYSTEM_EVENT with
a new type KVM_SYSTEM_EVENT_TDX_FATAL and forward it to userspace for
handling.
TD guest can use TDG.VP.VMCALL<ReportFatalError> to report the fatal
error it has experienced. This hypercall is special because TD guest
is requesting a termination with the error information, KVM needs to
forward the hypercall to userspace anyway, KVM doesn't do parsing or
conversion, it just dumps the 16 general-purpose registers to userspace
and let userspace decide what to do.
Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Message-ID: <20250222014225.897298-8-binbin.wu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The capability actually fails with EINVAL if vCPUs have already been
created.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250305230825.484091-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
The VGIC maintenance IRQ signals various conditions about the LRs, when
the GIC's virtualization extension is used.
So far we didn't need it, but nested virtualization needs to know about
this interrupt, so add a userland interface to setup the IRQ number.
The architecture mandates that it must be a PPI, on top of that this code
only exports a per-device option, so the PPI is the same on all VCPUs.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
[added some bits of documentation]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-16-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
The vendor_hyp_bmap bitmap holds the information about the Vendor Hyp
services available to the user space and can be get/set using
{G, S}ET_ONE_REG interfaces. This is done using the pseudo-firmware
bitmap register KVM_REG_ARM_VENDOR_HYP_BMAP.
At present, this bitmap is a 64 bit one and since the function numbers
for newly added DISCOVER_IPML_* hypercalls are 64-65, introduce
another pseudo-firmware bitmap register KVM_REG_ARM_VENDOR_HYP_BMAP_2.
Reviewed-by: Sebastian Ott <sebott@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/r/20250221140229.12588-4-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
If the Guest requires migration to multiple targets, these hypercalls
will provide a way to retrieve the target CPU implementations from
the user space VMM.
Subsequent patch will use this to enable the associated errata.
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Suggested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Sebastian Ott <sebott@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/r/20250221140229.12588-3-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
One of the not-so-obvious requirements for restoring a VM is ensuring
that the vITS has been restored _before_ creating any irqbypass mappings.
This is because KVM needs to get the guest translation for MSIs to
correctly assemble the vLPI mapping getting installed in the physical
ITS.
Document the restore ordering requirements necessary for GICv4 vLPI
injection to work.
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250226183124.82094-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
KVM's treatment of the ID registers that describe the implementation
(MIDR, REVIDR, and AIDR) is interesting, to say the least. On the
userspace-facing end of it, KVM presents the values of the boot CPU on
all vCPUs and treats them as invariant. On the guest side of things KVM
presents the hardware values of the local CPU, which can change during
CPU migration in a big-little system.
While one may call this fragile, there is at least some degree of
predictability around it. For example, if a VMM wanted to present
big-little to a guest, it could affine vCPUs accordingly to the correct
clusters.
All of this makes a giant mess out of adding support for making these
implementation ID registers writable. Avoid breaking the rather subtle
ABI around the old way of doing things by requiring opt-in from
userspace to make the registers writable.
When the cap is enabled, allow userspace to set MIDR, REVIDR, and AIDR
to any non-reserved value and present those values consistently across
all vCPUs.
Signed-off-by: Sebastian Ott <sebott@redhat.com>
[oliver: changelog, capability]
Link: https://lore.kernel.org/r/20250225005401.679536-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Reject userspace attempts to set the Xen hypercall page MSR to an index
outside of the "standard" virtualization range [0x40000000, 0x4fffffff],
as KVM is not equipped to handle collisions with real MSRs, e.g. KVM
doesn't update MSR interception, conflicts with VMCS/VMCB fields, special
case writes in KVM, etc.
While the MSR index isn't strictly ABI, i.e. can theoretically float to
any value, in practice no known VMM sets the MSR index to anything other
than 0x40000000 or 0x40000200.
Cc: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Link: https://lore.kernel.org/r/20250215011437.1203084-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Don't force SPTE modifications to be done atomically if the only volatile
bit in the SPTE is the Accessed bit. KVM and the primary MMU tolerate
stale aging state, and the probability of an Accessed bit A/D assist being
clobbered *and* affecting again is likely far lower than the probability
of consuming stale information due to not flushing TLBs when aging.
Rename spte_has_volatile_bits() to spte_needs_atomic_update() to better
capture the nature of the helper.
Opportunstically do s/write/update on the TDP MMU wrapper, as it's not
simply the "write" that needs to be done atomically, it's the entire
update, i.e. the entire read-modify-write operation needs to be done
atomically so that KVM has an accurate view of the old SPTE.
Leave kvm_tdp_mmu_write_spte_atomic() as is. While the name is imperfect,
it pairs with kvm_tdp_mmu_write_spte(), which in turn pairs with
kvm_tdp_mmu_read_spte(). And renaming all of those isn't obviously a net
positive, and would require significant churn.
Signed-off-by: James Houghton <jthoughton@google.com>
Link: https://lore.kernel.org/r/20250204004038.1680123-6-jthoughton@google.com
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Create a fake memslot for ucontrol VMs. The fake memslot identity-maps
userspace.
Now memslots will always be present, and ucontrol is not a special case
anymore.
Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20250123144627.312456-4-imbrenda@linux.ibm.com
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-ID: <20250123144627.312456-4-imbrenda@linux.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull KVM/arm64 updates from Will Deacon:
"New features:
- Support for non-protected guest in protected mode, achieving near
feature parity with the non-protected mode
- Support for the EL2 timers as part of the ongoing NV support
- Allow control of hardware tracing for nVHE/hVHE
Improvements, fixes and cleanups:
- Massive cleanup of the debug infrastructure, making it a bit less
awkward and definitely easier to maintain. This should pave the way
for further optimisations
- Complete rewrite of pKVM's fixed-feature infrastructure, aligning
it with the rest of KVM and making the code easier to follow
- Large simplification of pKVM's memory protection infrastructure
- Better handling of RES0/RES1 fields for memory-backed system
registers
- Add a workaround for Qualcomm's Snapdragon X CPUs, which suffer
from a pretty nasty timer bug
- Small collection of cleanups and low-impact fixes"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits)
arm64/sysreg: Get rid of TRFCR_ELx SysregFields
KVM: arm64: nv: Fix doc header layout for timers
KVM: arm64: nv: Apply RESx settings to sysreg reset values
KVM: arm64: nv: Always evaluate HCR_EL2 using sanitising accessors
KVM: arm64: Fix selftests after sysreg field name update
coresight: Pass guest TRFCR value to KVM
KVM: arm64: Support trace filtering for guests
KVM: arm64: coresight: Give TRBE enabled state to KVM
coresight: trbe: Remove redundant disable call
arm64/sysreg/tools: Move TRFCR definitions to sysreg
tools: arm64: Update sysreg.h header files
KVM: arm64: Drop pkvm_mem_transition for host/hyp donations
KVM: arm64: Drop pkvm_mem_transition for host/hyp sharing
KVM: arm64: Drop pkvm_mem_transition for FF-A
KVM: arm64: Explicitly handle BRBE traps as UNDEFINED
KVM: arm64: vgic: Use str_enabled_disabled() in vgic_v3_probe()
arm64: kvm: Introduce nvhe stack size constants
KVM: arm64: Fix nVHE stacktrace VA bits mask
KVM: arm64: Fix FEAT_MTE in pKVM
Documentation: Update the behaviour of "kvm-arm.mode"
...
|
|
Pull kvm updates from Paolo Bonzini:
"Loongarch:
- Clear LLBCTL if secondary mmu mapping changes
- Add hypercall service support for usermode VMM
x86:
- Add a comment to kvm_mmu_do_page_fault() to explain why KVM
performs a direct call to kvm_tdp_page_fault() when RETPOLINE is
enabled
- Ensure that all SEV code is compiled out when disabled in Kconfig,
even if building with less brilliant compilers
- Remove a redundant TLB flush on AMD processors when guest CR4.PGE
changes
- Use str_enabled_disabled() to replace open coded strings
- Drop kvm_x86_ops.hwapic_irr_update() as KVM updates hardware's
APICv cache prior to every VM-Enter
- Overhaul KVM's CPUID feature infrastructure to track all vCPU
capabilities instead of just those where KVM needs to manage state
and/or explicitly enable the feature in hardware. Along the way,
refactor the code to make it easier to add features, and to make it
more self-documenting how KVM is handling each feature
- Rework KVM's handling of VM-Exits during event vectoring; this
plugs holes where KVM unintentionally puts the vCPU into infinite
loops in some scenarios (e.g. if emulation is triggered by the
exit), and brings parity between VMX and SVM
- Add pending request and interrupt injection information to the
kvm_exit and kvm_entry tracepoints respectively
- Fix a relatively benign flaw where KVM would end up redoing RDPKRU
when loading guest/host PKRU, due to a refactoring of the kernel
helpers that didn't account for KVM's pre-checking of the need to
do WRPKRU
- Make the completion of hypercalls go through the complete_hypercall
function pointer argument, no matter if the hypercall exits to
userspace or not.
Previously, the code assumed that KVM_HC_MAP_GPA_RANGE specifically
went to userspace, and all the others did not; the new code need
not special case KVM_HC_MAP_GPA_RANGE and in fact does not care at
all whether there was an exit to userspace or not
- As part of enabling TDX virtual machines, support support
separation of private/shared EPT into separate roots.
When TDX will be enabled, operations on private pages will need to
go through the privileged TDX Module via SEAMCALLs; as a result,
they are limited and relatively slow compared to reading a PTE.
The patches included in 6.14 allow KVM to keep a mirror of the
private EPT in host memory, and define entries in kvm_x86_ops to
operate on external page tables such as the TDX private EPT
- The recently introduced conversion of the NX-page reclamation
kthread to vhost_task moved the task under the main process. The
task is created as soon as KVM_CREATE_VM was invoked and this, of
course, broke userspace that didn't expect to see any child task of
the VM process until it started creating its own userspace threads.
In particular crosvm refuses to fork() if procfs shows any child
task, so unbreak it by creating the task lazily. This is arguably a
userspace bug, as there can be other kinds of legitimate worker
tasks and they wouldn't impede fork(); but it's not like userspace
has a way to distinguish kernel worker tasks right now. Should they
show as "Kthread: 1" in proc/.../status?
x86 - Intel:
- Fix a bug where KVM updates hardware's APICv cache of the highest
ISR bit while L2 is active, while ultimately results in a
hardware-accelerated L1 EOI effectively being lost
- Honor event priority when emulating Posted Interrupt delivery
during nested VM-Enter by queueing KVM_REQ_EVENT instead of
immediately handling the interrupt
- Rework KVM's processing of the Page-Modification Logging buffer to
reap entries in the same order they were created, i.e. to mark gfns
dirty in the same order that hardware marked the page/PTE dirty
- Misc cleanups
Generic:
- Cleanup and harden kvm_set_memory_region(); add proper lockdep
assertions when setting memory regions and add a dedicated API for
setting KVM-internal memory regions. The API can then explicitly
disallow all flags for KVM-internal memory regions
- Explicitly verify the target vCPU is online in kvm_get_vcpu() to
fix a bug where KVM would return a pointer to a vCPU prior to it
being fully online, and give kvm_for_each_vcpu() similar treatment
to fix a similar flaw
- Wait for a vCPU to come online prior to executing a vCPU ioctl, to
fix a bug where userspace could coerce KVM into handling the ioctl
on a vCPU that isn't yet onlined
- Gracefully handle xarray insertion failures; even though such
failures are impossible in practice after xa_reserve(), reserving
an entry is always followed by xa_store() which does not know (or
differentiate) whether there was an xa_reserve() before or not
RISC-V:
- Zabha, Svvptc, and Ziccrse extension support for guests. None of
them require anything in KVM except for detecting them and marking
them as supported; Zabha adds byte and halfword atomic operations,
while the others are markers for specific operation of the TLB and
of LL/SC instructions respectively
- Virtualize SBI system suspend extension for Guest/VM
- Support firmware counters which can be used by the guests to
collect statistics about traps that occur in the host
Selftests:
- Rework vcpu_get_reg() to return a value instead of using an
out-param, and update all affected arch code accordingly
- Convert the max_guest_memory_test into a more generic
mmu_stress_test. The basic gist of the "conversion" is to have the
test do mprotect() on guest memory while vCPUs are accessing said
memory, e.g. to verify KVM and mmu_notifiers are working as
intended
- Play nice with treewrite builds of unsupported architectures, e.g.
arm (32-bit), as KVM selftests' Makefile doesn't do anything to
ensure the target architecture is actually one KVM selftests
supports
- Use the kernel's $(ARCH) definition instead of the target triple
for arch specific directories, e.g. arm64 instead of aarch64,
mainly so as not to be different from the rest of the kernel
- Ensure that format strings for logging statements are checked by
the compiler even when the logging statement itself is disabled
- Attempt to whack the last LLC references/misses mole in the Intel
PMU counters test by adding a data load and doing CLFLUSH{OPT} on
the data instead of the code being executed. It seems that modern
Intel CPUs have learned new code prefetching tricks that bypass the
PMU counters
- Fix a flaw in the Intel PMU counters test where it asserts that
events are counting correctly without actually knowing what the
events count given the underlying hardware; this can happen if
Intel reuses a formerly microarchitecture-specific event encoding
as an architectural event, as was the case for Top-Down Slots"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (151 commits)
kvm: defer huge page recovery vhost task to later
KVM: x86/mmu: Return RET_PF* instead of 1 in kvm_mmu_page_fault()
KVM: Disallow all flags for KVM-internal memslots
KVM: x86: Drop double-underscores from __kvm_set_memory_region()
KVM: Add a dedicated API for setting KVM-internal memslots
KVM: Assert slots_lock is held when setting memory regions
KVM: Open code kvm_set_memory_region() into its sole caller (ioctl() API)
LoongArch: KVM: Add hypercall service support for usermode VMM
LoongArch: KVM: Clear LLBCTL if secondary mmu mapping is changed
KVM: SVM: Use str_enabled_disabled() helper in svm_hardware_setup()
KVM: VMX: read the PML log in the same order as it was written
KVM: VMX: refactor PML terminology
KVM: VMX: Fix comment of handle_vmx_instruction()
KVM: VMX: Reinstate __exit attribute for vmx_exit()
KVM: SVM: Use str_enabled_disabled() helper in sev_hardware_setup()
KVM: x86: Avoid double RDPKRU when loading host/guest PKRU
KVM: x86: Use LVT_TIMER instead of an open coded literal
RISC-V: KVM: Add new exit statstics for redirected traps
RISC-V: KVM: Update firmware counters for various events
RISC-V: KVM: Redirect instruction access fault trap to guest
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- Introduce a new set of Hyper-V headers in include/hyperv and replace
the old hyperv-tlfs.h with the new headers (Nuno Das Neves)
- Fixes for the Hyper-V VTL mode (Roman Kisel)
- Fixes for cpu mask usage in Hyper-V code (Michael Kelley)
- Document the guest VM hibernation behaviour (Michael Kelley)
- Miscellaneous fixes and cleanups (Jacob Pan, John Starks, Naman Jain)
* tag 'hyperv-next-signed-20250123' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Documentation: hyperv: Add overview of guest VM hibernation
hyperv: Do not overlap the hvcall IO areas in hv_vtl_apicid_to_vp_id()
hyperv: Do not overlap the hvcall IO areas in get_vtl()
hyperv: Enable the hypercall output page for the VTL mode
hv_balloon: Fallback to generic_online_page() for non-HV hot added mem
Drivers: hv: vmbus: Log on missing offers if any
Drivers: hv: vmbus: Wait for boot-time offers during boot and resume
uio_hv_generic: Add a check for HV_NIC for send, receive buffers setup
iommu/hyper-v: Don't assume cpu_possible_mask is dense
Drivers: hv: Don't assume cpu_possible_mask is dense
x86/hyperv: Don't assume cpu_possible_mask is dense
hyperv: Remove the now unused hyperv-tlfs.h files
hyperv: Switch from hyperv-tlfs.h to hyperv/hvhdk.h
hyperv: Add new Hyper-V headers in include/hyperv
hyperv: Clean up unnecessary #includes
hyperv: Move hv_connection_id to hyperv-tlfs.h
|
|
Pull Documentation updates from Jonathan Corbet:
- Quite a bit of Chinese and Spanish translation work
- Clarifying that Git commit IDs >12chars are OK
- A new nvme-multipath document
- A reorganization of the admin-guide top-level page to make it
readable
- Clarification of the role of Acked-by and maintainer discretion on
their acceptance
- Some reorganization of debugging-oriented docs
... and typo fixes, documentation updates, etc as usual
* tag 'docs-6.14' of git://git.lwn.net/linux: (50 commits)
Documentation: Fix x86_64 UEFI outdated references to elilo
Documentation/sysctl: Add timer_migration to kernel.rst
docs/mm: Physical memory: Remove zone_t
docs: submitting-patches: clarify that signers may use their discretion on tags
docs: submitting-patches: clarify difference between Acked-by and Reviewed-by
docs: submitting-patches: clarify Acked-by and introduce "# Suffix"
Documentation: bug-hunting.rst: remove odd contact information
docs/zh_CN: Add sak index Chinese translation
doc: module: DEFAULT_SYMBOL_NAMESPACE must be defined before #includes
doc: module: Fix documented type of namespace
Documentation/kernel-parameters: Fix a reference to vga-softcursor.rst
docs/zh_CN: Add landlock index Chinese translation
Documentation: Fix typo localmodonfig -> localmodconfig
overlayfs.rst: Fix and improve grammar
docs/zh_CN: Add siphash index Chinese translation
docs/zh_CN: Add security IMA-templates Chinese translation
docs/zh_CN: Add security digsig Chinese translation
Align git commit ID abbreviation guidelines and checks
docs: process: submitting-patches: split canonical patch format section
docs/zh_CN: Add security lsm Chinese translation
...
|
|
KVM x86 misc changes for 6.14:
- Overhaul KVM's CPUID feature infrastructure to track all vCPU capabilities
instead of just those where KVM needs to manage state and/or explicitly
enable the feature in hardware. Along the way, refactor the code to make
it easier to add features, and to make it more self-documenting how KVM
is handling each feature.
- Rework KVM's handling of VM-Exits during event vectoring; this plugs holes
where KVM unintentionally puts the vCPU into infinite loops in some scenarios
(e.g. if emulation is triggered by the exit), and brings parity between VMX
and SVM.
- Add pending request and interrupt injection information to the kvm_exit and
kvm_entry tracepoints respectively.
- Fix a relatively benign flaw where KVM would end up redoing RDPKRU when
loading guest/host PKRU, due to a refactoring of the kernel helpers that
didn't account for KVM's pre-checking of the need to do WRPKRU.
|
|
Stephen reports that 'make htmldocs' spits out a warning
("Documentation/virt/kvm/devices/vcpu.rst:147: WARNING: Definition
list ends without a blank line; unexpected unindent.").
Fix it by keeping all the timer attributes on a single line.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Add documentation on how hibernation works in a guest VM on Hyper-V.
Describe how VMBus devices and the VMBus itself are hibernated and
resumed, along with various limitations.
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/20250113145645.1320942-1-mhklinux@outlook.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250113145645.1320942-1-mhklinux@outlook.com>
|
|
Prevent null pointer dereference when processing
KVM_IRQ_ROUTING_S390_ADAPTER routing entries.
The ioctl cannot be processed for ucontrol VMs.
Fixes: f65470661f36 ("KVM: s390/interrupt: do not pin adapter interrupt pages")
Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
Tested-by: Hariharan Mari <hari55@linux.ibm.com>
Reviewed-by: Hariharan Mari <hari55@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20241216092140.329196-4-schlameuss@linux.ibm.com
Message-ID: <20241216092140.329196-4-schlameuss@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
|
|
Prevent null pointer dereference when processing the
KVM_DEV_FLIC_APF_ENABLE and KVM_DEV_FLIC_APF_DISABLE_WAIT ioctls in the
interrupt controller.
Fixes: 3c038e6be0e2 ("KVM: async_pf: Async page fault support on s390")
Reported-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
Reviewed-by: Hariharan Mari <hari55@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20241216092140.329196-2-schlameuss@linux.ibm.com
Message-ID: <20241216092140.329196-2-schlameuss@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
|