summaryrefslogtreecommitdiff
path: root/arch/x86/include
AgeCommit message (Collapse)Author
2025-03-09Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "arm64: - Fix a couple of bugs affecting pKVM's PSCI relay implementation when running in the hVHE mode, resulting in the host being entered with the MMU in an unknown state, and EL2 being in the wrong mode x86: - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow - Ensure DEBUGCTL is context switched on AMD to avoid running the guest with the host's value, which can lead to unexpected bus lock #DBs - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly emulate BTF. KVM's lack of context switching has meant BTF has always been broken to some extent - Always save DR masks for SNP vCPUs if DebugSwap is *supported*, as the guest can enable DebugSwap without KVM's knowledge - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO memory" phase without actually generating a write-protection fault - Fix a printf() goof in the SEV smoke test that causes build failures with -Werror - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2 isn't supported by KVM" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Explicitly zero EAX and EBX when PERFMON_V2 isn't supported by KVM KVM: selftests: Fix printf() format goof in SEV smoke test KVM: selftests: Ensure all vCPUs hit -EFAULT during initial RO stage KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3 KVM: SVM: Save host DR masks on CPUs with DebugSwap KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu() KVM: arm64: Initialize HCR_EL2.E2H early KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs KVM: SVM: Manually context switch DEBUGCTL if LBR virtualization is disabled KVM: x86: Snapshot the host's DEBUGCTL in common x86 KVM: SVM: Suppress DEBUGCTL.BTF on AMD KVM: SVM: Drop DEBUGCTL[5:2] from guest's effective value KVM: selftests: Assert that STI blocking isn't set after event injection KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow
2025-03-09Merge tag 'kvm-x86-fixes-6.14-rcN.2' of https://github.com/kvm-x86/linux ↵Paolo Bonzini
into HEAD KVM x86 fixes for 6.14-rcN #2 - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow. - Ensure DEBUGCTL is context switched on AMD to avoid running the guest with the host's value, which can lead to unexpected bus lock #DBs. - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly emulate BTF. KVM's lack of context switching has meant BTF has always been broken to some extent. - Always save DR masks for SNP vCPUs if DebugSwap is *supported*, as the guest can enable DebugSwap without KVM's knowledge. - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO memory" phase without actually generating a write-protection fault. - Fix a printf() goof in the SEV smoke test that causes build failures with -Werror. - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2 isn't supported by KVM.
2025-03-08x86/vdso: Prepare introduction of struct vdso_clockAnna-Maria Behnsen
To support multiple PTP clocks, the VDSO data structure needs to be reworked. All clock specific data will end up in struct vdso_clock and in struct vdso_time_data there will be array of VDSO clocks. At the moment, vdso_clock is simply a define which maps vdso_clock to vdso_time_data. To prepare for the rework of the data structures, replace the struct vdso_time_data pointer with a struct vdso_clock pointer where applicable. No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-15-c1b5c69a166f@linutronix.de
2025-03-08Merge branch 'locking/urgent' into locking/core, to pick up locking fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-08x86/mm: Define PTRS_PER_PMD for assembly code tooIngo Molnar
Andy reported the following build warning from head_32.S: In file included from arch/x86/kernel/head_32.S:29: arch/x86/include/asm/pgtable_32.h:59:5: error: "PTRS_PER_PMD" is not defined, evaluates to 0 [-Werror=undef] 59 | #if PTRS_PER_PMD > 1 The reason is that on 2-level i386 paging the folded in PMD's PTRS_PER_PMD constant is not defined in assembly headers, only in generic MM C headers. Instead of trying to fish out the definition from the generic headers, just define it - it even has a comment for it already... Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/Z8oa8AUVyi2HWfo9@gmail.com
2025-03-07virt: sev-guest: Move SNP Guest Request data pages handling under snp_cmd_mutexAlexey Kardashevskiy
Compared to the SNP Guest Request, the "Extended" version adds data pages for receiving certificates. If not enough pages provided, the HV can report to the VM how much is needed so the VM can reallocate and repeat. Commit ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") moved handling of the allocated/desired pages number out of scope of said mutex and create a possibility for a race (multiple instances trying to trigger Extended request in a VM) as there is just one instance of snp_msg_desc per /dev/sev-guest and no locking other than snp_cmd_mutex. Fix the issue by moving the data blob/size and the GHCB input struct (snp_req_data) into snp_guest_req which is allocated on stack now and accessed by the GHCB caller under that mutex. Stop allocating SEV_FW_BLOB_MAX_SIZE in snp_msg_alloc() as only one of four callers needs it. Free the received blob in get_ext_report() right after it is copied to the userspace. Possible future users of snp_send_guest_request() are likely to have different ideas about the buffer size anyways. Fixes: ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250307013700.437505-3-aik@amd.com
2025-03-06cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masksMario Limonciello
Bitfield masks are easier to follow and less error prone. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06x86/fpu: Improve crypto performance by making kernel-mode FPU reliably ↵Eric Biggers
usable in softirqs Background: =========== Currently kernel-mode FPU is not always usable in softirq context on x86, since softirqs can nest inside a kernel-mode FPU section in task context, and nested use of kernel-mode FPU is not supported. Therefore, x86 SIMD-optimized code that can be called in softirq context has to sometimes fall back to non-SIMD code. There are two options for the fallback, both of which are pretty terrible: (a) Use a scalar fallback. This can be 10-100x slower than vectorized code because it cannot use specialized instructions like AES, SHA, or carryless multiplication. (b) Execute the request asynchronously using a kworker. In other words, use the "crypto SIMD helper" in crypto/simd.c. Currently most of the x86 en/decryption code (skcipher and aead algorithms) uses option (b), since this avoids the slow scalar fallback and it is easier to wire up. But option (b) is still really bad for its own reasons: - Punting the request to a kworker is bad for performance too. - It forces the algorithm to be marked as asynchronous (CRYPTO_ALG_ASYNC), preventing it from being used by crypto API users who request a synchronous algorithm. That's another huge performance problem, which is especially unfortunate for users who don't even do en/decryption in softirq context. - It makes all en/decryption operations take a detour through crypto/simd.c. That involves additional checks and an additional indirect call, which slow down en/decryption for *everyone*. Fortunately, the skcipher and aead APIs are only usable in task and softirq context in the first place. Thus, if kernel-mode FPU were to be reliably usable in softirq context, no fallback would be needed. Indeed, other architectures such as arm, arm64, and riscv have already done this. Changes implemented: ==================== Therefore, this patch updates x86 accordingly to reliably support kernel-mode FPU in softirqs. This is done by just disabling softirq processing in kernel-mode FPU sections (when hardirqs are not already disabled), as that prevents the nesting that was problematic. This will delay some softirqs slightly, but only ones that would have otherwise been nested inside a task context kernel-mode FPU section. Any such softirqs would have taken the slow fallback path before if they tried to do any en/decryption. Now these softirqs will just run at the end of the task context kernel-mode FPU section (since local_bh_enable() runs pending softirqs) and will no longer take the slow fallback path. Alternatives considered: ======================== - Make kernel-mode FPU sections fully preemptible. This would require growing task_struct by another struct fpstate which is more than 2K. - Make softirqs save/restore the kernel-mode FPU state to a per-CPU struct fpstate when nested use is detected. Somewhat interesting, but seems unnecessary when a simpler solution exists. Performance results: ==================== I did some benchmarks with AES-XTS encryption of 16-byte messages (which is unrealistically small, but this makes it easier to see the overhead of kernel-mode FPU...). The baseline was 384 MB/s. Removing the use of crypto/simd.c, which this work makes possible, increases it to 487 MB/s, a +27% improvement in throughput. CPU was AMD Ryzen 9 9950X (Zen 5). No debugging options were enabled. [ mingo: Prettified the changelog and added performance results. ] Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20250304204954.3901-1-ebiggers@kernel.org
2025-03-04x86/smp: Move this_cpu_off to percpu hot sectionBrian Gerst
No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-12-brgerst@gmail.com
2025-03-04x86/stackprotector: Move __stack_chk_guard to percpu hot sectionBrian Gerst
No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-11-brgerst@gmail.com
2025-03-04x86/percpu: Move current_task to percpu hot sectionBrian Gerst
No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-10-brgerst@gmail.com
2025-03-04x86/percpu: Move top_of_stack to percpu hot sectionBrian Gerst
No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-9-brgerst@gmail.com
2025-03-04x86/irq: Move irq stacks to percpu hot sectionBrian Gerst
No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-8-brgerst@gmail.com
2025-03-04x86/softirq: Move softirq_pending to percpu hot sectionBrian Gerst
No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-7-brgerst@gmail.com
2025-03-04x86/retbleed: Move call depth to percpu hot sectionBrian Gerst
No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-6-brgerst@gmail.com
2025-03-04x86/smp: Move cpu number to percpu hot sectionBrian Gerst
No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-5-brgerst@gmail.com
2025-03-04x86/preempt: Move preempt count to percpu hot sectionBrian Gerst
No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-4-brgerst@gmail.com
2025-03-04x86/percpu: Move pcpu_hot to percpu hot sectionBrian Gerst
Also change the alignment of the percpu hot section: - PERCPU_SECTION(INTERNODE_CACHE_BYTES) + PERCPU_SECTION(L1_CACHE_BYTES) As vSMP will muck with INTERNODE_CACHE_BYTES that invalidates the too-large-section assert we do: ASSERT(__per_cpu_hot_end - __per_cpu_hot_start <= 64, "percpu cache hot section too large") [ mingo: Added INTERNODE_CACHE_BYTES fix & explanation. ] Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-3-brgerst@gmail.com
2025-03-04Merge branch 'x86/headers' into x86/core, to pick up dependent commitsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-04Merge branch 'x86/asm' into x86/core, to pick up dependent commitsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-04x86/alternatives: Simplify alternative_call() interfaceJosh Poimboeuf
Separate the input from the clobbers in preparation for appending the input. Do this in preparation of changing the ASM_CALL_CONSTRAINT primitive. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org
2025-03-04x86/hyperv: Use named operands in inline asmJosh Poimboeuf
Use named operands in inline asm to make it easier to change the constraint order. Do this in preparation of changing the ASM_CALL_CONSTRAINT primitive. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: linux-kernel@vger.kernel.org
2025-03-04Merge branch 'x86/locking' into x86/asm, to simplify dependenciesIngo Molnar
Before picking up new changes in this area, consolidate these changes into x86/asm. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-04Merge branch 'x86/cpu' into x86/asm, to pick up dependent commitsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-04x86/cpu: Get rid of the smp_store_cpu_info() indirectionThomas Gleixner
smp_store_cpu_info() is just a wrapper around identify_secondary_cpu() without further value. Move the extra bits from smp_store_cpu_info() into identify_secondary_cpu() and remove the wrapper. [ darwi: Make it compile and fix up the xen/smp_pv.c instance ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250304085152.51092-9-darwi@linutronix.de
2025-03-04x86/cpu: Simplify TLB entry count storageAhmed S. Darwish
Commit: e0ba94f14f74 ("x86/tlb_info: get last level TLB entry number of CPU") introduced u16 "info" arrays for each TLB type. Since 2012 and each array stores just one type of information: the number of TLB entries for its respective TLB type. Replace such arrays with simple variables. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250304085152.51092-8-darwi@linutronix.de
2025-03-04x86/cpuid: Include <linux/build_bug.h> in <asm/cpuid.h>Ahmed S. Darwish
<asm/cpuid.h> uses static_assert() at multiple locations but it does not include the CPP macro's definition at linux/build_bug.h. Include the needed header to make <asm/cpuid.h> self-sufficient. This gets triggered when cpuid.h is included in new C files, which is to be done in further commits. Fixes: 43d86e3cd9a7 ("x86/cpu: Provide cpuid_read() et al.") Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250304085152.51092-5-darwi@linutronix.de
2025-03-04x86/cpu: Remove unnecessary macro indirection related to CPU feature namesBrendan Jackman
These macros used to abstract over CONFIG_X86_FEATURE_NAMES, but that was removed in: 7583e8fbdc49 ("x86/cpu: Remove X86_FEATURE_NAMES") Now they are just an unnecessary indirection, remove them. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20250303-setcpuid-taint-louder-v1-1-8d255032cb4c@google.com
2025-03-04x86/speculation: Add a conditional CS prefix to CALL_NOSPECPawan Gupta
Retpoline mitigation for spectre-v2 uses thunks for indirect branches. To support this mitigation compilers add a CS prefix with -mindirect-branch-cs-prefix. For an indirect branch in asm, this needs to be added manually. CS prefix is already being added to indirect branches in asm files, but not in inline asm. Add CS prefix to CALL_NOSPEC for inline asm as well. There is no JMP_NOSPEC for inline asm. Reported-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250228-call-nospec-v3-2-96599fed0f33@linux.intel.com
2025-03-04x86/speculation: Simplify and make CALL_NOSPEC consistentPawan Gupta
CALL_NOSPEC macro is used to generate Spectre-v2 mitigation friendly indirect branches. At compile time the macro defaults to indirect branch, and at runtime those can be patched to thunk based mitigations. This approach is opposite of what is done for the rest of the kernel, where the compile time default is to replace indirect calls with retpoline thunk calls. Make CALL_NOSPEC consistent with the rest of the kernel, default to retpoline thunk at compile time when CONFIG_MITIGATION_RETPOLINE is enabled. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250228-call-nospec-v3-1-96599fed0f33@linux.intel.com
2025-03-04x86/smp: Fix mwait_play_dead() and acpi_processor_ffh_play_dead() noreturn ↵Josh Poimboeuf
behavior Fix some related issues (done in a single patch to avoid introducing intermediate bisect warnings): 1) The SMP version of mwait_play_dead() doesn't return, but its !SMP counterpart does. Make its calling behavior consistent by resolving the !SMP version to a BUG(). It should never be called anyway, this just enforces that at runtime and enables its callers to be marked as __noreturn. 2) While the SMP definition of mwait_play_dead() is annotated as __noreturn, the declaration isn't. Nor is it listed in tools/objtool/noreturns.h. Fix that. 3) Similar to #1, the SMP version of acpi_processor_ffh_play_dead() doesn't return but its !SMP counterpart does. Make the !SMP version a BUG(). It should never be called. 4) acpi_processor_ffh_play_dead() doesn't return, but is lacking any __noreturn annotations. Fix that. This fixes the following objtool warnings: vmlinux.o: warning: objtool: acpi_processor_ffh_play_dead+0x67: mwait_play_dead() is missing a __noreturn annotation vmlinux.o: warning: objtool: acpi_idle_play_dead+0x3c: acpi_processor_ffh_play_dead() is missing a __noreturn annotation Fixes: a7dd183f0b38 ("x86/smp: Allow calling mwait_play_dead with an arbitrary hint") Fixes: 541ddf31e300 ("ACPI/processor_idle: Add FFH state handling") Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/e885c6fa9e96a61471b33e48c2162d28b15b14c5.1740962711.git.jpoimboe@kernel.org
2025-03-03x86/smp/32: Remove safe_smp_processor_id()Brian Gerst
The safe_smp_processor_id() function was originally implemented in: dc2bc768a009 ("stack overflow safe kdump: safe_smp_processor_id()") to mitigate the CPU number corruption on a stack overflow. At the time, x86-32 stored the CPU number in thread_struct, which was located at the bottom of the task stack and thus vulnerable to an overflow. The CPU number is now located in percpu memory, so this workaround is no longer needed. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20250303170115.2176553-1-brgerst@gmail.com
2025-03-03x86/asm: Merge KSTK_ESP() implementationsBrian Gerst
Commit: 263042e4630a ("Save user RSP in pt_regs->sp on SYSCALL64 fastpath") simplified the 64-bit implementation of KSTK_ESP() which is now identical to 32-bit. Merge them into a common definition. No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250303183111.2245129-1-brgerst@gmail.com
2025-03-03KVM: SVM: Require AP's "requested" SEV_FEATURES to match KVM's viewSean Christopherson
When handling an "AP Create" event, return an error if the "requested" SEV features for the vCPU don't exactly match KVM's view of the VM-scoped features. There is no known use case for heterogeneous SEV features across vCPUs, and while KVM can't actually enforce an exact match since the value in RAX isn't guaranteed to match what the guest shoved into the VMSA, KVM can at least avoid knowingly letting the guest run in an unsupported state. E.g. if a VM is created with DebugSwap disabled, KVM will intercept #DBs and DRs for all vCPUs, even if an AP is "created" with DebugSwap enabled in its VMSA. Note, the GHCB spec only "requires" that "AP use the same interrupt injection mechanism as the BSP", but given the disaster that is DebugSwap and SEV_FEATURES in general, it's safe to say that AMD didn't consider all possible complications with mismatching features between the BSP and APs. Opportunistically fold the check into the relevant request flavors; the "request < AP_DESTROY" check is just a bizarre way of implementing the AP_CREATE_ON_INIT => AP_CREATE fallthrough. Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event") Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Link: https://lore.kernel.org/r/20250227012541.3234589-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03x86/speculation: Add a conditional CS prefix to CALL_NOSPECPawan Gupta
Retpoline mitigation for spectre-v2 uses thunks for indirect branches. To support this mitigation compilers add a CS prefix with -mindirect-branch-cs-prefix. For an indirect branch in asm, this needs to be added manually. CS prefix is already being added to indirect branches in asm files, but not in inline asm. Add CS prefix to CALL_NOSPEC for inline asm as well. There is no JMP_NOSPEC for inline asm. Reported-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250228-call-nospec-v3-2-96599fed0f33@linux.intel.com
2025-03-03x86/speculation: Simplify and make CALL_NOSPEC consistentPawan Gupta
CALL_NOSPEC macro is used to generate Spectre-v2 mitigation friendly indirect branches. At compile time the macro defaults to indirect branch, and at runtime those can be patched to thunk based mitigations. This approach is opposite of what is done for the rest of the kernel, where the compile time default is to replace indirect calls with retpoline thunk calls. Make CALL_NOSPEC consistent with the rest of the kernel, default to retpoline thunk at compile time when CONFIG_MITIGATION_RETPOLINE is enabled. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250228-call-nospec-v3-1-96599fed0f33@linux.intel.com
2025-03-03x86/paravirt: Remove unused paravirt_disable_iospace()Dr. David Alan Gilbert
The last use of paravirt_disable_iospace() was removed in 2015 by commit d1c29465b8a5 ("lguest: don't disable iospace.") Remove it. Note the comment above it about 'entry.S' is unrelated to this but stayed when intervening code got deleted. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20250303004441.250451-1-linux@treblig.org
2025-03-03x86/ibt: Make cfi_bhi a constant for FINEIBT_BHI=nPeter Zijlstra
Robot yielded a .config that tripped: vmlinux.o: warning: objtool: do_jit+0x276: relocation to !ENDBR: .noinstr.text+0x6a60 This is the result of using __bhi_args[1] in unreachable code; make sure the compiler is able to determine this is unreachable and trigger DCE. Closes: https://lore.kernel.org/oe-kbuild-all/202503030704.H9KFysNS-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250303094911.GL5880@noisy.programming.kicks-ass.net
2025-02-28KVM: x86: Snapshot the host's DEBUGCTL in common x86Sean Christopherson
Move KVM's snapshot of DEBUGCTL to kvm_vcpu_arch and take the snapshot in common x86, so that SVM can also use the snapshot. Opportunistically change the field to a u64. While bits 63:32 are reserved on AMD, not mentioned at all in Intel's SDM, and managed as an "unsigned long" by the kernel, DEBUGCTL is an MSR and therefore a 64-bit value. Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Cc: stable@vger.kernel.org Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250227222411.3490595-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-28KVM: nVMX: Decouple EPT RWX bits from EPT Violation protection bitsSean Christopherson
Define independent macros for the RWX protection bits that are enumerated via EXIT_QUALIFICATION for EPT Violations, and tie them to the RWX bits in EPT entries via compile-time asserts. Piggybacking the EPTE defines works for now, but it creates holes in the EPT_VIOLATION_xxx macros and will cause headaches if/when KVM emulates Mode-Based Execution (MBEC), or any other features that introduces additional protection information. Opportunistically rename EPT_VIOLATION_RWX_MASK to EPT_VIOLATION_PROT_MASK so that it doesn't become stale if/when MBEC support is added. No functional change intended. Cc: Jon Kohler <jon@nutanix.com> Cc: Nikolay Borisov <nik.borisov@suse.com> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Link: https://lore.kernel.org/r/20250227000705.3199706-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-28KVM: VMX: Remove EPT_VIOLATIONS_ACC_*_BIT definesNikolay Borisov
Those defines are only used in the definition of the various EPT_VIOLATIONS_ACC_* macros which are then used to extract respective bits from vmexit error qualifications. Remove the _BIT defines and redefine the _ACC ones via BIT() macro. No functional changes. Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> Link: https://lore.kernel.org/r/20250227000705.3199706-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-28x86/mm: Reduce header dependencies in <asm/set_memory.h>Kevin Brodsky
Commit: 03b122da74b2 ("x86/sgx: Hook arch_memory_failure() into mainline code") ... added <linux/mm.h> to <asm/set_memory.h> to provide some helpers. However the following commit: b3fdf9398a16 ("x86/mce: relocate set{clear}_mce_nospec() functions") ... moved the inline definitions someplace else, and now <asm/set_memory.h> just declares a bunch of mostly self-contained functions. No need for the whole <linux/mm.h> inclusion to declare functions; just remove that include. This helps avoid circular dependency headaches (e.g. if <linux/mm.h> ends up including <linux/set_memory.h>). This change requires a couple of include fixups not to break the build: * <asm/smp.h>: including <asm/thread_info.h> directly relies on <linux/thread_info.h> having already been included, because the former needs the BAD_STACK/NOT_STACK constants defined in the latter. This is no longer the case when <asm/smp.h> is included from some driver file - just include <linux/thread_info.h> to stay out of trouble. * sev-guest.c relies on <asm/set_memory.h> including <linux/mm.h>, so we just need to make that include explicit. [ mingo: Cleaned up the changelog ] Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20241212080904.2089632-3-kevin.brodsky@arm.com
2025-02-28x86/mm: Remove unused __set_memory_prot()Kevin Brodsky
__set_memory_prot() is unused since: 5c11f00b09c1 ("x86: remove memory hotplug support on X86_32") Let's remove it. Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20241212080904.2089632-2-kevin.brodsky@arm.com
2025-02-28x86/bugs: Add AUTO mitigations for mds/taa/mmio/rfdsDavid Kaplan
Add AUTO mitigations for mds/taa/mmio/rfds to create consistent vulnerability handling. These AUTO mitigations will be turned into the appropriate default mitigations in the <vuln>_select_mitigation() functions. Later, these will be used with the new attack vector controls to help select appropriate mitigations. Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20250108202515.385902-4-david.kaplan@amd.com
2025-02-28x86/bugs: Add X86_BUG_SPECTRE_V2_USERDavid Kaplan
All CPU vulnerabilities with command line options map to a single X86_BUG bit except for Spectre V2 where both the spectre_v2 and spectre_v2_user command line options are related to the same bug. The spectre_v2 command line options mostly relate to user->kernel and guest->host mitigations, while the spectre_v2_user command line options relate to user->user or guest->guest protections. Define a new X86_BUG bit for spectre_v2_user so each *_select_mitigation() function in bugs.c is related to a unique X86_BUG bit. No functional changes. Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20250108202515.385902-2-david.kaplan@amd.com
2025-02-28x86/cpufeatures: Rename X86_CMPXCHG64 to X86_CX8H. Peter Anvin (Intel)
Replace X86_CMPXCHG64 with X86_CX8, as CX8 is the name of the CPUID flag, thus to make it consistent with X86_FEATURE_CX8 defined in <asm/cpufeatures.h>. No functional change intended. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250228082338.73859-2-xin@zytor.com
2025-02-28x86/cpu: Enable modifying CPU bug flags with '{clear,set}puid='Brendan Jackman
Sometimes it can be very useful to run CPU vulnerability mitigations on systems where they aren't known to mitigate any real-world vulnerabilities. This can be handy for mundane reasons like debugging HW-agnostic logic on whatever machine is to hand, but also for research reasons: while some mitigations are focused on individual vulns and uarches, others are fairly general, and it's strategically useful to have an idea how they'd perform on systems where they aren't currently needed. As evidence for this being useful, a flag specifically for Retbleed was added in: 5c9a92dec323 ("x86/bugs: Add retbleed=force"). Since CPU bugs are tracked using the same basic mechanism as features, and there are already parameters for manipulating them by hand, extend that mechanism to support bug as well as capabilities. With this patch and setcpuid=srso, a QEMU guest running on an Intel host will boot with Safe-RET enabled. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20241220-force-cpu-bug-v2-3-7dc71bce742a@google.com
2025-02-28x86/locking: Remove semicolon from "lock" prefixUros Bizjak
Minimum version of binutils required to compile the kernel is 2.25. This version correctly handles the "lock" prefix, so it is possible to remove the semicolon, which was used to support ancient versions of GNU as. Due to the semicolon, the compiler considers "lock; insn" as two separate instructions. Removing the semicolon makes asm length calculations more accurate, consequently making scheduling and inlining decisions of the compiler more accurate. Removing the semicolon also enables assembler checks involving lock prefix. Trying to assemble e.g. "lock andl %eax, %ebx" results in: Error: expecting lockable instruction after `lock' Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250228085149.2478245-1-ubizjak@gmail.com
2025-02-27x86/cpu: Remove get_this_hybrid_cpu_*()Pawan Gupta
Because calls to get_this_hybrid_cpu_type() and get_this_hybrid_cpu_native_id() are not required now. cpu-type and native-model-id are cached at boot in per-cpu struct cpuinfo_topology. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20241211-add-cpu-type-v5-4-2ae010f50370@linux.intel.com
2025-02-27perf/x86/intel: Use cache cpu-type for hybrid PMU selectionPawan Gupta
get_this_hybrid_cpu_type() misses a case when cpu-type is populated regardless of X86_FEATURE_HYBRID_CPU. This is particularly true for hybrid variants that have P or E cores fused off. Instead use the cpu-type cached in struct x86_topology, as it does not rely on hybrid feature to enumerate cpu-type. This can also help avoid the model-specific fixup get_hybrid_cpu_type(). Also replace the get_this_hybrid_cpu_native_id() with its cached value in struct x86_topology. While at it, remove enum hybrid_cpu_type as it serves no purpose when we have the exact cpu-types defined in enum intel_cpu_type. Also rename atom_native_id to intel_native_id and move it to intel-family.h where intel_cpu_type lives. Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20241211-add-cpu-type-v5-3-2ae010f50370@linux.intel.com