summaryrefslogtreecommitdiff
path: root/arch/x86/include
AgeCommit message (Collapse)Author
2025-04-24KVM: arm64, x86: make kvm_arch_has_irq_bypass() inlinePaolo Bonzini
kvm_arch_has_irq_bypass() is a small function and even though it does not appear in any *really* hot paths, it's also not entirely rare. Make it inline---it also works out nicely in preparation for using it in kvm-intel.ko and kvm-amd.ko, since the function is not currently exported. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-04-23x86/mm: Fix _pgd_alloc() for Xen PV modeJuergen Gross
Recently _pgd_alloc() was switched from using __get_free_pages() to pagetable_alloc_noprof(), which might return a compound page in case the allocation order is larger than 0. On x86 this will be the case if CONFIG_MITIGATION_PAGE_TABLE_ISOLATION is set, even if PTI has been disabled at runtime. When running as a Xen PV guest (this will always disable PTI), using a compound page for a PGD will result in VM_BUG_ON_PGFLAGS being triggered when the Xen code tries to pin the PGD. Fix the Xen issue together with the not needed 8k allocation for a PGD with PTI disabled by replacing PGD_ALLOCATION_ORDER with an inline helper returning the needed order for PGD allocations. Fixes: a9b3c355c2e6 ("asm-generic: pgalloc: provide generic __pgd_{alloc,free}") Reported-by: Petr Vaněk <arkamar@atlas.cz> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Petr Vaněk <arkamar@atlas.cz> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20250422131717.25724-1-jgross%40suse.com
2025-04-22x86/vdso: Remove #ifdeffery around page setup variantsThomas Weißschuh
Replace the open-coded ifdefs in C sources files with IS_ENABLED(). This makes the code easier to read and enables the compiler to typecheck also the disabled parts, before optimizing them away. To make this work, also remove the ifdefs from declarations of used variables. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <kees@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Link: https://lore.kernel.org/r/20240910-x86-vdso-ifdef-v1-1-877c9df9b081@linutronix.de
2025-04-22x86/asm: Retire RIP_REL_REF()Ard Biesheuvel
Now that all users have been moved into startup/ where PIC codegen is used, RIP_REL_REF() is no longer needed. Remove it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Dionna Amalie Glaze <dionnaglaze@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Loughlin <kevinloughlin@google.com> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20250418141253.2601348-14-ardb+git@google.com
2025-04-22x86/boot: Drop RIP_REL_REF() uses from early SEV codeArd Biesheuvel
Now that the early SEV code is built with -fPIC, RIP_REL_REF() has no effect and can be dropped. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Dionna Amalie Glaze <dionnaglaze@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Loughlin <kevinloughlin@google.com> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20250418141253.2601348-13-ardb+git@google.com
2025-04-22Merge branch 'x86/urgent' into x86/boot, to merge dependent commit and ↵Ingo Molnar
upstream fixes In particular we need this fix before applying subsequent changes: d54d610243a4 ("x86/boot/sev: Avoid shared GHCB page for early memory acceptance") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-04-22x86/cpu: Help users notice when running old Intel microcodeDave Hansen
Old microcode is bad for users and for kernel developers. For users, it exposes them to known fixed security and/or functional issues. These obviously rarely result in instant dumpster fires in every environment. But it is as important to keep your microcode up to date as it is to keep your kernel up to date. Old microcode also makes kernels harder to debug. A developer looking at an oops need to consider kernel bugs, known CPU issues and unknown CPU issues as possible causes. If they know the microcode is up to date, they can mostly eliminate known CPU issues as the cause. Make it easier to tell if CPU microcode is out of date. Add a list of released microcode. If the loaded microcode is older than the release, tell users in a place that folks can find it: /sys/devices/system/cpu/vulnerabilities/old_microcode Tell kernel kernel developers about it with the existing taint flag: TAINT_CPU_OUT_OF_SPEC == Discussion == When a user reports a potential kernel issue, it is very common to ask them to reproduce the issue on mainline. Running mainline, they will (independently from the distro) acquire a more up-to-date microcode version list. If their microcode is old, they will get a warning about the taint and kernel developers can take that into consideration when debugging. Just like any other entry in "vulnerabilities/", users are free to make their own assessment of their exposure. == Microcode Revision Discussion == The microcode versions in the table were generated from the Intel microcode git repo: 8ac9378a8487 ("microcode-20241112 Release") which as of this writing lags behind the latest microcode-20250211. It can be argued that the versions that the kernel picks to call "old" should be a revision or two old. Which specific version is picked is less important to me than picking *a* version and enforcing it. This repository contains only microcode versions that Intel has deemed to be OS-loadable. It is quite possible that the BIOS has loaded a newer microcode than the latest in this repo. If this happens, the system is considered to have new microcode, not old. Specifically, the sysfs file and taint flag answer the question: Is the CPU running on the latest OS-loadable microcode, or something even later that the BIOS loaded? In other words, Intel never publishes an authoritative list of CPUs and latest microcode revisions. Until it does, this is the best that Linux can do. Also note that the "intel-ucode-defs.h" file is simple, ugly and has lots of magic numbers. That's on purpose and should allow a single file to be shared across lots of stable kernel regardless of if they have the new "VFM" infrastructure or not. It was generated with a dumb script. == FAQ == Q: Does this tell me if my system is secure or insecure? A: No. It only tells you if your microcode was old when the system booted. Q: Should the kernel warn if the microcode list itself is too old? A: No. New kernels will get new microcode lists, both mainline and stable. The only way to have an old list is to be running an old kernel in which case you have bigger problems. Q: Is this for security or functional issues? A: Both. Q: If a given microcode update only has functional problems but no security issues, will it be considered old? A: Yes. All microcode image versions within a microcode release are treated identically. Intel appears to make security updates without disclosing them in the release notes. Thus, all updates are considered to be security-relevant. Q: Who runs old microcode? A: Anybody with an old distro. This happens all the time inside of Intel where there are lots of weird systems in labs that might not be getting regular distro updates and might also be running rather exotic microcode images. Q: If I update my microcode after booting will it stop saying "Vulnerable"? A: No. Just like all the other vulnerabilies, you need to reboot before the kernel will reassess your vulnerability. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "Ahmed S. Darwish" <darwi@linutronix.de> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/all/20250421195659.CF426C07%40davehans-spike.ostc.intel.com (cherry picked from commit 9127865b15eb0a1bd05ad7efe29489c44394bdc1)
2025-04-18x86/asm: Rename rep_nop() to native_pause()Uros Bizjak
Rename rep_nop() function to what it really does. No functional change intended. Suggested-by: David Laight <david.laight.linux@gmail.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: https://lore.kernel.org/r/20250418080805.83679-1-ubizjak@gmail.com
2025-04-18x86/asm: Replace "REP; NOP" with PAUSE mnemonicUros Bizjak
Current minimum required version of binutils is 2.25, which supports PAUSE instruction mnemonic. Replace "REP; NOP" with this proper mnemonic. No functional change intended. Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: https://lore.kernel.org/r/20250418080805.83679-2-ubizjak@gmail.com
2025-04-18x86/asm: Remove semicolon from "rep" prefixesUros Bizjak
Minimum version of binutils required to compile the kernel is 2.25. This version correctly handles the "rep" prefixes, so it is possible to remove the semicolon, which was used to support ancient versions of GNU as. Due to the semicolon, the compiler considers "rep; insn" (or its alternate "rep\n\tinsn" form) as two separate instructions. Removing the semicolon makes asm length calculations more accurate, consequently making scheduling and inlining decisions of the compiler more accurate. Removing the semicolon also enables assembler checks involving "rep" prefixes. Trying to assemble e.g. "rep addl %eax, %ebx" results in: Error: invalid instruction `add' after `rep' Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@kernel.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20250418071437.4144391-2-ubizjak@gmail.com
2025-04-17x86/mm: Remove now unused SHARED_KERNEL_PMDDave Hansen
All the users of SHARED_KERNEL_PMD are gone. Zap it. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20250414173244.1125BEC3%40davehans-spike.ostc.intel.com
2025-04-17x86/mm: Always tell core mm to sync kernel mappingsDave Hansen
Each mm_struct has its own copy of the page tables. When core mm code makes changes to a copy of the page tables those changes sometimes need to be synchronized with other mms' copies of the page tables. But when this synchronization actually needs to happen is highly architecture and configuration specific. In cases where kernel PMDs are shared across processes (SHARED_KERNEL_PMD) the core mm does not itself need to do that synchronization for kernel PMD changes. The x86 code communicates this by clearing the PGTBL_PMD_MODIFIED bit cleared in those configs to avoid expensive synchronization. The kernel is moving toward never sharing kernel PMDs on 32-bit. Prepare for that and make 32-bit PAE always set PGTBL_PMD_MODIFIED, even if there is no modification to synchronize. This obviously adds some synchronization overhead in cases where the kernel page tables are being changed. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20250414173237.EC790E95%40davehans-spike.ostc.intel.com
2025-04-17Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-04-16x86/bugs: Rename mmio_stale_data_clear to cpu_buf_vm_clearPawan Gupta
The static key mmio_stale_data_clear controls the KVM-only mitigation for MMIO Stale Data vulnerability. Rename it to reflect its purpose. No functional change. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250416-mmio-rename-v2-1-ad1f5488767c@linux.intel.com
2025-04-16x86/fpu/apx: Enable APX state supportChang S. Bae
With securing APX against conflicting MPX, it is now ready to be enabled. Include APX in the enabled xfeature set. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250416021720.12305-5-chang.seok.bae@intel.com
2025-04-16x86/fpu/apx: Define APX state componentChang S. Bae
Advanced Performance Extensions (APX) is associated with a new state component number 19. To support saving and restoring of the corresponding registers via the XSAVE mechanism, introduce the component definition along with the necessary sanity checks. Define the new component number, state name, and those register data type. Then, extend the size checker to validate the register data type and explicitly list the APX feature flag as a dependency for the new component in xsave_cpuid_features[]. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250416021720.12305-3-chang.seok.bae@intel.com
2025-04-16x86/cpufeatures: Add X86_FEATURE_APXChang S. Bae
Intel Advanced Performance Extensions (APX) introduce a new set of general-purpose registers, managed as an extended state component via the xstate management facility. Before enabling this new xstate, define a feature flag to clarify the dependency in xsave_cpuid_features[]. APX is enumerated under CPUID level 7 with EDX=1. Since this CPUID leaf is not yet allocated, place the flag in a scattered feature word. While this feature is intended only for userspace, exposing it via /proc/cpuinfo is unnecessary. Instead, the existing arch_prctl(2) mechanism with the ARCH_GET_XCOMP_SUPP option can be used to query the feature availability. Finally, clarify that APX depends on XSAVE. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250416021720.12305-2-chang.seok.bae@intel.com
2025-04-16x86: Make simd.h more resilientHerbert Xu
Add missing header inclusions and protect against double inclusion. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-16Merge branch 'x86/cpu' into x86/fpu, to pick up dependent commitsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-04-16x86/cpu: Add CPU model number for Bartlett Lake CPUs with Raptor Cove coresPi Xiange
Bartlett Lake has a P-core only product with Raptor Cove. [ mingo: Switch around the define as pointed out by Christian Ludloff: Ratpr Cove is the core, Bartlett Lake is the product. Signed-off-by: Pi Xiange <xiange.pi@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Christian Ludloff <ludloff@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: "Ahmed S. Darwish" <darwi@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250414032839.5368-1-xiange.pi@intel.com
2025-04-16Merge branch 'linus' into x86/cpu, to resolve conflictsIngo Molnar
Conflicts: tools/arch/x86/include/asm/cpufeatures.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-04-15x86/cpufeatures: Shorten X86_FEATURE_AMD_HETEROGENEOUS_CORESXin Li (Intel)
Shorten X86_FEATURE_AMD_HETEROGENEOUS_CORES to X86_FEATURE_AMD_HTR_CORES to make the last column aligned consistently in the whole file. No functional changes. Suggested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250415175410.2944032-4-xin@zytor.com
2025-04-15x86/cpufeatures: Shorten X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXITXin Li (Intel)
Shorten X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT to X86_FEATURE_CLEAR_BHB_VMEXIT to make the last column aligned consistently in the whole file. There's no need to explain in the name what the mitigation does. No functional changes. Suggested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250415175410.2944032-3-xin@zytor.com
2025-04-15x86/cpufeatures: Clean up formattingBorislav Petkov (AMD)
It is a special file with special formatting so remove one whitespace damage and format newer defines like the rest. No functional changes. [ Xin: Do the same to tools/arch/x86/include/asm/cpufeatures.h. ] Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250415175410.2944032-2-xin@zytor.com
2025-04-14x86/bugs: Remove X86_BUG_MMIO_UNKNOWNBorislav Petkov (AMD)
Whack this thing because: - the "unknown" handling is done only for this vuln and not for the others - it doesn't do anything besides reporting things differently. It doesn't apply any mitigations - it is simply causing unnecessary complications to the code which don't bring anything besides maintenance overhead to what is already a very nasty spaghetti pile - all the currently unaffected CPUs can also be in "unknown" status so there's no need for special handling here so get rid of it. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: David Kaplan <david.kaplan@amd.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Link: https://lore.kernel.org/r/20250414150951.5345-1-bp@kernel.org
2025-04-14x86/cpuid: Align macro linebreaks verticallyBorislav Petkov (AMD)
Align the backspaces vertically again, after recent cleanups. No functional changes. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Ahmed S. Darwish <darwi@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250414094130.6768-1-bp@kernel.org
2025-04-14x86/platform/amd: Move the <asm/amd_node.h> header to <asm/amd/node.h>Ingo Molnar
Collect AMD specific platform header files in <asm/amd/*.h>. Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <superm1@kernel.org> Link: https://lore.kernel.org/r/20250413084144.3746608-7-mingo@kernel.org
2025-04-14x86/platform/amd: Clean up the <asm/amd/hsmp.h> header guards a bitIngo Molnar
- There's no need for a newline after the SPDX line - But there's a need for one before the closing header guard. Collect AMD specific platform header files in <asm/amd/*.h>. Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: Carlos Bilbao <carlos.bilbao@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <superm1@kernel.org> Cc: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com> Link: https://lore.kernel.org/r/20250413084144.3746608-6-mingo@kernel.org
2025-04-14x86/platform/amd: Move the <asm/amd_hsmp.h> header to <asm/amd/hsmp.h>Ingo Molnar
Collect AMD specific platform header files in <asm/amd/*.h>. Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: Carlos Bilbao <carlos.bilbao@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <superm1@kernel.org> Cc: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com> Link: https://lore.kernel.org/r/20250413084144.3746608-5-mingo@kernel.org
2025-04-14x86/platform/amd: Move the <asm/amd_nb.h> header to <asm/amd/nb.h>Ingo Molnar
Collect AMD specific platform header files in <asm/amd/*.h>. Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <superm1@kernel.org> Link: https://lore.kernel.org/r/20250413084144.3746608-4-mingo@kernel.org
2025-04-14x86/platform/amd: Add standard header guards to <asm/amd/ibs.h>Ingo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <superm1@kernel.org> Link: https://lore.kernel.org/r/20250413084144.3746608-3-mingo@kernel.org
2025-04-14x86/platform/amd: Move the <asm/amd-ibs.h> header to <asm/amd/ibs.h>Ingo Molnar
Collect AMD specific platform header files in <asm/amd/*.h>. Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <superm1@kernel.org> Link: https://lore.kernel.org/r/20250413084144.3746608-2-mingo@kernel.org
2025-04-14x86/fpu: Use 'fpstate' variable names consistentlyIngo Molnar
A few uses of 'fps' snuck in, which is rather confusing (to me) as it suggests frames-per-second. ;-) Rename them to the canonical 'fpstate' name. No change in functionality. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250409211127.3544993-9-mingo@kernel.org
2025-04-14x86/fpu: Remove init_task FPU state dependencies, add debugging warning for ↵Ingo Molnar
PF_KTHREAD tasks init_task's FPU state initialization was a bit of a hack: __x86_init_fpu_begin = .; . = __x86_init_fpu_begin + 128*PAGE_SIZE; __x86_init_fpu_end = .; But the init task isn't supposed to be using the FPU context in any case, so remove the hack and add in some debug warnings. As Linus noted in the discussion, the init task (and other PF_KTHREAD tasks) *can* use the FPU via kernel_fpu_begin()/_end(), but they don't need the context area because their FPU use is not preemptible or reentrant, and they don't return to user-space. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20250409211127.3544993-8-mingo@kernel.org
2025-04-14x86/fpu: Push 'fpu' pointer calculation into the fpu__drop() callIngo Molnar
This encapsulates the fpu__drop() functionality better, and it will also enable other changes that want to check a task for PF_KTHREAD before calling x86_task_fpu(). Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250409211127.3544993-6-mingo@kernel.org
2025-04-14x86/fpu: Remove the thread::fpu pointerIngo Molnar
As suggested by Oleg, remove the thread::fpu pointer, as we can calculate it via x86_task_fpu() at compile-time. This improves code generation a bit: kepler:~/tip> size vmlinux.before vmlinux.after text data bss dec hex filename 26475405 10435342 1740804 38651551 24dc69f vmlinux.before 26475339 10959630 1216516 38651485 24dc65d vmlinux.after Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20250409211127.3544993-5-mingo@kernel.org
2025-04-14x86/fpu: Make task_struct::thread constant sizeIngo Molnar
Turn thread.fpu into a pointer. Since most FPU code internals work by passing around the FPU pointer already, the code generation impact is small. This allows us to remove the old kludge of task_struct being variable size: struct task_struct { ... /* * New fields for task_struct should be added above here, so that * they are included in the randomized portion of task_struct. */ randomized_struct_fields_end /* CPU-specific state of this task: */ struct thread_struct thread; /* * WARNING: on x86, 'thread_struct' contains a variable-sized * structure. It *MUST* be at the end of 'task_struct'. * * Do not put anything below here! */ }; ... which creates a number of problems, such as requiring thread_struct to be the last member of the struct - not allowing it to be struct-randomized, etc. But the primary motivation is to allow the decoupling of task_struct from hardware details (<asm/processor.h> in particular), and to eventually allow the per-task infrastructure: DECLARE_PER_TASK(type, name); ... per_task(current, name) = val; ... which requires task_struct to be a constant size struct. The fpu_thread_struct_whitelist() quirk to hardened usercopy can be removed, now that the FPU structure is not embedded in the task struct anymore, which reduces text footprint a bit. Fixed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250409211127.3544993-4-mingo@kernel.org
2025-04-14x86/fpu: Convert task_struct::thread.fpu accesses to use x86_task_fpu()Ingo Molnar
This will make the removal of the task_struct::thread.fpu array easier. No change in functionality - code generated before and after this commit is identical on x86-defconfig: kepler:~/tip> diff -up vmlinux.before.asm vmlinux.after.asm kepler:~/tip> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250409211127.3544993-3-mingo@kernel.org
2025-04-14x86/fpu: Introduce the x86_task_fpu() helper methodIngo Molnar
The per-task FPU context/save area is allocated right next to task_struct, currently in a variable-size array via task_struct::thread.fpu[], but we plan to fully hide it from the C type scope. Introduce the x86_task_fpu() accessor that gets to the FPU context pointer explicitly from the task pointer. Right now this is a simple (task)->thread.fpu wrapper. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250409211127.3544993-2-mingo@kernel.org
2025-04-13x86/uaccess: Use asm_inline() instead of asm() in __untagged_addr()Uros Bizjak
Use asm_inline() to instruct the compiler that the size of asm() is the minimum size of one instruction, ignoring how many instructions the compiler thinks it is. ALTERNATIVE macro that expands to several pseudo directives causes instruction length estimate to count more than 20 instructions. bloat-o-meter reports minimal code size increase (x86_64 defconfig with CONFIG_ADDRESS_MASKING, gcc-14.2.1): add/remove: 2/2 grow/shrink: 5/1 up/down: 2365/-1995 (370) Function old new delta ----------------------------------------------------- do_get_mempolicy - 1449 +1449 copy_nodes_to_user - 226 +226 __x64_sys_get_mempolicy 35 213 +178 syscall_user_dispatch_set_config 157 332 +175 __ia32_sys_get_mempolicy 31 206 +175 set_syscall_user_dispatch 29 181 +152 __do_sys_mremap 2073 2083 +10 sp_insert 133 117 -16 task_set_syscall_user_dispatch 172 - -172 kernel_get_mempolicy 1807 - -1807 Total: Before=21423151, After=21423521, chg +0.00% The code size increase is due to the compiler inlining more functions that inline untagged_addr(), e.g: task_set_syscall_user_dispatch() is now fully inlined in set_syscall_user_dispatch(): 000000000010b7e0 <set_syscall_user_dispatch>: 10b7e0: f3 0f 1e fa endbr64 10b7e4: 49 89 c8 mov %rcx,%r8 10b7e7: 48 89 d1 mov %rdx,%rcx 10b7ea: 48 89 f2 mov %rsi,%rdx 10b7ed: 48 89 fe mov %rdi,%rsi 10b7f0: 65 48 8b 3d 00 00 00 mov %gs:0x0(%rip),%rdi 10b7f7: 00 10b7f8: e9 03 fe ff ff jmp 10b600 <task_set_syscall_user_dispatch> that after inlining becomes: 000000000010b730 <set_syscall_user_dispatch>: 10b730: f3 0f 1e fa endbr64 10b734: 65 48 8b 05 00 00 00 mov %gs:0x0(%rip),%rax 10b73b: 00 10b73c: 48 85 ff test %rdi,%rdi 10b73f: 74 54 je 10b795 <set_syscall_user_dispatch+0x65> 10b741: 48 83 ff 01 cmp $0x1,%rdi 10b745: 74 06 je 10b74d <set_syscall_user_dispatch+0x1d> 10b747: b8 ea ff ff ff mov $0xffffffea,%eax 10b74c: c3 ret 10b74d: 48 85 f6 test %rsi,%rsi 10b750: 75 7b jne 10b7cd <set_syscall_user_dispatch+0x9d> 10b752: 48 85 c9 test %rcx,%rcx 10b755: 74 1a je 10b771 <set_syscall_user_dispatch+0x41> 10b757: 48 89 cf mov %rcx,%rdi 10b75a: 49 b8 ef cd ab 89 67 movabs $0x123456789abcdef,%r8 10b761: 45 23 01 10b764: 90 nop 10b765: 90 nop 10b766: 90 nop 10b767: 90 nop 10b768: 90 nop 10b769: 90 nop 10b76a: 90 nop 10b76b: 90 nop 10b76c: 49 39 f8 cmp %rdi,%r8 10b76f: 72 6e jb 10b7df <set_syscall_user_dispatch+0xaf> 10b771: 48 89 88 48 08 00 00 mov %rcx,0x848(%rax) 10b778: 48 89 b0 50 08 00 00 mov %rsi,0x850(%rax) 10b77f: 48 89 90 58 08 00 00 mov %rdx,0x858(%rax) 10b786: c6 80 60 08 00 00 00 movb $0x0,0x860(%rax) 10b78d: f0 80 48 08 20 lock orb $0x20,0x8(%rax) 10b792: 31 c0 xor %eax,%eax 10b794: c3 ret 10b795: 48 09 d1 or %rdx,%rcx 10b798: 48 09 f1 or %rsi,%rcx 10b79b: 75 aa jne 10b747 <set_syscall_user_dispatch+0x17> 10b79d: 48 c7 80 48 08 00 00 movq $0x0,0x848(%rax) 10b7a4: 00 00 00 00 10b7a8: 48 c7 80 50 08 00 00 movq $0x0,0x850(%rax) 10b7af: 00 00 00 00 10b7b3: 48 c7 80 58 08 00 00 movq $0x0,0x858(%rax) 10b7ba: 00 00 00 00 10b7be: c6 80 60 08 00 00 00 movb $0x0,0x860(%rax) 10b7c5: f0 80 60 08 df lock andb $0xdf,0x8(%rax) 10b7ca: 31 c0 xor %eax,%eax 10b7cc: c3 ret 10b7cd: 48 8d 3c 16 lea (%rsi,%rdx,1),%rdi 10b7d1: 48 39 fe cmp %rdi,%rsi 10b7d4: 0f 82 78 ff ff ff jb 10b752 <set_syscall_user_dispatch+0x22> 10b7da: e9 68 ff ff ff jmp 10b747 <set_syscall_user_dispatch+0x17> 10b7df: b8 f2 ff ff ff mov $0xfffffff2,%eax 10b7e4: c3 ret Please note a series of NOPs that get replaced with an alternative: 11f0: 65 48 23 05 00 00 00 and %gs:0x0(%rip),%rax 11f7: 00 Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250407072129.33440-1-ubizjak@gmail.com
2025-04-13x86/msr: Add compatibility wrappers for rdmsrl()/wrmsrl()Ingo Molnar
To reduce the impact of the API renames in -next, add compatibility wrappers for the two most popular MSR access APIs: rdmsrl() and wrmsrl(). Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Xin Li <xin@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org
2025-04-13objtool, x86/hweight: Remove ANNOTATE_IGNORE_ALTERNATIVEJosh Poimboeuf
Since objtool's inception, frame pointer warnings have been manually silenced for __arch_hweight*() to allow those functions' inline asm to avoid using ASM_CALL_CONSTRAINT. The potentially dubious reasoning for that decision over nine years ago was that since !X86_FEATURE_POPCNT is exceedingly rare, it's not worth hurting the code layout for a function call that will never happen on the vast majority of systems. However, those functions actually started using ASM_CALL_CONSTRAINT with the following commit: 194a613088a8 ("x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm()") And rightfully so, as it makes the code correct. ASM_CALL_CONSTRAINT will soon have no effect for non-FP configs anyway. With ASM_CALL_CONSTRAINT in place, ANNOTATE_IGNORE_ALTERNATIVE no longer has a purpose for the hweight functions. Remove it. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/e7070dba3278c90f1a836b16157dcd34ccd21e21.1744318586.git.jpoimboe@kernel.org
2025-04-13x86/percpu: Refer __percpu_prefix to __force_percpu_prefixUros Bizjak
Refer __percpu_prefix to __force_percpu_prefix to avoid duplicate definition. While there, slightly reorder definitions to a more logical sequence, remove unneeded double quotes and move misplaced comment to the right place. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: https://lore.kernel.org/r/20250411093130.81389-1-ubizjak@gmail.com
2025-04-12x86/sev: Prepare for splitting off early SEV codeArd Biesheuvel
Prepare for splitting off parts of the SEV core.c source file into a file that carries code that must tolerate being called from the early 1:1 mapping. This will allow special build-time handling of thise code, to ensure that it gets generated in a way that is compatible with the early execution context. So create a de-facto internal SEV API and put the definitions into sev-internal.h. No attempt is made to allow this header file to be included in arbitrary other sources - this is explicitly not the intent. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Dionna Amalie Glaze <dionnaglaze@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Loughlin <kevinloughlin@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: linux-efi@vger.kernel.org Link: https://lore.kernel.org/r/20250410134117.3713574-20-ardb+git@google.com
2025-04-12x86/boot: Drop RIP_REL_REF() uses from SME startup codeArd Biesheuvel
RIP_REL_REF() has no effect on code residing in arch/x86/boot/startup, as it is built with -fPIC. So remove any occurrences from the SME startup code. Note the SME is the only caller of cc_set_mask() that requires this, so drop it from there as well. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Dionna Amalie Glaze <dionnaglaze@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Loughlin <kevinloughlin@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: linux-efi@vger.kernel.org Link: https://lore.kernel.org/r/20250410134117.3713574-19-ardb+git@google.com
2025-04-12x86/asm: Make rip_rel_ptr() usable from fPIC codeArd Biesheuvel
RIP_REL_REF() is used in non-PIC C code that is called very early, before the kernel virtual mapping is up, which is the mapping that the linker expects. It is currently used in two different ways: - to refer to the value of a global variable, including as an lvalue in assignments; - to take the address of a global variable via the mapping that the code currently executes at. The former case is only needed in non-PIC code, as PIC code will never use absolute symbol references when the address of the symbol is not being used. But taking the address of a variable in PIC code may still require extra care, as a stack allocated struct assignment may be emitted as a memcpy() from a statically allocated copy in .rodata. For instance, this void startup_64_setup_gdt_idt(void) { struct desc_ptr startup_gdt_descr = { .address = (__force unsigned long)gdt_page.gdt, .size = GDT_SIZE - 1, }; may result in an absolute symbol reference in PIC code, even though the struct is allocated on the stack and populated at runtime. To address this case, make rip_rel_ptr() accessible in PIC code, and update any existing uses where the address of a global variable is taken using RIP_REL_REF. Once all code of this nature has been moved into arch/x86/boot/startup and built with -fPIC, RIP_REL_REF() can be retired, and only rip_rel_ptr() will remain. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Dionna Amalie Glaze <dionnaglaze@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kevin Loughlin <kevinloughlin@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: linux-efi@vger.kernel.org Link: https://lore.kernel.org/r/20250410134117.3713574-14-ardb+git@google.com
2025-04-12x86/mm: Opt-in to IRQs-off activate_mm()Andy Lutomirski
We gain nothing by having the core code enable IRQs right before calling activate_mm() only for us to turn them right back off again in switch_mm(). This will save a few cycles, so execve() should be blazingly fast with this patch applied! Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20250402094540.3586683-8-mingo@kernel.org
2025-04-12x86/mm: Remove 'mm' argument from unuse_temporary_mm() againPeter Zijlstra
Now that unuse_temporary_mm() lives in tlb.c it can access cpu_tlbstate.loaded_mm. [ mingo: Merged it on top of x86/alternatives ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20250402094540.3586683-5-mingo@kernel.org
2025-04-12x86/mm: Make use_/unuse_temporary_mm() non-staticAndy Lutomirski
This prepares them for use outside of the alternative machinery. The code is unchanged. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20250402094540.3586683-4-mingo@kernel.org
2025-04-11x86/cpuid: Remove obsolete CPUID(0x2) iteration macroAhmed S. Darwish
The CPUID(0x2) cache descriptors iterator at <cpuid/leaf_0x2_api.h>: for_each_leaf_0x2_desc() has no more call sites. Remove it. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250411070401.1358760-2-darwi@linutronix.de