summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu
AgeCommit message (Collapse)Author
3 daysMerge tag 'x86_sgx_for_6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Intel software guard extension (SGX) updates from Dave Hansen: "A couple of x86/sgx changes. The first one is a no-brainer to use the (simple) SHA-256 library. For the second one, some folks doing testing noticed that SGX systems under memory pressure were inducing fatal machine checks at pretty unnerving rates, despite the SGX code having _some_ awareness of memory poison. It turns out that the SGX reclaim path was not checking for poison _and_ it always accesses memory to copy it around. Make sure that poisoned pages are not reclaimed" * tag 'x86_sgx_for_6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sgx: Prevent attempts to reclaim poisoned pages x86/sgx: Use SHA-256 library API instead of crypto_shash API
5 daysMerge tag 'x86_mtrr_for_v6.16_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull mtrr update from Borislav Petkov: "A single change to verify the presence of fixed MTRR ranges before accessing the respective MSRs" * tag 'x86_mtrr_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mtrr: Check if fixed-range MTRRs exist in mtrr_save_fixed_ranges()
5 daysMerge tag 'x86_cache_for_v6.16_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 resource control updates from Borislav Petkov: "Carve out the resctrl filesystem-related code into fs/resctrl/ so that multiple architectures can share the fs API for manipulating their respective hw resource control implementation. This is the second step in the work towards sharing the resctrl filesystem interface, the next one being plugging ARM's MPAM into the aforementioned fs API" * tag 'x86_cache_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) MAINTAINERS: Add reviewers for fs/resctrl x86,fs/resctrl: Move the resctrl filesystem code to live in /fs/resctrl x86/resctrl: Always initialise rid field in rdt_resources_all[] x86/resctrl: Relax some asm #includes x86/resctrl: Prefer alloc(sizeof(*foo)) idiom in rdt_init_fs_context() x86/resctrl: Squelch whitespace anomalies in resctrl core code x86/resctrl: Move pseudo lock prototypes to include/linux/resctrl.h x86/resctrl: Fix types in resctrl_arch_mon_ctx_{alloc,free}() stubs x86/resctrl: Move enum resctrl_event_id to resctrl.h x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl fs/resctrl: Add boiler plate for external resctrl code x86/resctrl: Add 'resctrl' to the title of the resctrl documentation x86/resctrl: Split trace.h x86/resctrl: Expand the width of domid by replacing mon_data_bits x86/resctrl: Add end-marker to the resctrl_event_id enum x86/resctrl: Move is_mba_sc() out of core.c x86/resctrl: Drop __init/__exit on assorted symbols x86/resctrl: Resctrl_exit() teardown resctrl but leave the mount point x86/resctrl: Check all domains are offline in resctrl_exit() x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" ...
6 daysMerge tag 'x86-cleanups-2025-05-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Misc x86 cleanups: kernel-doc updates and a string API transition patch" * tag 'x86-cleanups-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/power: hibernate: Fix W=1 build kernel-doc warnings x86/mm/pat: Fix W=1 build kernel-doc warning x86/CPU/AMD: Replace strcpy() with strscpy()
11 daysx86/bugs: Fix spectre_v2 mitigation default on IntelPawan Gupta
Commit 480e803dacf8 ("x86/bugs: Restructure spectre_v2 mitigation") inadvertently changed the spectre-v2 mitigation default from eIBRS to IBRS on Intel. While splitting the spectre_v2 mitigation in select/update/apply functions, eIBRS and IBRS selection logic was separated in select and update. This caused IBRS selection to not consider that eIBRS mitigation is already selected, fix it. Fixes: 480e803dacf8 ("x86/bugs: Restructure spectre_v2 mitigation") Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250520-eibrs-fix-v1-1-91bacd35ed09@linux.intel.com
12 daysx86/bugs: Restructure ITS mitigationDavid Kaplan
Restructure the ITS mitigation to use select/update/apply functions like the other mitigations. There is a particularly complex interaction between ITS and Retbleed as CDT (Call Depth Tracking) is a mitigation for both, and either its=stuff or retbleed=stuff will attempt to enable CDT. retbleed_update_mitigation() runs first and will check the necessary pre-conditions for CDT if either ITS or Retbleed stuffing is selected. If checks pass and ITS stuffing is selected, it will select stuffing for Retbleed as well. its_update_mitigation() runs after and will either select stuffing if retbleed stuffing was enabled, or fall back to the default (aligned thunks) if stuffing could not be enabled. Enablement of CDT is done exclusively in retbleed_apply_mitigation(). its_apply_mitigation() is only used to enable aligned thunks. Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/20250516193212.128782-1-david.kaplan@amd.com
12 daysMerge tag 'v6.15-rc7' into x86/core, to pick up fixesIngo Molnar
Pick up build fixes from upstream to make this tree more testable. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-17Merge tag 'x86-urgent-2025-05-17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Ingo Molnar: - Fix SEV-SNP kdump bugs - Update the email address of Alexey Makhalov in MAINTAINERS - Add the CPU feature flag for the Zen6 microarchitecture - Fix typo in system message * tag 'x86-urgent-2025-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Remove duplicated word in warning message x86/CPU/AMD: Add X86_FEATURE_ZEN6 x86/sev: Make sure pages are not skipped during kdump x86/sev: Do not touch VMSA pages during SNP guest memory kdump MAINTAINERS: Update Alexey Makhalov's email address x86/sev: Fix operator precedence in GHCB_MSR_VMPL_REQ_LEVEL macro
2025-05-17x86/bugs: Fix indentation due to ITS mergeBorislav Petkov (AMD)
No functional changes. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-16x86,fs/resctrl: Move the resctrl filesystem code to live in /fs/resctrlJames Morse
Resctrl is a filesystem interface to hardware that provides cache allocation policy and bandwidth control for groups of tasks or CPUs. To support more than one architecture, resctrl needs to live in /fs/. Move the code that is concerned with the filesystem interface to /fs/resctrl. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-25-james.morse@arm.com
2025-05-16x86/resctrl: Always initialise rid field in rdt_resources_all[]James Morse
x86 has an array, rdt_resources_all[], of all possible resources. The for-each-resource walkers depend on the rid field of all resources being initialised. If the array ever grows due to another architecture adding a resource type that is not defined on x86, the for-each-resources walkers will loop forever. Initialise all the rid values in resctrl_arch_late_init() before any for-each-resource walker can be called. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-24-james.morse@arm.com
2025-05-16x86/resctrl: Relax some asm #includesDave Martin
checkpatch.pl identifies some direct #includes of asm headers that can be satisfied by including the corresponding <linux/...> header instead. Fix them. No intentional functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-23-james.morse@arm.com
2025-05-16x86/resctrl: Prefer alloc(sizeof(*foo)) idiom in rdt_init_fs_context()Dave Martin
rdt_init_fs_context() sizes a typed allocation using an explicit sizeof(type) expression, which checkpatch.pl complains about. Since this code is about to be factored out and made generic, this is a good opportunity to fix the code to size the allocation based on the target pointer instead, to reduce the chance of future mis- maintenance. Fix it. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-22-james.morse@arm.com
2025-05-16x86/resctrl: Squelch whitespace anomalies in resctrl core codeDave Martin
checkpatch.pl complains about some whitespace anomalies in the resctrl core code. This doesn't matter, but since this code is about to be factored out and made generic, this is a good opportunity to fix these issues and so reduce future checkpatch fuzz. Fix them. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-21-james.morse@arm.com
2025-05-16x86/resctrl: Move the filesystem bits to headers visible to fs/resctrlJames Morse
Once the filesystem parts of resctrl move to fs/resctrl, it cannot rely on definitions in x86's internal.h. Move definitions in internal.h that need to be shared between the filesystem and architecture code to header files that fs/resctrl can include. Doing this separately means the filesystem code only moves between files of the same name, instead of having these changes mixed in too. Co-developed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-17-james.morse@arm.com
2025-05-16fs/resctrl: Add boiler plate for external resctrl codeJames Morse
Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL for the common parts of the resctrl interface and make X86_CPU_RESCTRL select this. Adding an include of asm/resctrl.h to linux/resctrl.h allows the /fs/resctrl files to switch over to using this header instead. Co-developed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-16-james.morse@arm.com
2025-05-16x86/cpu/intel: Rename CPUID(0x2) descriptors iterator parameterAhmed S. Darwish
The CPUID(0x2) descriptors iterator has been renamed from: for_each_leaf_0x2_entry() to: for_each_cpuid_0x2_desc() since it iterates over CPUID(0x2) cache and TLB "descriptors", not "entries". In the macro's x86/cpu call-site, rename the parameter denoting the parsed descriptor at each iteration from 'entry' to 'desc'. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250508150240.172915-8-darwi@linutronix.de
2025-05-16x86/cacheinfo: Rename CPUID(0x2) descriptors iterator parameterAhmed S. Darwish
The CPUID(0x2) descriptors iterator has been renamed from: for_each_leaf_0x2_entry() to: for_each_cpuid_0x2_desc() since it iterates over CPUID(0x2) cache and TLB "descriptors", not "entries". In the macro's x86/cacheinfo call-site, rename the parameter denoting the parsed descriptor at each iteration from 'entry' to 'desc'. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250508150240.172915-7-darwi@linutronix.de
2025-05-16x86/cpuid: Rename cpuid_get_leaf_0x2_regs() to cpuid_leaf_0x2()Ahmed S. Darwish
Rename the CPUID(0x2) register accessor function: cpuid_get_leaf_0x2_regs(regs) to: cpuid_leaf_0x2(regs) for consistency with other <cpuid/api.h> accessors that return full CPUID registers outputs like: cpuid_leaf(regs) cpuid_subleaf(regs) In the same vein, rename the CPUID(0x2) iteration macro: for_each_leaf_0x2_entry() to: for_each_cpuid_0x2_desc() to include "cpuid" in the macro name, and since what is iterated upon is CPUID(0x2) cache and TLB "descriptos", not "entries". Prefix an underscore to that iterator macro parameters, so that the newly renamed 'desc' parameter do not get mixed with "union leaf_0x2_regs :: desc[]" in the macro's implementation. Adjust all the affected call-sites accordingly. While at it, use "CPUID(0x2)" instead of "CPUID leaf 0x2" as this is the recommended style. No change in functionality intended. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250508150240.172915-6-darwi@linutronix.de
2025-05-16x86/resctrl: Split trace.hJames Morse
trace.h contains all the tracepoints. After the move to /fs/resctrl, some of these will be left behind. All the pseudo_lock tracepoints remain part of the architecture. The lone tracepoint in monitor.c moves to /fs/resctrl. Split trace.h so that each C file includes a different trace header file. This means the trace header files are not modified when they are moved. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-14-james.morse@arm.com
2025-05-16x86/resctrl: Expand the width of domid by replacing mon_data_bitsJames Morse
MPAM platforms retrieve the cache-id property from the ACPI PPTT table. The cache-id field is 32 bits wide. Under resctrl, the cache-id becomes the domain-id, and is packed into the mon_data_bits union bitfield. The width of cache-id in this field is 14 bits. Expanding the union would break 32bit x86 platforms as this union is stored as the kernfs kn->priv pointer. This saved allocating memory for the priv data storage. The firmware on MPAM platforms have used the PPTT cache-id field to expose the interconnect's id for the cache, which is sparse and uses more than 14 bits. Use of this id is to enable PCIe direct cache injection hints. Using this feature with VFIO means the value provided by the ACPI table should be exposed to user-space. To support cache-id values greater than 14 bits, convert the mon_data_bits union to a structure. These are shared between control and monitor groups, and are allocated on first use. The list of allocated struct mon_data is free'd when the filesystem is umount()ed. Co-developed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-13-james.morse@arm.com
2025-05-16x86/resctrl: Add end-marker to the resctrl_event_id enumJames Morse
The resctrl_event_id enum gives names to the counter event numbers on x86. These are used directly by resctrl. To allow the MPAM driver to keep an array of these the size of the enum needs to be known. Add a 'num_events' enum entry which can be used to size an array. This is added to the enum to reduce conflicts with another series, which in turn requires get_arch_mbm_state() to have a default case. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-12-james.morse@arm.com
2025-05-16x86/resctrl: Move is_mba_sc() out of core.cJames Morse
is_mba_sc() is defined in core.c, but has no callers there. It does not access any architecture private structures. Move this to rdtgroup.c where the majority of callers are. This makes the move of the filesystem code to /fs/ cleaner. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-11-james.morse@arm.com
2025-05-15x86/resctrl: Drop __init/__exit on assorted symbolsJames Morse
Because ARM's MPAM controls are probed using MMIO, resctrl can't be initialised until enough CPUs are online to have determined the system-wide supported num_closid. Arm64 also supports 'late onlined secondaries', where only a subset of CPUs are online during boot. These two combine to mean the MPAM driver may not be able to initialise resctrl until user-space has brought 'enough' CPUs online. To allow MPAM to initialise resctrl after __init text has been free'd, remove all the __init markings from resctrl. The existing __exit markings cause these functions to be removed by the linker as it has never been possible to build resctrl as a module. MPAM has an error interrupt which causes the driver to reset and disable itself. Remove the __exit markings to allow the MPAM driver to tear down resctrl when an error occurs. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-10-james.morse@arm.com
2025-05-15x86/resctrl: Resctrl_exit() teardown resctrl but leave the mount pointJames Morse
resctrl_exit() was intended for use when the 'resctrl' module was unloaded. resctrl can't be built as a module, and the kernfs helpers are not exported so this is unlikely to change. MPAM has an error interrupt which indicates the MPAM driver has gone haywire. Should this occur tasks could run with the wrong control values, leading to bad performance for important tasks. In this scenario the MPAM driver will reset the hardware, but it needs a way to tell resctrl that no further configuration should be attempted. In particular, moving tasks between control or monitor groups does not interact with the architecture code, so there is no opportunity for the arch code to indicate that the hardware is no-longer functioning. Using resctrl_exit() for this leaves the system in a funny state as resctrl is still mounted, but cannot be un-mounted because the sysfs directory that is typically used has been removed. Dave Martin suggests this may cause systemd trouble in the future as not all filesystems can be unmounted. Add calls to remove all the files and directories in resctrl, and remove the sysfs_remove_mount_point() call that leaves the system in a funny state. When triggered, this causes all the resctrl files to disappear. resctrl can be unmounted, but not mounted again. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-9-james.morse@arm.com
2025-05-15x86/resctrl: Check all domains are offline in resctrl_exit()James Morse
resctrl_exit() removes things like the resctrl mount point directory and unregisters the filesystem prior to freeing data structures that were allocated during resctrl_init(). This assumes that there are no online domains when resctrl_exit() is called. If any domain were online, the limbo or overflow handler could be scheduled to run. Add a check for any online control or monitor domains, and document that the architecture code is required to offline all monitor and control domains before calling resctrl_exit(). Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-8-james.morse@arm.com
2025-05-15x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_"James Morse
resctrl_sched_in() loads the architecture specific CPU MSRs with the CLOSID and RMID values. This function was named before resctrl was split to have architecture specific code, and generic filesystem code. This function is obviously architecture specific, but does not begin with 'resctrl_arch_', making it the odd one out in the functions an architecture needs to support to enable resctrl. Rename it for consistency. This is purely cosmetic. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-7-james.morse@arm.com
2025-05-15x86/resctrl: Remove the limit on the number of CLOSIDAmit Singh Tomar
Resctrl allocates and finds free CLOSID values using the bits of a u32. This restricts the number of control groups that can be created by user-space. MPAM has an architectural limit of 2^16 CLOSID values, Intel x86 could be extended beyond 32 values. There is at least one MPAM platform which supports more than 32 CLOSID values. Replace the fixed size bitmap with calls to the bitmap API to allocate an array of a sufficient size. ffs() returns '1' for bit 0, hence the existing code subtracts 1 from the index to get the CLOSID value. find_first_bit() returns the bit number which does not need adjusting. [ morse: fixed the off-by-one in the allocator and the wrong not-found value. Removed the limit. Rephrase the commit message. ] Signed-off-by: Amit Singh Tomar <amitsinght@marvell.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-6-james.morse@arm.com
2025-05-15x86/resctrl: Optimize cpumask_any_housekeeping()Yury Norov [NVIDIA]
With the lack of cpumask_any_andnot_but(), cpumask_any_housekeeping() has to abuse cpumask_nth() functions. Update cpumask_any_housekeeping() to use the new cpumask_any_but() and cpumask_any_andnot_but(). These two functions understand RESCTRL_PICK_ANY_CPU, which simplifies cpumask_any_housekeeping() significantly. Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: James Morse <james.morse@arm.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: James Morse <james.morse@arm.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-5-james.morse@arm.com
2025-05-15x86/sgx: Prevent attempts to reclaim poisoned pagesAndrew Zaborowski
TL;DR: SGX page reclaim touches the page to copy its contents to secondary storage. SGX instructions do not gracefully handle machine checks. Despite this, the existing SGX code will try to reclaim pages that it _knows_ are poisoned. Avoid even trying to reclaim poisoned pages. The longer story: Pages used by an enclave only get epc_page->poison set in arch_memory_failure() but they currently stay on sgx_active_page_list until sgx_encl_release(), with the SGX_EPC_PAGE_RECLAIMER_TRACKED flag untouched. epc_page->poison is not checked in the reclaimer logic meaning that, if other conditions are met, an attempt will be made to reclaim an EPC page that was poisoned. This is bad because 1. we don't want that page to end up added to another enclave and 2. it is likely to cause one core to shut down and the kernel to panic. Specifically, reclaiming uses microcode operations including "EWB" which accesses the EPC page contents to encrypt and write them out to non-SGX memory. Those operations cannot handle MCEs in their accesses other than by putting the executing core into a special shutdown state (affecting both threads with HT.) The kernel will subsequently panic on the remaining cores seeing the core didn't enter MCE handler(s) in time. Call sgx_unmark_page_reclaimable() to remove the affected EPC page from sgx_active_page_list on memory error to stop it being considered for reclaiming. Testing epc_page->poison in sgx_reclaim_pages() would also work but I assume it's better to add code in the less likely paths. The affected EPC page is not added to &node->sgx_poison_page_list until later in sgx_encl_release()->sgx_free_epc_page() when it is EREMOVEd. Membership on other lists doesn't change to avoid changing any of the lists' semantics except for sgx_active_page_list. There's a "TBD" comment in arch_memory_failure() about pre-emptive actions, the goal here is not to address everything that it may imply. This also doesn't completely close the time window when a memory error notification will be fatal (for a not previously poisoned EPC page) -- the MCE can happen after sgx_reclaim_pages() has selected its candidates or even *inside* a microcode operation (actually easy to trigger due to the amount of time spent in them.) The spinlock in sgx_unmark_page_reclaimable() is safe because memory_failure() runs in process context and no spinlocks are held, explicitly noted in a mm/memory-failure.c comment. Signed-off-by: Andrew Zaborowski <andrew.zaborowski@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tony Luck <tony.luck@intel.com> Cc: balrogg@gmail.com Cc: linux-sgx@vger.kernel.org Link: https://lore.kernel.org/r/20250508230429.456271-1-andrew.zaborowski@intel.com
2025-05-15x86/cpuid: Rename have_cpuid_p() to cpuid_feature()Ahmed S. Darwish
In order to let all the APIs under <cpuid/api.h> have a shared "cpuid_" namespace, rename have_cpuid_p() to cpuid_feature(). Adjust all call-sites accordingly. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250508150240.172915-4-darwi@linutronix.de
2025-05-15x86/cpuid: Set <asm/cpuid/api.h> as the main CPUID headerAhmed S. Darwish
The main CPUID header <asm/cpuid.h> was originally a storefront for the headers: <asm/cpuid/api.h> <asm/cpuid/leaf_0x2_api.h> Now that the latter CPUID(0x2) header has been merged into the former, there is no practical difference between <asm/cpuid.h> and <asm/cpuid/api.h>. Migrate all users to the <asm/cpuid/api.h> header, in preparation of the removal of <asm/cpuid.h>. Don't remove <asm/cpuid.h> just yet, in case some new code in -next started using it. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250508150240.172915-3-darwi@linutronix.de
2025-05-13x86/CPU/AMD: Add X86_FEATURE_ZEN6Yazen Ghannam
Add a synthetic feature flag for Zen6. [ bp: Move the feature flag to a free slot and avoid future merge conflicts from incoming stuff. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250513204857.3376577-1-yazen.ghannam@amd.com
2025-05-13x86/bugs: Fix SRSO reporting on Zen1/2 with SMT disabledBorislav Petkov (AMD)
1f4bb068b498 ("x86/bugs: Restructure SRSO mitigation") does this: if (boot_cpu_data.x86 < 0x19 && !cpu_smt_possible()) { setup_force_cpu_cap(X86_FEATURE_SRSO_NO); srso_mitigation = SRSO_MITIGATION_NONE; return; } and, in particular, sets srso_mitigation to NONE. This leads to reporting Speculative Return Stack Overflow: Vulnerable on Zen2 machines. There's a far bigger confusion with what SRSO_NO means and how it is used in the code but this will be a matter of future fixes and restructuring to how the SRSO mitigation gets determined. Fix the reporting issue for now. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: David Kaplan <david.kaplan@amd.com> Link: https://lore.kernel.org/20250513110405.15872-1-bp@kernel.org
2025-05-13Merge commit 'its-for-linus-20250509-merge' into x86/core, to resolve conflictsIngo Molnar
Conflicts: Documentation/admin-guide/hw-vuln/index.rst arch/x86/include/asm/cpufeatures.h arch/x86/kernel/alternative.c arch/x86/kernel/cpu/bugs.c arch/x86/kernel/cpu/common.c drivers/base/cpu.c include/linux/cpu.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/platform' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/msr' into x86/core, to resolve conflictsIngo Molnar
Conflicts: arch/x86/boot/startup/sme.c arch/x86/coco/sev/core.c arch/x86/kernel/fpu/core.c arch/x86/kernel/fpu/xstate.c Semantic conflict: arch/x86/include/asm/sev-internal.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/microcode' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/fpu' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/cpu' into x86/core, to resolve conflictsIngo Molnar
Conflicts: arch/x86/kernel/cpu/bugs.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/boot' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/bugs' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-12x86/mtrr: Check if fixed-range MTRRs exist in mtrr_save_fixed_ranges()Jiaqing Zhao
When suspending, save_processor_state() calls mtrr_save_fixed_ranges() to save fixed-range MTRRs. On platforms without fixed-range MTRRs like the ACRN hypervisor which has removed fixed-range MTRR emulation, accessing these MSRs will trigger an unchecked MSR access error. Make sure fixed-range MTRRs are supported before access to prevent such error. Since mtrr_state.have_fixed is only set when MTRRs are present and enabled, checking the CPU feature flag in mtrr_save_fixed_ranges() is unnecessary. Fixes: 3ebad5905609 ("[PATCH] x86: Save and restore the fixed-range MTRRs of the BSP when suspending") Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250509170633.3411169-2-jiaqing.zhao@linux.intel.com
2025-05-11Merge tag 'its-for-linus-20250509' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 ITS mitigation from Dave Hansen: "Mitigate Indirect Target Selection (ITS) issue. I'd describe this one as a good old CPU bug where the behavior is _obviously_ wrong, but since it just results in bad predictions it wasn't wrong enough to notice. Well, the researchers noticed and also realized that thus bug undermined a bunch of existing indirect branch mitigations. Thus the unusually wide impact on this one. Details: ITS is a bug in some Intel CPUs that affects indirect branches including RETs in the first half of a cacheline. Due to ITS such branches may get wrongly predicted to a target of (direct or indirect) branch that is located in the second half of a cacheline. Researchers at VUSec found this behavior and reported to Intel. Affected processors: - Cascade Lake, Cooper Lake, Whiskey Lake V, Coffee Lake R, Comet Lake, Ice Lake, Tiger Lake and Rocket Lake. Scope of impact: - Guest/host isolation: When eIBRS is used for guest/host isolation, the indirect branches in the VMM may still be predicted with targets corresponding to direct branches in the guest. - Intra-mode using cBPF: cBPF can be used to poison the branch history to exploit ITS. Realigning the indirect branches and RETs mitigates this attack vector. - User/kernel: With eIBRS enabled user/kernel isolation is *not* impacted by ITS. - Indirect Branch Prediction Barrier (IBPB): Due to this bug indirect branches may be predicted with targets corresponding to direct branches which were executed prior to IBPB. This will be fixed in the microcode. Mitigation: As indirect branches in the first half of cacheline are affected, the mitigation is to replace those indirect branches with a call to thunk that is aligned to the second half of the cacheline. RETs that take prediction from RSB are not affected, but they may be affected by RSB-underflow condition. So, RETs in the first half of cacheline are also patched to a return thunk that executes the RET aligned to second half of cacheline" * tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftest/x86/bugs: Add selftests for ITS x86/its: FineIBT-paranoid vs ITS x86/its: Use dynamic thunks for indirect branches x86/ibt: Keep IBT disabled during alternative patching mm/execmem: Unify early execmem_cache behaviour x86/its: Align RETs in BHB clear sequence to avoid thunking x86/its: Add support for RSB stuffing mitigation x86/its: Add "vmexit" option to skip mitigation on some CPUs x86/its: Enable Indirect Target Selection mitigation x86/its: Add support for ITS-safe return thunk x86/its: Add support for ITS-safe indirect thunk x86/its: Enumerate Indirect Target Selection (ITS) bug Documentation: x86/bugs/its: Add ITS documentation
2025-05-11Merge tag 'ibti-hisory-for-linus-2025-05-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 IBTI mitigation from Dave Hansen: "Mitigate Intra-mode Branch History Injection via classic BFP programs This adds the branch history clearing mitigation to cBPF programs for x86. Intra-mode BHI attacks via cBPF a.k.a IBTI-History was reported by researchers at VUSec. For hardware that doesn't support BHI_DIS_S, the recommended mitigation is to run the short software sequence followed by the IBHF instruction after cBPF execution. On hardware that does support BHI_DIS_S, enable BHI_DIS_S and execute the IBHF after cBPF execution. The Indirect Branch History Fence (IBHF) is a new instruction that prevents indirect branch target predictions after the barrier from using branch history from before the barrier while BHI_DIS_S is enabled. On older systems this will map to a NOP. It is recommended to add this fence at the end of the cBPF program to support VM migration. This instruction is required on newer parts with BHI_NO to fully mitigate against these attacks. The current code disables the mitigation for anything running with the SYS_ADMIN capability bit set. The intention was not to waste time mitigating a process that has access to anything it wants anyway" * tag 'ibti-hisory-for-linus-2025-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/bhi: Do not set BHI_DIS_S in 32-bit mode x86/bpf: Add IBHF call at end of classic BPF x86/bpf: Call branch history clearing sequence on exit
2025-05-09x86/its: Add support for RSB stuffing mitigationPawan Gupta
When retpoline mitigation is enabled for spectre-v2, enabling call-depth-tracking and RSB stuffing also mitigates ITS. Add cmdline option indirect_target_selection=stuff to allow enabling RSB stuffing mitigation. When retpoline mitigation is not enabled, =stuff option is ignored, and default mitigation for ITS is deployed. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09x86/its: Add "vmexit" option to skip mitigation on some CPUsPawan Gupta
Ice Lake generation CPUs are not affected by guest/host isolation part of ITS. If a user is only concerned about KVM guests, they can now choose a new cmdline option "vmexit" that will not deploy the ITS mitigation when CPU is not affected by guest/host isolation. This saves the performance overhead of ITS mitigation on Ice Lake gen CPUs. When "vmexit" option selected, if the CPU is affected by ITS guest/host isolation, the default ITS mitigation is deployed. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09x86/its: Enable Indirect Target Selection mitigationPawan Gupta
Indirect Target Selection (ITS) is a bug in some pre-ADL Intel CPUs with eIBRS. It affects prediction of indirect branch and RETs in the lower half of cacheline. Due to ITS such branches may get wrongly predicted to a target of (direct or indirect) branch that is located in the upper half of the cacheline. Scope of impact =============== Guest/host isolation -------------------- When eIBRS is used for guest/host isolation, the indirect branches in the VMM may still be predicted with targets corresponding to branches in the guest. Intra-mode ---------- cBPF or other native gadgets can be used for intra-mode training and disclosure using ITS. User/kernel isolation --------------------- When eIBRS is enabled user/kernel isolation is not impacted. Indirect Branch Prediction Barrier (IBPB) ----------------------------------------- After an IBPB, indirect branches may be predicted with targets corresponding to direct branches which were executed prior to IBPB. This is mitigated by a microcode update. Add cmdline parameter indirect_target_selection=off|on|force to control the mitigation to relocate the affected branches to an ITS-safe thunk i.e. located in the upper half of cacheline. Also add the sysfs reporting. When retpoline mitigation is deployed, ITS safe-thunks are not needed, because retpoline sequence is already ITS-safe. Similarly, when call depth tracking (CDT) mitigation is deployed (retbleed=stuff), ITS safe return thunk is not used, as CDT prevents RSB-underflow. To not overcomplicate things, ITS mitigation is not supported with spectre-v2 lfence;jmp mitigation. Moreover, it is less practical to deploy lfence;jmp mitigation on ITS affected parts anyways. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09x86/its: Enumerate Indirect Target Selection (ITS) bugPawan Gupta
ITS bug in some pre-Alderlake Intel CPUs may allow indirect branches in the first half of a cache line get predicted to a target of a branch located in the second half of the cache line. Set X86_BUG_ITS on affected CPUs. Mitigation to follow in later commits. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-06x86/bhi: Do not set BHI_DIS_S in 32-bit modePawan Gupta
With the possibility of intra-mode BHI via cBPF, complete mitigation for BHI is to use IBHF (history fence) instruction with BHI_DIS_S set. Since this new instruction is only available in 64-bit mode, setting BHI_DIS_S in 32-bit mode is only a partial mitigation. Do not set BHI_DIS_S in 32-bit mode so as to avoid reporting misleading mitigated status. With this change IBHF won't be used in 32-bit mode, also remove the CONFIG_X86_64 check from emit_spectre_bhb_barrier(). Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>