summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu
AgeCommit message (Collapse)Author
2024-12-31x86/microcode/AMD: Return bool from find_blobs_in_containers()Nikolay Borisov
Instead of open-coding the check for size/data move it inside the function and make it return a boolean indicating whether data was found or not. No functional changes. [ bp: Write @ret in find_blobs_in_containers() only on success. ] Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20241018155151.702350-2-nik.borisov@suse.com
2024-12-31x86/mce: Remove the redundant mce_hygon_feature_init()Qiuxu Zhuo
Get HYGON to directly call mce_amd_feature_init() and remove the redundant mce_hygon_feature_init(). Suggested-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20241212140103.66964-7-qiuxu.zhuo@intel.com
2024-12-31x86/mce: Convert family/model mixed checks to VFM-based checksQiuxu Zhuo
Convert family/model mixed checks to VFM-based checks to make the code more compact. Simplify. [ bp: Drop the "what" from the commit message - it should be visible from the diff alone. ] Suggested-by: Sohil Mehta <sohil.mehta@intel.com> Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20241212140103.66964-6-qiuxu.zhuo@intel.com
2024-12-31x86/mce: Break up __mcheck_cpu_apply_quirks()Tony Luck
Split each vendor specific part into its own helper function. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Link: https://lore.kernel.org/r/20241212140103.66964-5-qiuxu.zhuo@intel.com
2024-12-30x86/mce: Make four functions return boolQiuxu Zhuo
Make those functions whose callers only care about success or failure return a boolean value for better readability. Also, update the call sites accordingly as the polarities of all the return values have been flipped. No functional changes. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20241212140103.66964-4-qiuxu.zhuo@intel.com
2024-12-30x86/mce/threshold: Remove the redundant this_cpu_dec_return()Qiuxu Zhuo
The 'storm' variable points to this_cpu_ptr(&storm_desc). Access the 'stormy_bank_count' field through the 'storm' to avoid calling this_cpu_*() on the same per-CPU variable twice. This minor optimization reduces the text size by 16 bytes. $ size threshold.o.* text data bss dec hex filename 1395 1664 0 3059 bf3 threshold.o.old 1379 1664 0 3043 be3 threshold.o.new No functional changes intended. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20241212140103.66964-3-qiuxu.zhuo@intel.com
2024-12-30x86/mce: Make several functions return boolQiuxu Zhuo
Make several functions that return 0 or 1 return a boolean value for better readability. No functional changes are intended. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20241212140103.66964-2-qiuxu.zhuo@intel.com
2024-12-30x86/bugs: Add SRSO_USER_KERNEL_NO supportBorislav Petkov (AMD)
If the machine has: CPUID Fn8000_0021_EAX[30] (SRSO_USER_KERNEL_NO) -- If this bit is 1, it indicates the CPU is not subject to the SRSO vulnerability across user/kernel boundaries. have it fall back to IBPB on VMEXIT only, in the case it is going to run VMs: Speculative Return Stack Overflow: Mitigation: IBPB on VMEXIT only Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Link: https://lore.kernel.org/r/20241202120416.6054-2-bp@kernel.org
2024-12-18Merge tag 'hyperv-fixes-signed-20241217' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Various fixes to Hyper-V tools in the kernel tree (Dexuan Cui, Olaf Hering, Vitaly Kuznetsov) - Fix a bug in the Hyper-V TSC page based sched_clock() (Naman Jain) - Two bug fixes in the Hyper-V utility functions (Michael Kelley) - Convert open-coded timeouts to secs_to_jiffies() in Hyper-V drivers (Easwar Hariharan) * tag 'hyperv-fixes-signed-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: tools/hv: reduce resource usage in hv_kvp_daemon tools/hv: add a .gitignore file tools/hv: reduce resouce usage in hv_get_dns_info helper hv/hv_kvp_daemon: Pass NIC name to hv_get_dns_info as well Drivers: hv: util: Avoid accessing a ringbuffer not initialized yet Drivers: hv: util: Don't force error code to ENODEV in util_probe() tools/hv: terminate fcopy daemon if read from uio fails drivers: hv: Convert open-coded timeouts to secs_to_jiffies() tools: hv: change permissions of NetworkManager configuration file x86/hyperv: Fix hv tsc page based sched_clock for hibernation tools: hv: Fix a complier warning in the fcopy uio daemon
2024-12-18x86/cpu: Make all all CPUID leaf names consistentDave Hansen
The leaf names are not consistent. Give them all a CPUID_LEAF_ prefix for consistency and vertical alignment. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Jiang <dave.jiang@intel.com> # for ioatdma bits Link: https://lore.kernel.org/all/20241213205040.7B0C3241%40davehans-spike.ostc.intel.com
2024-12-18x86/fpu: Move CPUID leaf definitions to common codeDave Hansen
Move the XSAVE-related CPUID leaf definitions to common code. Then, use the new definition to remove the last magic number from the CPUID level dependency table. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/all/20241213205037.43C57CDE%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Refresh DCA leaf reading codeDave Hansen
The DCA leaf number is also hard-coded in the CPUID level dependency table. Move its definition to common code and use it. While at it, fix up the naming and types in the probe code. All CPUID data is provided in 32-bit registers, not 'unsigned long'. Also stop referring to "level_9". Move away from test_bit() because the type is no longer an 'unsigned long'. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/all/20241213205032.476A30FE%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Use MWAIT leaf definitionDave Hansen
The leaf-to-feature dependency array uses hard-coded leaf numbers. Use the new common header definition for the MWAIT leaf. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/all/20241213205029.5B055D6E%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Remove 'x86_cpu_desc' infrastructureDave Hansen
All the users of 'x86_cpu_desc' are gone. Zap it from the tree. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213185133.AF0BF2BC%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Move AMD erratum 1386 table over to 'x86_cpu_id'Dave Hansen
The AMD erratum 1386 detection code uses and old style 'x86_cpu_desc' table. Replace it with 'x86_cpu_id' so the old style can be removed. I did not create a new helper macro here. The new table is certainly more noisy than the old and it can be improved on. But I was hesitant to create a new macro just for a single site that is only two ugly lines in the end. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213185132.07555E1D%40davehans-spike.ostc.intel.com
2024-12-17x86/cpu: Expose only stepping min/max interfaceDave Hansen
The x86_match_cpu() infrastructure can match CPU steppings. Since there are only 16 possible steppings, the matching infrastructure goes all out and stores the stepping match as a bitmap. That means it can match any possible steppings in a single list entry. Fun. But it exposes this bitmap to each of the X86_MATCH_*() helpers when none of them really need a bitmap. It makes up for this by exporting a helper (X86_STEPPINGS()) which converts a contiguous stepping range into the bitmap which every single user leverages. Instead of a bitmap, have the main helper for this sort of thing (X86_MATCH_VFM_STEPS()) just take a stepping range. This ends up actually being even more compact than before. Leave the helper in place (renamed to __X86_STEPPINGS()) to make it more clear what is going on instead of just having a random GENMASK() in the middle of an already complicated macro. One oddity that I hit was this macro: X86_MATCH_VFM_STEPS(vfm, X86_STEPPING_MIN, max_stepping, issues) It *could* have been converted over to take a min/max stepping value for each entry. But that would have been a bit too verbose and would prevent the one oddball in the list (INTEL_COMETLAKE_L stepping 0) from sticking out. Instead, just have it take a *maximum* stepping and imply that the match is from 0=>max_stepping. This is functional for all the cases now and also retains the nice property of having INTEL_COMETLAKE_L stepping 0 stick out like a sore thumb. skx_cpuids[] is goofy. It uses the stepping match but encodes all possible steppings. Just use a normal, non-stepping match helper. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213185129.65527B2A%40davehans-spike.ostc.intel.com
2024-12-17x86/cpu: Introduce new microcode matching helperDave Hansen
The 'x86_cpu_id' and 'x86_cpu_desc' structures are very similar and need to be consolidated. There is a microcode version matching function for 'x86_cpu_desc' but not 'x86_cpu_id'. Create one for 'x86_cpu_id'. This essentially just leverages the x86_cpu_id->driver_data field to replace the less generic x86_cpu_desc->x86_microcode_rev field. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213185128.8F24EEFC%40davehans-spike.ostc.intel.com
2024-12-14x86/sev: Require the RMPREAD instruction after Zen4Tom Lendacky
Limit usage of the non-architectural RMP format to Zen3/Zen4 processors. The RMPREAD instruction, with architectural defined output, is available and should be used for RMP access beyond Zen4. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Reviewed-by: Ashish Kalra <ashish.kalra@amd.com> Link: https://lore.kernel.org/r/5be0093e091778a151266ea853352f62f838eb99.1733172653.git.thomas.lendacky@amd.com
2024-12-13x86: make get_cpu_vendor() accessible from Xen codeJuergen Gross
In order to be able to differentiate between AMD and Intel based systems for very early hypercalls without having to rely on the Xen hypercall page, make get_cpu_vendor() non-static. Refactor early_cpu_init() for the same reason by splitting out the loop initializing cpu_devs() into an externally callable function. This is part of XSA-466 / CVE-2024-53241. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2024-12-12x86/resctrl: Add write option to "mba_MBps_event" fileTony Luck
The "mba_MBps" mount option provides an alternate method to control memory bandwidth. Instead of specifying allowable bandwidth as a percentage of maximum possible, the user provides a MiB/s limit value. There is a file in each CTRL_MON group directory that shows the event currently in use. Allow writing that file to choose a different event. A user can choose any of the memory bandwidth monitoring events listed in /sys/fs/resctrl/info/L3_mon/mon_features independently for each CTRL_MON group by writing to each of the "mba_MBps_event" files. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-8-tony.luck@intel.com
2024-12-12x86/resctrl: Add "mba_MBps_event" file to CTRL_MON directoriesTony Luck
The "mba_MBps" mount option provides an alternate method to control memory bandwidth. Instead of specifying allowable bandwidth as a percentage of maximum possible, the user provides a MiB/s limit value. In preparation to allow the user to pick the memory bandwidth monitoring event used as input to the feedback loop, provide a file in each CTRL_MON group directory that shows the event currently in use. Note that this file is only visible when the "mba_MBps" mount option is in use. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-7-tony.luck@intel.com
2024-12-10x86/cpu: Fix typo in x86_match_cpu()'s docRaag Jadav
Fix typo in x86_match_cpu()'s description. [ bp: Massage commit message. ] Signed-off-by: Raag Jadav <raag.jadav@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20241030065804.407793-1-raag.jadav@intel.com
2024-12-10Merge branch 'linus' into x86/cleanups, to resolve conflictIngo Molnar
These two commits interact: upstream: 73da582a476e ("x86/cpu/topology: Remove limit of CPUs due to disabled IO/APIC") x86/cleanups: 13148e22c151 ("x86/apic: Remove "disablelapic" cmdline option") Resolve it. Conflicts: arch/x86/kernel/cpu/topology.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-12-10x86/apic: Remove "disablelapic" cmdline optionBorislav Petkov (AMD)
The convention is "no<something>" and there already is "nolapic". Drop the disable one. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Link: https://lore.kernel.org/r/20241202190011.11979-2-bp@kernel.org
2024-12-10x86/resctrl: Make mba_sc use total bandwidth if local is not supportedTony Luck
The default input measurement to the mba_sc feedback loop for memory bandwidth control when the user mounts with the "mba_MBps" option is the local bandwidth event. But some systems may not support a local bandwidth event. When local bandwidth event is not supported, check for support of total bandwidth and use that instead. Relax the mount option check to allow use of the "mba_MBps" option for systems when only total bandwidth monitoring is supported. Also update the error message. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-6-tony.luck@intel.com
2024-12-10x86/resctrl: Compute memory bandwidth for all supported eventsTony Luck
Switching between local and total memory bandwidth events as the input to the mba_sc feedback loop would be cumbersome and take effect slowly in the current implementation as the bandwidth is only known after two consecutive readings of the same event. Compute the bandwidth for all supported events. This doesn't add significant overhead and will make changing which event is used simple. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-5-tony.luck@intel.com
2024-12-10x86/resctrl: Modify update_mba_bw() to use per CTRL_MON group eventTony Luck
update_mba_bw() hard codes use of the memory bandwidth local event which prevents more flexible options from being deployed. Change this function to use the event specified in the rdtgroup that is being processed. Mount time checks for the "mba_MBps" option ensure that local memory bandwidth is enabled. So drop the redundant is_mbm_local_enabled() check. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-4-tony.luck@intel.com
2024-12-10x86/resctrl: Prepare for per-CTRL_MON group mba_MBps controlTony Luck
Resctrl uses local memory bandwidth event as input to the feedback loop when the mba_MBps mount option is used. This means that this mount option cannot be used on systems that only support monitoring of total bandwidth. Prepare to allow users to choose the input event independently for each CTRL_MON group by adding a global variable "mba_mbps_default_event" used to set the default event for each CTRL_MON group, and a new field "mba_mbps_event" in struct rdtgroup to track which event is used for each CTRL_MON group. Notes: 1) Both of these are only used when the user mounts the filesystem with the "mba_MBps" option. 2) Only check for support of local bandwidth event when initializing mba_mbps_default_event. Support for total bandwidth event can be added after other routines in resctrl have been updated to handle total bandwidth event. [ bp: Move mba_mbps_default_event extern into the arch header. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-3-tony.luck@intel.com
2024-12-09x86/resctrl: Introduce resctrl_file_fflags_init() to initialize fflagsBabu Moger
thread_throttle_mode_init() and mbm_config_rftype_init() both initialize fflags for resctrl files. Adding new files will involve adding another function to initialize the fflags. This can be simplified by adding a new function resctrl_file_fflags_init() and passing the file name and flags to be initialized. Consolidate fflags initialization into resctrl_file_fflags_init() and remove thread_throttle_mode_init() and mbm_config_rftype_init(). [ Tony: Drop __init attribute so resctrl_file_fflags_init() can be used at run time. ] Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20241206163148.83828-2-tony.luck@intel.com
2024-12-09x86/resctrl: Use kthread_run_on_cpu()Frederic Weisbecker
Use the proper API instead of open coding it. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20240807160228.26206-3-frederic@kernel.org
2024-12-09x86/hyperv: Fix hv tsc page based sched_clock for hibernationNaman Jain
read_hv_sched_clock_tsc() assumes that the Hyper-V clock counter is bigger than the variable hv_sched_clock_offset, which is cached during early boot, but depending on the timing this assumption may be false when a hibernated VM starts again (the clock counter starts from 0 again) and is resuming back (Note: hv_init_tsc_clocksource() is not called during hibernation/resume); consequently, read_hv_sched_clock_tsc() may return a negative integer (which is interpreted as a huge positive integer since the return type is u64) and new kernel messages are prefixed with huge timestamps before read_hv_sched_clock_tsc() grows big enough (which typically takes several seconds). Fix the issue by saving the Hyper-V clock counter just before the suspend, and using it to correct the hv_sched_clock_offset in resume. This makes hv tsc page based sched_clock continuous and ensures that post resume, it starts from where it left off during suspend. Override x86_platform.save_sched_clock_state and x86_platform.restore_sched_clock_state routines to correct this as soon as possible. Note: if Invariant TSC is available, the issue doesn't happen because 1) we don't register read_hv_sched_clock_tsc() for sched clock: See commit e5313f1c5404 ("clocksource/drivers/hyper-v: Rework clocksource and sched clock setup"); 2) the common x86 code adjusts TSC similarly: see __restore_processor_state() -> tsc_verify_tsc_adjust(true) and x86_platform.restore_sched_clock_state(). Cc: stable@vger.kernel.org Fixes: 1349401ff1aa ("clocksource/drivers/hyper-v: Suspend/resume Hyper-V clocksource for hibernation") Co-developed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20240917053917.76787-1-namjain@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240917053917.76787-1-namjain@linux.microsoft.com>
2024-12-06x86/CPU/AMD: WARN when setting EFER.AUTOIBRS if and only if the WRMSR failsSean Christopherson
When ensuring EFER.AUTOIBRS is set, WARN only on a negative return code from msr_set_bit(), as '1' is used to indicate the WRMSR was successful ('0' indicates the MSR bit was already set). Fixes: 8cc68c9c9e92 ("x86/CPU/AMD: Make sure EFER[AIBRSE] is set") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/Z1MkNofJjt7Oq0G6@google.com Closes: https://lore.kernel.org/all/20241205220604.GA2054199@thelio-3990X
2024-12-06x86/cacheinfo: Delete global num_cache_leavesRicardo Neri
Linux remembers cpu_cachinfo::num_leaves per CPU, but x86 initializes all CPUs from the same global "num_cache_leaves". This is erroneous on systems such as Meteor Lake, where each CPU has a distinct num_leaves value. Delete the global "num_cache_leaves" and initialize num_leaves on each CPU. init_cache_level() no longer needs to set num_leaves. Also, it never had to set num_levels as it is unnecessary in x86. Keep checking for zero cache leaves. Such condition indicates a bug. [ bp: Cleanup. ] Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org # 6.3+ Link: https://lore.kernel.org/r/20241128002247.26726-3-ricardo.neri-calderon@linux.intel.com
2024-12-06x86/paravirt: Remove the WBINVD callbackJuergen Gross
The pv_ops::cpu.wbinvd paravirt callback is a leftover of lguest times. Today it is no longer needed, as all users use the native WBINVD implementation. Remove the callback and rename native_wbinvd() to wbinvd(). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20241203071550.26487-1-jgross@suse.com
2024-12-06x86/cpufeatures: Free up unused feature bitsSohil Mehta
Linux defined feature bits X86_FEATURE_P3 and X86_FEATURE_P4 are not used anywhere. Commit f31d731e4467 ("x86: use X86_FEATURE_NOPL in alternatives") got rid of the last usage in 2008. Remove the related mappings and code. Just like all X86_FEATURE bits, the raw bit numbers can be exposed to userspace via MODULE_DEVICE_TABLE(). There is a very small theoretical chance of userspace getting confused if these bits got reassigned and changed logical meaning. But these bits were never used for a device table, so it's highly unlikely this will ever happen in practice. [ dhansen: clarify userspace visibility of these bits ] Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/20241107233000.2742619-1-sohil.mehta%40intel.com
2024-12-05x86/cpu/topology: Remove limit of CPUs due to disabled IO/APICFernando Fernandez Mancera
The rework of possible CPUs management erroneously disabled SMP when the IO/APIC is disabled either by the 'noapic' command line parameter or during IO/APIC setup. SMP is possible without IO/APIC. Remove the ioapic_is_disabled conditions from the relevant possible CPU management code paths to restore the orgininal behaviour. Fixes: 7c0edad3643f ("x86/cpu/topology: Rework possible CPU management") Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241202145905.1482-1-ffmancera@riseup.net
2024-12-04x86/cpu: Add Lunar Lake to list of CPUs with a broken MONITOR implementationLen Brown
Under some conditions, MONITOR wakeups on Lunar Lake processors can be lost, resulting in significant user-visible delays. Add Lunar Lake to X86_BUG_MONITOR so that wake_up_idle_cpu() always sends an IPI, avoiding this potential delay. Reported originally here: https://bugzilla.kernel.org/show_bug.cgi?id=219364 [ dhansen: tweak subject ] Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/a4aa8842a3c3bfdb7fe9807710eef159cbf0e705.1731463305.git.len.brown%40intel.com
2024-12-04x86/mtrr: Rename mtrr_overwrite_state() to guest_force_mtrr_state()Kirill A. Shutemov
Rename the helper to better reflect its function. Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Link: https://lore.kernel.org/all/20241202073139.448208-1-kirill.shutemov%40linux.intel.com
2024-12-02x86/topology: Introduce topology_logical_core_id()K Prateek Nayak
On x86, topology_core_id() returns a unique core ID within the PKG domain. Looking at match_smt() suggests that a core ID just needs to be unique within a LLC domain. For use cases such as the core RAPL PMU, there exists a need for a unique core ID across the entire system with multiple PKG domains. Introduce topology_logical_core_id() to derive a unique core ID across the system. Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: "Gautham R. Shenoy" <gautham.shenoy@amd.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Link: https://lore.kernel.org/r/20241115060805.447565-3-Dhananjay.Ugwekar@amd.com
2024-12-01Merge tag 'x86_urgent_for_v6.13_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add a terminating zero end-element to the array describing AMD CPUs affected by erratum 1386 so that the matching loop actually terminates instead of going off into the weeds - Update the boot protocol documentation to mention the fact that the preferred address to load the kernel to is considered in the relocatable kernel case too - Flush the memory buffer containing the microcode patch after applying microcode on AMD Zen1 and Zen2, to avoid unnecessary slowdowns - Make sure the PPIN CPU feature flag is cleared on all CPUs if PPIN has been disabled * tag 'x86_urgent_for_v6.13_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Terminate the erratum_1386_microcode array x86/Documentation: Update algo in init_size description of boot protocol x86/microcode/AMD: Flush patch buffer mapping after application x86/mm: Carve out INVLPG inline asm for use by others x86/cpu: Fix PPIN initialization
2024-11-26x86/CPU/AMD: Terminate the erratum_1386_microcode arraySebastian Andrzej Siewior
The erratum_1386_microcode array requires an empty entry at the end. Otherwise x86_match_cpu_with_stepping() will continue iterate the array after it ended. Add an empty entry to erratum_1386_microcode to its end. Fixes: 29ba89f189528 ("x86/CPU/AMD: Improve the erratum 1386 workaround") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20241126134722.480975-1-bigeasy@linutronix.de
2024-11-25x86: fix off-by-one in access_ok()David Laight
When the size isn't a small constant, __access_ok() will call valid_user_address() with the address after the last byte of the user buffer. It is valid for a buffer to end with the last valid user address so valid_user_address() must allow accesses to the base of the guard page. [ This introduces an off-by-one in the other direction for the plain non-sized accesses, but since we have that guard region that is a whole page, those checks "allowing" accesses to that guard region don't really matter. The access will fault anyway, whether to the guard page or if the address has been masked to all ones - Linus ] Fixes: 86e6b1547b3d0 ("x86: fix user address masking non-canonical speculation issue") Signed-off-by: David Laight <david.laight@aculab.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-25x86/microcode/AMD: Flush patch buffer mapping after applicationBorislav Petkov (AMD)
Due to specific requirements while applying microcode patches on Zen1 and 2, the patch buffer mapping needs to be flushed from the TLB after application. Do so. If not, unnecessary and unnatural delays happen in the boot process. Reported-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com> Cc: <stable@kernel.org> # f1d84b59cbb9 ("x86/mm: Carve out INVLPG inline asm for use by others") Link: https://lore.kernel.org/r/ZyulbYuvrkshfsd2@antipodes
2024-11-25x86/cpu: Fix PPIN initializationTony Luck
On systems that enumerate PPIN (protected processor inventory number) using CPUID, but where the BIOS locked the MSR to prevent access /proc/cpuinfo reports "intel_ppin" feature as present on all logical CPUs except for CPU 0. This happens because ppin_init() uses x86_match_cpu() to determine whether PPIN is supported. When called on CPU 0 the test for locked PPIN MSR results in: clear_cpu_cap(c, info->feature); This clears the X86 FEATURE bit in boot_cpu_data. When other CPUs are brought online the x86_match_cpu() fails, and the PPIN FEATURE bit remains set for those other CPUs. Fix by using setup_clear_cpu_cap() instead of clear_cpu_cap() which force clears the FEATURE bit for all CPUS. Reported-by: Adeel Ashad <adeel.arshad@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20241122234212.27451-1-tony.luck@intel.com
2024-11-22Merge tag 'x86_misc_for_6.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Dave Hansen: "As usual for this branch, these are super random: a compile fix for some newish LLVM checks and making sure a Kconfig text reference to 'RSB' matches the normal definition: - Rework some CPU setup code to keep LLVM happy on 32-bit - Correct RSB terminology in Kconfig text" * tag 'x86_misc_for_6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Make sure flag_is_changeable_p() is always being used x86/bugs: Correct RSB terminology in Kconfig
2024-11-22Merge tag 'x86_sgx_for_6.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull sgx update from Dave Hansen: - Use vmalloc_array() instead of vmalloc() * tag 'x86_sgx_for_6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sgx: Use vmalloc_array() instead of vmalloc()
2024-11-19Merge tag 'x86-cleanups-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: - x86/boot: Remove unused function atou() (Dr. David Alan Gilbert) - x86/cpu: Use str_yes_no() helper in show_cpuinfo_misc() (Thorsten Blum) - x86/platform: Switch back to struct platform_driver::remove() (Uwe Kleine-König) * tag 'x86-cleanups-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Remove unused function atou() x86/cpu: Use str_yes_no() helper in show_cpuinfo_misc() x86/platform: Switch back to struct platform_driver::remove()
2024-11-19Merge tag 'x86-splitlock-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 splitlock updates from Ingo Molnar: - Move Split and Bus lock code to a dedicated file (Ravi Bangoria) - Add split/bus lock support for AMD (Ravi Bangoria) * tag 'x86-splitlock-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/bus_lock: Add support for AMD x86/split_lock: Move Split and Bus lock code to a dedicated file
2024-11-19Merge tag 'perf-core-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: "Uprobes: - Add BPF session support (Jiri Olsa) - Switch to RCU Tasks Trace flavor for better performance (Andrii Nakryiko) - Massively increase uretprobe SMP scalability by SRCU-protecting the uretprobe lifetime (Andrii Nakryiko) - Kill xol_area->slot_count (Oleg Nesterov) Core facilities: - Implement targeted high-frequency profiling by adding the ability for an event to "pause" or "resume" AUX area tracing (Adrian Hunter) VM profiling/sampling: - Correct perf sampling with guest VMs (Colton Lewis) New hardware support: - x86/intel: Add PMU support for Intel ArrowLake-H CPUs (Dapeng Mi) Misc fixes and enhancements: - x86/intel/pt: Fix buffer full but size is 0 case (Adrian Hunter) - x86/amd: Warn only on new bits set (Breno Leitao) - x86/amd/uncore: Avoid a false positive warning about snprintf truncation in amd_uncore_umc_ctx_init (Jean Delvare) - uprobes: Re-order struct uprobe_task to save some space (Christophe JAILLET) - x86/rapl: Move the pmu allocation out of CPU hotplug (Kan Liang) - x86/rapl: Clean up cpumask and hotplug (Kan Liang) - uprobes: Deuglify xol_get_insn_slot/xol_free_insn_slot paths (Oleg Nesterov)" * tag 'perf-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits) perf/core: Correct perf sampling with guest VMs perf/x86: Refactor misc flag assignments perf/powerpc: Use perf_arch_instruction_pointer() perf/core: Hoist perf_instruction_pointer() and perf_misc_flags() perf/arm: Drop unused functions uprobes: Re-order struct uprobe_task to save some space perf/x86/amd/uncore: Avoid a false positive warning about snprintf truncation in amd_uncore_umc_ctx_init perf/x86/intel: Do not enable large PEBS for events with aux actions or aux sampling perf/x86/intel/pt: Add support for pause / resume perf/core: Add aux_pause, aux_resume, aux_start_paused perf/x86/intel/pt: Fix buffer full but size is 0 case uprobes: SRCU-protect uretprobe lifetime (with timeout) uprobes: allow put_uprobe() from non-sleepable softirq context perf/x86/rapl: Clean up cpumask and hotplug perf/x86/rapl: Move the pmu allocation out of CPU hotplug uprobe: Add support for session consumer uprobe: Add data pointer to consumer handlers perf/x86/amd: Warn only on new bits set uprobes: fold xol_take_insn_slot() into xol_get_insn_slot() uprobes: kill xol_area->slot_count ...
2024-11-19Merge tag 'x86_cpu_for_v6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Borislav Petkov: - Add a feature flag which denotes AMD CPUs supporting workload classification with the purpose of using such hints when making scheduling decisions - Determine the boost enumerator for each AMD core based on its type: efficiency or performance, in the cppc driver - Add the type of a CPU to the topology CPU descriptor with the goal of supporting and making decisions based on the type of the respective core - Add a feature flag to denote AMD cores which have heterogeneous topology and enable SD_ASYM_PACKING for those - Check microcode revisions before disabling PCID on Intel - Cleanups and fixlets * tag 'x86_cpu_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Remove redundant CONFIG_NUMA guard around numa_add_cpu() x86/cpu: Fix FAM5_QUARK_X1000 to use X86_MATCH_VFM() x86/cpu: Fix formatting of cpuid_bits[] in scattered.c x86/cpufeatures: Add X86_FEATURE_AMD_WORKLOAD_CLASS feature bit x86/amd: Use heterogeneous core topology for identifying boost numerator x86/cpu: Add CPU type to struct cpuinfo_topology x86/cpu: Enable SD_ASYM_PACKING for PKG domain on AMD x86/cpufeatures: Add X86_FEATURE_AMD_HETEROGENEOUS_CORES x86/cpufeatures: Rename X86_FEATURE_FAST_CPPC to have AMD prefix x86/mm: Don't disable PCID when INVLPG has been fixed by microcode