summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/fpu
AgeCommit message (Collapse)Author
2021-06-23x86/fpu: Rename and sanitize fpu__save/copy()Thomas Gleixner
Both function names are a misnomer. fpu__save() is actually about synchronizing the hardware register state into the task's memory state so that either coredump or a math exception handler can inspect the state at the time where the problem happens. The function guarantees to preserve the register state, while "save" is a common terminology for saving the current state so it can be modified and restored later. This is clearly not the case here. Rename it to fpu_sync_fpstate(). fpu__copy() is used to clone the current task's FPU state when duplicating task_struct. While the register state is a copy the rest of the FPU state is not. Name it accordingly and remove the really pointless @src argument along with the warning which comes along with it. Nothing can ever copy the FPU state of a non-current task. It's clearly just a consequence of arch_dup_task_struct(), but it makes no sense to proliferate that further. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121455.196727450@linutronix.de
2021-06-23x86/fpu/xstate: Sanitize handling of independent featuresThomas Gleixner
The copy functions for the independent features are horribly named and the supervisor and independent part is just overengineered. The point is that the supplied mask has either to be a subset of the independent features or a subset of the task->fpu.xstate managed features. Rewrite it so it checks for invalid overlaps of these areas in the caller supplied feature mask. Rename it so it follows the new naming convention for these operations. Mop up the function documentation. This allows to use that function for other purposes as well. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20210623121455.004880675@linutronix.de
2021-06-23x86/fpu: Rename "dynamic" XSTATEs to "independent"Andy Lutomirski
The salient feature of "dynamic" XSTATEs is that they are not part of the main task XSTATE buffer. The fact that they are dynamically allocated is irrelevant and will become quite confusing when user math XSTATEs start being dynamically allocated. Rename them to "independent" because they are independent of the main XSTATE code. This is just a search-and-replace with some whitespace updates to keep things aligned. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/1eecb0e4f3e07828ebe5d737ec77dc3b708fad2d.1623388344.git.luto@kernel.org Link: https://lkml.kernel.org/r/20210623121454.911450390@linutronix.de
2021-06-23x86/fpu: Rename initstate copy functionsThomas Gleixner
Again this not a copy. It's restoring register state from kernel memory. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121454.816581630@linutronix.de
2021-06-23x86/fpu: Get rid of the FNSAVE optimizationThomas Gleixner
The FNSAVE support requires conditionals in quite some call paths because FNSAVE reinitializes the FPU hardware. If the save has to preserve the FPU register state then the caller has to conditionally restore it from memory when FNSAVE is in use. This also requires a conditional in context switch because the restore avoidance optimization cannot work with FNSAVE. As this only affects 20+ years old CPUs there is really no reason to keep this optimization effective for FNSAVE. It's about time to not optimize for antiques anymore. Just unconditionally FRSTOR the save content to the registers and clean up the conditionals all over the place. Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121454.617369268@linutronix.de
2021-06-23x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate()Thomas Gleixner
A copy is guaranteed to leave the source intact, which is not the case when FNSAVE is used as that reinitilizes the registers. Save does not make such guarantees and it matches what this is about, i.e. to save the state for a later restore. Rename it to save_fpregs_to_fpstate(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121454.508853062@linutronix.de
2021-06-23x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate()Thomas Gleixner
copy_uabi_from_user_to_xstate() and copy_uabi_from_kernel_to_xstate() are almost identical except for the copy function. Unify them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20210623121454.414215896@linutronix.de
2021-06-23x86/fpu: Rename xstate copy functions which are related to UABIThomas Gleixner
Rename them to reflect that these functions deal with user space format XSAVE buffers. copy_kernel_to_xstate() -> copy_uabi_from_kernel_to_xstate() copy_user_to_xstate() -> copy_sigframe_from_user_to_xstate() Again a clear statement that these functions deal with user space ABI. Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121454.318485015@linutronix.de
2021-06-23x86/fpu: Rename fregs-related copy functionsThomas Gleixner
The function names for fnsave/fnrstor operations are horribly named and a permanent source of confusion. Rename: copy_kernel_to_fregs() to frstor() copy_fregs_to_user() to fnsave_to_user_sigframe() copy_user_to_fregs() to frstor_from_user_sigframe() so it's clear what these are doing. All these functions are really low level wrappers around the equally named instructions, so mapping to the documentation is just natural. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121454.223594101@linutronix.de
2021-06-23x86/fpu: Rename fxregs-related copy functionsThomas Gleixner
The function names for fxsave/fxrstor operations are horribly named and a permanent source of confusion. Rename: copy_fxregs_to_kernel() to fxsave() copy_kernel_to_fxregs() to fxrstor() copy_fxregs_to_user() to fxsave_to_user_sigframe() copy_user_to_fxregs() to fxrstor_from_user_sigframe() so it's clear what these are doing. All these functions are really low level wrappers around the equally named instructions, so mapping to the documentation is just natural. While at it, replace the static_cpu_has(X86_FEATURE_FXSR) with use_fxsr() to be consistent with the rest of the code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121454.017863494@linutronix.de
2021-06-23x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user()Thomas Gleixner
The function names for xsave[s]/xrstor[s] operations are horribly named and a permanent source of confusion. Rename: copy_xregs_to_user() to xsave_to_user_sigframe() copy_user_to_xregs() to xrstor_from_user_sigframe() so it's entirely clear what this is about. This is also a clear indicator of the potentially different storage format because this is user ABI and cannot use compacted format. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.924266705@linutronix.de
2021-06-23x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs()Thomas Gleixner
The function names for xsave[s]/xrstor[s] operations are horribly named and a permanent source of confusion. Rename: copy_xregs_to_kernel() to os_xsave() copy_kernel_to_xregs() to os_xrstor() These are truly low level wrappers around the actual instructions XSAVE[OPT]/XRSTOR and XSAVES/XRSTORS with the twist that the selection based on the available CPU features happens with an alternative to avoid conditionals all over the place and to provide the best performance for hot paths. The os_ prefix tells that this is the OS selected mechanism. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.830239347@linutronix.de
2021-06-23x86/fpu: Get rid of copy_supervisor_to_kernel()Thomas Gleixner
If the fast path of restoring the FPU state on sigreturn fails or is not taken and the current task's FPU is active then the FPU has to be deactivated for the slow path to allow a safe update of the tasks FPU memory state. With supervisor states enabled, this requires to save the supervisor state in the memory state first. Supervisor states require XSAVES so saving only the supervisor state requires to reshuffle the memory buffer because XSAVES uses the compacted format and therefore stores the supervisor states at the beginning of the memory state. That's just an overengineered optimization. Get rid of it and save the full state for this case. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.734561971@linutronix.de
2021-06-23x86/fpu: Cleanup arch_set_user_pkey_access()Thomas Gleixner
The function does a sanity check with a WARN_ON_ONCE() but happily proceeds when the pkey argument is out of range. Clean it up. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.635764326@linutronix.de
2021-06-23x86/fpu: Get rid of using_compacted_format()Thomas Gleixner
This function is pointlessly global and a complete misnomer because it's usage is related to both supervisor state checks and compacted format checks. Remove it and just make the conditions check the XSAVES feature. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.425493349@linutronix.de
2021-06-23x86/fpu: Move fpu__write_begin() to regsetThomas Gleixner
The only usecase for fpu__write_begin is the set() callback of regset, so the function is pointlessly global. Move it to the regset code and rename it to fpu_force_restore() which is exactly decribing what the function does. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.328652975@linutronix.de
2021-06-23x86/fpu/regset: Move fpu__read_begin() into regsetThomas Gleixner
The function can only be used from the regset get() callbacks safely. So there is no reason to have it globally exposed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.234942936@linutronix.de
2021-06-23x86/fpu: Remove fpstate_sanitize_xstate()Thomas Gleixner
No more users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.124819167@linutronix.de
2021-06-23x86/fpu: Use copy_xstate_to_uabi_buf() in fpregs_get()Thomas Gleixner
Use the new functionality of copy_xstate_to_uabi_buf() to retrieve the FX state when XSAVE* is in use. This avoids to overwrite the FPU state buffer with fpstate_sanitize_xstate() which is error prone and duplicated code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.014441775@linutronix.de
2021-06-23x86/fpu: Use copy_xstate_to_uabi_buf() in xfpregs_get()Thomas Gleixner
Use the new functionality of copy_xstate_to_uabi_buf() to retrieve the FX state when XSAVE* is in use. This avoids overwriting the FPU state buffer with fpstate_sanitize_xstate() which is error prone and duplicated code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121452.901736860@linutronix.de
2021-06-23x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get()Thomas Gleixner
When xsave with init state optimization is used then a component's state in the task's xsave buffer can be stale when the corresponding feature bit is not set. fpregs_get() and xfpregs_get() invoke fpstate_sanitize_xstate() to update the task's xsave buffer before retrieving the FX or FP state. That's just duplicated code as copy_xstate_to_kernel() already handles this correctly. Add a copy mode argument to the function which allows to restrict the state copy to the FP and SSE features. Also rename the function to copy_xstate_to_uabi_buf() so the name reflects what it is doing. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121452.805327286@linutronix.de
2021-06-23x86/fpu: Clean up fpregs_set()Andy Lutomirski
fpregs_set() has unnecessary complexity to support short or nonzero-offset writes and to handle the case in which a copy from userspace overwrites some of the target buffer and then fails. Support for partial writes is useless -- just require that the write has offset 0 and the correct size, and copy into a temporary kernel buffer to avoid clobbering the state if the user access fails. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121452.710467587@linutronix.de
2021-06-23x86/fpu: Fail ptrace() requests that try to set invalid MXCSR valuesAndy Lutomirski
There is no benefit from accepting and silently changing an invalid MXCSR value supplied via ptrace(). Instead, return -EINVAL on invalid input. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121452.613614842@linutronix.de
2021-06-23x86/fpu: Rewrite xfpregs_set()Andy Lutomirski
xfpregs_set() was incomprehensible. Almost all of the complexity was due to trying to support nonsensically sized writes or -EFAULT errors that would have partially or completely overwritten the destination before failing. Nonsensically sized input would only have been possible using PTRACE_SETREGSET on REGSET_XFP. Fortunately, it appears (based on Debian code search results) that no one uses that API at all, let alone with the wrong sized buffer. Failed user access can be handled more cleanly by first copying to kernel memory. Just rewrite it to require sensible input. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121452.504234607@linutronix.de
2021-06-23x86/fpu: Simplify PTRACE_GETREGS codeDave Hansen
ptrace() has interfaces that let a ptracer inspect a ptracee's register state. This includes XSAVE state. The ptrace() ABI includes a hardware-format XSAVE buffer for both the SETREGS and GETREGS interfaces. In the old days, the kernel buffer and the ptrace() ABI buffer were the same boring non-compacted format. But, since the advent of supervisor states and the compacted format, the kernel buffer has diverged from the format presented in the ABI. This leads to two paths in the kernel: 1. Effectively a verbatim copy_to_user() which just copies the kernel buffer out to userspace. This is used when the kernel buffer is kept in the non-compacted form which means that it shares a format with the ptrace ABI. 2. A one-state-at-a-time path: copy_xstate_to_kernel(). This is theoretically slower since it does a bunch of piecemeal copies. Remove the verbatim copy case. Speed probably does not matter in this path, and the vast majority of new hardware will use the one-state-at-a-time path anyway. This ensures greater testing for the "slow" path. This also makes enabling PKRU in this interface easier since a single path can be patched instead of two. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121452.408457100@linutronix.de
2021-06-23x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate()Thomas Gleixner
Instead of masking out reserved bits, check them and reject the provided state as invalid if not zero. Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121452.308388343@linutronix.de
2021-06-23x86/fpu: Sanitize xstateregs_set()Thomas Gleixner
xstateregs_set() operates on a stopped task and tries to copy the provided buffer into the task's fpu.state.xsave buffer. Any error while copying or invalid state detected after copying results in wiping the target task's FPU state completely including supervisor states. That's just wrong. The caller supplied invalid data or has a problem with unmapped memory, so there is absolutely no justification to corrupt the target state. Fix this with the following modifications: 1) If data has to be copied from userspace, allocate a buffer and copy from user first. 2) Use copy_kernel_to_xstate() unconditionally so that header checking works correctly. 3) Return on error without corrupting the target state. This prevents corrupting states and lets the caller deal with the problem it caused in the first place. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121452.214903673@linutronix.de
2021-06-23x86/fpu: Limit xstate copy size in xstateregs_set()Thomas Gleixner
If the count argument is larger than the xstate size, this will happily copy beyond the end of xstate. Fixes: 91c3dba7dbc1 ("x86/fpu/xstate: Fix PTRACE frames for XSAVES") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121452.120741557@linutronix.de
2021-06-23x86/fpu: Move inlines where they belongThomas Gleixner
They are only used in fpstate_init() and there is no point to have them in a header just to make reading the code harder. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121452.023118522@linutronix.de
2021-06-23x86/fpu: Remove unused get_xsave_field_ptr()Thomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121451.915614415@linutronix.de
2021-06-23x86/fpu: Get rid of fpu__get_supported_xfeatures_mask()Thomas Gleixner
This function is really not doing what the comment advertises: "Find supported xfeatures based on cpu features and command-line input. This must be called after fpu__init_parse_early_param() is called and xfeatures_mask is enumerated." fpu__init_parse_early_param() does not exist anymore and the function just returns a constant. Remove it and fix the caller and get rid of further references to fpu__init_parse_early_param(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121451.816404717@linutronix.de
2021-06-23x86/fpu: Make xfeatures_mask_all __ro_after_initThomas Gleixner
Nothing has to modify this after init. But of course there is code which unconditionally masks xfeatures_mask_all on CPU hotplug. This goes unnoticed during boot hotplug because at that point the variable is still RW mapped. This is broken in several ways: 1) Masking this in post init CPU hotplug means that any modification of this state goes unnoticed until actual hotplug happens. 2) If that ever happens then these bogus feature bits are already populated all over the place and the system is in inconsistent state vs. the compacted XSTATE offsets. If at all then this has to panic the machine because the inconsistency cannot be undone anymore. Make this a one-time paranoia check in xstate init code and disable xsave when this happens. Reported-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121451.712803952@linutronix.de
2021-06-23x86/fpu: Mark various FPU state variables __ro_after_initThomas Gleixner
Nothing modifies these after booting. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20210623121451.611751529@linutronix.de
2021-06-23x86/fpu: Fix copy_xstate_to_kernel() gap handlingThomas Gleixner
The gap handling in copy_xstate_to_kernel() is wrong when XSAVES is in use. Using init_fpstate for copying the init state of features which are not set in the xstate header is only correct for the legacy area, but not for the extended features area because when XSAVES is in use then init_fpstate is in compacted form which means the xstate offsets which are used to copy from init_fpstate are not valid. Fortunately, this is not a real problem today because all extended features in use have an all-zeros init state, but it is wrong nevertheless and with a potentially dynamically sized init_fpstate this would result in an access outside of the init_fpstate. Fix this by keeping track of the last copied state in the target buffer and explicitly zero it when there is a feature or alignment gap. Use the compacted offset when accessing the extended feature space in init_fpstate. As this is not a functional issue on older kernels this is intentionally not tagged for stable. Fixes: b8be15d58806 ("x86/fpu/xstate: Re-enable XSAVES") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121451.294282032@linutronix.de
2021-06-23Merge x86/urgent into x86/fpuBorislav Petkov
Pick up dependent changes which either went mainline (x86/urgent is based on -rc7 and that contains them) as urgent fixes and the current x86/urgent branch which contains two more urgent fixes, so that the bigger FPU rework can base off ontop. Signed-off-by: Borislav Petkov <bp@suse.de>
2021-06-22x86/fpu: Make init_fpstate correct with optimized XSAVEThomas Gleixner
The XSAVE init code initializes all enabled and supported components with XRSTOR(S) to init state. Then it XSAVEs the state of the components back into init_fpstate which is used in several places to fill in the init state of components. This works correctly with XSAVE, but not with XSAVEOPT and XSAVES because those use the init optimization and skip writing state of components which are in init state. So init_fpstate.xsave still contains all zeroes after this operation. There are two ways to solve that: 1) Use XSAVE unconditionally, but that requires to reshuffle the buffer when XSAVES is enabled because XSAVES uses compacted format. 2) Save the components which are known to have a non-zero init state by other means. Looking deeper, #2 is the right thing to do because all components the kernel supports have all-zeroes init state except the legacy features (FP, SSE). Those cannot be hard coded because the states are not identical on all CPUs, but they can be saved with FXSAVE which avoids all conditionals. Use FXSAVE to save the legacy FP/SSE components in init_fpstate along with a BUILD_BUG_ON() which reminds developers to validate that a newly added component has all zeroes init state. As a bonus remove the now unused copy_xregs_to_kernel_booting() crutch. The XSAVE and reshuffle method can still be implemented in the unlikely case that components are added which have a non-zero init state and no other means to save them. For now, FXSAVE is just simple and good enough. [ bp: Fix a typo or two in the text. ] Fixes: 6bad06b76892 ("x86, xsave: Use xsaveopt in context-switch path when supported") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210618143444.587311343@linutronix.de
2021-06-22x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate()Thomas Gleixner
sanitize_restored_user_xstate() preserves the supervisor states only when the fx_only argument is zero, which allows unprivileged user space to put supervisor states back into init state. Preserve them unconditionally. [ bp: Fix a typo or two in the text. ] Fixes: 5d6b6a6f9b5c ("x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor xstates") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210618143444.438635017@linutronix.de
2021-06-10x86/fpu: Reset state for all signal restore failuresThomas Gleixner
If access_ok() or fpregs_soft_set() fails in __fpu__restore_sig() then the function just returns but does not clear the FPU state as it does for all other fatal failures. Clear the FPU state for these failures as well. Fixes: 72a671ced66d ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/87mtryyhhz.ffs@nanos.tec.linutronix.de
2021-06-09x86/fpu: Add address range checks to copy_user_to_xstate()Andy Lutomirski
copy_user_to_xstate() uses __copy_from_user(), which provides a negligible speedup. Fortunately, both call sites are at least almost correct. __fpu__restore_sig() checks access_ok() with xstate_sigframe_size() length and ptrace regset access uses fpu_user_xstate_size. These should be valid upper bounds on the length, so, at worst, this would cause spurious failures and not accesses to kernel memory. Nonetheless, this is far more fragile than necessary and none of these callers are in a hotpath. Use copy_from_user() instead. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rik van Riel <riel@surriel.com> Link: https://lkml.kernel.org/r/20210608144346.140254130@linutronix.de
2021-06-09x86/fpu: Invalidate FPU state after a failed XRSTOR from a user bufferAndy Lutomirski
Both Intel and AMD consider it to be architecturally valid for XRSTOR to fail with #PF but nonetheless change the register state. The actual conditions under which this might occur are unclear [1], but it seems plausible that this might be triggered if one sibling thread unmaps a page and invalidates the shared TLB while another sibling thread is executing XRSTOR on the page in question. __fpu__restore_sig() can execute XRSTOR while the hardware registers are preserved on behalf of a different victim task (using the fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but modify the registers. If this happens, then there is a window in which __fpu__restore_sig() could schedule out and the victim task could schedule back in without reloading its own FPU registers. This would result in part of the FPU state that __fpu__restore_sig() was attempting to load leaking into the victim task's user-visible state. Invalidate preserved FPU registers on XRSTOR failure to prevent this situation from corrupting any state. [1] Frequent readers of the errata lists might imagine "complex microarchitectural conditions". Fixes: 1d731e731c4c ("x86/fpu: Add a fastpath to __fpu__restore_sig()") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rik van Riel <riel@surriel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210608144345.758116583@linutronix.de
2021-06-09x86/fpu: Prevent state corruption in __fpu__restore_sig()Thomas Gleixner
The non-compacted slowpath uses __copy_from_user() and copies the entire user buffer into the kernel buffer, verbatim. This means that the kernel buffer may now contain entirely invalid state on which XRSTOR will #GP. validate_user_xstate_header() can detect some of that corruption, but that leaves the onus on callers to clear the buffer. Prior to XSAVES support, it was possible just to reinitialize the buffer, completely, but with supervisor states that is not longer possible as the buffer clearing code split got it backwards. Fixing that is possible but not corrupting the state in the first place is more robust. Avoid corruption of the kernel XSAVE buffer by using copy_user_to_xstate() which validates the XSAVE header contents before copying the actual states to the kernel. copy_user_to_xstate() was previously only called for compacted-format kernel buffers, but it works for both compacted and non-compacted forms. Using it for the non-compacted form is slower because of multiple __copy_from_user() operations, but that cost is less important than robust code in an already slow path. [ Changelog polished by Dave Hansen ] Fixes: b860eb8dce59 ("x86/fpu/xstate: Define new functions for clearing fpregs and xstates") Reported-by: syzbot+2067e764dbcd10721e2e@syzkaller.appspotmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rik van Riel <riel@surriel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210608144345.611833074@linutronix.de
2021-06-03x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid()Thomas Gleixner
While digesting the XSAVE-related horrors which got introduced with the supervisor/user split, the recent addition of ENQCMD-related functionality got on the radar and turned out to be similarly broken. update_pasid(), which is only required when X86_FEATURE_ENQCMD is available, is invoked from two places: 1) From switch_to() for the incoming task 2) Via a SMP function call from the IOMMU/SMV code #1 is half-ways correct as it hacks around the brokenness of get_xsave_addr() by enforcing the state to be 'present', but all the conditionals in that code are completely pointless for that. Also the invocation is just useless overhead because at that point it's guaranteed that TIF_NEED_FPU_LOAD is set on the incoming task and all of this can be handled at return to user space. #2 is broken beyond repair. The comment in the code claims that it is safe to invoke this in an IPI, but that's just wishful thinking. FPU state of a running task is protected by fregs_lock() which is nothing else than a local_bh_disable(). As BH-disabled regions run usually with interrupts enabled the IPI can hit a code section which modifies FPU state and there is absolutely no guarantee that any of the assumptions which are made for the IPI case is true. Also the IPI is sent to all CPUs in mm_cpumask(mm), but the IPI is invoked with a NULL pointer argument, so it can hit a completely unrelated task and unconditionally force an update for nothing. Worse, it can hit a kernel thread which operates on a user space address space and set a random PASID for it. The offending commit does not cleanly revert, but it's sufficient to force disable X86_FEATURE_ENQCMD and to remove the broken update_pasid() code to make this dysfunctional all over the place. Anything more complex would require more surgery and none of the related functions outside of the x86 core code are blatantly wrong, so removing those would be overkill. As nothing enables the PASID bit in the IA32_XSS MSR yet, which is required to make this actually work, this cannot result in a regression except for related out of tree train-wrecks, but they are broken already today. Fixes: 20f0afd1fb3d ("x86/mmu: Allocate/free a PASID") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/87mtsd6gr9.ffs@nanos.tec.linutronix.de
2021-05-19x86/signal: Introduce helpers to get the maximum signal frame sizeChang S. Bae
Signal frames do not have a fixed format and can vary in size when a number of things change: supported XSAVE features, 32 vs. 64-bit apps, etc. Add support for a runtime method for userspace to dynamically discover how large a signal stack needs to be. Introduce a new variable, max_frame_size, and helper functions for the calculation to be used in a new user interface. Set max_frame_size to a system-wide worst-case value, instead of storing multiple app-specific values. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Len Brown <len.brown@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: H.J. Lu <hjl.tools@gmail.com> Link: https://lkml.kernel.org/r/20210518200320.17239-3-chang.seok.bae@intel.com
2021-03-18x86: Fix various typos in commentsIngo Molnar
Fix ~144 single-word typos in arch/x86/ code comments. Doing this in a single commit should reduce the churn. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-kernel@vger.kernel.org
2021-01-29x86/fpu/xstate: Use sizeof() instead of a constantYejune Deng
Use sizeof() instead of a constant in fpstate_sanitize_xstate(). Remove use of the address of the 0th array element of ->st_space and ->xmm_space which is equivalent to the array address itself: No code changed: # arch/x86/kernel/fpu/xstate.o: text data bss dec hex filename 9694 899 4 10597 2965 xstate.o.before 9694 899 4 10597 2965 xstate.o.after md5: 5a43fc70bad8e2a1784f67f01b71aabb xstate.o.before.asm 5a43fc70bad8e2a1784f67f01b71aabb xstate.o.after.asm [ bp: Massage commit message. ] Signed-off-by: Yejune Deng <yejune.deng@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210122071925.41285-1-yejune.deng@gmail.com
2021-01-21x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize stateAndy Lutomirski
Currently, requesting kernel FPU access doesn't distinguish which parts of the extended ("FPU") state are needed. This is nice for simplicity, but there are a few cases in which it's suboptimal: - The vast majority of in-kernel FPU users want XMM/YMM/ZMM state but do not use legacy 387 state. These users want MXCSR initialized but don't care about the FPU control word. Skipping FNINIT would save time. (Empirically, FNINIT is several times slower than LDMXCSR.) - Code that wants MMX doesn't want or need MXCSR initialized. _mmx_memcpy(), for example, can run before CR4.OSFXSR gets set, and initializing MXCSR will fail because LDMXCSR generates an #UD when the aforementioned CR4 bit is not set. - Any future in-kernel users of XFD (eXtended Feature Disable)-capable dynamic states will need special handling. Add a more specific API that allows callers to specify exactly what they want. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Krzysztof Piotr Olędzki <ole@ans.pl> Link: https://lkml.kernel.org/r/aff1cac8b8fc7ee900cf73e8f2369966621b053f.1611205691.git.luto@kernel.org
2020-10-12Merge tag 'x86_fpu_for_v5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu updates from Borislav Petkov: - Allow clearcpuid= to accept multiple bits (Arvind Sankar) - Move clearcpuid= parameter handling earlier in the boot, away from the FPU init code and to a generic location (Mike Hommey) * tag 'x86_fpu_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Handle FPU-related and clearcpuid command line arguments earlier x86/fpu: Allow multiple bits in clearcpuid= parameter
2020-09-22x86/fpu: Handle FPU-related and clearcpuid command line arguments earlierMike Hommey
FPU initialization handles them currently. However, in the case of clearcpuid=, some other early initialization code may check for features before the FPU initialization code is called. Handling the argument earlier allows the command line to influence those early initializations. Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200921215638.37980-1-mh@glandium.org
2020-09-17x86/mmu: Allocate/free a PASIDFenghua Yu
A PASID is allocated for an "mm" the first time any thread binds to an SVA-capable device and is freed from the "mm" when the SVA is unbound by the last thread. It's possible for the "mm" to have different PASID values in different binding/unbinding SVA cycles. The mm's PASID (non-zero for valid PASID or 0 for invalid PASID) is propagated to a per-thread PASID MSR for all threads within the mm through IPI, context switch, or inherited. This is done to ensure that a running thread has the right PASID in the MSR matching the mm's PASID. [ bp: s/SVM/SVA/g; massage. ] Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-10-git-send-email-fenghua.yu@intel.com
2020-09-17x86/fpu/xstate: Add supervisor PASID state for ENQCMDYu-cheng Yu
The ENQCMD instruction reads a PASID from the IA32_PASID MSR. The MSR is stored in the task's supervisor XSAVE* PASID state and is context-switched by XSAVES/XRSTORS. [ bp: Add (in-)definite articles and massage. ] Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-6-git-send-email-fenghua.yu@intel.com