summaryrefslogtreecommitdiff
path: root/tools/arch/x86/lib
AgeCommit message (Collapse)Author
2025-06-16tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'Arnaldo Carvalho de Melo
Also add SYM_PIC_ALIAS() to tools/perf/util/include/linux/linkage.h. This is to get the changes from: 419cbaf6a56a6e4b ("x86/boot: Add a bunch of PIC aliases") That addresses these perf tools build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/aEry7L3fibwIG5au@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-06x86/insn: Stop decoding i64 instructions in x86-64 mode at opcodeMasami Hiramatsu (Google)
In commit 2e044911be75 ("x86/traps: Decode 0xEA instructions as #UD") FineIBT starts using 0xEA as an invalid instruction like UD2. But insn decoder always returns the length of "0xea" instruction as 7 because it does not check the (i64) superscript. The x86 instruction decoder should also decode 0xEA on x86-64 as a one-byte invalid instruction by decoding the "(i64)" superscript tag. This stops decoding instruction which has (i64) but does not have (o64) superscript in 64-bit mode at opcode and skips other fields. With this change, insn_decoder_test says 0xea is 1 byte length if x86-64 (-y option means 64-bit): $ printf "0:\tea\t\n" | insn_decoder_test -y -v insn_decoder_test: success: Decoded and checked 1 instructions Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/174580490000.388420.5225447607417115496.stgit@devnote2
2025-05-06x86/insn: Fix opcode map (!REX2) superscript tagsMasami Hiramatsu (Google)
Commit: 159039af8c07 ("x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map") added (!REX2) superscript with a space, but the correct format requires ',' for concatination with other superscript tags. Add ',' to generate correct insn attribute tables. I confirmed with following command: arch/x86/lib/x86-opcode-map.txt | grep e8 | head -n 1 [0xe8] = INAT_MAKE_IMM(INAT_IMM_VWORD32) | INAT_FORCE64 | INAT_NO_REX2, Fixes: 159039af8c07 ("x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/174580489027.388420.15539375184727726142.stgit@devnote2
2025-04-24x86/insn: Fix CTEST instruction decodingKirill A. Shutemov
insn_decoder_test found a problem with decoding APX CTEST instructions: Found an x86 instruction decoder bug, please report this. ffffffff810021df 62 54 94 05 85 ff ctestneq objdump says 6 bytes, but insn_get_length() says 5 It happens because x86-opcode-map.txt doesn't specify arguments for the instruction and the decoder doesn't expect to see ModRM byte. Fixes: 690ca3a3067f ("x86/insn: Add support for APX EVEX instructions to the opcode map") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org # v6.10+ Link: https://lore.kernel.org/r/20250423065815.2003231-1-kirill.shutemov@linux.intel.com
2025-04-10tools headers: Update the arch/x86/lib/memset_64.S copy with the kernel sourcesNamhyung Kim
To pick up the changes in: 2981557cb0408e14 x86,kcfi: Fix EXPORT_SYMBOL vs kCFI That required adding a copy of include/linux/cfi_types.h and its checking in tools/perf/check-headers.h. Addressing this perf tools build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S Please see tools/include/uapi/README for further details. Acked-by: Ingo Molnar <mingo@kernel.org> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: x86@kernel.org Link: https://lore.kernel.org/r/20250410001125.391820-11-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-28tools/x86: Fix linux/unaligned.h include path in lib/insn.cIan Rogers
tools/arch/x86/include/linux doesn't exist but building is working by virtue of a -I. Building using bazel this fails. Use angle brackets to include unaligned.h so there isn't an invalid relative include. Fixes: 5f60d5f6bbc1 ("move asm/unaligned.h to linux/unaligned.h") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20250225193600.90037-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-02move asm/unaligned.h to linux/unaligned.hAl Viro
asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header. auto-generated by the following: for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-05-02x86/insn: Add support for APX EVEX instructions to the opcode mapAdrian Hunter
To support APX functionality, the EVEX prefix is used to: - promote legacy instructions - promote VEX instructions - add new instructions Promoted VEX instructions require no extra annotation because the opcodes do not change and the permissive nature of the instruction decoder already allows them to have an EVEX prefix. Promoted legacy instructions and new instructions are placed in map 4 which has not been used before. Create a new table for map 4 and add APX instructions. Annotate SCALABLE instructions with "(es)" - refer to patch "x86/insn: Add support for APX EVEX to the instruction decoder logic". SCALABLE instructions must be represented in both no-prefix (NP) and 66 prefix forms. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-9-adrian.hunter@intel.com
2024-05-02x86/insn: Add support for APX EVEX to the instruction decoder logicAdrian Hunter
Intel Advanced Performance Extensions (APX) extends the EVEX prefix to support: - extended general purpose registers (EGPRs) i.e. r16 to r31 - Push-Pop Acceleration (PPX) hints - new data destination (NDD) register - suppress status flags writes (NF) of common instructions - new instructions Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture Specification for details. The extended EVEX prefix does not need amended instruction decoder logic, except in one area. Some instructions are defined as SCALABLE which means the EVEX.W bit and EVEX.pp bits are used to determine operand size. Specifically, if an instruction is SCALABLE and EVEX.W is zero, then EVEX.pp value 0 (representing no prefix NP) means default operand size, whereas EVEX.pp value 1 (representing 66 prefix) means operand size override i.e. 16 bits Add an attribute (INAT_EVEX_SCALABLE) to identify such instructions, and amend the logic appropriately. Amend the awk script that generates the attribute tables from the opcode map, to recognise "(es)" as attribute INAT_EVEX_SCALABLE. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-8-adrian.hunter@intel.com
2024-05-02x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder ↵Adrian Hunter
opcode map Support for REX2 has been added to the instruction decoder logic and the awk script that generates the attribute tables from the opcode map. Add REX2 prefix byte (0xD5) to the opcode map. Add annotation (!REX2) for map 0/1 opcodes that are reserved under REX2. Add JMPABS to the opcode map and add annotation (REX2) to identify that it has a mandatory REX2 prefix. A separate opcode attribute table is not needed at this time because JMPABS has the same attribute encoding as the MOV instruction that it shares an opcode with i.e. INAT_MOFFSET. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-7-adrian.hunter@intel.com
2024-05-02x86/insn: Add support for REX2 prefix to the instruction decoder logicAdrian Hunter
Intel Advanced Performance Extensions (APX) uses a new 2-byte prefix named REX2 to select extended general purpose registers (EGPRs) i.e. r16 to r31. The REX2 prefix is effectively an extended version of the REX prefix. REX2 and EVEX are also used with PUSH/POP instructions to provide a Push-Pop Acceleration (PPX) hint. With PPX hints, a CPU will attempt to fast-forward register data between matching PUSH and POP instructions. REX2 is valid only with opcodes in maps 0 and 1. Similar extension for other maps is provided by the EVEX prefix, covered in a separate patch. Some opcodes in maps 0 and 1 are reserved under REX2. One of these is used for a new 64-bit absolute direct jump instruction JMPABS. Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture Specification for details. Define a code value for the REX2 prefix (INAT_PFX_REX2), and add attribute flags for opcodes reserved under REX2 (INAT_NO_REX2) and to identify opcodes (only JMPABS) that require a mandatory REX2 prefix (INAT_REX2_VARIANT). Amend logic to read the REX2 prefix and get the opcode attribute for the map number (0 or 1) encoded in the REX2 prefix. Amend the awk script that generates the attribute tables from the opcode map, to recognise "REX2" as attribute INAT_PFX_REX2, and "(!REX2)" as attribute INAT_NO_REX2, and "(REX2)" as attribute INAT_REX2_VARIANT. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-6-adrian.hunter@intel.com
2024-05-02x86/insn: Add misc new Intel instructionsAdrian Hunter
The x86 instruction decoder is used not only for decoding kernel instructions. It is also used by perf uprobes (user space probes) and by perf tools Intel Processor Trace decoding. Consequently, it needs to support instructions executed by user space also. Add instructions documented in Intel Architecture Instruction Set Extensions and Future Features Programming Reference March 2024 319433-052, that have not been added yet: AADD AAND AOR AXOR CMPccXADD PBNDKB RDMSRLIST URDMSR UWRMSR VBCSTNEBF162PS VBCSTNESH2PS VCVTNEEBF162PS VCVTNEEPH2PS VCVTNEOBF162PS VCVTNEOPH2PS VCVTNEPS2BF16 VPDPB[SU,UU,SS]D[,S] VPDPW[SU,US,UU]D[,S] VPMADD52HUQ VPMADD52LUQ VSHA512MSG1 VSHA512MSG2 VSHA512RNDS2 VSM3MSG1 VSM3MSG2 VSM3RNDS2 VSM4KEY4 VSM4RNDS4 WRMSRLIST TCMMIMFP16PS TCMMRLFP16PS TDPFP16PS PREFETCHIT1 PREFETCHIT0 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-5-adrian.hunter@intel.com
2024-05-02x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDSAdrian Hunter
The x86 instruction decoder is used not only for decoding kernel instructions. It is also used by perf uprobes (user space probes) and by perf tools Intel Processor Trace decoding. Consequently, it needs to support instructions executed by user space also. Intel Architecture Instruction Set Extensions and Future Features manual number 319433-044 of May 2021, documented VEX versions of instructions VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS, but the opcode map has them listed as EVEX only. Remove EVEX-only (ev) annotation from instructions VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS, which allows them to be decoded with either a VEX or EVEX prefix. Fixes: 0153d98f2dd6 ("x86/insn: Add misc instructions to x86 instruction decoder") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-4-adrian.hunter@intel.com
2024-05-02x86/insn: Fix PUSH instruction in x86 instruction decoder opcode mapAdrian Hunter
The x86 instruction decoder is used not only for decoding kernel instructions. It is also used by perf uprobes (user space probes) and by perf tools Intel Processor Trace decoding. Consequently, it needs to support instructions executed by user space also. Opcode 0x68 PUSH instruction is currently defined as 64-bit operand size only i.e. (d64). That was based on Intel SDM Opcode Map. However that is contradicted by the Instruction Set Reference section for PUSH in the same manual. Remove 64-bit operand size only annotation from opcode 0x68 PUSH instruction. Example: $ cat pushw.s .global _start .text _start: pushw $0x1234 mov $0x1,%eax # system call number (sys_exit) int $0x80 $ as -o pushw.o pushw.s $ ld -s -o pushw pushw.o $ objdump -d pushw | tail -4 0000000000401000 <.text>: 401000: 66 68 34 12 pushw $0x1234 401004: b8 01 00 00 00 mov $0x1,%eax 401009: cd 80 int $0x80 $ perf record -e intel_pt//u ./pushw [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.014 MB perf.data ] Before: $ perf script --insn-trace=disasm Warning: 1 instruction trace errors pushw 10349 [000] 10586.869237014: 401000 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) pushw $0x1234 pushw 10349 [000] 10586.869237014: 401006 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb %al, (%rax) pushw 10349 [000] 10586.869237014: 401008 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb %cl, %ch pushw 10349 [000] 10586.869237014: 40100a [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb $0x2e, (%rax) instruction trace error type 1 time 10586.869237224 cpu 0 pid 10349 tid 10349 ip 0x40100d code 6: Trace doesn't match instruction After: $ perf script --insn-trace=disasm pushw 10349 [000] 10586.869237014: 401000 [unknown] (./pushw) pushw $0x1234 pushw 10349 [000] 10586.869237014: 401004 [unknown] (./pushw) movl $1, %eax Fixes: eb13296cfaf6 ("x86: Instruction decoder API") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-3-adrian.hunter@intel.com
2024-05-02x86/insn: Add Key Locker instructions to the opcode mapChang S. Bae
The x86 instruction decoder needs to know these new instructions that are going to be used in the crypto library as well as the x86 core code. Add the following: LOADIWKEY: Load a CPU-internal wrapping key. ENCODEKEY128: Wrap a 128-bit AES key to a key handle. ENCODEKEY256: Wrap a 256-bit AES key to a key handle. AESENC128KL: Encrypt a 128-bit block of data using a 128-bit AES key indicated by a key handle. AESENC256KL: Encrypt a 128-bit block of data using a 256-bit AES key indicated by a key handle. AESDEC128KL: Decrypt a 128-bit block of data using a 128-bit AES key indicated by a key handle. AESDEC256KL: Decrypt a 128-bit block of data using a 256-bit AES key indicated by a key handle. AESENCWIDE128KL: Encrypt 8 128-bit blocks of data using a 128-bit AES key indicated by a key handle. AESENCWIDE256KL: Encrypt 8 128-bit blocks of data using a 256-bit AES key indicated by a key handle. AESDECWIDE128KL: Decrypt 8 128-bit blocks of data using a 128-bit AES key indicated by a key handle. AESDECWIDE256KL: Decrypt 8 128-bit blocks of data using a 256-bit AES key indicated by a key handle. The detail can be found in Intel Software Developer Manual. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240502105853.5338-2-adrian.hunter@intel.com
2024-03-11Merge tag 'x86-asm-2024-03-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "Two changes to simplify the x86 decoder logic a bit" * tag 'x86-asm-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/insn: Directly assign x86_64 state in insn_init() x86/insn: Remove superfluous checks from instruction decoding routines
2024-03-11Merge tag 'x86-fred-2024-03-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FRED support from Thomas Gleixner: "Support for x86 Fast Return and Event Delivery (FRED). FRED is a replacement for IDT event delivery on x86 and addresses most of the technical nightmares which IDT exposes: 1) Exception cause registers like CR2 need to be manually preserved in nested exception scenarios. 2) Hardware interrupt stack switching is suboptimal for nested exceptions as the interrupt stack mechanism rewinds the stack on each entry which requires a massive effort in the low level entry of #NMI code to handle this. 3) No hardware distinction between entry from kernel or from user which makes establishing kernel context more complex than it needs to be especially for unconditionally nestable exceptions like NMI. 4) NMI nesting caused by IRET unconditionally reenabling NMIs, which is a problem when the perf NMI takes a fault when collecting a stack trace. 5) Partial restore of ESP when returning to a 16-bit segment 6) Limitation of the vector space which can cause vector exhaustion on large systems. 7) Inability to differentiate NMI sources FRED addresses these shortcomings by: 1) An extended exception stack frame which the CPU uses to save exception cause registers. This ensures that the meta information for each exception is preserved on stack and avoids the extra complexity of preserving it in software. 2) Hardware interrupt stack switching is non-rewinding if a nested exception uses the currently interrupt stack. 3) The entry points for kernel and user context are separate and GS BASE handling which is required to establish kernel context for per CPU variable access is done in hardware. 4) NMIs are now nesting protected. They are only reenabled on the return from NMI. 5) FRED guarantees full restore of ESP 6) FRED does not put a limitation on the vector space by design because it uses a central entry points for kernel and user space and the CPUstores the entry type (exception, trap, interrupt, syscall) on the entry stack along with the vector number. The entry code has to demultiplex this information, but this removes the vector space restriction. The first hardware implementations will still have the current restricted vector space because lifting this limitation requires further changes to the local APIC. 7) FRED stores the vector number and meta information on stack which allows having more than one NMI vector in future hardware when the required local APIC changes are in place. The series implements the initial FRED support by: - Reworking the existing entry and IDT handling infrastructure to accomodate for the alternative entry mechanism. - Expanding the stack frame to accomodate for the extra 16 bytes FRED requires to store context and meta information - Providing FRED specific C entry points for events which have information pushed to the extended stack frame, e.g. #PF and #DB. - Providing FRED specific C entry points for #NMI and #MCE - Implementing the FRED specific ASM entry points and the C code to demultiplex the events - Providing detection and initialization mechanisms and the necessary tweaks in context switching, GS BASE handling etc. The FRED integration aims for maximum code reuse vs the existing IDT implementation to the extent possible and the deviation in hot paths like context switching are handled with alternatives to minimalize the impact. The low level entry and exit paths are seperate due to the extended stack frame and the hardware based GS BASE swichting and therefore have no impact on IDT based systems. It has been extensively tested on existing systems and on the FRED simulation and as of now there are no outstanding problems" * tag 'x86-fred-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) x86/fred: Fix init_task thread stack pointer initialization MAINTAINERS: Add a maintainer entry for FRED x86/fred: Fix a build warning with allmodconfig due to 'inline' failing to inline properly x86/fred: Invoke FRED initialization code to enable FRED x86/fred: Add FRED initialization functions x86/syscall: Split IDT syscall setup code into idt_syscall_init() KVM: VMX: Call fred_entry_from_kvm() for IRQ/NMI handling x86/entry: Add fred_entry_from_kvm() for VMX to handle IRQ/NMI x86/entry/calling: Allow PUSH_AND_CLEAR_REGS being used beyond actual entry code x86/fred: Fixup fault on ERETU by jumping to fred_entrypoint_user x86/fred: Let ret_from_fork_asm() jmp to asm_fred_exit_user when FRED is enabled x86/traps: Add sysvec_install() to install a system interrupt handler x86/fred: FRED entry/exit and dispatch code x86/fred: Add a machine check entry stub for FRED x86/fred: Add a NMI entry stub for FRED x86/fred: Add a debug fault entry stub for FRED x86/idtentry: Incorporate definitions/declarations of the FRED entries x86/fred: Make exc_page_fault() work for FRED x86/fred: Allow single-step trap and NMI when starting a new task x86/fred: No ESPFIX needed when FRED is enabled ...
2024-02-22x86/insn: Directly assign x86_64 state in insn_init()Nikolay Borisov
No point in checking again as this was already done by the caller. Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240222111636.2214523-3-nik.borisov@suse.com
2024-02-22x86/insn: Remove superfluous checks from instruction decoding routinesNikolay Borisov
It's pointless checking if a particular part of an instruction is decoded before calling the routine responsible for decoding it as this check is duplicated in the routines itself. Streamline the code by removing the superfluous checks. No functional difference. Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240222111636.2214523-2-nik.borisov@suse.com
2024-01-31x86/opcode: Add ERET[US] instructions to the x86 opcode mapH. Peter Anvin (Intel)
ERETU returns from an event handler while making a transition to ring 3, and ERETS returns from an event handler while staying in ring 0. Add instruction opcodes used by ERET[US] to the x86 opcode map; opcode numbers are per FRED spec v5.0. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li <xin3.li@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Shan Kang <shan.kang@intel.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20231205105030.8698-10-xin3.li@intel.com
2024-01-30tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'Arnaldo Carvalho de Melo
This is to get the changes from: 94ea9c05219518ef ("x86/headers: Replace #include <asm/export.h> with #include <linux/export.h>") 10f4c9b9a33b7df0 ("x86/asm: Fix build of UML with KASAN") That addresses these perf tools build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Link: https://lore.kernel.org/lkml/ZbkIKpKdNqOFdMwJ@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-25x86/cpufeatures,opcode,msr: Add the WRMSRNS instruction supportXin Li
WRMSRNS is an instruction that behaves exactly like WRMSR, with the only difference being that it is not a serializing instruction by default. Under certain conditions, WRMSRNS may replace WRMSR to improve performance. Add its CPU feature bit, opcode to the x86 opcode map, and an always inline API __wrmsrns() to embed WRMSRNS into the code. Signed-off-by: Xin Li <xin3.li@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Shan Kang <shan.kang@intel.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231205105030.8698-2-xin3.li@intel.com
2023-05-17tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'Arnaldo Carvalho de Melo
This is to get the changes from: 68674f94ffc9dddc ("x86: don't use REP_GOOD or ERMS for small memory copies") 20f3337d350c4e1b ("x86: don't use REP_GOOD or ERMS for small memory clearing") This also make the 'perf bench mem' files stop referring to the erms versions that gone away with the above patches. That addresses these perf tools build warning: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S' diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-10Merge tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Add Adrian Hunter to MAINTAINERS as a perf tools reviewer - Sync various tools/ copies of kernel headers with the kernel sources, this time trying to avoid first merging with upstream to then update but instead copy from upstream so that a merge is avoided and the end result after merging this pull request is the one expected, tools/perf/check-headers.sh (mostly) happy, less warnings while building tools/perf/ - Fix counting when initial delay configured by setting perf_attr.enable_on_exec when starting workloads from the perf command line - Don't avoid emitting a PERF_RECORD_MMAP2 in 'perf inject --buildid-all' when that record comes with a build-id, otherwise we end up not being able to resolve symbols - Don't use comma as the CSV output separator the "stat+csv_output" test, as comma can appear on some tests as a modifier for an event, use @ instead, ditto for the JSON linter test - The offcpu test was looking for some bits being set on task_struct->prev_state without masking other bits not important for this specific 'perf test', fix it * tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf tools: Add Adrian Hunter to MAINTAINERS as a reviewer tools headers UAPI: Sync linux/perf_event.h with the kernel sources tools headers x86 cpufeatures: Sync with the kernel sources tools include UAPI: Sync linux/vhost.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools headers kvm: Sync uapi/{asm/linux} kvm.h headers with the kernel sources tools include UAPI: Synchronize linux/fcntl.h with the kernel sources tools headers: Synchronize {linux,vdso}/bits.h with the kernel sources tools headers UAPI: Sync linux/prctl.h with the kernel sources tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench' perf stat: Fix counting when initial delay configured tools headers svm: Sync svm headers with the kernel sources perf test: Avoid counting commas in json linter perf tests stat+csv_output: Switch CSV separator to @ perf inject: Fix --buildid-all not to eat up MMAP2 tools arch x86: Sync the msr-index.h copy with the kernel sources perf test: Fix offcpu test prev_state check
2023-03-03tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'Arnaldo Carvalho de Melo
We also continue with SYM_TYPED_FUNC_START() in util/include/linux/linkage.h and with an exception in tools/perf/check_headers.sh's diff check to ignore the include cfi_types.h line when checking if the kernel original files drifted from the copies we carry. This is to get the changes from: 69d4c0d3218692ff ("entry, kasan, x86: Disallow overriding mem*() functions") That addresses these perf tools build warning: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S' diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/lkml/ZAH%2FjsioJXGIOrkf@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-12x86/opcode: Add the LKGS instruction to x86-opcode-mapH. Peter Anvin (Intel)
Add the instruction opcode used by LKGS to x86-opcode-map. Opcode number is per public FRED draft spec v3.0. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li <xin3.li@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230112072032.35626-3-xin3.li@intel.com
2022-10-25tools headers: Update the copy of x86's memcpy_64.S used in 'perf bench'Arnaldo Carvalho de Melo
We also need to add SYM_TYPED_FUNC_START() to util/include/linux/linkage.h and update tools/perf/check_headers.sh to ignore the include cfi_types.h line when checking if the kernel original files drifted from the copies we carry. This is to get the changes from: ccace936eec7b805 ("x86: Add types to indirectly called assembly functions") Addressing these tools/perf build warnings: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/lkml/Y1f3VRIec9EBgX6F@kernel.org/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-21Merge tag 'x86_misc_for_v5.18_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Borislav Petkov: - Add support for a couple new insn sets to the insn decoder: AVX512-FP16, AMX, other misc insns. - Update VMware-specific MAINTAINERS entries * tag 'x86_misc_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Mark VMware mailing list entries as email aliases MAINTAINERS: Add Zack as maintainer of vmmouse driver MAINTAINERS: Update maintainers for paravirt ops and VMware hypervisor interface x86/insn: Add AVX512-FP16 instructions to the x86 instruction decoder perf/tests: Add AVX512-FP16 instructions to x86 instruction decoder test x86/insn: Add misc instructions to x86 instruction decoder perf/tests: Add misc instructions to the x86 instruction decoder test x86/insn: Add AMX instructions to the x86 instruction decoder perf/tests: Add AMX instructions to x86 instruction decoder test
2022-02-22x86: clean up symbol aliasingMark Rutland
Now that we have SYM_FUNC_ALIAS() and SYM_FUNC_ALIAS_WEAK(), use those to simplify the definition of function aliases across arch/x86. For clarity, where there are multiple annotations such as EXPORT_SYMBOL(), I've tried to keep annotations grouped by symbol. For example, where a function has a name and an alias which are both exported, this is organised as: SYM_FUNC_START(func) ... asm insns ... SYM_FUNC_END(func) EXPORT_SYMBOL(func) SYM_FUNC_ALIAS(alias, func) EXPORT_SYMBOL(alias) Where there are only aliases and no exports or other annotations, I have not bothered with line spacing, e.g. SYM_FUNC_START(func) ... asm insns ... SYM_FUNC_END(func) SYM_FUNC_ALIAS(alias, func) The tools/perf/ copies of memset_64.S and memset_64.S are updated likewise to avoid the build system complaining these are mismatched: | Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' | diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S | Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S' | diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Mark Brown <broonie@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220216162229.1076788-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-01-23x86/insn: Add AVX512-FP16 instructions to the x86 instruction decoderAdrian Hunter
The x86 instruction decoder is used for both kernel instructions and user space instructions (e.g. uprobes, perf tools Intel PT), so it is good to update it with new instructions. Add AVX512-FP16 instructions to x86 instruction decoder. Note the EVEX map field is extended by 1 bit, and most instructions are in map 5 and map 6. Reference: Intel AVX512-FP16 Architecture Specification June 2021 Revision 1.0 Document Number: 347407-001US Example using perf tools' x86 instruction decoder test: $ perf test -v "x86 instruction decoder" |& grep vfcmaddcph | head -2 Decoded ok: 62 f6 6f 48 56 cb vfcmaddcph %zmm3,%zmm2,%zmm1 Decoded ok: 62 f6 6f 48 56 8c c8 78 56 34 12 vfcmaddcph 0x12345678(%eax,%ecx,8),%zmm2,%zmm1 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20211202095029.2165714-7-adrian.hunter@intel.com
2022-01-23x86/insn: Add misc instructions to x86 instruction decoderAdrian Hunter
x86 instruction decoder is used for both kernel instructions and user space instructions (e.g. uprobes, perf tools Intel PT), so it is good to update it with new instructions. Add instructions to x86 instruction decoder: User Interrupt clui senduipi stui testui uiret Prediction history reset hreset Serialize instruction execution serialize TSX suspend load address tracking xresldtrk xsusldtrk Reference: Intel Architecture Instruction Set Extensions and Future Features Programming Reference May 2021 Document Number: 319433-044 Example using perf tools' x86 instruction decoder test: $ perf test -v "x86 instruction decoder" |& grep -i hreset Decoded ok: f3 0f 3a f0 c0 00 hreset $0x0 Decoded ok: f3 0f 3a f0 c0 00 hreset $0x0 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20211202095029.2165714-5-adrian.hunter@intel.com
2022-01-23x86/insn: Add AMX instructions to the x86 instruction decoderAdrian Hunter
The x86 instruction decoder is used for both kernel instructions and user space instructions (e.g. uprobes, perf tools Intel PT), so it is good to update it with new instructions. Add AMX instructions to the x86 instruction decoder. Reference: Intel Architecture Instruction Set Extensions and Future Features Programming Reference May 2021 Document Number: 319433-044 Example using perf tools' x86 instruction decoder test: $ INSN='ldtilecfg\|sttilecfg\|tdpbf16ps\|tdpbssd\|' $ INSN+='tdpbsud\|tdpbusd\|'tdpbuud\|tileloadd\|' $ INSN+='tileloaddt1\|tilerelease\|tilestored\|tilezero' $ perf test -v "x86 instruction decoder" |& grep -i $INSN Decoded ok: c4 e2 78 49 04 c8 ldtilecfg (%rax,%rcx,8) Decoded ok: c4 c2 78 49 04 c8 ldtilecfg (%r8,%rcx,8) Decoded ok: c4 e2 79 49 04 c8 sttilecfg (%rax,%rcx,8) Decoded ok: c4 c2 79 49 04 c8 sttilecfg (%r8,%rcx,8) Decoded ok: c4 e2 7a 5c d1 tdpbf16ps %tmm0,%tmm1,%tmm2 Decoded ok: c4 e2 7b 5e d1 tdpbssd %tmm0,%tmm1,%tmm2 Decoded ok: c4 e2 7a 5e d1 tdpbsud %tmm0,%tmm1,%tmm2 Decoded ok: c4 e2 79 5e d1 tdpbusd %tmm0,%tmm1,%tmm2 Decoded ok: c4 e2 78 5e d1 tdpbuud %tmm0,%tmm1,%tmm2 Decoded ok: c4 e2 7b 4b 0c c8 tileloadd (%rax,%rcx,8),%tmm1 Decoded ok: c4 c2 7b 4b 14 c8 tileloadd (%r8,%rcx,8),%tmm2 Decoded ok: c4 e2 79 4b 0c c8 tileloaddt1 (%rax,%rcx,8),%tmm1 Decoded ok: c4 c2 79 4b 14 c8 tileloaddt1 (%r8,%rcx,8),%tmm2 Decoded ok: c4 e2 78 49 c0 tilerelease Decoded ok: c4 e2 7a 4b 0c c8 tilestored %tmm1,(%rax,%rcx,8) Decoded ok: c4 c2 7a 4b 14 c8 tilestored %tmm2,(%r8,%rcx,8) Decoded ok: c4 e2 7b 49 c0 tilezero %tmm0 Decoded ok: c4 e2 7b 49 f8 tilezero %tmm7 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20211202095029.2165714-3-adrian.hunter@intel.com
2022-01-13tools arch: Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench ↵Arnaldo Carvalho de Melo
mem memcpy' To bring in the change made in this cset: f94909ceb1ed4bfd ("x86: Prepare asm files for straight-line-speculation") It silences these perf tools build warnings, no change in the tools: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S' diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S The code generated was checked before and after using 'objdump -d /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o', no changes. Cc: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-06x86/insn: Use get_unaligned() instead of memcpy()Borislav Petkov
Use get_unaligned() instead of memcpy() to access potentially unaligned memory, which, when accessed through a pointer, leads to undefined behavior. get_unaligned() describes much better what is happening there anyway even if memcpy() does the job. In addition, since perf tool builds with -Werror, it would fire with: util/intel-pt-decoder/../../../arch/x86/lib/insn.c: In function '__insn_get_emulate_prefix': tools/include/../include/asm-generic/unaligned.h:10:15: error: packed attribute is unnecessary [-Werror=packed] 10 | const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ because -Werror=packed would complain if the packed attribute would have no effect on the layout of the structure. In this case, that is intentional so disable the warning only for that compilation unit. That part is Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> No functional changes. Fixes: 5ba1071f7554 ("x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lkml.kernel.org/r/YVSsIkj9Z29TyUjE@zn.tnic
2021-09-24x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accessesNumfor Mbiziwo-Tiapo
Don't perform unaligned loads in __get_next() and __peek_nbyte_next() as these are forms of undefined behavior: "A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined." (from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf) These problems were identified using the undefined behavior sanitizer (ubsan) with the tools version of the code and perf test. [ bp: Massage commit message. ] Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com> Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20210923161843.751834-1-irogers@google.com
2021-05-10tools arch: Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench ↵Arnaldo Carvalho de Melo
mem memcpy' To bring in the change made in this cset: 5e21a3ecad1500e3 ("x86/alternative: Merge include files") This just silences these perf tools build warnings, no change in the tools: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S' diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S Cc: Borislav Petkov <bp@suse.de> Cc: Juergen Gross <jgross@suse.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-03-17tools/insn: Restore the relative include paths for cross buildingBorislav Petkov
Building perf on ppc causes: In file included from util/intel-pt-decoder/intel-pt-insn-decoder.c:15: util/intel-pt-decoder/../../../arch/x86/lib/insn.c:14:10: fatal error: asm/inat.h: No such file or directory 14 | #include <asm/inat.h> /*__ignore_sync_check__ */ | ^~~~~~~~~~~~ Restore the relative include paths so that the compiler can find the headers. Fixes: 93281c4a9657 ("x86/insn: Add an insn_decode() API") Reported-by: Ian Rogers <irogers@google.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Ian Rogers <irogers@google.com> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lkml.kernel.org/r/20210317150858.02b1bbc8@canb.auug.org.au
2021-03-15x86/insn: Make insn_complete() staticBorislav Petkov
... and move it above the only place it is used. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210304174237.31945-22-bp@alien8.de
2021-03-15x86/insn: Add an insn_decode() APIBorislav Petkov
Users of the instruction decoder should use this to decode instruction bytes. For that, have insn*() helpers return an int value to denote success/failure. When there's an error fetching the next insn byte and the insn falls short, return -ENODATA to denote that. While at it, make insn_get_opcode() more stricter as to whether what has seen so far is a valid insn and if not. Copy linux/kconfig.h for the tools-version of the decoder so that it can use IS_ENABLED(). Also, cast the INSN_MODE_KERN dummy define value to (enum insn_mode) for tools use of the decoder because perf tool builds with -Werror and errors out with -Werror=sign-compare otherwise. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20210304174237.31945-5-bp@alien8.de
2021-03-15x86/insn: Add a __ignore_sync_check__ markerBorislav Petkov
Add an explicit __ignore_sync_check__ marker which will be used to mark lines which are supposed to be ignored by file synchronization check scripts, its advantage being that it explicitly denotes such lines in the code. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20210304174237.31945-4-bp@alien8.de
2021-03-15x86/insn: Add @buf_len param to insn_init() kernel-doc commentBorislav Petkov
It wasn't documented so add it. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20210304174237.31945-3-bp@alien8.de
2021-01-13x86/insn: Fix vector instruction decoding on big endian cross-compilesVasily Gorbik
Running instruction decoder posttest on an s390 host with an x86 target with allyesconfig shows errors. Instructions used in a couple of kernel objects could not be correctly decoded on big endian system. insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 5 insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this. insn_decoder_test: warning: ffffffff831eb4e1: 62 d1 fd 48 7f 04 24 vmovdqa64 %zmm0,(%r12) insn_decoder_test: warning: objdump says 7 bytes, but insn_get_length() says 6 insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this. insn_decoder_test: warning: ffffffff831eb4e8: 62 51 fd 48 7f 44 24 01 vmovdqa64 %zmm8,0x40(%r12) insn_decoder_test: warning: objdump says 8 bytes, but insn_get_length() says 6 This is because in a few places instruction field bytes are set directly with further usage of "value". To address that introduce and use a insn_set_byte() helper, which correctly updates "value" on big endian systems. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-13x86/insn: Support big endian cross-compilesMartin Schwidefsky
The x86 instruction decoder code is shared across the kernel source and the tools. Currently objtool seems to be the only tool from build tools needed which breaks x86 cross-compilation on big endian systems. Make the x86 instruction decoder build host endianness agnostic to support x86 cross-compilation and enable objtool to implement endianness awareness for big endian architectures support. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Co-developed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-11-12tools arch: Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench ↵Arnaldo Carvalho de Melo
mem memcpy' To bring in the change made in this cset: 4d6ffa27b8e5116c ("x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S") 6dcc5627f6aec4cb ("x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*") I needed to define SYM_FUNC_START_LOCAL() as SYM_L_GLOBAL as mem{cpy,set}_{orig,erms} are used by 'perf bench'. This silences these perf tools build warnings: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S' diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Fangrui Song <maskray@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-06x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()Dan Williams
In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: <stable@vger.kernel.org> Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
2020-07-03tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy'Arnaldo Carvalho de Melo
To bring in the change made in this cset: e3a9e681adb7 ("x86/entry: Fixup bad_iret vs noinstr") This doesn't cause any functional changes to tooling, just a rebuild. Addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-26x86/insn: Add Control-flow Enforcement (CET) instructions to the opcode mapYu-cheng Yu
Add the following CET instructions to the opcode map: INCSSP: Increment Shadow Stack pointer (SSP). RDSSP: Read SSP into a GPR. SAVEPREVSSP: Use "previous ssp" token at top of current Shadow Stack (SHSTK) to create a "restore token" on the previous (outgoing) SHSTK. RSTORSSP: Restore from a "restore token" to SSP. WRSS: Write to kernel-mode SHSTK (kernel-mode instruction). WRUSS: Write to user-mode SHSTK (kernel-mode instruction). SETSSBSY: Verify the "supervisor token" pointed by MSR_IA32_PL0_SSP, set the token busy, and set then Shadow Stack pointer(SSP) to the value of MSR_IA32_PL0_SSP. CLRSSBSY: Verify the "supervisor token" and clear its busy bit. ENDBR64/ENDBR32: Mark a valid 64/32 bit control transfer endpoint. Detailed information of CET instructions can be found in Intel Software Developer's Manual. Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20200204171425.28073-2-yu-cheng.yu@intel.com
2020-01-22x86/decoder: Add TEST opcode to Group3-2Masami Hiramatsu
Add TEST opcode to Group3-2 reg=001b as same as Group3-1 does. Commit 12a78d43de76 ("x86/decoder: Add new TEST instruction pattern") added a TEST opcode assignment to f6 XX/001/XXX (Group 3-1), but did not add f7 XX/001/XXX (Group 3-2). Actually, this TEST opcode variant (ModRM.reg /1) is not described in the Intel SDM Vol2 but in AMD64 Architecture Programmer's Manual Vol.3, Appendix A.2 Table A-6. ModRM.reg Extensions for the Primary Opcode Map. Without this fix, Randy found a warning by insn_decoder_test related to this issue as below. HOSTCC arch/x86/tools/insn_decoder_test HOSTCC arch/x86/tools/insn_sanity TEST posttest arch/x86/tools/insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this. arch/x86/tools/insn_decoder_test: warning: ffffffff81000bf1: f7 0b 00 01 08 00 testl $0x80100,(%rbx) arch/x86/tools/insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 2 arch/x86/tools/insn_decoder_test: warning: Decoded and checked 11913894 instructions with 1 failures TEST posttest arch/x86/tools/insn_sanity: Success: decoded and checked 1000000 random instructions with 0 errors (seed:0x871ce29c) To fix this error, add the TEST opcode according to AMD64 APM Vol.3. [ bp: Massage commit message. ] Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lkml.kernel.org/r/157966631413.9580.10311036595431878351.stgit@devnote2
2019-12-02perf bench: Update the copies of x86's mem{cpy,set}_64.SArnaldo Carvalho de Melo
And update linux/linkage.h, which requires in turn that we make these files switch from ENTRY()/ENDPROC() to SYM_FUNC_START()/SYM_FUNC_END(): tools/perf/arch/arm64/tests/regs_load.S tools/perf/arch/arm/tests/regs_load.S tools/perf/arch/powerpc/tests/regs_load.S tools/perf/arch/x86/tests/regs_load.S We also need to switch SYM_FUNC_START_LOCAL() to SYM_FUNC_START() for the functions used directly by 'perf bench', and update tools/perf/check_headers.sh to ignore those changes when checking if the kernel original files drifted from the copies we carry. This is to get the changes from: 6dcc5627f6ae ("x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*") ef1e03152cb0 ("x86/asm: Make some functions local") e9b9d020c487 ("x86/asm: Annotate aliases") And address these tools/perf build warnings: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S' diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-tay3l8x8k11p7y3qcpqh9qh5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-26x86/insn: Add some more Intel instructions to the opcode mapAdrian Hunter
Add to the opcode map the following instructions: v4fmaddps v4fmaddss v4fnmaddps v4fnmaddss vaesdec vaesdeclast vaesenc vaesenclast vcvtne2ps2bf16 vcvtneps2bf16 vdpbf16ps gf2p8affineinvqb vgf2p8affineinvqb gf2p8affineqb vgf2p8affineqb gf2p8mulb vgf2p8mulb vp2intersectd vp2intersectq vp4dpwssd vp4dpwssds vpclmulqdq vpcompressb vpcompressw vpdpbusd vpdpbusds vpdpwssd vpdpwssds vpexpandb vpexpandw vpopcntb vpopcntd vpopcntq vpopcntw vpshldd vpshldq vpshldvd vpshldvq vpshldvw vpshldw vpshrdd vpshrdq vpshrdvd vpshrdvq vpshrdvw vpshrdw vpshufbitqmb For information about the instructions, refer Intel SDM May 2019 (325462-070US) and Intel Architecture Instruction Set Extensions May 2019 (319433-037). The instruction decoding can be tested using the perf tools' "x86 instruction decoder - new instructions" test e.g. $ perf test -v "new " 2>&1 | grep -i 'v4fmaddps' Decoded ok: 62 f2 7f 48 9a 20 v4fmaddps (%eax),%zmm0,%zmm4 Decoded ok: 62 f2 7f 48 9a a4 c8 78 56 34 12 v4fmaddps 0x12345678(%eax,%ecx,8),%zmm0,%zmm4 Decoded ok: 62 f2 7f 48 9a 20 v4fmaddps (%rax),%zmm0,%zmm4 Decoded ok: 67 62 f2 7f 48 9a 20 v4fmaddps (%eax),%zmm0,%zmm4 Decoded ok: 62 f2 7f 48 9a a4 c8 78 56 34 12 v4fmaddps 0x12345678(%rax,%rcx,8),%zmm0,%zmm4 Decoded ok: 67 62 f2 7f 48 9a a4 c8 78 56 34 12 v4fmaddps 0x12345678(%eax,%ecx,8),%zmm0,%zmm4 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20191125125044.31879-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>